Commit Graph

729 Commits

Author SHA1 Message Date
itepechi 666e3bc4ad
Merge remote-tracking branch 'upstream/develop' into bnakkoma 2024-08-15 05:40:27 +09:00
itepechi 6914aab88a
Merge remote-tracking branch 'upstream/develop' into bnakkoma 2024-06-14 13:34:42 +09:00
Oneric bed7ff8e89 mix: consistently use shell_info and shell_error
Logger output being visible depends on user configuration, but most of
the prints in mix tasks should always be shown. When running inside a
mix shell, it’s probably preferable to send output directly to it rather
than using raw IO.puts and we already have shell_* functions for this,
let’s use them everywhere.
2024-05-31 17:17:42 +02:00
Oneric 70cd5f91d8 dbprune/activites: prune array activities first
This query is less costly; if something goes wrong or gets aborted later
at least this part will arelady be done.
2024-05-31 17:16:40 +02:00
Oneric aeaebb566c dbprune: allow splitting array and single activity prunes
The former is typically just a few reports; it doesn't make sense to
rerun it over and over again in batched prunes or if a full prune OOMed.
2024-05-31 17:16:40 +02:00
Oneric 5751637926 dbprune: use query! 2024-05-31 17:16:40 +02:00
Oneric 24bab63cd8 dbprune: add more logs
Pruning can go on for a long time; give admins some insight into that
something is happening to make it less frustrating and to make it easier
which part of the process is stalled should this happen.

Again most of the changes are merely reindents;
review with whitespace changes hidden recommended.
2024-05-31 17:16:40 +02:00
Oneric 1d4c212441 dbprune: shortcut array activity search
This brought down query costs from 7,953,740.90 to 47,600.97
2024-05-31 17:16:40 +02:00
Oneric 225f87ad62 Also allow limiting the initial prune_object
May sometimes be helpful to get more predictable runtime
than just with an age-based limit.

The subquery for the non-keep-threads path is required
since delte_all does not directly accept limit().

Again most of the diff is just adjusting indentation, best
hide whitespace-only changes with git diff -w or similar.
2024-05-31 17:16:40 +02:00
Oneric e64f031167 Log number of deleted rows in prune_orphaned_activities
This gives feedback when to stop rerunning limited batches.

Most of the diff is just adjusting indentation; best reviewed
with whitespace-only changes hidden, e.g. `git diff -w`.
2024-05-31 17:16:40 +02:00
Oneric fa52093bac Add standalone prune_orphaned_activities CLI task
This part of pruning can be very expensive and bog down the whole
instance to an unusable sate for a long time. It can thus be desireable
to split it from prune_objects and run it on its own in smaller limited batches.

If the batches are smaller enough and spaced out a bit, it may even be possible
to avoid any downtime. If not, the limit can still help to at least make the
downtime duration somewhat more predictable.
2024-05-31 17:16:40 +02:00
Oneric 3126d15ffc refactor: move prune_orphaned_activities into own function
No logic changes. Preparation for standalone orphan pruning.
2024-05-31 17:16:39 +02:00
floatingghost 8f97c15b07 Merge pull request 'Preserve Meilisearch’s result ranking' (#772) from Oneric/akkoma:search-meili-order into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/772
2024-05-31 14:12:05 +00:00
Floatingghost 3af0c53a86 use proper workers for fetching pins instead of an ad-hoc task (#788)
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/788
Co-authored-by: Floatingghost <hannah@coffee-and-dreams.uk>
Co-committed-by: Floatingghost <hannah@coffee-and-dreams.uk>
2024-05-31 08:58:52 +00:00
Oneric 59685e25d2 meilisearch: show keys by name not description
This makes show-key’s output match our documentation as of Meilisearch
1.8.0-8-g4d5971f343c00d45c11ef0cfb6f61e83a8508208. Since I’m not sure
if older versions maybe only provided description, it will fallback to
the latter if no name parameter exists.
2024-05-29 23:17:27 +00:00
itepechi 1b6a18b473
Improve search results by implementing filtering on the Meilisearch side
You will need to rerun the `search.meilisearch index` task in order to
support this. If you do not, Akkoma will only be able to filter for
newer posts than this commit and will return an error for advanced
searches if you did not update the `filterable-attributes` attribute on
the `objects` index manually.
2024-05-03 06:46:48 +09:00
Oneric a95af3ee4c exiftool: strip all non-essential tags
Documentation was already clear on this only stripping GPS tags.
But there are more potentially sensitive metadata tags (e.g. author
and possibly description) and the name alone suggests a broader effect.

Thus change the filter to strip all metadata except for colourspace info
and orientation (technically it strips everything and then readds
selected tags).

Explicitly stripping CommonIFD0 is needed since -all does not modify
IFD0 due to TIFF storing some actual image data there. CommonIFD0 then
strips a bunch of commonly used actual metadata tags from IFD0, to my
understanding leaving TIFF image data and custom metadata tags intact.
2024-04-25 23:00:42 +02:00
timorl 09d3ccf770
Read description before stripping metadata 2024-04-19 20:51:54 +02:00
timorl cd7af81896
Rename StripLocation to StripMetadata for temporal-proofing reasons 2024-04-16 20:37:00 +02:00
itepechi 5adae54d52
Removed `Dedupe` upload filter from instance generation wizard
The `Dedupe` filter is now always active, so there is no need to ask the
user to configure it anymore.
2024-04-15 04:34:38 +09:00
itepechi 0f0298abfd
Merge remote-tracking branch 'upstream/develop' into bnakkoma 2024-04-15 04:30:34 +09:00
timorl b144218dce
Merge branch 'develop' into elseinspe 2024-04-14 20:31:33 +02:00
FloatingGhost 2d439034ca Ensure that spoof-inserted does not time out 2024-03-30 12:55:22 +00:00
Oneric 0648d9ebaa Add mix tasks to detect spoofed posts and users
At least as far as we can
2024-03-26 16:05:20 -01:00
Oneric d441101200 Add mix task to detect uploaded spoof payloads 2024-03-26 16:05:20 -01:00
Oneric 0ec62acb9d Always insert Dedupe upload filter
This actually was already intended before to eradict all future
path-traversal-style exploits and to fix issues with some
characters like akkoma#610 in 0b2ec0ccee. However, Dedupe and
AnonymizeFilename got mixed up. The latter only anonymises the name
in Content-Disposition headers GET parameters (with link_name),
_not_ the upload path.

Even without Dedupe, the upload path is prefixed by an UUID,
so it _should_ already be hard to guess for attackers. But now
we actually can be sure no path shenanigangs occur, uploads
reliably work and save some disk space.

While this makes the final path predictable, this prediction is
not exploitable. Insertion of a back-reference to the upload
itself requires pulling off a successfull preimage attack against
SHA-256, which is deemed infeasible for the foreseeable futures.

Dedupe was already included in the default list in config.exs
since 28cfb2c37a, but this will get overridde by whatever the
config generated by the "pleroma.instance gen" task chose.

Upload+delete tests running in parallel using Dedupe might be flaky, but
this was already true before and needs its own commit to fix eventually.
2024-03-18 22:33:10 -01:00
Oneric fef773ca35 Drop media base_url default and recommend different domain
Same-domain setups enabled now at least two exploits,
so they ought to be discouraged and definitely not be the default.
2024-03-18 22:33:10 -01:00
itepechi 35be52eb9f Support for user search via PGroonga 2023-11-03 05:42:14 +09:00
itepechi 3b7ef1bad8 Add a setup question asking if the user wants to use AnalyzeMetadata 2023-09-27 05:32:18 +09:00
itepechi dce9ecaa9a Merge branch 'develop' into itepechi 2023-09-16 08:11:30 +09:00
itepechi 00c8a65879 Avoid dying when handling large payloads: the sequel 2023-09-09 07:14:38 +09:00
FloatingGhost 6cb40bee26 Migrate to phoenix 1.7 (#626)
Closes #612

Co-authored-by: tusooa <tusooa@kazv.moe>
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/626
Co-authored-by: FloatingGhost <hannah@coffee-and-dreams.uk>
Co-committed-by: FloatingGhost <hannah@coffee-and-dreams.uk>
2023-08-15 10:22:18 +00:00
itepechi 06bd9130e8 Merge branch 'develop' into itepechi 2023-08-13 04:38:02 +09:00
floatingghost 0b32beb051 Merge pull request 'meilisearch: Move published date to lower priority' (#623) from norm/akkoma:meilisearch-order into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/623
2023-08-12 14:36:53 +00:00
floatingghost 7bb41bffb3 Merge pull request 'Reload emoji when using mix pleroma.emoji gen-pack and get-packs' (#563) from norm/akkoma:emoji-reload into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/563
2023-08-12 14:07:23 +00:00
Norm d79c92f9c6
meilisearch: Move published date to lower priority
Currently, Akkoma sorts by published date first before everything else.
This however makes search results pretty bad since Meilisearch uses a
bucket sort algorithm in order of the ranking rules specified:
https://www.meilisearch.com/docs/learn/core_concepts/relevancy#behavior

Since the `published` attribute is a unix timestamp, the resulting
buckets are pretty small so the other rules essentially have little to
no effect on the rankings of search results.

This fixes that issue by moving the `published:desc` rule further down
so it still sorts by date, but only after considering everything else.

AFAIK attribute and sort doesn't really affect results for Akkoma since
the only attribute considered is the `content` attribute and the `sort`
parameter isn't used in Akkoma searches. Everything else is made to
match more closely to Meilisearch's defaults.
2023-08-11 11:07:14 -04:00
itepechi af349b073a Deduplicate query_with 2023-08-07 03:53:53 +09:00
itepechi aba4e6ea60 Replace map |> join with map_join 2023-08-06 22:17:17 +09:00
itepechi 2634723958 Support for setting PGroonga indexing options 2023-08-06 06:16:38 +09:00
itepechi 033e5e27d6 Allow configuration of supported languages during instance gen 2023-08-05 04:49:52 +09:00
itepechi dddc104fb5 Fix missing variable on instance gen 2023-08-05 04:49:18 +09:00
Haelwenn (lanodan) Monnier 4f57c87be4
instance gen: Reduce permissions of pleroma directories and config files
Original: 69caedc591
2023-08-04 14:13:50 -04:00
itepechi 557fce16e5 Add support for PGroonga 2023-08-04 06:27:23 +09:00
FloatingGhost 98cb255d12 Support elixir1.15
OTP builds to 1.15

Changelog entry

Ensure policies are fully loaded

Fix :warn

use main branch for linkify

Fix warn in tests

Migrations for phoenix 1.17

Revert "Migrations for phoenix 1.17"

This reverts commit 6a3b2f15b74ea5e33150529385215b7a531f3999.

Oban upgrade

Add default empty whitelist

mix format

limit test to amd64

OTP 26 tests for 1.15

use OTP_VERSION tag

baka

just 1.15

Massive deps update

Update locale, deps

Mix format

shell????

multiline???

?

max cases 1

use assert_recieve

don't put_env in async tests

don't async conn/fs tests

mix format

FIx some uploader issues

Fix tests
2023-08-03 17:44:09 +01:00
Norm b99053d2c2 Reload emoji when using mix pleroma.emoji gen-pack and get-packs
I think it makes more sense that the emoji cache gets reloaded in Akkoma if you add or create emoji packs.
2023-06-04 02:43:18 +00:00
floatingghost 6225f24f5f Merge pull request 'Clean up bookmarks after prune_objects' (#544) from ilja/akkoma:clean_up_bookmarks_after_prune_objects into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/544
2023-05-22 21:28:48 +00:00
ilja f49e9e6d4c Clean up bookmarks after prune_objects
When doing prune_objects, it's possible that bookmarked objects are deleted.
This gave problems when fetching the bookmark TL.
Here we clean up the bookmarks during pruning in the case were it's possible that bookmarked objects are deleted.
2023-05-21 13:02:28 +02:00
FloatingGhost 522221f7fb Mix format 2023-04-14 17:56:34 +01:00
FloatingGhost 2a8c1f4192 Add extra diagnostic tasks in 2023-03-29 14:11:00 +01:00
ilja 57eef6d764 prune_objects can prune orphaned activities who reference an array of objects
E.g. Flag activities have an array of objects

We prune the activity when NONE of the objects can be found

Note that the cost of finding and deleting these is ~4x higher than finding and deleting the non-array ones

Only string:
Delete on activities  (cost=506573.48..506580.38 rows=0 width=0)

Only Array:
Delete on activities  (cost=3570359.68..4276365.34 rows=0 width=0)

(They are still executed separately, so the total cost is the sum of the two)
2023-02-26 14:41:50 +01:00