Commit graph

9481 commits

Author SHA1 Message Date
floatingghost
e3c8c4f24f Merge pull request 'mrf/object_age: fix handling of non-public objects' (#851) from Oneric/akkoma:mrf-fix-oage into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/851
2025-01-03 15:26:11 +00:00
floatingghost
67cdc38296 Merge pull request 'Only proxy HTTP and HTTP urls via Media Proxy' (#860) from nopjmp/akkoma:media-proxy-only-http into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/860
2025-01-03 15:25:14 +00:00
floatingghost
89d209f486 Merge pull request 'Fix NodeInfo content-type' (#853) from Oneric/akkoma:nodeinfo-contenttype into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/853
2025-01-03 15:24:25 +00:00
floatingghost
91bedcfa68 Merge pull request 'Completely omit id for anonymous objects' (#850) from Oneric/akkoma:ap-anonymous-errata into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/850
2025-01-03 15:23:03 +00:00
Norm
f19d5d1380 Set customize_hostname_check for Swoosh.Adapters.SMTP
This should hopefully fix issues with connecting to SMTP servers
with wildcard TLS certificates.

Taken from https://erlef.github.io/security-wg/secure_coding_and_deployment_hardening/ssl

Fixes https://akkoma.dev/AkkomaGang/akkoma/issues/660
2024-12-18 14:37:27 -05:00
nopjmp
7632765b43 Only proxy HTTP and HTTP urls via Media Proxy
We make an assumption that we are only proxying HTTP/HTTPS hosted
media through the media proxy endpoint.

Fixes: #859
2024-12-16 20:35:12 -06:00
Oneric
294de939cb signing_key: refactor nested case into with statement
The error branches were already effectively identical before.
This change is purely cosmetic.
2024-12-08 20:43:57 +00:00
Haelwenn (lanodan) Monnier
2b1a252cc7 User: truncate remote user fields instead of rejecting 2024-11-26 09:29:44 +00:00
Oneric
416aebb76a Fix NodeInfo content-type
Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/852
2024-11-19 19:25:31 +01:00
Oneric
932810c35e mrf/object_age: fix handling of non-public objects
Current logic unconditionally adds public adressing to "cc"
and follower adressing to "to" after attempting to strip it
from the other one. This creates serious problems:

First the bug prompting this investigation and fix,
unconditional addition creates duplicates when adressing
URIs already were in their intended final field; e.g.
this is prominently the case for all "unlisted" posts.
Since List.delete only removes the first occurence,
this then broke follower-adress stripping later on
making the policy ineffective.

It’s also just not safe in general wrt to non-public adressing:
e.g. pre-existing duplicates didn’t get fully stripped,
bespoke adressing modes with only one of public addressing
or follower addressing are mangled — and most importantly:
any belatedly received DM or follower-only post
also got public adressing added!
Shockingly this last point was actually asserted as "correct" in tests;
it appears to be a mistake from mindless match adjustments
while fixing crashes on nil adressing in
10c792110e.

Clean up this sloppy logic up, making sure no more duplicates are
added by us, all instances of relevant adresses are purged and only
readded when they actually existed to begin with.
2024-11-17 00:44:51 +01:00
Oneric
0f9c9aac38 Completely omit id for anonymous objects
Current AP spec demands anonymous objects to have an id value,
but explicitly set it to JSON null. Howeveras it turns out this is
incompatible with JSON-LD requiring `@id` to be a string and thus AP
spec is incompatible iwth the Ativity Streams spec it is based on.
This is an issue for (the few) AP implementers actually performing
JSON-LD processing, like IceShrimp.NET.
This was uncovered by IceShrimp.NET’s zotan due to our adoption of
anonymous objects for emoj in f101886709.

The issues is being discussed by W3C, and will most likely be resolved
via an errata redefining anonymous objects to completely omit the id
field just like transient objects already do. See:
https://github.com/w3c/activitypub/issues/476

Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/848
2024-11-09 19:29:29 +01:00
Floatingghost
c0a99df06a Merge remote-tracking branch 'oneric/varfixes' into develop 2024-10-30 15:15:00 +00:00
Floatingghost
11c5838947 standardise local key id generation 2024-10-30 12:44:01 +00:00
Floatingghost
d330c57cda make sure we correctly match key objects 2024-10-26 08:42:07 +01:00
Floatingghost
9d2c558f64 remove unused import 2024-10-26 07:42:43 +01:00
Floatingghost
ac25b051ae remove previous "allow user routes" functionality 2024-10-26 07:28:43 +01:00
Floatingghost
58d5d9d7bf fix tests, contain object 2024-10-26 06:58:47 +01:00
Floatingghost
b6e8fde4dd Merge branch 'develop' into keys-extraction 2024-10-26 06:11:29 +01:00
Floatingghost
bee10eab5e correct minor zip behaviour 2024-10-26 06:11:12 +01:00
Floatingghost
13215f5f06 remove public key field 2024-10-26 05:28:55 +01:00
Floatingghost
430b376ded mix format 2024-10-26 05:05:48 +01:00
Floatingghost
ccf1007883 Fix about a million tests 2024-10-26 05:05:48 +01:00
Floatingghost
6da783b84d Fix http signature plug tests 2024-10-26 05:05:48 +01:00
Floatingghost
8f322456a0 Allow unsigned fetches of a user's public key 2024-10-26 05:05:48 +01:00
Floatingghost
9c876cea21 Fix some tests 2024-10-26 05:05:48 +01:00
Floatingghost
9728e2f8f7 adjust logic to use relation :signing_key 2024-10-26 05:05:47 +01:00
Floatingghost
b0f7da9ce0 remove now-unused Keys module 2024-10-26 05:05:28 +01:00
Floatingghost
fc99c694e6 Add signing key modules 2024-10-26 05:05:28 +01:00
Floatingghost
fbb13fde76 Merge branch 'develop' of akkoma.dev:AkkomaGang/akkoma into develop 2024-10-26 05:04:27 +01:00
Floatingghost
cbd236aeb5 mix format 2024-10-26 05:04:20 +01:00
floatingghost
71d3bbd7be Merge pull request 'Fix wrong type when importing emojis' (#841) from tudbut/akkomafixes:emojis into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/841
2024-10-26 04:00:12 +00:00
Norm
88a8086ad3 Use LEFT JOIN instead of UNION for hashtag pruning 2024-10-25 12:26:14 -04:00
Norm
40da4e88ea Update hashtag prune to account for followed hashtags
Currently pruning hashtags with the prune_objects task only accounts
for whether that hashtag is associated with an object, but this may
lead to a foreign key constraint violation if that hashtag has no
objects but is followed by a local user.

This adds an additional check to see if that hashtag has any followers
before proceeding to delete it.
2024-10-25 11:55:37 -04:00
TudbuT
661b7fedb6
fix wrong type when importing emojis 2024-10-18 14:57:31 +02:00
TudbuT
8b5aca9619
fix fs error while unpacking frontends 2024-10-18 14:50:28 +02:00
floatingghost
f101886709 Merge pull request 'Federate emoji as anonymous objects' (#815) from Oneric/akkoma:emoji-id into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/815
2024-10-16 14:58:46 +00:00
Oneric
d5b0720596 Allow cross-domain redirects on AP requests
Since we now remember the final location redirects lead to
and use it for all further checks since
3e134b07fa, these redirects
can no longer be exploited to serve counterfeit objects.

This fixes:
 - display URLs from independent webapp clients
   redirecting to the canonical domain
 - Peertube display URLs for remote content
   (acting like the above)
2024-10-14 01:42:51 +02:00
Oneric
940792f8ba Refetch on AP ID mismatch
As hinted at in the commit message when strict checking
was added in 8684964c5d,
refetching is more robust than display URL comparison
but in exchange is harder to implement correctly.

A similar refetch approach is also employed by
e.g. Mastodon, IceShrimp and FireFish.

To make sure no checks can be bypassed by forcing
a refetch, id checking is placed at the very end.

This will fix:
 - Peertube display URL arrays our transmogrifier fails to normalise
 - non-canonical display URLs from alternative frontends
   (theoretical; we didnt’t get any actual reports about this)

It will also be helpful in the planned key handling overhaul.

The modified user collision test was introduced in
https://git.pleroma.social/pleroma/pleroma/-/merge_requests/461
and unfortunately the issues this fixes aren’t public.
Afaict it was just meant to guard against someone serving
faked data belonging to an unrelated domain. Since we now
refetch and the id actually is mocked, lookup now succeeds
but will use the real data from the authorative server
making it unproblematic. Instead modify the fake data further
and make sure we don’t end up using the spoofed version.
2024-10-14 01:42:43 +02:00
floatingghost
3bb31117e6 Merge pull request 'Handle domain mutes on the backend' (#804) from domain-mute-backend-processing into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/804
2024-08-20 10:32:47 +00:00
Floatingghost
2c5c531c35 readd comment about domain mutes 2024-08-20 11:05:36 +01:00
Oneric
a3101a435b Fix swagger-ui
Ever since the browser frontend switcher was introduced in
de64c6c54a /akkoma counts as
an API prefix and thus gets skipped by frontend plugs
breaking the old swagger ui path of /akkoma/swagger-ui.

Do the simple thing and change the frontend path to
/pleroma/swaggerui which isn't an API path and can't collide
with frontend user paths given pleroma is areserved nickname.

Reported in
  https://meta.akkoma.dev/t/view-all-endpoints/269/7
  https://meta.akkoma.dev/t/swagger-ui-not-loading/728
2024-06-27 18:29:45 +02:00
Oneric
d488cf476e Fix voters count field
Mastodon API demands this be null unless it’s a multi-selection poll.
Not abiding by this can mess up display in some clients.

Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/190
2024-06-27 18:29:45 +02:00
Oneric
ca182a0ae7 Correctly parse content types with multiple profiles
Multiple profiles can be specified as a space-separated list
and the possibility of additional profiles is explicitly brought up
in ActivityStream spec
2024-06-27 18:29:45 +02:00
Oneric
495a1a71e8 strip_metadata: skip BMP files
Not _yet_ supported as of exiftool 12.87, though
at first glance it seems like standard BMP files
can't store any metadata besides colour profiles

Fixes the specific case from
https://akkoma.dev/AkkomaGang/akkoma-fe/issues/396
although the frontend shouldn’t get bricked regardless.
2024-06-27 18:29:45 +02:00
Oneric
7cd3954152 Remove superfluous actor key suffix
Fragments are already always stripped anyway
so listing one specific fragment here is
unnecessary and potentially confusing.

This effectively reverts
4457928e32
but keeps the added bridgy testcase.
2024-06-27 18:29:45 +02:00
Oneric
4ff5293093 Federate emoji as anonymous objects
Usually an id should point to another AP object
and the image file isn’t an AP object. We currently
do not provide standalone AP objects for emoji and
don't keep track of remote emoji at all.
Thus just federate them as anonymous objects,
i.e. objects only existing within a parent context
and using an explicit null id.

IceShrimp.NET previously adopted anonymous objects
for remote emoji without any apparent issues. See:
333611f65e

Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/694
2024-06-23 20:46:59 +02:00
floatingghost
f66135ed08 Merge pull request 'Avoid accumulation of stale data in websockets' (#806) from Oneric/akkoma:websocket_fullsweep into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/806
Reviewed-by: floatingghost <hannah@coffee-and-dreams.uk>
2024-06-23 02:19:36 +00:00
Oneric
13e2a811ec Avoid accumulation of stale data in websockets
We’ve received reports of some specific instances slowly accumulating
more and more binary data over time up to OOMs and globally setting
ERL_FULLSWEEP_AFTER=0 has proven to be an effective countermeasure.
However, this incurs increased cpu perf costs everywhere and is
thus not suitable to apply out of the box.

Apparently long-lived Phoenix websocket processes are known to
often cause exactly this by getting into a state unfavourable
for the garbage collector.
Therefore it seems likely affected instances are using timeline
streaming and do so in just the right way to trigger this. We
can tune the garbage collector just for websocket processes
and use a more lenient value of 20 to keep the added perf cost
in check.

Testing on one affected instance appears to confirm this theory

Ref.:
  https://www.erlang.org/doc/man/erlang#ghlink-process_flag-2-idp226
  https://blog.guzman.codes/using-phoenix-channels-high-memory-usage-save-money-with-erlfullsweepafter
  https://git.pleroma.social/pleroma/pleroma/-/merge_requests/4060

Tested-by: bjo
2024-06-22 22:22:33 +02:00
Oneric
c3069b9478 cosmetic: fix elixir 1.17 compiler warnings in main application 2024-06-19 01:49:59 +02:00
floatingghost
5992e8bb16 Merge pull request 'Update http-signatures dep, allow created header' (#800) from created-pseudoheader into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/800
2024-06-17 21:52:59 +00:00
Floatingghost
57273754b7 we may as well handle (expires) as well 2024-06-17 22:30:14 +01:00
floatingghost
59bfdf2ca4 Merge pull request 'Add limit CLI flags to prune jobs' (#655) from Oneric/akkoma:prune-batch into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/655
2024-06-17 20:47:53 +00:00
Oneric
bf8f493ffd Remove proxy_remote vestiges
Ever since 364b6969eb
this setting wasn't used by the backend and a noop.
The stated usecase is better served by setting the base_url
to a local subdomain and using proxying in nginx/Caddy/...
2024-06-16 01:21:52 +02:00
Floatingghost
3b197503d2 me me stupid person 2024-06-15 15:30:02 +01:00
Floatingghost
c0b2bba55e revert subdomain change until i can look at why i did that 2024-06-15 15:14:42 +01:00
Floatingghost
4b765b1886 mix format 2024-06-15 15:06:28 +01:00
Floatingghost
cba2c5725f Filter emoji reaction accounts by domain blocks 2024-06-15 15:05:52 +01:00
Floatingghost
2b96c3b224 Update http-signatures dep, allow created header 2024-06-12 18:40:44 +01:00
floatingghost
b03edb4ff4 Merge pull request 'Fix StealEmoji’s max size check' (#793) from Oneric/akkoma:emojistealer_contentlength into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/793
2024-06-12 17:09:05 +00:00
Floatingghost
4d6fb43cbd No need to spawn() any more 2024-06-12 02:09:24 +01:00
Floatingghost
ad52135bf5 Convert rich media backfill to oban task 2024-06-11 18:06:51 +01:00
Floatingghost
9c5feb81aa fix tests 2024-06-09 21:26:29 +01:00
Floatingghost
a360836ce3 fix oembed test 2024-06-09 21:17:12 +01:00
Floatingghost
840c70c4fa remove prints 2024-06-09 18:52:09 +01:00
Floatingghost
c65379afea attempt to fix some tests 2024-06-09 18:45:38 +01:00
Floatingghost
16bed0562d Fix tests 2024-06-09 18:28:00 +01:00
Mark Felder
a801dd7b07 Fix module struct matching 2024-06-09 17:38:28 +01:00
Mark Felder
1e86da43f5 Credo 2024-06-09 17:38:24 +01:00
Mark Felder
411831458c Credo 2024-06-09 17:38:18 +01:00
Mark Felder
56463b2121 Fix compile warning
warning: "else" clauses will never match because all patterns in "with" will always match
  lib/pleroma/web/rich_media/parser/ttl/opengraph.ex:10
2024-06-09 17:38:12 +01:00
Mark Felder
2f5eb79473 Mastodon API: Remove deprecated GET /api/v1/statuses/:id/card endpoint
Removed back in 2019

https://github.com/mastodon/mastodon/pull/11213
2024-06-09 17:38:06 +01:00
Mark Felder
4746f98851 Fix broken Rich Media parsing when the image URL is a relative path 2024-06-09 17:36:28 +01:00
Mark Felder
765c7e98d2 Respect the TTL returned in OpenGraph tags 2024-06-09 17:36:15 +01:00
Floatingghost
4a3dd5f65e lost in cherry-pick 2024-06-09 17:34:41 +01:00
Mark Felder
bfe4152385 Increase the :max_body for Rich Media to 5MB
Websites are increasingly getting more bloated with tricks like inlining content (e.g., CNN.com) which puts pages at or above 5MB. This value may still be too low.
2024-06-09 17:34:29 +01:00
Mark Felder
5da9cbd8a5 RichMedia refactor
Rich Media parsing was previously handled on-demand with a 2 second HTTP request timeout and retained only in Cachex. Every time a Pleroma instance is restarted it will have to request and parse the data for each status with a URL detected. When fetching a batch of statuses they were processed in parallel to attempt to keep the maximum latency at 2 seconds, but often resulted in a timeline appearing to hang during loading due to a URL that could not be successfully reached. URLs which had images links that expire (Amazon AWS) were parsed and inserted with a TTL to ensure the image link would not break.

Rich Media data is now cached in the database and fetched asynchronously. Cachex is used as a read-through cache. When the data becomes available we stream an update to the clients. If the result is returned quickly the experience is almost seamless. Activities were already processed for their Rich Media data during ingestion to warm the cache, so users should not normally encounter the asynchronous loading of the Rich Media data.

Implementation notes:

- The async worker is a Task with a globally unique process name to prevent duplicate processing of the same URL
- The Task will attempt to fetch the data 3 times with increasing sleep time between attempts
- The HTTP request obeys the default HTTP request timeout value instead of 2 seconds
- URLs that cannot be successfully parsed due to an unexpected error receives a negative cache entry for 15 minutes
- URLs that fail with an expected error will receive a negative cache with no TTL
- Activities that have no detected URLs insert a nil value in the Cachex :scrubber_cache so we do not repeat parsing the object content with Floki every time the activity is rendered
- Expiring image URLs are handled with an Oban job
- There is no automatic cleanup of the Rich Media data in the database, but it is safe to delete at any time
- The post draft/preview feature makes the URL processing synchronous so the rendered post preview will have an accurate rendering

Overall performance of timelines and creating new posts which contain URLs is greatly improved.
2024-06-09 17:33:48 +01:00
Floatingghost
a924e117fd Add pool timeouts 2024-06-09 17:20:29 +01:00
Oneric
2180d068ae Raise log level for start failures 2024-06-07 16:21:21 +02:00
Oneric
a3840e7d1f Raise minimum PostgreSQL version to 12
This lets us:
 - avoid issues with broken hash indices for PostgreSQL <10
 - drop runtime checks and legacy codepaths for <11 in db search
 - always enable custom query plans for performance optimisation

PostgreSQL 11 is already EOL since 2023-11-09, so
in theory everyone should already have moved on to 12 anyway.
2024-06-07 16:21:09 +02:00
Oneric
df27567d99 mrf/steal_emoji: display download_unknown_size in admin-fe
Fixes omission in d6d838cbe8
2024-06-05 20:14:10 +02:00
Oneric
be5440c5e8 mrf/steal_emoji: fix size limit check
Headers are strings, but this expected to already get an int
thus always failing the comparison if the header was set.

Fixes mistake in d6d838cbe8
2024-06-05 20:11:53 +02:00
Floatingghost
0f65dd3ebe remove pointless logger 2024-06-04 14:34:59 +01:00
Floatingghost
38d09cb0ce remove now-pointless clause 2024-06-04 14:34:18 +01:00
Floatingghost
c9a03af7c1 Move rescue to the HTTP request itself 2024-06-04 14:30:16 +01:00
Floatingghost
0f7ae0fa21 am i baka 2024-06-04 14:26:33 +01:00
Floatingghost
30e13a8785 Don't error on rich media fail 2024-06-04 14:21:40 +01:00
Floatingghost
778b213945 enqueue pin fetches after changeset validation 2024-06-01 08:25:35 +01:00
Oneric
bed7ff8e89 mix: consistently use shell_info and shell_error
Logger output being visible depends on user configuration, but most of
the prints in mix tasks should always be shown. When running inside a
mix shell, it’s probably preferable to send output directly to it rather
than using raw IO.puts and we already have shell_* functions for this,
let’s use them everywhere.
2024-05-31 17:17:42 +02:00
Oneric
70cd5f91d8 dbprune/activites: prune array activities first
This query is less costly; if something goes wrong or gets aborted later
at least this part will arelady be done.
2024-05-31 17:16:40 +02:00
Oneric
aeaebb566c dbprune: allow splitting array and single activity prunes
The former is typically just a few reports; it doesn't make sense to
rerun it over and over again in batched prunes or if a full prune OOMed.
2024-05-31 17:16:40 +02:00
Oneric
5751637926 dbprune: use query! 2024-05-31 17:16:40 +02:00
Oneric
24bab63cd8 dbprune: add more logs
Pruning can go on for a long time; give admins some insight into that
something is happening to make it less frustrating and to make it easier
which part of the process is stalled should this happen.

Again most of the changes are merely reindents;
review with whitespace changes hidden recommended.
2024-05-31 17:16:40 +02:00
Oneric
1d4c212441 dbprune: shortcut array activity search
This brought down query costs from 7,953,740.90 to 47,600.97
2024-05-31 17:16:40 +02:00
Oneric
225f87ad62 Also allow limiting the initial prune_object
May sometimes be helpful to get more predictable runtime
than just with an age-based limit.

The subquery for the non-keep-threads path is required
since delte_all does not directly accept limit().

Again most of the diff is just adjusting indentation, best
hide whitespace-only changes with git diff -w or similar.
2024-05-31 17:16:40 +02:00
Oneric
e64f031167 Log number of deleted rows in prune_orphaned_activities
This gives feedback when to stop rerunning limited batches.

Most of the diff is just adjusting indentation; best reviewed
with whitespace-only changes hidden, e.g. `git diff -w`.
2024-05-31 17:16:40 +02:00
Oneric
fa52093bac Add standalone prune_orphaned_activities CLI task
This part of pruning can be very expensive and bog down the whole
instance to an unusable sate for a long time. It can thus be desireable
to split it from prune_objects and run it on its own in smaller limited batches.

If the batches are smaller enough and spaced out a bit, it may even be possible
to avoid any downtime. If not, the limit can still help to at least make the
downtime duration somewhat more predictable.
2024-05-31 17:16:40 +02:00
Oneric
3126d15ffc refactor: move prune_orphaned_activities into own function
No logic changes. Preparation for standalone orphan pruning.
2024-05-31 17:16:39 +02:00
floatingghost
8f97c15b07 Merge pull request 'Preserve Meilisearch’s result ranking' (#772) from Oneric/akkoma:search-meili-order into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/772
2024-05-31 14:12:05 +00:00
Floatingghost
3af0c53a86 use proper workers for fetching pins instead of an ad-hoc task (#788)
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/788
Co-authored-by: Floatingghost <hannah@coffee-and-dreams.uk>
Co-committed-by: Floatingghost <hannah@coffee-and-dreams.uk>
2024-05-31 08:58:52 +00:00
Oneric
fc7e07f424 meilisearch: enable using search_key
Using only the admin key works as well currently
and Akkoma needs to know the admin key to be able
to add new entries etc. However the Meilisearch
key descriptions suggest the admin key is not
supposed to be used for searches, so let’s not.

For compatibility with existings configs, search_key remains optional.
2024-05-29 23:17:27 +00:00