Commit graph

9467 commits

Author SHA1 Message Date
Oneric
2ddff7e386 transmogrifier: gracefully ignore Delete of unknown objects
It's quite common to receive spurious Deletes,
so we neither want to waste resources on retrying
nor spam "invalid AP" logs
2025-01-07 20:27:28 +01:00
Oneric
cd8e6a4235 transmogrifier: gracefully ignore duplicated object deletes
The object lookup is later repeated in the validator, but due to
caching shouldn't incur any noticeable performance impact.
It’s actually preferable to check here, since it avoids the otherwise
occuring user lookup and overhead from starting and aborting a
transaction
2025-01-07 20:27:28 +01:00
Oneric
ac2327c8fc transmogrfier: be more selective about Delete retry
If something else renders the Delete invalid,
there’s no point in retrying anyway
2025-01-07 20:27:28 +01:00
Oneric
92bf93a4f7 transmogrifier: avoid crashes on non-validation Delte errors
Happens e.g. for duplicated Deletes.
The remaining tombstone object no longer has an actor,
leading to an error response during side-effect handling.
2025-01-07 20:27:28 +01:00
Oneric
7ad5f8d3c0 object_validators: only query relevant table for object
Most of them actually only accept either activities or a
non-activity object later; querying both is then a waste
of resources and may create false positives.
2025-01-07 20:27:28 +01:00
Oneric
b0387dee14 Gracefully ignore Undo activities referring to unknown objects 2025-01-07 20:27:28 +01:00
Oneric
caa4fbe326 user: avoid database work on superfluous pin
The only thing this does is changing the updated_at field of the user.
Afaict this came to be because prior to pins federating this was split
into two functions, one of which created a changeset, the other applying
a given changeset. When this was merged the bits were just copied into
place.
2025-01-07 20:27:28 +01:00
Oneric
09736431e0 Don't spam logs about deleted users
User.get_or_fetch_by_(apid|nickname) are the only external users of fetch_and_prepare_user_from_ap_id,
thus there’s no point in duplicating logging, expecially not at error level.
Currently (duplicated) _not_found errors for users make up the bulk of my logs
and are created almost every second. Deleted users are a common occurence and not
worth logging outside of debug
2025-01-07 20:27:28 +01:00
Oneric
bcf3e101f6 rich_media: lower log level of update 2025-01-07 20:27:28 +01:00
Oneric
05bbdbf388 nodeinfo: lower log level of regular actions to debug 2025-01-07 20:27:28 +01:00
Oneric
2c75600532 federation/incoming: improve link_resolve retry decision
To facilitate this ObjectValidator.fetch_actor_and_object is adapted to
return an informative error. Otherwise we’d be unable to make an
informed decision on retrying or not later. There’s no point in
retrying to fetch MRF-blocked stuff or private posts for example.
2025-01-07 20:27:28 +01:00
Oneric
0cd4040db6 Error out earlier on missing mandatory reference
This is the only user of fetch_actor_and_object which previously just
always preteneded to be successful. For all the activity types handled
here, we absolutely need the referenced object to be able to process it
(other than Announce whether or not processing those activity types for
unknown remote objects is desirable in the first place is up for debate)

All other users of the similar fetch_actor already properly check success.

Note, this currently lumps all reolv failure reasons together,
so even e.g. boosts of MRF rejected posts will still exhaust all
retries. The following commit improves on this.
2025-01-07 20:27:28 +01:00
Oneric
0ba5c3649d federator: don't nest {:error, _} tuples
It makes decisions based on error sources harder since all possible
nesting levels need to be checked for. As shown by the return values
handled in the receiver worker something else still nests those,
but this is a first start.
2025-01-07 20:27:28 +01:00
Oneric
8e5defe6ca stats: estimate remote user count
This value is currently only used by Prometheus metrics
but (after optimisng the peer query inthe preceeding commit)
the most costly part of instance stats.
2025-01-07 20:27:28 +01:00
Oneric
138b1aea2f stats: use cheaper peers query
This query is one of the top cost offenders during an instances
lifetime. For small instances it was shown to take up 30-50% percent of
the total database query time, while for bigger isntaces it still held
a spot in the top 3 — alost as or even more expensive overall than
timeline queries!

The good news is, there’s a cheaper way using the instance table:
no need to process each entry, no need to filter NULLs
and no need to dedupe. EXPLAIN estimates the cost of the
old query as 13272.39 and the cost of the new query as 395.74
for me; i.e. a 33-fold reduction.

Results can slightly differ. E.g. we might have an old user
predating the instance tables existence and no interaction with since
or no instance table entry due to failure to query nodeinfo.
Conversely, we might have an instance entry but all known users got
deleted since.
However, this seems unproblematic in practice
and well worth the perf improvment.

Given the previous query didn’t exclude unreachable instances
neither does the new query.
2025-01-07 20:27:28 +01:00
Oneric
8b5183cb74 stats: fix stat spec 2025-01-07 20:27:28 +01:00
Oneric
cbb0d4b0a8 receiver_worker: log unecpected errors
This can't handle process crash errors
but i hope those get a stacktrace logged by default
2025-01-07 20:27:28 +01:00
Oneric
be2c857845 receiver_worker: don't reattempt invalid documents
Ideally we’d like to split this up more and count most invalid documents
as an error, but silently drop e.g. Deletes for unknown objects.
However, this is hard to extract from the changeset and jobs canceled
with :discard don’t count as exceptions and I’m not aware of a idiomatic
way to cancel further retries while retaining the exception status.

Thus at least keep a log, but since superfluous "Delete"s
seem kinda frequent, don't log at error, only info level.
2025-01-07 20:27:28 +01:00
Oneric
9f4d3a936f cosmetic/receiver_worker: reformat error cases
The next commit adds a multi-statement case
and then mix format will enforce this anyway
2025-01-07 20:27:28 +01:00
Oneric
f9724b5879 Don’t reattempt insertion of already known objects
Might happen if we receive e.g. a Like before the Note arrives
in our inbox and we thus already queried the Note ourselves.
2025-01-07 20:27:27 +01:00
Oneric
280652651c rich_media: don't reattempt parsing on rejected URLs 2025-01-07 20:27:27 +01:00
Oneric
92544e8f99 Don't enqueue a plethora of unnecessary NodeInfoFetcher jobs
There were two issues leading to needles effort:
Most importnatly, the use of AP IDs as "source_url" meant multiple
simultaneous jobs got scheduled for the same instance even with the
default unique settings.
Also jobs were scheduled uncontionally for each processed AP object
meaning we incured oberhead from managing Oban jobs even if we knew it
wasn't necessary. By comparison the single query to check if an update
is needed should be cheaper overall.
2025-01-07 20:27:27 +01:00
Oneric
d283ac52c3 Don't create noop SearchIndexingWorker jobs for passive index 2025-01-07 20:27:27 +01:00
Oneric
ed4019e7a3 workers: make custom filtering ahead of enqueue possible 2025-01-07 20:27:27 +01:00
Oneric
25d24cc5f6 validators/add_remove: don't crash on failure to resolve reference
It allows for informed error handling and retry/discard job
decisions lateron which a future commit will add.
2025-01-07 20:27:27 +01:00
Oneric
ead44c6671 federator: don't fetch the user for no reason
The return value is never used here; later stages which actually need it
fetch the user themselves and it doesn't matter wheter we wait for the
fech here or later (if needed at all).

Even more, this early fetch always fails if the user was already deleted
or never known to begin with, but we get something referencing it; e.g.
the very Delete action carrying out the user deletion.
This prevents processing of the Delete, but before that it will be
reattempted several times, each time attempring to fetch the
non-existing profile, wasting resources.
2025-01-07 20:27:27 +01:00
Oneric
4859f38624 add_remove_validator: limit refetch rate to 1 per 5s
This matches the maximum_age used when processing Move activities
2025-01-07 20:27:27 +01:00
Oneric
0f4a7a185f Drop ap_enabled indicator from atom feeds 2025-01-07 20:27:27 +01:00
Haelwenn (lanodan) Monnier
c17681ae1e Purge obsolete ap_enabled indicator
It was used to migrate OStatus connections to ActivityPub if possible,
but support for OStatus was long since dropped, all new actors always AP
and if anything wasn't migrated before, their instance is already marked
as unreachable anyway.

The associated logic was also buggy in several ways and deleted users
got set to ap_enabled=false also causing some issues.

This patch is a pretty direct port of the original Pleroma MR;
follow-up commits will further fix and clean up remaining issues.
Changes made (other than trivial merge conflict resolutions):
  - converted CHANGELOG format
  - adapted migration id for Akkoma’s timeline
  - removed ap_enabled from additional tests

Ported-from: https://git.pleroma.social/pleroma/pleroma/-/merge_requests/3880
2025-01-07 20:27:26 +01:00
Floatingghost
1ffbaa2924 don't allow a nil inbox to obliterate federation 2025-01-06 11:43:41 +00:00
floatingghost
39cef8b8d2 Merge pull request 'Set customize_hostname_check for Swoosh.Adapters.SMTP' (#861) from norm/akkoma:smtp-defaults-fix into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/861
2025-01-05 15:43:16 +00:00
floatingghost
3ba743d635 Merge pull request 'Update hashtag prune to account for followed hashtags' (#844) from norm/akkoma:hashtag-prune into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/844
2025-01-05 15:41:23 +00:00
floatingghost
8de373fa24 Merge pull request 'Fix various attachment cleanup issues' (#789) from Oneric/akkoma:attachcleanup-overeager into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/789
2025-01-05 15:39:48 +00:00
floatingghost
7c095a6b70 Merge pull request 'do not fetch if :limit_to_local_content is :all or :unauthenticated' (#582) from beerriot/akkoma:develop-no-fetch-with-local-limit into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/582
2025-01-05 15:39:13 +00:00
Oneric
e8bf4422ff Delay attachment deletion
Otherwise attachments have a high chance to disappear with akkoma-fe’s
“delete & redraft” feature when cleanup is enabled in the backend. Since
we don't know whether a deletion was intended to be part of a redraft
process or even if whether the redraft was abandoned we still have to
delete attachments eventually.
A thirty minute delay should provide sufficient time for redrafting.

Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/775
2025-01-03 20:49:11 +01:00
Oneric
bcfbfbcff5 Don't try to cleanup remote attachments
The cleanup attachment worker was run for every deleted post,
even if it’s a remote post whose attachments we don't even store.
This was especially bad due to attachment cleanup involving a
particularly heavy query wasting a bunch of database perf for nil.

This was uncovered by comparing statistics from
https://akkoma.dev/AkkomaGang/akkoma/issues/784 and
https://akkoma.dev/AkkomaGang/akkoma/issues/765#issuecomment-12256
2025-01-03 20:48:46 +01:00
floatingghost
e3c8c4f24f Merge pull request 'mrf/object_age: fix handling of non-public objects' (#851) from Oneric/akkoma:mrf-fix-oage into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/851
2025-01-03 15:26:11 +00:00
floatingghost
67cdc38296 Merge pull request 'Only proxy HTTP and HTTP urls via Media Proxy' (#860) from nopjmp/akkoma:media-proxy-only-http into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/860
2025-01-03 15:25:14 +00:00
floatingghost
89d209f486 Merge pull request 'Fix NodeInfo content-type' (#853) from Oneric/akkoma:nodeinfo-contenttype into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/853
2025-01-03 15:24:25 +00:00
floatingghost
91bedcfa68 Merge pull request 'Completely omit id for anonymous objects' (#850) from Oneric/akkoma:ap-anonymous-errata into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/850
2025-01-03 15:23:03 +00:00
Norm
f19d5d1380 Set customize_hostname_check for Swoosh.Adapters.SMTP
This should hopefully fix issues with connecting to SMTP servers
with wildcard TLS certificates.

Taken from https://erlef.github.io/security-wg/secure_coding_and_deployment_hardening/ssl

Fixes https://akkoma.dev/AkkomaGang/akkoma/issues/660
2024-12-18 14:37:27 -05:00
nopjmp
7632765b43 Only proxy HTTP and HTTP urls via Media Proxy
We make an assumption that we are only proxying HTTP/HTTPS hosted
media through the media proxy endpoint.

Fixes: #859
2024-12-16 20:35:12 -06:00
Oneric
294de939cb signing_key: refactor nested case into with statement
The error branches were already effectively identical before.
This change is purely cosmetic.
2024-12-08 20:43:57 +00:00
Haelwenn (lanodan) Monnier
2b1a252cc7 User: truncate remote user fields instead of rejecting 2024-11-26 09:29:44 +00:00
Oneric
416aebb76a Fix NodeInfo content-type
Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/852
2024-11-19 19:25:31 +01:00
Oneric
932810c35e mrf/object_age: fix handling of non-public objects
Current logic unconditionally adds public adressing to "cc"
and follower adressing to "to" after attempting to strip it
from the other one. This creates serious problems:

First the bug prompting this investigation and fix,
unconditional addition creates duplicates when adressing
URIs already were in their intended final field; e.g.
this is prominently the case for all "unlisted" posts.
Since List.delete only removes the first occurence,
this then broke follower-adress stripping later on
making the policy ineffective.

It’s also just not safe in general wrt to non-public adressing:
e.g. pre-existing duplicates didn’t get fully stripped,
bespoke adressing modes with only one of public addressing
or follower addressing are mangled — and most importantly:
any belatedly received DM or follower-only post
also got public adressing added!
Shockingly this last point was actually asserted as "correct" in tests;
it appears to be a mistake from mindless match adjustments
while fixing crashes on nil adressing in
10c792110e.

Clean up this sloppy logic up, making sure no more duplicates are
added by us, all instances of relevant adresses are purged and only
readded when they actually existed to begin with.
2024-11-17 00:44:51 +01:00
Oneric
0f9c9aac38 Completely omit id for anonymous objects
Current AP spec demands anonymous objects to have an id value,
but explicitly set it to JSON null. Howeveras it turns out this is
incompatible with JSON-LD requiring `@id` to be a string and thus AP
spec is incompatible iwth the Ativity Streams spec it is based on.
This is an issue for (the few) AP implementers actually performing
JSON-LD processing, like IceShrimp.NET.
This was uncovered by IceShrimp.NET’s zotan due to our adoption of
anonymous objects for emoj in f101886709.

The issues is being discussed by W3C, and will most likely be resolved
via an errata redefining anonymous objects to completely omit the id
field just like transient objects already do. See:
https://github.com/w3c/activitypub/issues/476

Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/848
2024-11-09 19:29:29 +01:00
Floatingghost
c0a99df06a Merge remote-tracking branch 'oneric/varfixes' into develop 2024-10-30 15:15:00 +00:00
Floatingghost
11c5838947 standardise local key id generation 2024-10-30 12:44:01 +00:00
Floatingghost
d330c57cda make sure we correctly match key objects 2024-10-26 08:42:07 +01:00