Commit graph

9481 commits

Author SHA1 Message Date
Floatingghost
57273754b7 we may as well handle (expires) as well 2024-06-17 22:30:14 +01:00
floatingghost
59bfdf2ca4 Merge pull request 'Add limit CLI flags to prune jobs' (#655) from Oneric/akkoma:prune-batch into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/655
2024-06-17 20:47:53 +00:00
Oneric
bf8f493ffd Remove proxy_remote vestiges
Ever since 364b6969eb
this setting wasn't used by the backend and a noop.
The stated usecase is better served by setting the base_url
to a local subdomain and using proxying in nginx/Caddy/...
2024-06-16 01:21:52 +02:00
Floatingghost
3b197503d2 me me stupid person 2024-06-15 15:30:02 +01:00
Floatingghost
c0b2bba55e revert subdomain change until i can look at why i did that 2024-06-15 15:14:42 +01:00
Floatingghost
4b765b1886 mix format 2024-06-15 15:06:28 +01:00
Floatingghost
cba2c5725f Filter emoji reaction accounts by domain blocks 2024-06-15 15:05:52 +01:00
Floatingghost
2b96c3b224 Update http-signatures dep, allow created header 2024-06-12 18:40:44 +01:00
floatingghost
b03edb4ff4 Merge pull request 'Fix StealEmoji’s max size check' (#793) from Oneric/akkoma:emojistealer_contentlength into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/793
2024-06-12 17:09:05 +00:00
Floatingghost
4d6fb43cbd No need to spawn() any more 2024-06-12 02:09:24 +01:00
Floatingghost
ad52135bf5 Convert rich media backfill to oban task 2024-06-11 18:06:51 +01:00
Floatingghost
9c5feb81aa fix tests 2024-06-09 21:26:29 +01:00
Floatingghost
a360836ce3 fix oembed test 2024-06-09 21:17:12 +01:00
Floatingghost
840c70c4fa remove prints 2024-06-09 18:52:09 +01:00
Floatingghost
c65379afea attempt to fix some tests 2024-06-09 18:45:38 +01:00
Floatingghost
16bed0562d Fix tests 2024-06-09 18:28:00 +01:00
Mark Felder
a801dd7b07 Fix module struct matching 2024-06-09 17:38:28 +01:00
Mark Felder
1e86da43f5 Credo 2024-06-09 17:38:24 +01:00
Mark Felder
411831458c Credo 2024-06-09 17:38:18 +01:00
Mark Felder
56463b2121 Fix compile warning
warning: "else" clauses will never match because all patterns in "with" will always match
  lib/pleroma/web/rich_media/parser/ttl/opengraph.ex:10
2024-06-09 17:38:12 +01:00
Mark Felder
2f5eb79473 Mastodon API: Remove deprecated GET /api/v1/statuses/:id/card endpoint
Removed back in 2019

https://github.com/mastodon/mastodon/pull/11213
2024-06-09 17:38:06 +01:00
Mark Felder
4746f98851 Fix broken Rich Media parsing when the image URL is a relative path 2024-06-09 17:36:28 +01:00
Mark Felder
765c7e98d2 Respect the TTL returned in OpenGraph tags 2024-06-09 17:36:15 +01:00
Floatingghost
4a3dd5f65e lost in cherry-pick 2024-06-09 17:34:41 +01:00
Mark Felder
bfe4152385 Increase the :max_body for Rich Media to 5MB
Websites are increasingly getting more bloated with tricks like inlining content (e.g., CNN.com) which puts pages at or above 5MB. This value may still be too low.
2024-06-09 17:34:29 +01:00
Mark Felder
5da9cbd8a5 RichMedia refactor
Rich Media parsing was previously handled on-demand with a 2 second HTTP request timeout and retained only in Cachex. Every time a Pleroma instance is restarted it will have to request and parse the data for each status with a URL detected. When fetching a batch of statuses they were processed in parallel to attempt to keep the maximum latency at 2 seconds, but often resulted in a timeline appearing to hang during loading due to a URL that could not be successfully reached. URLs which had images links that expire (Amazon AWS) were parsed and inserted with a TTL to ensure the image link would not break.

Rich Media data is now cached in the database and fetched asynchronously. Cachex is used as a read-through cache. When the data becomes available we stream an update to the clients. If the result is returned quickly the experience is almost seamless. Activities were already processed for their Rich Media data during ingestion to warm the cache, so users should not normally encounter the asynchronous loading of the Rich Media data.

Implementation notes:

- The async worker is a Task with a globally unique process name to prevent duplicate processing of the same URL
- The Task will attempt to fetch the data 3 times with increasing sleep time between attempts
- The HTTP request obeys the default HTTP request timeout value instead of 2 seconds
- URLs that cannot be successfully parsed due to an unexpected error receives a negative cache entry for 15 minutes
- URLs that fail with an expected error will receive a negative cache with no TTL
- Activities that have no detected URLs insert a nil value in the Cachex :scrubber_cache so we do not repeat parsing the object content with Floki every time the activity is rendered
- Expiring image URLs are handled with an Oban job
- There is no automatic cleanup of the Rich Media data in the database, but it is safe to delete at any time
- The post draft/preview feature makes the URL processing synchronous so the rendered post preview will have an accurate rendering

Overall performance of timelines and creating new posts which contain URLs is greatly improved.
2024-06-09 17:33:48 +01:00
Floatingghost
a924e117fd Add pool timeouts 2024-06-09 17:20:29 +01:00
Oneric
2180d068ae Raise log level for start failures 2024-06-07 16:21:21 +02:00
Oneric
a3840e7d1f Raise minimum PostgreSQL version to 12
This lets us:
 - avoid issues with broken hash indices for PostgreSQL <10
 - drop runtime checks and legacy codepaths for <11 in db search
 - always enable custom query plans for performance optimisation

PostgreSQL 11 is already EOL since 2023-11-09, so
in theory everyone should already have moved on to 12 anyway.
2024-06-07 16:21:09 +02:00
Oneric
df27567d99 mrf/steal_emoji: display download_unknown_size in admin-fe
Fixes omission in d6d838cbe8
2024-06-05 20:14:10 +02:00
Oneric
be5440c5e8 mrf/steal_emoji: fix size limit check
Headers are strings, but this expected to already get an int
thus always failing the comparison if the header was set.

Fixes mistake in d6d838cbe8
2024-06-05 20:11:53 +02:00
Floatingghost
0f65dd3ebe remove pointless logger 2024-06-04 14:34:59 +01:00
Floatingghost
38d09cb0ce remove now-pointless clause 2024-06-04 14:34:18 +01:00
Floatingghost
c9a03af7c1 Move rescue to the HTTP request itself 2024-06-04 14:30:16 +01:00
Floatingghost
0f7ae0fa21 am i baka 2024-06-04 14:26:33 +01:00
Floatingghost
30e13a8785 Don't error on rich media fail 2024-06-04 14:21:40 +01:00
Floatingghost
778b213945 enqueue pin fetches after changeset validation 2024-06-01 08:25:35 +01:00
Oneric
bed7ff8e89 mix: consistently use shell_info and shell_error
Logger output being visible depends on user configuration, but most of
the prints in mix tasks should always be shown. When running inside a
mix shell, it’s probably preferable to send output directly to it rather
than using raw IO.puts and we already have shell_* functions for this,
let’s use them everywhere.
2024-05-31 17:17:42 +02:00
Oneric
70cd5f91d8 dbprune/activites: prune array activities first
This query is less costly; if something goes wrong or gets aborted later
at least this part will arelady be done.
2024-05-31 17:16:40 +02:00
Oneric
aeaebb566c dbprune: allow splitting array and single activity prunes
The former is typically just a few reports; it doesn't make sense to
rerun it over and over again in batched prunes or if a full prune OOMed.
2024-05-31 17:16:40 +02:00
Oneric
5751637926 dbprune: use query! 2024-05-31 17:16:40 +02:00
Oneric
24bab63cd8 dbprune: add more logs
Pruning can go on for a long time; give admins some insight into that
something is happening to make it less frustrating and to make it easier
which part of the process is stalled should this happen.

Again most of the changes are merely reindents;
review with whitespace changes hidden recommended.
2024-05-31 17:16:40 +02:00
Oneric
1d4c212441 dbprune: shortcut array activity search
This brought down query costs from 7,953,740.90 to 47,600.97
2024-05-31 17:16:40 +02:00
Oneric
225f87ad62 Also allow limiting the initial prune_object
May sometimes be helpful to get more predictable runtime
than just with an age-based limit.

The subquery for the non-keep-threads path is required
since delte_all does not directly accept limit().

Again most of the diff is just adjusting indentation, best
hide whitespace-only changes with git diff -w or similar.
2024-05-31 17:16:40 +02:00
Oneric
e64f031167 Log number of deleted rows in prune_orphaned_activities
This gives feedback when to stop rerunning limited batches.

Most of the diff is just adjusting indentation; best reviewed
with whitespace-only changes hidden, e.g. `git diff -w`.
2024-05-31 17:16:40 +02:00
Oneric
fa52093bac Add standalone prune_orphaned_activities CLI task
This part of pruning can be very expensive and bog down the whole
instance to an unusable sate for a long time. It can thus be desireable
to split it from prune_objects and run it on its own in smaller limited batches.

If the batches are smaller enough and spaced out a bit, it may even be possible
to avoid any downtime. If not, the limit can still help to at least make the
downtime duration somewhat more predictable.
2024-05-31 17:16:40 +02:00
Oneric
3126d15ffc refactor: move prune_orphaned_activities into own function
No logic changes. Preparation for standalone orphan pruning.
2024-05-31 17:16:39 +02:00
floatingghost
8f97c15b07 Merge pull request 'Preserve Meilisearch’s result ranking' (#772) from Oneric/akkoma:search-meili-order into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/772
2024-05-31 14:12:05 +00:00
Floatingghost
3af0c53a86 use proper workers for fetching pins instead of an ad-hoc task (#788)
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/788
Co-authored-by: Floatingghost <hannah@coffee-and-dreams.uk>
Co-committed-by: Floatingghost <hannah@coffee-and-dreams.uk>
2024-05-31 08:58:52 +00:00
Oneric
fc7e07f424 meilisearch: enable using search_key
Using only the admin key works as well currently
and Akkoma needs to know the admin key to be able
to add new entries etc. However the Meilisearch
key descriptions suggest the admin key is not
supposed to be used for searches, so let’s not.

For compatibility with existings configs, search_key remains optional.
2024-05-29 23:17:27 +00:00
Oneric
59685e25d2 meilisearch: show keys by name not description
This makes show-key’s output match our documentation as of Meilisearch
1.8.0-8-g4d5971f343c00d45c11ef0cfb6f61e83a8508208. Since I’m not sure
if older versions maybe only provided description, it will fallback to
the latter if no name parameter exists.
2024-05-29 23:17:27 +00:00
Oneric
65aeaefa41 meilisearch: respect meili’s result ranking
Meilisearch is already configured to return results sorted by a
particular ranking configured in the meilisearch CLI task.
Resorting the returned top results by date partially negates this and
runs counter to what someone with tweaked settings expects.

Issue and fix identified by AdamK2003 in
https://akkoma.dev/AkkomaGang/akkoma/pulls/579
But instead of using a O(n^2) resorting, this commit directly
retrieves results in the correct order from the database.

Closes: https://akkoma.dev/AkkomaGang/akkoma/pulls/579
2024-05-29 23:17:27 +00:00
Oneric
5d6cb6a459 meilisearch: remove duplicate preload 2024-05-29 23:17:27 +00:00
floatingghost
5bdef8c724 Merge pull request 'Allow for attachment to be a single object in user data' (#783) from single-attachment into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/783
2024-05-27 01:44:53 +00:00
Floatingghost
f15eded3e1 Add extra test case for nonsense field, increase timeouts 2024-05-27 02:09:48 +01:00
Floatingghost
da67e69af5 Allow for attachment to be a single object in user data 2024-05-26 17:09:26 +01:00
Norm
c2d3221be3 Fix Exiftool stderr being read as an image description
Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/773
2024-05-23 14:44:17 -04:00
Floatingghost
b72127b45a Merge remote-tracking branch 'oneric-sec/media-owner' into develop 2024-05-22 19:36:10 +01:00
Oneric
9a91299f96 Don't try to handle non-media objects as media
Trying to display non-media as media crashed the renderer,
but when posting a status with a valid, non-media object id
the post was still created, but then crashed e.g. timeline rendering.
It also crashed C2S inbox reads, so this could not be used to leak
private posts.
2024-05-22 20:30:23 +02:00
Oneric
fbd961c747 Drop activity_type override for uploads
Afaict this was never used, but keeping this (in theory) possible
hinders detecting which objects are actually media uploads and
which proper ActivityPub objects.

It was originally added as part of upload support itself in
02d3dc6869 without being used
and `git log -S:activity_type` and `git log -Sactivity_type:`
don't find any other commits using this.
2024-05-22 20:30:23 +02:00
Oneric
0c2b33458d Restrict media usage to owners
In Mastodon media can only be used by owners and only be associated with
a single post. We currently allow media to be associated with several
posts and until now did not limit their usage in posts to media owners.
However, media update and GET lookup was already limited to owners.
(In accordance with allowing media reuse, we also still allow GET
lookups of media already used in a post unlike Mastodon)

Allowing reuse isn’t problematic per se, but allowing use by non-owners
can be problematic if media ids of private-scoped posts can be guessed
since creating a new post with this media id will reveal the uploaded
file content and alt text.
Given media ids are currently just part of a sequentieal series shared
with some other objects, guessing media ids is with some persistence
indeed feasible.

E.g. sampline some public media ids from a real-world
instance with 112 total and 61 monthly-active users:

  17.465.096  at  t0
  17.472.673  at  t1 = t0 + 4h
  17.473.248  at  t2 = t1 + 20min

This gives about 30 new ids per minute of which most won't be
local media but remote and local posts, poll answers etc.
Assuming the default ratelimit of 15 post actions per 10s, scraping all
media for the 4h interval takes about 84 minutes and scraping the 20min
range mere 6.3 minutes. (Until the preceding commit, post updates were
not rate limited at all, allowing even faster scraping.)
If an attacker can infer (e.g. via reply to a follower-only post not
accessbile to the attacker) some sensitive information was uploaded
during a specific time interval and has some pointers regarding the
nature of the information, identifying the specific upload out of all
scraped media for this timerange is not impossible.

Thus restrict media usage to owners.

Checking ownership just in ActivitDraft would already be sufficient,
since when a scheduled status actually gets posted it goes through
ActivityDraft again, but would erroneously return a success status
when scheduling an illegal post.

Independently discovered and fixed by mint in Pleroma
1afde067b1
2024-05-22 20:30:18 +02:00
marcin mikołajczak
3a21293970 Fix tests
Signed-off-by: marcin mikołajczak <git@mkljczk.pl>
2024-05-22 19:27:31 +01:00
marcin mikołajczak
0d66237205 Fix validate_webfinger when running a different domain for Webfinger
Signed-off-by: marcin mikołajczak <git@mkljczk.pl>
2024-05-22 19:20:02 +01:00
Oneric
6ef6b2a289 Apply rate limits to status updates 2024-05-22 20:18:08 +02:00
Oneric
94e9c8f48a Purge unused media description update on post
In MastoAPI media descriptions are updated via the
media update API not upon post creation or post update.

This functionality was originally added about 6 years ago in
ba93396649 which was part of
https://git.pleroma.social/pleroma/pleroma/-/merge_requests/626 and
https://git.pleroma.social/pleroma/pleroma-fe/-/merge_requests/450.
They introduced image descriptions to the front- and backend,
but predate adoption of Mastodon API.

For a while adding an `descriptions` array on post creation might have
continued to work as an undocumented Pleroma extension to Masto API, but
at latest when OpenAPI specs were added for those endpoints four years
ago in 7803a85d2c, these codepaths ceased
to be used. The API specs don’t list a `descriptions` parameter and
any unknown parameters are stripped out.

The attachments_from_ids function is only called from
ScheduledActivity and ActivityDraft.create with the latter
only being called by CommonAPI.{post,update} whihc in turn
are only called from ScheduledActivity again, MastoAPI controller
and without any attachment or description parameter WelcomeMessage.
Therefore no codepath can contain a descriptions parameter.
2024-05-22 20:18:08 +02:00
Oneric
873aa9da1c activity_draft: mark new/2 as private 2024-05-22 20:18:08 +02:00
Oneric
34a48cb87f scheduled_activity: mark private functions as private
And remove unused due_activities/1
2024-05-22 20:18:08 +02:00
Alex Gleason
a953b1d927 Prevent spoofing webfinger 2024-05-22 19:08:37 +01:00
floatingghost
76ded10a70 Merge pull request 'Backoff on HTTP requests when 429 is recieved' (#762) from backoff-http into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/762
2024-05-11 04:38:47 +00:00
Floatingghost
4457928e32 duct-tape fix for #438
we really need to make this less manual
2024-05-11 05:30:18 +01:00
Floatingghost
bd74693db6 additionally support retry-after values 2024-05-06 23:34:48 +01:00
Floatingghost
010e8c7bb2 where were you when lint fail 2024-04-26 19:28:01 +01:00
Floatingghost
f531484063 Merge branch 'develop' into backoff-http 2024-04-26 19:06:18 +01:00
Floatingghost
ec7e9da734 Correct ttl syntax for new cachex 2024-04-26 19:05:12 +01:00
FloatingGhost
3c384c1b76 Add ratelimit backoff to HTTP get 2024-04-26 19:01:12 +01:00
FloatingGhost
2437a3e9ba add test for backoff 2024-04-26 19:01:01 +01:00
FloatingGhost
ad7dcf38a8 Add HTTP backoff cache to respect 429s 2024-04-26 19:00:35 +01:00
Floatingghost
828158ef49 Merge remote-tracking branch 'oneric/fedfix-public-ld' into develop 2024-04-26 18:49:31 +01:00
Oneric
5ee0fb18cb exiftool: make stripped tags configurable 2024-04-26 18:57:24 +02:00
Oneric
a95af3ee4c exiftool: strip all non-essential tags
Documentation was already clear on this only stripping GPS tags.
But there are more potentially sensitive metadata tags (e.g. author
and possibly description) and the name alone suggests a broader effect.

Thus change the filter to strip all metadata except for colourspace info
and orientation (technically it strips everything and then readds
selected tags).

Explicitly stripping CommonIFD0 is needed since -all does not modify
IFD0 due to TIFF storing some actual image data there. CommonIFD0 then
strips a bunch of commonly used actual metadata tags from IFD0, to my
understanding leaving TIFF image data and custom metadata tags intact.
2024-04-25 23:00:42 +02:00
Oneric
163cb1d5e0 exiftool: strip JXL and HEIC
As of exiftool 12.57 both formats are supported, but EXIF data is
optional for JXL and if exiftool doesn’t find a preexisting metadata
chunk it will create one and treat it as a minor error resulting in
a non-zero exit code.
Setting -ignoreMinorErrors avoids failing on such uploads.
2024-04-25 23:00:42 +02:00
Oneric
b0a46c1e2e Normalise public adressing to fix federation
Due to JSON-LD compaction the full address of public scope
may also occur in shorter forms and the spec requires us to treat them
all equivalently. To save us the pain of repeatedly checking for all
variants internally, normalise inbound data to just one form.
See note at: https://www.w3.org/TR/activitypub/#public-addressing

This needs to happen very early, even before the other addressing fixes
else an earlier validator will reject the object. This in turn required
to move the list-tpye normalisation earlier as well, but since I was
unsure about putting empty lists into the data when no such field
existed before, I excluded this case and thus the later fixing had to be
kept as well.

Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/670
2024-04-25 18:45:16 +02:00
floatingghost
b1c6621e66 Merge pull request 'Read image description from EXIF data' (#744) from timorl/akkoma:elseinspe into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/744
2024-04-25 12:52:31 +00:00
floatingghost
764dbeded4 Merge pull request 'Accept all standard actor types' (#751) from Oneric/akkoma:all-actor-types into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/751
2024-04-24 17:09:02 +00:00
floatingghost
80e1c094c7 Merge pull request 'Don't strip newlines in pre' (#709) from snan/akkoma:pre into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/709
2024-04-24 17:00:34 +00:00
floatingghost
4a0e90e8a8 Merge pull request 'ReceiverWorker: Make sure non-{:ok, _} is returned as {:error, …}' (#753) from Oneric/akkoma:receive-worker-return into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/753
2024-04-24 17:00:18 +00:00
Oneric
83f75c3e93 Accept all standard actor types 2024-04-23 18:14:34 +02:00
Floatingghost
92168fa5a1 Merge remote-tracking branch 'origin/develop' into who-wants-to-yeet-c2s-i-want-to-yeet-c2s 2024-04-23 14:37:05 +01:00
Floatingghost
3e199242b0 remove upload_media from AP representation 2024-04-23 14:35:52 +01:00
Haelwenn (lanodan) Monnier
0c2f200b4d ReceiverWorker: Make sure non-{:ok, _} is returned as {:error, …}
Otherwise an error like `{:signature, {:error, {:error, :not_found}}}`
ends up considered a success.

Cherry-picked-from: a299ddb10e
2024-04-21 20:58:06 +02:00
timorl
09d3ccf770
Read description before stripping metadata 2024-04-19 20:51:54 +02:00
timorl
9da0fe930e
Format, but this time with a non-ancient version of elixir 2024-04-19 18:07:50 +02:00
timorl
2a9db73b4c
Merge branch 'develop' into elseinspe 2024-04-19 17:11:55 +02:00
Floatingghost
370576474c only consider :op and :id args in duplicate checks 2024-04-19 11:39:27 +01:00
Floatingghost
1ed975636b Keep READ endpoints, purge WRITE 2024-04-19 11:06:01 +01:00
timorl
cd7af81896
Rename StripLocation to StripMetadata for temporal-proofing reasons 2024-04-16 20:37:00 +02:00
Floatingghost
ddb8a5ef73 yeet AP C2S support
literally nothing uses C2S AP, and it's another route into core
systems which requires analysis and maintenance. A second API
is just extra surface for potentially bad things so let's take
it out back and obliterate it
2024-04-16 13:55:03 +01:00
Floatingghost
123db1abc4 Merge branch 'develop' into failed-fetch-processing 2024-04-16 12:35:54 +01:00
Floatingghost
b2c29527fb make xmerl shut up about markup 2024-04-16 10:19:30 +01:00
timorl
59d32c10d9
Formatting 2024-04-16 08:02:13 +02:00
Floatingghost
d2cee15c15 mix format says no 2024-04-16 03:07:28 +01:00
Floatingghost
d70fa16383 oban options should be a keyword list 2024-04-16 02:58:50 +01:00
Floatingghost
5043571084 Enable oban job uniqueness
by default just prevent job floods with a 1-seconds
uniqueness check, but override in RemoteFetcherWorker
for 5 minute uniqueness check over all states

:infinity is an option we can go for maybe at some point,
but that would prevent any refetches so maybe not idk.
2024-04-16 02:53:24 +01:00
Floatingghost
b7dd739de1 Make sure we return the right format for oban 2024-04-16 02:35:21 +01:00
timorl
b144218dce
Merge branch 'develop' into elseinspe 2024-04-14 20:31:33 +02:00
Floatingghost
2fc25980d1 fix pattern matching in fetch errors 2024-04-13 23:55:26 +01:00
floatingghost
c1f0b6b875 Merge pull request 'Accept body parameters for /api/pleroma/notification_settings' (#738) from Oneric/akkoma:notif-setting-parameters into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/738
2024-04-13 22:55:02 +00:00
Floatingghost
33fb74043d Bring our adjustments into line with atom-failure 2024-04-13 22:56:04 +01:00
Floatingghost
49ed27cd96 require logger 2024-04-13 22:25:31 +01:00
Mark Felder
2e369aef71 Allow the Remote Fetcher to attempt fetching an unreachable instance 2024-04-12 20:33:21 +01:00
Mark Felder
fed7a78c77 Oban jobs should be discarded on permanent errors 2024-04-12 20:33:17 +01:00
Mark Felder
c0532bcae0 Handle 401s as I have observed it in the wild 2024-04-12 20:33:11 +01:00
Mark Felder
ff515c05c3 Prevent requeuing Remote Fetcher jobs that exceed thread depth 2024-04-12 20:32:31 +01:00
Mark Felder
7e5004b3e2 Leverage existing atoms as return errors for the object fetcher 2024-04-12 20:32:13 +01:00
Mark Felder
53a9413b95 Formatting 2024-04-12 20:31:40 +01:00
Mark Felder
d69cba1b93 Remove duplicate log messages from Transmogrifier
Object fetch errors are logged in the fetcher module
2024-04-12 20:31:31 +01:00
Mark Felder
3c54f407c5 Conslidate log messages for object fetcher failures and leverage Logger.metadata 2024-04-12 20:30:38 +01:00
Mark Felder
825ae46bfa Set Logger level to error 2024-04-12 20:29:33 +01:00
Mark Felder
eeed051a0f Fix detection of user follower collection being private
We were overzealous with matching on a raw error from the object fetch that should have never been relied on like this. If we can't fetch successfully we should assume that the collection is private.

Building a more expressive and universal error struct to match on may be something to consider.
2024-04-12 20:29:11 +01:00
Mark Felder
30d63aaa6e Revert "Mark instances as unreachable when returning a 403 from an object fetch"
This reverts commit d472bafec19cee269e7c943bafae7c805785acd7.
2024-04-12 20:28:56 +01:00
Mark Felder
e2b04fac5a Skip remote fetch jobs for unreachable instances 2024-04-12 20:28:36 +01:00
Mark Felder
6d368808d3 Remove mistaken duplicate fetch 2024-04-12 20:28:31 +01:00
Mark Felder
132036f951 Cancel remote fetch jobs for deleted objects 2024-04-12 20:28:21 +01:00
Mark Felder
4ff22a409a Consolidate the HTTP status code checking into the private get_object/1 2024-04-12 20:28:16 +01:00
Mark Felder
4c29366fe5 Mark instances as unreachable when returning a 403 from an object fetch
This is a definite sign the instance is blocked and they are enforcing authorized_fetch
2024-04-12 20:27:33 +01:00
Mark Felder
ac4cc619ea Fix Transmogrifier tests
These tests relied on the removed Fetcher.fetch_object_from_id!/2 function injecting the error tuple into a log message with the exact words "Object containment failed."

We will keep this behavior by generating a similar log message, but perhaps this should do a better job of matching on the error tuple returned by Transmogrifier.handle_incoming/1
2024-04-12 20:26:56 +01:00
Mark Felder
c241b5b09f Remove Fetcher.fetch_object_from_id!/2
It was only being called once and can be replaced with a case statement.
2024-04-12 20:26:28 +01:00
floatingghost
6f3c955aa0 Merge pull request 'elixir1.16 testing' (#742) from elixir1.16 into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/742
2024-04-12 18:49:33 +00:00
floatingghost
024ffadd80 Merge pull request 'Don't list old accounts as aliases in WebFinger' (#713) from erincandescent/akkoma:no-old-account-alias into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/713
2024-04-12 18:34:14 +00:00
floatingghost
e2e4f53585 Merge pull request 'Use standard-compliant Accept header when fetching' (#740) from Oneric/akkoma:fetch_std-accept-hdr into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/740
2024-04-12 18:28:26 +00:00
Floatingghost
df25d86999 Cleaned up FEP-fffd commits a bit 2024-04-12 18:50:57 +01:00
floatingghost
4887df12d7 Merge pull request 'Allow for url to be a list' (#718) from helge/akkoma:develop into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/718
2024-04-12 17:39:38 +00:00
floatingghost
e6ca2b4d2a Merge pull request 'Fix array-less EmojiReacts' (#739) from Oneric/akkoma:tag-arrayless into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/739
2024-04-12 17:26:07 +00:00
floatingghost
6ba80aaff5 Merge pull request 'Check if data is visible before embedding it in OG tags' (#741) from ograph-restrictions into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/741
2024-04-12 17:22:59 +00:00
floatingghost
8e60177466 Merge pull request 'MRF.InlineQuotePolicy: Add link to post URL, not ID' (#733) from erincandescent/akkoma:quote-url into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/733
2024-04-12 17:02:52 +00:00
Erin Shepherd
75d9e2b375 MRF.InlineQuotePolicy: Add link to post URL, not ID
"id" is used for the canonical link to the AS2 representation of an object.
"url" is typically used for the canonical link to the HTTP representation.
It is what we use, for example, when following the "external source" link
in the frontend. However, it's not the link we include in the post contents
for quote posts.

Using URL instead means we include a more user-friendly URL for Mastodon,
and a working (in the browser) URL for Threads
2024-04-12 13:23:50 +02:00
Floatingghost
05f8179d08 check if data is visible before embedding it in OG tags
previously we would uncritically take data and format it into
tags for static-fe and the like - however, instances can be
configured to disallow unauthenticated access to these resources.

this means that OG tags as a vector for information leakage.

_technically_ this should only occur if you have both
restrict_unauthenticated *AND* you run static-fe, which makes no
sense since static-fe is for unauthenticated people in particular,
but hey ho.
2024-04-12 05:16:47 +01:00
Oneric
fae0a14ee8 Use standard-compliant Accept header when fetching
Spec says clients MUST use this header and servers MUST respond to it,
while servers merely SHOULD respond to the one we used before.
https://www.w3.org/TR/activitypub/#retrieving-objects

The old value is kept as a fallback since at least two years ago
not every implementation correctly dealt with the spec-compliant
variant, see: https://github.com/owncast/owncast/issues/1827

Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/730
2024-04-12 00:22:37 +02:00
Floatingghost
1135935cbe Merge remote-tracking branch 'oneric/ipv6' into develop 2024-04-11 20:59:49 +01:00
Oneric
bd74ad9ce4 Accept body parameters for /api/pleroma/notification_settings
This brings it in line with its documentation and akkoma-fe’s
expectations. For backwards compatibility URL parameters are still
accept with lower priority. Unfortunately this means duplicating
parameters and descriptions in the API spec.

Usually Plug already pre-merges parameters from different sources into
the plain 'params' parameter which then gets forwarded by Phoenix.
However, OpenApiSpex 3.x prevents this; 4.x is set to change this
  https://github.com/open-api-spex/open_api_spex/issues/334
  https://github.com/open-api-spex/open_api_spex/issues/92

Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/691
Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/722
2024-04-09 04:11:28 +02:00
Oneric
462225880a Accept EmojiReacts with non-array tag
JSON-LD compaction strips the array since it’s just one object

Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/720
2024-04-09 04:04:16 +02:00
Oneric
9598137d32 Drop base_url special casing in test env
61621ebdbc already explicitly added
the uploader base url to config/test.exs and it reduces differences
from prod.
2024-04-07 00:20:12 +02:00
floatingghost
554f19a9ed Merge pull request 'Refresh Users much more aggressively when processing Move activities' (#714) from erincandescent/akkoma:move-bust-cache into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/714
2024-04-03 10:03:14 +00:00
FloatingGhost
b5d97e7d85 Don't error out if we're not using the local uploader 2024-04-02 11:36:26 +01:00
FloatingGhost
f592090206 Fix tests that relied on no base_url in the uploader 2024-04-02 11:23:57 +01:00
FloatingGhost
61621ebdbc Add tests for extra warnings about media subdomains 2024-04-02 10:54:53 +01:00
FloatingGhost
4cd299bd83 Add extra warnings if the uploader is on the same domain as the main application 2024-04-02 10:20:59 +01:00
Erin Shepherd
464db9ea0b Don't list old accounts as aliases in WebFinger
Per the XRD specification:

> 2.4. Element <Alias>
>
> The <Alias> element contains a URI value that is an additional
> identifier for the resource described by the XRD. This value
> MUST be an absolute URI. The <Alias> element does not identify
> additional resources the XRD is describing, **but rather provides
> additional identifiers for the same resource.**

(http://docs.oasis-open.org/xri/xrd/v1.0/os/xrd-1.0-os.html#element.alias, emphasis mine)

In other words, the alias list is expected to link to things which are
not just semantically the same, but exactly the same. Old user accounts
don't do that

This change should not pose a compatibility issue: Mastodon does not
list old accounts here (See e1fcb02867/app/serializers/webfinger_serializer.rb (L12))

The use of as:alsoKnownAs is also not quite semantically right here
(see https://www.w3.org/TR/did-core/#dfn-alsoknownas, which defines
it to be used to refer to identifiers which are interchangable) but
that's what DID get for reusing a property definition that Mastodon
already squatted long before they got to it
2024-04-01 13:34:58 +02:00
FloatingGhost
2d439034ca Ensure that spoof-inserted does not time out 2024-03-30 12:55:22 +00:00
Oneric
0648d9ebaa Add mix tasks to detect spoofed posts and users
At least as far as we can
2024-03-26 16:05:20 -01:00