Commit graph

1334 commits

Author SHA1 Message Date
Floatingghost
ad52135bf5 Convert rich media backfill to oban task 2024-06-11 18:06:51 +01:00
Floatingghost
a360836ce3 fix oembed test 2024-06-09 21:17:12 +01:00
Mark Felder
765c7e98d2 Respect the TTL returned in OpenGraph tags 2024-06-09 17:36:15 +01:00
Mark Felder
bfe4152385 Increase the :max_body for Rich Media to 5MB
Websites are increasingly getting more bloated with tricks like inlining content (e.g., CNN.com) which puts pages at or above 5MB. This value may still be too low.
2024-06-09 17:34:29 +01:00
Mark Felder
5da9cbd8a5 RichMedia refactor
Rich Media parsing was previously handled on-demand with a 2 second HTTP request timeout and retained only in Cachex. Every time a Pleroma instance is restarted it will have to request and parse the data for each status with a URL detected. When fetching a batch of statuses they were processed in parallel to attempt to keep the maximum latency at 2 seconds, but often resulted in a timeline appearing to hang during loading due to a URL that could not be successfully reached. URLs which had images links that expire (Amazon AWS) were parsed and inserted with a TTL to ensure the image link would not break.

Rich Media data is now cached in the database and fetched asynchronously. Cachex is used as a read-through cache. When the data becomes available we stream an update to the clients. If the result is returned quickly the experience is almost seamless. Activities were already processed for their Rich Media data during ingestion to warm the cache, so users should not normally encounter the asynchronous loading of the Rich Media data.

Implementation notes:

- The async worker is a Task with a globally unique process name to prevent duplicate processing of the same URL
- The Task will attempt to fetch the data 3 times with increasing sleep time between attempts
- The HTTP request obeys the default HTTP request timeout value instead of 2 seconds
- URLs that cannot be successfully parsed due to an unexpected error receives a negative cache entry for 15 minutes
- URLs that fail with an expected error will receive a negative cache with no TTL
- Activities that have no detected URLs insert a nil value in the Cachex :scrubber_cache so we do not repeat parsing the object content with Floki every time the activity is rendered
- Expiring image URLs are handled with an Oban job
- There is no automatic cleanup of the Rich Media data in the database, but it is safe to delete at any time
- The post draft/preview feature makes the URL processing synchronous so the rendered post preview will have an accurate rendering

Overall performance of timelines and creating new posts which contain URLs is greatly improved.
2024-06-09 17:33:48 +01:00
Floatingghost
a924e117fd Add pool timeouts 2024-06-09 17:20:29 +01:00
Oneric
a3840e7d1f Raise minimum PostgreSQL version to 12
This lets us:
 - avoid issues with broken hash indices for PostgreSQL <10
 - drop runtime checks and legacy codepaths for <11 in db search
 - always enable custom query plans for performance optimisation

PostgreSQL 11 is already EOL since 2023-11-09, so
in theory everyone should already have moved on to 12 anyway.
2024-06-07 16:21:09 +02:00
Norm
8ae54b260a Remove remaining Dokku files 2024-04-29 13:45:58 -04:00
Oneric
5ee0fb18cb exiftool: make stripped tags configurable 2024-04-26 18:57:24 +02:00
Oneric
9598137d32 Drop base_url special casing in test env
61621ebdbc already explicitly added
the uploader base url to config/test.exs and it reduces differences
from prod.
2024-04-07 00:20:12 +02:00
FloatingGhost
61621ebdbc Add tests for extra warnings about media subdomains 2024-04-02 10:54:53 +01:00
Oneric
31f90bbb52 Register APNG MIME type
The newest git HEAD of MIME already knows about APNG, but this
hasn’t been released yet. Without this, APNG attachments from
remote posts won’t display as images in frontends.

Fixes: akkoma#657
2024-03-26 15:44:44 -01:00
Oneric
bcc528b2e2 Never automatically assign privileged content types
By mapping all extensions related to our custom privileged types
back to innocuous text/plain, our custom types will never automatically
be inserted which was one of the factors making impersonation possible.

Note, this does not invalidate the upload and emoji Content-Type
restrictions from previous commits. Apart from counterfeit AP objects
there are other payloads with standard types this protects against,
e.g. *.js Javascript payloads as used in prior frontend injections.
2024-03-18 22:33:10 -01:00
Oneric
0ec62acb9d Always insert Dedupe upload filter
This actually was already intended before to eradict all future
path-traversal-style exploits and to fix issues with some
characters like akkoma#610 in 0b2ec0ccee. However, Dedupe and
AnonymizeFilename got mixed up. The latter only anonymises the name
in Content-Disposition headers GET parameters (with link_name),
_not_ the upload path.

Even without Dedupe, the upload path is prefixed by an UUID,
so it _should_ already be hard to guess for attackers. But now
we actually can be sure no path shenanigangs occur, uploads
reliably work and save some disk space.

While this makes the final path predictable, this prediction is
not exploitable. Insertion of a back-reference to the upload
itself requires pulling off a successfull preimage attack against
SHA-256, which is deemed infeasible for the foreseeable futures.

Dedupe was already included in the default list in config.exs
since 28cfb2c37a, but this will get overridde by whatever the
config generated by the "pleroma.instance gen" task chose.

Upload+delete tests running in parallel using Dedupe might be flaky, but
this was already true before and needs its own commit to fix eventually.
2024-03-18 22:33:10 -01:00
Oneric
f7c9793542 Sanitise Content-Type of uploads
The lack thereof enables spoofing ActivityPub objects.

A malicious user could upload fake activities as attachments
and (if having access to remote search) trick local and remote
fedi instances into fetching and processing it as a valid object.

If uploads are hosted on the same domain as the instance itself,
it is possible for anyone with upload access to impersonate(!)
other users of the same instance.
If uploads are exclusively hosted on a different domain, even the most
basic check of domain of the object id and fetch url matching should
prevent impersonation. However, it may still be possible to trick
servers into accepting bogus users on the upload (sub)domain and bogus
notes attributed to such users.
Instances which later migrated to a different domain and have a
permissive redirect rule in place can still be vulnerable.
If — like Akkoma — the fetching server is overly permissive with
redirects, impersonation still works.

This was possible because Plug.Static also uses our custom
MIME type mappings used for actually authentic AP objects.

Provided external storage providers don’t somehow return ActivityStream
Content-Types on their own, instances using those are also safe against
their users being spoofed via uploads.

Akkoma instances using the OnlyMedia upload filter
cannot be exploited as a vector in this way — IF the
fetching server validates the Content-Type of
fetched objects (Akkoma itself does this already).

However, restricting uploads to only multimedia files may be a bit too
heavy-handed. Instead this commit will restrict the returned
Content-Type headers for user uploaded files to a safe subset, falling
back to generic 'application/octet-stream' for anything else.
This will also protect against non-AP payloads as e.g. used in
past frontend code injection attacks.

It’s a slight regression in user comfort, if say PDFs are uploaded,
but this trade-off seems fairly acceptable.

(Note, just excluding our own custom types would offer no protection
 against non-AP payloads and bear a (perhaps small) risk of a silent
 regression should MIME ever decide to add a canonical extension for
 ActivityPub objects)

Now, one might expect there to be other defence mechanisms
besides Content-Type preventing counterfeits from being accepted,
like e.g. validation of the queried URL and AP ID matching.
Inserting a self-reference into our uploads is hard, but unfortunately
*oma does not verify the id in such a way and happily accepts _anything_
from the same domain (without even considering redirects).
E.g. Sharkey (and possibly other *keys) seem to attempt to guard
against this by immediately refetching the object from its ID, but
this is easily circumvented by just uploading two payloads with the
ID of one linking to the other.

Unfortunately *oma is thus _both_ a vector for spoofing and
vulnerable to those spoof payloads, resulting in an easy way
to impersonate our users.

Similar flaws exists for emoji and media proxy.

Subsequent commits will fix this by rigorously sanitising
content types in more areas, hardening our checks, improving
the default config and discouraging insecure config options.
2024-03-18 22:33:10 -01:00
Oneric
8f1776a8a7 Purge leftovers from FollowBot MRF
It was dropped in 9db4c2429f
2024-02-19 23:13:05 +01:00
Oneric
3b0714c4fd Fix SimplePolicy blocking account updates
This fixes an oversight in e99e2407f3
which added background_removal as a possible SimplePolicy setting.
However, it did _not_ add a default value to the base config and
as it turns out instance_list doesn’t handle unset options well.

In effect this caused federating instances with SimplePolicy enabled
but background_removal not explicitly configured to always trip up for
outgoing account updates in check_background_removal (and incoming
updates from Sharkey).
For added ""fun"" this error was able to block account updates made
e.g. via /api/v1/accounts/update_credentials.

Tests were unaffected since they explicitly override
all relevant config options.

Set a default to avoid all this
(note to self: don’t forget next time, baka!)
2024-02-17 03:10:05 +01:00
Oneric
5f7d47dcb7 Drop obolete chat/shoutbox config options
Their functions were purged in 0f132b802d
2024-02-11 05:15:02 +01:00
FloatingGhost
6cb40bee26 Migrate to phoenix 1.7 (#626)
Closes #612

Co-authored-by: tusooa <tusooa@kazv.moe>
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/626
Co-authored-by: FloatingGhost <hannah@coffee-and-dreams.uk>
Co-committed-by: FloatingGhost <hannah@coffee-and-dreams.uk>
2023-08-15 10:22:18 +00:00
FloatingGhost
e59fc0677b Update mime dep 2023-08-07 04:07:42 +01:00
FloatingGhost
98cb255d12 Support elixir1.15
OTP builds to 1.15

Changelog entry

Ensure policies are fully loaded

Fix :warn

use main branch for linkify

Fix warn in tests

Migrations for phoenix 1.17

Revert "Migrations for phoenix 1.17"

This reverts commit 6a3b2f15b74ea5e33150529385215b7a531f3999.

Oban upgrade

Add default empty whitelist

mix format

limit test to amd64

OTP 26 tests for 1.15

use OTP_VERSION tag

baka

just 1.15

Massive deps update

Update locale, deps

Mix format

shell????

multiline???

?

max cases 1

use assert_recieve

don't put_env in async tests

don't async conn/fs tests

mix format

FIx some uploader issues

Fix tests
2023-08-03 17:44:09 +01:00
XxXCertifiedForkliftDriverXxX
07b478dc49 Implement blocklists for MediaProxy 2023-06-26 15:18:31 +02:00
FloatingGhost
037f881187 Fix create processing in direct message disabled 2023-05-23 13:16:20 +01:00
FloatingGhost
f2b4e7f86b Merge branch 'develop' of akkoma.dev:AkkomaGang/akkoma into develop 2023-04-14 17:56:56 +01:00
FloatingGhost
522221f7fb Mix format 2023-04-14 17:56:34 +01:00
floatingghost
8c86a06ed1 Merge pull request 'Remove "default" image description' (#493) from ilja/akkoma:remove_default_image_description into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/493
2023-04-14 16:27:41 +00:00
FloatingGhost
4c9c959bb3 Merge branch 'develop' into frontend-switcher-9000 2023-04-14 16:56:10 +01:00
sadposter
0151ca1d52 Revert "Remove indexer plugin"
This reverts commit 1d94f2a424.
2023-03-29 03:32:30 +01:00
FloatingGhost
1d94f2a424 Remove indexer plugin 2023-03-29 01:59:19 +01:00
FloatingGhost
de64c6c54a add selection UI 2023-03-28 12:44:52 +01:00
FloatingGhost
4bbe9c8f5c Ship with hehe 2023-03-27 10:03:12 +01:00
FloatingGhost
f94e8a3713 add bubble visibility to description 2023-03-18 20:49:43 +00:00
FloatingGhost
dd44387f1a Add timeline visibility options 2023-03-17 15:33:28 +00:00
FloatingGhost
3d964a9970 Add frontend preference route 2023-03-12 23:24:07 +00:00
ilja
6c396fcab4 Remove "default" image description
When no image description is filled in, Pleroma allowed fallbacks.
Those were (based on a setting) either the filename, or a fixed description.
Neither are good options for image descriptions imo, so here we remove this.

Note that there's two tests removed who supposedly tested something else.
But examining closer, they didn't seem to test what they claimed to test,
so I removed them rather than try to "fix" them.
2023-03-12 08:42:33 +01:00
flisk
292f0444d0 update healthcheck route in locale string 2023-02-18 14:59:46 +01:00
Seirdy
676cc0d0d7
Make default outgoing-blocks setting off
This should help mitigate negative impacts related to block-retaliation
and block-circumvention when blocks become visible to the blocked party.
Instances interested in broadcasting blocks can turn this on if they
wish. This should have always been the default.

See also: https://akkoma.dev/AkkomaGang/akkoma-fe/pulls/274
2023-01-26 22:01:22 -08:00
FloatingGhost
07ccfafd92 Mix format 2022-12-19 13:07:29 +00:00
ilja
c092fc9fd6 Add translation module for Argos Translate (#351)
Argos Translate is a Python module for translation and can be used as a command line tool.

This is also the engine for LibreTranslate, for which we already have a module.
Here we can use the engine directly from our server without doing requests to a third party or having to install our own LibreTranslate webservice (obviously you do have to install Argos Translate).

One thing that's currently still missing from Argos Translate is auto-detection of languages (see <https://github.com/argosopentech/argos-translate/issues/9>). For now, when no source language is provided, we just return the text unchanged, supposedly translated from the target language. That way you get a near immediate response in pleroma-fe when clicking Translate, after which you can select the source language from a dropdown.

Argos Translate also doesn't seem to handle html very well. Therefore we give admins the option to strip the html before translating. I made this an option because I'm unsure if/how this will change in the future.

Co-authored-by: ilja <git@ilja.space>
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/351
Co-authored-by: ilja <akkoma.dev@ilja.space>
Co-committed-by: ilja <akkoma.dev@ilja.space>
2022-12-19 13:06:39 +00:00
FloatingGhost
52d8183787 drop admin scopes on create app instead of rejecting 2022-12-17 23:14:49 +00:00
FloatingGhost
dcac8adb3d Add option to modify HTTP pool size 2022-12-16 18:33:00 +00:00
FloatingGhost
126f1ca69c increase rich media backoff time 2022-12-16 17:31:04 +00:00
Paul Dawson
eb9ef59d50 Remove legacy references to FE that is not officially supported 2022-12-16 08:08:00 -06:00
FloatingGhost
48d302a60f allow disabling prometheus entirely 2022-12-16 11:17:04 +00:00
FloatingGhost
f752126427 Remove quack, ensure adapter is finch 2022-12-11 23:22:35 +00:00
FloatingGhost
affc910372 Remove hackney/gun in favour of finch 2022-12-11 19:19:31 +00:00
FloatingGhost
6f83ae27aa extend reject MRF to check if originating instance is blocked 2022-12-09 19:57:29 +00:00
FloatingGhost
0eaec57d3f mix format 2022-12-09 10:24:38 +00:00
FloatingGhost
4e4bd24813 Add misskey markdown to format suggestions
Fixes #345
2022-12-07 15:39:19 +00:00
floatingghost
6b882a2c0b Purge Rejected Follow requests in daily task (#334)
Co-authored-by: FloatingGhost <hannah@coffee-and-dreams.uk>
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/334
2022-12-03 23:17:43 +00:00