federation/out: tweak publish retry backoff

With the current strategy the individual and cumulative backoff looks like this (the + part denotes max extra random delay): attempt backoff_single cumulative 1 16+30 16+30 2 47+60 63+90 3 243+90 ≈ 4min 321+180 4 1024+120 ≈17min 1360+300 ≈23+5min 5 3125+150 ≈20min 4500+450 ≈75+8min 6 7776+180 ≈ 2.1h 12291+630 ≈3.4h 7 16807+210 ≈ 4.6h 29113+840 ≈8h 8 32768+240 ≈ 9.1h 61896+1080 ≈17h 9 59049+270 ≈16.4h 120960+1350 ≈33h 10 100000+300 ≈27.7h 220975+1650 ≈61h We default to 5 retries meaning the least backoff runs with attempt=4. Therefore outgoing activiities might already be permanently dropped by a downtime of only 23 minutes which doesn't seem too implausible to occur. Furthermore it seems excessive to retry this quickly this often at the beginning. At the same time, we’d like to have at least one quick'ish retry to deal with transient issues and maintain reasonable federation responsiveness. If an admin wants to tolerate one -day downtime of remotes, retries need to be almost doubled. The new backoff strategy implemented in this commit instead switches to an exponetial after a few initial attempts: attempt backoff_single cumulative 1 16+30 16+30 2 143+60 159+90 3 2202+90 ≈37min 2361+180 ≈40min 4 8160+120 ≈ 2.3h 10521+300 ≈ 3h 5 77393+150 ≈21.5h 87914+450 ≈24h Initial retries are still fast, but the same amount of retries now allows a remote downtime of at least 40 minutes. Customising the retry count to 5 allows for whole-day downtimes.
2025-03-17 19:37:54 +01:00 · 2025-03-17 19:37:54 +01:00 · 4011d20dbe
commit 4011d20dbe
parent 74182abb5b
2 changed files with 14 additions and 1 deletions
--- a/lib/pleroma/workers/publisher_worker.ex
+++ b/lib/pleroma/workers/publisher_worker.ex
@ -9,7 +9,11 @@ defmodule Pleroma.Workers.PublisherWorker do
  use Pleroma.Workers.WorkerHelper, queue: "federator_outgoing"
  def backoff(%Job{attempt: attempt}) when is_integer(attempt) do
-    Pleroma.Workers.WorkerHelper.sidekiq_backoff(attempt, 5)
+    if attempt > 3 do
      Pleroma.Workers.WorkerHelper.exponential_backoff(attempt, 9.5)
    else
      Pleroma.Workers.WorkerHelper.sidekiq_backoff(attempt, 6)
    end
  end
  @impl Oban.Worker
--- a/lib/pleroma/workers/worker_helper.ex
+++ b/lib/pleroma/workers/worker_helper.ex
@ -22,6 +22,15 @@ def sidekiq_backoff(attempt, pow \\ 4, base_backoff \\ 15) do
    trunc(backoff)
  end
  def exponential_backoff(attempt, base, base_backoff \\ 15) do
    backoff =
      :math.pow(base, attempt) +
        base_backoff +
        :rand.uniform(2 * base_backoff) * attempt
    trunc(backoff)
  end
  defmacro __using__(opts) do
    caller_module = __CALLER__.module
    queue = Keyword.fetch!(opts, :queue)