federation/out: tweak publish retry backoff
With the current strategy the individual and cumulative backoff looks like this (the + part denotes max extra random delay): attempt backoff_single cumulative 1 16+30 16+30 2 47+60 63+90 3 243+90 ≈ 4min 321+180 4 1024+120 ≈17min 1360+300 ≈23+5min 5 3125+150 ≈20min 4500+450 ≈75+8min 6 7776+180 ≈ 2.1h 12291+630 ≈3.4h 7 16807+210 ≈ 4.6h 29113+840 ≈8h 8 32768+240 ≈ 9.1h 61896+1080 ≈17h 9 59049+270 ≈16.4h 120960+1350 ≈33h 10 100000+300 ≈27.7h 220975+1650 ≈61h We default to 5 retries meaning the least backoff runs with attempt=4. Therefore outgoing activiities might already be permanently dropped by a downtime of only 23 minutes which doesn't seem too implausible to occur. Furthermore it seems excessive to retry this quickly this often at the beginning. At the same time, we’d like to have at least one quick'ish retry to deal with transient issues and maintain reasonable federation responsiveness. If an admin wants to tolerate one -day downtime of remotes, retries need to be almost doubled. The new backoff strategy implemented in this commit instead switches to an exponetial after a few initial attempts: attempt backoff_single cumulative 1 16+30 16+30 2 143+60 159+90 3 2202+90 ≈37min 2361+180 ≈40min 4 8160+120 ≈ 2.3h 10521+300 ≈ 3h 5 77393+150 ≈21.5h 87914+450 ≈24h Initial retries are still fast, but the same amount of retries now allows a remote downtime of at least 40 minutes. Customising the retry count to 5 allows for whole-day downtimes.
This commit is contained in:
parent
74182abb5b
commit
4011d20dbe
2 changed files with 14 additions and 1 deletions
|
@ -9,7 +9,11 @@ defmodule Pleroma.Workers.PublisherWorker do
|
||||||
use Pleroma.Workers.WorkerHelper, queue: "federator_outgoing"
|
use Pleroma.Workers.WorkerHelper, queue: "federator_outgoing"
|
||||||
|
|
||||||
def backoff(%Job{attempt: attempt}) when is_integer(attempt) do
|
def backoff(%Job{attempt: attempt}) when is_integer(attempt) do
|
||||||
Pleroma.Workers.WorkerHelper.sidekiq_backoff(attempt, 5)
|
if attempt > 3 do
|
||||||
|
Pleroma.Workers.WorkerHelper.exponential_backoff(attempt, 9.5)
|
||||||
|
else
|
||||||
|
Pleroma.Workers.WorkerHelper.sidekiq_backoff(attempt, 6)
|
||||||
|
end
|
||||||
end
|
end
|
||||||
|
|
||||||
@impl Oban.Worker
|
@impl Oban.Worker
|
||||||
|
|
|
@ -22,6 +22,15 @@ def sidekiq_backoff(attempt, pow \\ 4, base_backoff \\ 15) do
|
||||||
trunc(backoff)
|
trunc(backoff)
|
||||||
end
|
end
|
||||||
|
|
||||||
|
def exponential_backoff(attempt, base, base_backoff \\ 15) do
|
||||||
|
backoff =
|
||||||
|
:math.pow(base, attempt) +
|
||||||
|
base_backoff +
|
||||||
|
:rand.uniform(2 * base_backoff) * attempt
|
||||||
|
|
||||||
|
trunc(backoff)
|
||||||
|
end
|
||||||
|
|
||||||
defmacro __using__(opts) do
|
defmacro __using__(opts) do
|
||||||
caller_module = __CALLER__.module
|
caller_module = __CALLER__.module
|
||||||
queue = Keyword.fetch!(opts, :queue)
|
queue = Keyword.fetch!(opts, :queue)
|
||||||
|
|
Loading…
Reference in a new issue