Regarding retries and timeouts: We should have both sane default behavior and everything controllable if user wants to fine-tune.
We need:
- retry with exponential backoff, maybe capped, with jitter
- timeout: both on each network call, and globally on the retry: we don't want to wait indefinitely because of infinite retry.