Skip to content

Document delivery guarantees of Native SDK transports #1186

@supervacuus

Description

@supervacuus

Some sensible questions were raised in #1185, which led me to think that essential docs about the delivery guarantees (or lack thereof) and related trade-offs should be provided. My response is a sensible start but might need to be reframed to be put into the sentry-docs.

This ideally would end up in advanced usage since this is of particular interest to users who consider long offline periods or worry about us filling their users' disks.

With version 0.8.2, we enabled the crashpad retry mechanism on all platforms (upstream enables this feature only for iOS). However, this only affects the sending of crashes (when using the crashpad backend).

There is currently no mechanism in any of the Native SDK transports to attempt to guarantee delivery. If you capture any event and the network is offline, transport will discard the event once a request fails (unless this happens due to rate limiting).

1.If my sentry-native is deployed in an offline environment, will crashpad continuously attempt to send crash logs during the offline period, or will it detect when the network is back online before attempting to send the crash logs?

There is no attempt to detect whether the network is offline/online in crashpad, but with version 0.8.2, we enabled its retry mechanism that works in the following way:

  • if a crash report upload fails, it will be persisted for retry (whereas previously, it would marked as skipped due to upload error)
  • the retry will be attempted 5 times, after which the report will be discarded
  • the retry happens
    • on every restart of the crashpad_handler (typically the restart of your application) and
    • every 15 minutes after that

Your events will likely be discarded even with retries if you have a long offline interval.

2.If there is no network connection for an extended period and a large number of crashes occur, which are stored locally on the user's machine, is there a limit on the maximum number or size of stored crash files?

Besides discarding events via the retry mechanism after the five retries mentioned above, we trigger a cleanup of the crashpad database on every sentry_init(). Skipped or processed database entries will be removed if they are older than 2 days or if the entire database grows beyond 8MiB. Of course, if a particular crash dump is larger than 8MiB, it will stay on disk until the application initializes the SDK again (and the upload is successful or all retries fail).

Similarly, our internal database cleans up all old runs during sentry_init() after sending any remaining envelopes that might have persisted from the last run. There is no prune period or size boundary; if the files from a previous run were processed, they will be immediately deleted.

So, as you can see, we deliberately try not to fill the disks.

Or is there any mechanism to delete old crash files from the user's local machine?

We currently do not expose a mechanism that would safely delete old crash files on behalf of the user besides our automated removal on startup.

Originally posted by @supervacuus in #1185

Metadata

Metadata

Assignees

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions