Skip to content

feat(replay): Stop replays when a segment could not be sent#6765

Merged
mydea merged 1 commit intomasterfrom
fn/replay-stop-on-error
Jan 16, 2023
Merged

feat(replay): Stop replays when a segment could not be sent#6765
mydea merged 1 commit intomasterfrom
fn/replay-stop-on-error

Conversation

@mydea
Copy link
Copy Markdown
Member

@mydea mydea commented Jan 13, 2023

After some investigation, we decided that we actually don't really need a queue for handling replay events, because this is already basically handled by flushing.

For context, the flow in replay is as follows:

  • When we try to flush (=send events), we wait debounced 5s
  • When the debounce is done, we start flushing the currently recorded events. For this we take the events from the EventBuffer (removing them from there), and create a new flushlock.
  • The flushlock resolves when the data has been successfully sent.
  • While a flushlock exists, no new flush will happen. Any existing data remains in the eventBuffer.

Because of this, no segment can be sent before the previous segment has finished being sent.

So instead of adding a new queue, the only change of this PR is to ensure we actually stop() the replay when we exceed the retry limit. This will ensure that no following segment will be sent, and we don't get any inconsistent replays. Instead, the replay will just end with the last successful segment.

Additionally, this adds a new client report category, retry_failed. I am not 100% sure if we can just add this like this or if this needs any changes on the backend?

closes #6639
closes #6671

@mydea mydea requested review from Lms24 and billyvg January 13, 2023 08:10
@mydea mydea self-assigned this Jan 13, 2023
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 13, 2023

size-limit report 📦

Path Size
@sentry/browser - ES5 CDN Bundle (gzipped + minified) 19.84 KB (-0.01% 🔽)
@sentry/browser - ES5 CDN Bundle (minified) 61.46 KB (0%)
@sentry/browser - ES6 CDN Bundle (gzipped + minified) 18.61 KB (+0.02% 🔺)
@sentry/browser - ES6 CDN Bundle (minified) 55 KB (0%)
@sentry/browser - Webpack (gzipped + minified) 20.37 KB (0%)
@sentry/browser - Webpack (minified) 66.54 KB (0%)
@sentry/react - Webpack (gzipped + minified) 20.4 KB (0%)
@sentry/nextjs Client - Webpack (gzipped + minified) 47.63 KB (0%)
@sentry/browser + @sentry/tracing - ES5 CDN Bundle (gzipped + minified) 26.82 KB (+0.02% 🔺)
@sentry/browser + @sentry/tracing - ES6 CDN Bundle (gzipped + minified) 25.25 KB (-0.01% 🔽)
@sentry/replay ES6 CDN Bundle (gzipped + minified) 43.35 KB (-0.81% 🔽)
@sentry/replay - Webpack (gzipped + minified) 38.58 KB (-0.95% 🔽)

Copy link
Copy Markdown
Member

@Lms24 Lms24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at the base transport and just that we're aware: In case we receive an error Http response from Sentry,

// We don't want to throw on NOK responses, but we want to at least log them
if (response.statusCode !== undefined && (response.statusCode < 200 || response.statusCode >= 300)) {
__DEBUG_BUILD__ && logger.warn(`Sentry responded with status code ${response.statusCode} to sent event.`);
}

we're currently not retrying anything in Replay, as the transport doesn't throw in this case. I think, keeping it this way (i.e. don't retry) is probably the safer option as it prevents thrashing Sentry with requests e.g. if we're down. However, I think we should nevertheless stop the replay. WDYT?

Additionally, this adds a new client report category, retry_failed. I am not 100% sure if we can just add this like this or if this needs any changes on the backend?

I checked the dev spec for client reports:

In case a reason needs to be added, it also has to be added to the allowlist in snuba.

So I guess, we still need to do something here. Let's check with Relay folks if this would cause errors if we already add it or if it would simply be ignored for the time being.

Comment thread packages/replay/src/replay.ts Outdated
@billyvg
Copy link
Copy Markdown
Member

billyvg commented Jan 13, 2023

Not sure about client reports but otherwise lgtm.

@mydea mydea force-pushed the fn/replay-stop-on-error branch from bae6b5e to aba648a Compare January 16, 2023 07:41
@mydea
Copy link
Copy Markdown
Member Author

mydea commented Jan 16, 2023

Note: I will extract the client_report stuff out of this PR, as well as the "handle bad API response" stuff, to keep this more streamlined.

Copy link
Copy Markdown
Member

@Lms24 Lms24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, thanks for extracting the two concerns.

@mydea mydea merged commit 56d3a74 into master Jan 16, 2023
@mydea mydea deleted the fn/replay-stop-on-error branch January 16, 2023 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replay: Stop session replay recording if unable to send recording Handle replay events in a queue

3 participants