Skip to content

feat: [Trace Stats] Add skeleton of concentrator#842

Merged
lym953 merged 9 commits intomainfrom
yiming.luo/trace-stats-5
Sep 19, 2025
Merged

feat: [Trace Stats] Add skeleton of concentrator#842
lym953 merged 9 commits intomainfrom
yiming.luo/trace-stats-5

Conversation

@lym953
Copy link
Copy Markdown
Contributor

@lym953 lym953 commented Sep 17, 2025

This PR

Next steps

  • Implement StatsConcentrator, which aggregates stats data into buckets and returns it in batch
  • Add more fields to AggregationKey and Stats
  • Move the processing of stats after "obfuscation", as suggested by APM team. This will involve lots of code changes, so I'll make it a separate PR.

I'll mainly move code from this draft PR: #827

Architecture

image

Jira: https://datadoghq.atlassian.net/browse/SVLS-7593

Base automatically changed from yiming.luo/trace-stats-4 to main September 17, 2025 16:29
@lym953 lym953 force-pushed the yiming.luo/trace-stats-5 branch 2 times, most recently from 265fcd7 to 7226500 Compare September 17, 2025 18:31
@lym953 lym953 marked this pull request as ready for review September 17, 2025 18:36
@lym953 lym953 requested a review from a team as a code owner September 17, 2025 18:36
@lym953
Copy link
Copy Markdown
Contributor Author

lym953 commented Sep 18, 2025

@lym953 lym953 marked this pull request as draft September 18, 2025 20:55
@lym953 lym953 marked this pull request as ready for review September 19, 2025 17:01
Comment on lines +10 to +26
#[derive(Debug)]
pub enum StatsError {
SendError(mpsc::error::SendError<ConcentratorCommand>),
RecvError(oneshot::error::RecvError),
}

impl From<mpsc::error::SendError<ConcentratorCommand>> for StatsError {
fn from(err: mpsc::error::SendError<ConcentratorCommand>) -> Self {
StatsError::SendError(err)
}
}

impl From<oneshot::error::RecvError> for StatsError {
fn from(err: oneshot::error::RecvError) -> Self {
StatsError::RecvError(err)
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could probably use thiserror to avoid the boilerplate

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we should have a submodule called stats for all of this stuff, instead of having it as stats_...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might do it in a separate PR

@lym953 lym953 force-pushed the yiming.luo/trace-stats-5 branch from 3c1fe3e to 46efff5 Compare September 19, 2025 17:38
@lym953 lym953 merged commit 58110c0 into main Sep 19, 2025
46 checks passed
@lym953 lym953 deleted the yiming.luo/trace-stats-5 branch September 19, 2025 18:05
lym953 added a commit that referenced this pull request Sep 22, 2025
## This PR
1. Move stats generation after trace obfuscation, which is the correct
order as suggested by Trace Agent team. Right now stats generation is
before trace obfuscation.
2. Also generate trace stats for OTLP agent. Right now we only do it for
trace agent.

## Architecture
Copied from #842
<img width="1296" height="674" alt="image"
src="https://github.com/user-attachments/assets/2d4cb925-6cfc-4581-8ed6-6bd87cf0d87a"
/>

## Testing
Tested in the next PR
#856, which
implements stats concentrator. Trace stats appeared in Datadog.
<img width="538" height="317" alt="image"
src="https://github.com/user-attachments/assets/48b849cc-2413-41d5-8576-5ff657c21a0f"
/>


## Next steps
1. Implement `StatsConcentrator`
2. Rename for clarity:
  - `SendingTraceStatsProcessor` -> `TraceStatsGenerator`
  - `stats_sender` -> `stats_generator`
3. Small refactor: consider passing around `stats_sender` instead of
`stats_concentrator_handle`. Right now
`SendingTraceStatsProcessor::new()` is called in three places. It might
be possible to call it only once then pass it around.

## Notes
Jira: https://datadoghq.atlassian.net/browse/SVLS-7593
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants