Skip to content

Conversation

@atovpeko
Copy link
Contributor

@atovpeko atovpeko commented Apr 4, 2025

No description provided.

@github-actions
Copy link

github-actions bot commented Apr 4, 2025

Allow 10 minutes from last push for the staging site to build. If the link doesn't work, try using incognito mode instead. For internal reviewers, check web-documentation repo actions for staging build status. Link to build for this PR: http://docs-dev.timescale.com/docs-285-docs-rfc-update-caggs-docs


In addition, materializing the most recent bucket might interfere with
[real-time aggregation][future-watermark].
and extends to the beginning or end of time. If you set `end_offset` within the current time bucket, and [real-time aggregation][future-watermark] is disabled, the current time bucket is excluded. This is to improve performance: for time-series data that mostly contains writes that occur in the time stamp order, the time buckets that see lots of writes quickly have out-of-date aggregates. You get better performance by excluding the time buckets that are getting a lot of writes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW: another reason is also that you cannot really refresh a partial bucket. You either compute the whole bucket or not at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added, thank you

Copy link
Contributor

@billy-the-fish billy-the-fish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had a go, but that sentence is still tricky.

* `schedule_interval`: the refresh interval in minutes or hours. Defaults to
24 hours.

If you set the `start_offset` or `end_offset` to `NULL`, the range is open-ended
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you set the `start_offset` or `end_offset` to `NULL`, the range is open-ended
If you set `start_offset` or `end_offset` to `NULL`, the range is open-ended


In addition, materializing the most recent bucket might interfere with
[real-time aggregation][future-watermark].
and extends to the beginning or end of time. If you set `end_offset` within the current time bucket, and [real-time aggregation][future-watermark] is disabled, the current time bucket is excluded. This is because the current bucket is incomplete and can't be refreshed. Excluding the current bucket also improves performance: for time-series data that mostly contains writes that occur in the time stamp order, the time buckets that see lots of writes quickly have out-of-date aggregates. You get better performance by excluding the time buckets that are getting a lot of writes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
and extends to the beginning or end of time. If you set `end_offset` within the current time bucket, and [real-time aggregation][future-watermark] is disabled, the current time bucket is excluded. This is because the current bucket is incomplete and can't be refreshed. Excluding the current bucket also improves performance: for time-series data that mostly contains writes that occur in the time stamp order, the time buckets that see lots of writes quickly have out-of-date aggregates. You get better performance by excluding the time buckets that are getting a lot of writes.
and extends to the beginning or end of time. If you set `end_offset` within the current time bucket while [real-time aggregation][future-watermark] is disabled, the current time bucket is excluded. Incomplete time buckets cannot be refreshed, this is because the current bucket is incomplete, also time buckets that see lots of writes quickly have out-of-date aggregates. Excluding the current bucket improves performance: time-series data mostly contains writes that occur in time stamp order, you get better performance by excluding the time buckets that are getting a lot of writes.

Co-authored-by: Iain Cox <iain@timescale.com>
Signed-off-by: Anastasiia Tovpeko <114177030+atovpeko@users.noreply.github.com>
@atovpeko atovpeko merged commit bb0d5e2 into latest Apr 15, 2025
3 checks passed
@atovpeko atovpeko deleted the 285-docs-rfc-update-caggs-docs branch April 15, 2025 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants