Introduce TSDB blocks compactor#1942
Conversation
thorfour
left a comment
There was a problem hiding this comment.
Looks really clean, well done.
pkg/compactor/compactor.go
Outdated
There was a problem hiding this comment.
this statement looks funny to my eyes. Can this be a switch with <- ctx.Done() ?
There was a problem hiding this comment.
You mean changing it to something like the following? If yes, what's the real benefit? To me it looks we're writing the same thing using more lines of code.
select {
case <-c.ctx.Done():
return c.ctx.Err()
default:
}
There was a problem hiding this comment.
Yea it just felt more idiomatic to me to write it that way. But I have no qualms leaving it the way it is.
There was a problem hiding this comment.
Personally I would only use select version if there were more things to select on. ctx.Err() returns non-nil value exactly when context is finished, so I think this is fine.
pkg/compactor/compactor.go
Outdated
There was a problem hiding this comment.
I've found this to be troublesome since we likely don't want per-user metrics anyways but want a rollup of all the metrics across users.
There was a problem hiding this comment.
I see your point. Don't have an answer yet. As stated in the PR description, I would prefer to defer it to a subsequent PR, to not block the compactor because of this.
What's your take? Do you have any idea to solve this?
There was a problem hiding this comment.
Oh definitely don't block this PR for that. I too punted on a solution for that for the other user wrapped thanos components. Just wanted to add my thoughts to the comment.
I'm not sure there can be a good solution for this without changing the Thanos code to only call the register function once.
There was a problem hiding this comment.
Oh definitely don't block this PR for that
👍
I'm not sure there can be a good solution for this without changing the Thanos code
An option - in Thanos - would be exposing compact.syncerMetrics and picking its instance in input in compact.NewSyncer() so what we could create it once in Cortex and pass the same syncerMetrics instance to multiple Syncer.
@bwplotka would you see feasible having such refactoring in Thanos, just to help Cortex? Would you see a better way to do it?
There was a problem hiding this comment.
We want to treat any Thanos package as a library, so if the use case of a struct in our package makes sense, then we are ok with it. (:
We can definitely allow passing metrics for syncer. Also, note that we changed syncer a bit and introduced block.MetadataFetcher. I assume you use different syncers because of many buckets?
e4c021f to
3daa7ee
Compare
codesome
left a comment
There was a problem hiding this comment.
Some initial review, haven't looked at tests yet.
pkg/compactor/compactor.go
Outdated
There was a problem hiding this comment.
Single line flag registration makes it much easier to find out the flag that you are looking for, though it makes a single line very long. For example: https://github.com/cortexproject/cortex/blob/3daa7eeebe49741746c66b503e0a29bb34c03f5a/pkg/ingester/ingester.go#L137-L147 as all the flags and their defaults are almost like in a table (when viewed without line wrapping).
aed461d to
cfa680a
Compare
|
@thorfour @pstibrany @codesome May you take another look and eventually approve if you don't have further comments and you believe the current PR is good to be merged, please? It would help as a signal for maintainers. |
codesome
left a comment
There was a problem hiding this comment.
A note for the future: We can parallelize at the user level if it becomes necessary.
65fc6e6 to
9746548
Compare
|
Integration tests are failing because of this unrelated issue (looks a temporary networking issue): |
pstibrany
left a comment
There was a problem hiding this comment.
Nice job. Looks good to me with some tiny nits (as usual :))
pkg/storage/tsdb/config.go
Outdated
There was a problem hiding this comment.
This is tricky. flag package supports calling same parameter multiple times, in which case it will also call Set method multiple times. More flag-compatible way of using it would therefore be: -compactor.block-ranges=2h -compactor.block-ranges=48h. It's ok if we want to support list flag, but we should probably still append ranges from multiple Set calls.
In either case, should these ranges be validated for being in correct order, before they are used?
There was a problem hiding this comment.
I guess I was hedging a little too much against obviousness of behaviour. I thought if you had multiple flags with duration, it wasn't obvious in which order they are applied. But looking back on it now, it's pretty obvious they'll be sorted by time. I think it's probably a good change to get rid of that, and just use multiple flags if desired.
There was a problem hiding this comment.
The clearing of the duration list was because the flag parse function is called twice from the main package, which was resulting in a double list being created, despite the user only submitting a single list.
There was a problem hiding this comment.
I like single comma-separated parameter option as well, but think that multiple parameters should be supported, and "append" is a correct option to use here, instead of overwrite.
About sorting -- where does that happen? I haven't found validation nor sorting (but haven't looked that hard)
There was a problem hiding this comment.
The only reason for double-parsing is that we need -config.file option to parse config file, and then reparse command line flags so that they take precendence. Easy to fix. I'll send separate PR. It came up here, because this Set method is using incorrect semantics (replacing instead of appending flags).
There was a problem hiding this comment.
Now that #1997 is merged (with subsequent fix as well), it would make sense to use *d = append(*d, values...) here, to make it work like standard flag. What do you think?
There was a problem hiding this comment.
Ah, I guess I'm too late. I can send separate PR later. :) Great to see this merged!
There was a problem hiding this comment.
I didn't wait to merge this PR before addressing this because it's not a regression introduced by this PR. I would be glad if you could take care of that 🙏
a2a6a7a to
825539d
Compare
|
Thanks @pstibrany for your review. I've addressed all the comments, except the one about |
825539d to
0bcd40c
Compare
pkg/compactor/compactor.go
Outdated
There was a problem hiding this comment.
Could you add the default values to the help text?
There was a problem hiding this comment.
There actually no need. When the --help is generated, defaults are displayed reading them from the values set:
-compactor.block-ranges value
Comma separated list of compaction ranges expressed in the time duration format (default 2h0m0s,12h0m0s,24h0m0s)
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
…ontext Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
0bcd40c to
56adef2
Compare
What this PR does:
Following this design doc, this PR introduces a new component - called
compactor- which uses the Thanos bucket compactor to compact TSDB blocks stored in the bucket.I've got the compactor running in a dev cluster and - as expected - it significantly reduce the memory used by the querier, due to better optimized Thanos index headers. For example, if the following chart (showing queriers memory usage), the compactor dropped from about 1.8K to 50 (1 per day, 50 days retention so far):
There are few things left out from this PR, that will be addressed in subsequent PRs:
TODOinpkg/compactor/compactor.go)Which issue(s) this PR fixes:
N/A
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]