Skip to content

Comments

WIP: bump Prometheus to v0.20.0-rc.0#51

Closed
simonpasquier wants to merge 121 commits intoopenshift:masterfrom
simonpasquier:aos-bump-v0.20.0-rc.0
Closed

WIP: bump Prometheus to v0.20.0-rc.0#51
simonpasquier wants to merge 121 commits intoopenshift:masterfrom
simonpasquier:aos-bump-v0.20.0-rc.0

Conversation

@simonpasquier
Copy link

This is only to test the first release candidate of v0.20.0 against the origin test suite and it shouldn't be merged.

ArthurSens and others added 30 commits May 11, 2020 19:54
Signed-off-by: arthursens <arthursens2005@gmail.com>
…ling. (prometheus#7342)

More consistent variable names.

Fixes prometheus#7298

Signed-off-by: Brian Brazil <brian.brazil@robustperception.io>
* increase the remote write bucket range

Increase the range of remote write buckets to capture times above 10s for laggy scenarios
Buckets had been: {.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10}
Buckets are now: {0.03125, 0.0625, 0.125, 0.25, 0.5, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512}

Signed-off-by: Bert Hartmann <berthartm@gmail.com>

* revert back to DefBuckets with addons to be backwards compatible

Signed-off-by: Bert Hartmann <berthartm@gmail.com>

* shuffle the buckets to maintain 2-2.5x increases

Signed-off-by: Bert Hartmann <berthartm@gmail.com>
This PR is about adding some unit tests for pkg/labels/labels.go.

Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
This was TODO because circleci was not in go1.13 yet.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
Fixes websocket-extensions security warning.

Signed-off-by: Ben Kochie <superq@gmail.com>
This PR is about adding some unit tests for pkg/labels/labels.go.

Signed-off-by: Hu Shuai <hus.fnst@cn.fujitsu.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
* Fixed incorrect query results caused by buffer reuse in merge adapter (prometheus#7361)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Cut 2.18.2 + cherry pick of query bugfix.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Co-authored-by: Marco Pracucci <marco@pracucci.com>
… Async Select (prometheus#7251)

* Add errors and Warnings to SeriesSet

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Change Querier interface and refactor accordingly

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Refactor promql/engine to propagate warnings at eval stage

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Address review issues

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Make sure all the series from all Selects are pre-advanced

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Address review issues

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Separate merge series sets

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Clean

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Refactor merge querier failure handling

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Refactored and simplified fanout with improvements from incoming chunk iterator PRs.

* Secondary logic is hidden, instead of weird failed series set logic we had.
* Fanout is well commented
* Fanout closing record all errors
* MergeQuerier improved API (clearer)
* deferredGenericMergeSeriesSet is not needed as we return no samples anyway for failed series sets (next = false).

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Fix formatting

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Fix CI issues

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Added final tests for error handling.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed Brian's comments.

* Moved hints in populate to be allocated only when needed.
* Used sync.Once in secondary Querier to achieve all-or-nothing partial response logic.
* Select after first Next is done will panic.

NOTE: in lazySeriesSet in theory we could just panic, I think however we can
totally just return error, it will panic in expand anyway.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Utilize errWithWarnings

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Fix recently introduced expansion issue

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Add tests for secondary querier error handling

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Implement lazy merge

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Add name to test cases

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Reorganize

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Address review comments

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Address review comments

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Remove redundant warnings

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Fix rebase mistake

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
…s with useLocalStorage(), add toggleAnnotations method, and add passing tests (prometheus#7374)

Signed-off-by: Lisa Carpenter <carpenter.lisa@gmail.com>
Previously `max` results stopped reading from results in tests
prematurely, as it stopped when `max` number of items were received from
the channel instead of `max` number of unique target groups received.
This caused flaky tests where the same target group was received
multiple times, as Kubernetes informers may emit the same event multiple
times.

Before this patch, running this test repeatedly failed eventually. After
this patch I have run the test many thousand times without failure.

```bash
go test -run TestEndpointsDiscoveryNamespaces -count 1000 -test.v
```

Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>
discovery/kubernetes: Fix incorrect premature break of reading results
Signed-off-by: Martin Lee <martin@martinlee.org>
PR prometheus#7338 was not rebased on top of master and interface had changed.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
Signed-off-by: Hrishikesh Barman <hrishikeshbman@gmail.com>
…etheus#7393)

* Fix off-by-one error in funcHistogramQuantile / ensureMonotonic
* Additional coverage for nonmonotonic histogram buckets

Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>
) (prometheus#6297)

* Fixed evaluation_time duration parsing in promtool unit tests (Fixes prometheus#6285)

Signed-off-by: Jordan Neufeld <jordan@neufeldtech.com>
* Optimized bstream reader

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed linter

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added license to new file

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed type cast

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Changed comments

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Improved comments and rolledback no-op changes

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed race condition

Signed-off-by: Marco Pracucci <marco@pracucci.com>
Sections with three backticks require a blank line before them.

Signed-off-by: Alex Vandiver <alex@chmrr.net>
…ntation since v0.16.0

Signed-off-by: Manuel Fontan <mfontangarcia@slack-corp.com>
change remote read queries total metric to a histogram and add read requests counter with status code

Signed-off-by: njingco <jingco.nicole@gmail.com>
sylr and others added 18 commits July 10, 2020 00:08
* Display dates as well as timestamps in the status page

Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>

* Trim trailing whitespaces

Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>
Note that by just running `make update-go-deps`, the K8s Go client was
set to `k8s.io/client-go v11.0.0+incompatible`. However, that doesn't
play well with `k8s.io/apimachinery v0.18.5`. I the manually changed
the Go client line to `k8s.io/client-go v0.18.5`, which made
everything work. I guess Go Modules got confused by the ginormous
v11.0.0 version tag. Or it is a problem that pulling k8s.io/client-go
with git results in a rather old repo without the v0.18.5
tag. github.com/kubernetes/client-go has all the right tags. I
actually don't understand how Go Modules still correctly figures out
the source from the `k8s.io/client-go v0.18.5` line.

If one of the reviewers could enlighten me, I'd much appreciate it.

Signed-off-by: beorn7 <beorn@grafana.com>
It was still RWLock but we never use the read lock..

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* Fix avg_over_time with Inf and NaN values

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
…nown symbol error. (prometheus#7560)

* Fixed race between compact (gc, populate) and head append causing unknown symbol error.

Fixes prometheus#7373

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Addressed comments.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Avoid empty mmap files by using .tmp files to write headers
Additionally, implement isolation in collectResultAppender.

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
* Replay m-map chunks irrespective of WAL

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* More logs

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: beorn7 <beorn@grafana.com>
v2.20.0-rc.0

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 17, 2020
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: simonpasquier

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 17, 2020
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
@simonpasquier
Copy link
Author

/close

The tests pass and after a quick look at the memory/CPU metrics of Prometheus, no regression detected. We'll wait for the official v2.20.0.

@openshift-ci-robot
Copy link

@simonpasquier: Closed this PR.

Details

In response to this:

/close

The tests pass and after a quick look at the memory/CPU metrics of Prometheus, no regression detected. We'll wait for the official v2.20.0.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@simonpasquier simonpasquier deleted the aos-bump-v0.20.0-rc.0 branch July 17, 2020 13:07
openshift-merge-bot bot pushed a commit that referenced this pull request Oct 9, 2025
When doing a config reload that need to stop some providers while also sending SIGTERM to Prometheus at the same time can sometimes hang

1: sync.WaitGroup.Wait [83 minutes] [Created by run.(*Group).Run in goroutine 1 @ group.go:37]
    sync         sema.go:110              runtime_SemacquireWaitGroup(*uint32(#166))
    sync         waitgroup.go:118         (*WaitGroup).Wait(*WaitGroup(#23))
    discovery    manager.go:276           (*Manager).ApplyConfig(#23, #167)
    main         main.go:964              main.func5(#120)
    main         main.go:1505             reloadConfig({#183, 0x1b}, 1, #40, #43, #50, {#31, 0xa, 0})
    main         main.go:1182             main.func22()
    run          group.go:38              (*Group).Run.func1(*Group(#26), #51)

Add a test for it.

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.