Change the router reload supression, and add some more controls#16564
Change the router reload supression, and add some more controls#16564knobunc wants to merge 4 commits intoopenshift:masterfrom
Conversation
|
Code looks ok. The flow is actually a more clear this way. |
204995f to
cc3edd2
Compare
|
@imcsk8 Thanks for the review. I plan to add some tests soon. |
70a6561 to
d09631e
Compare
|
@openshift/networking @openshift/sig-networking FYI |
902bd6f to
929af8f
Compare
|
Can we use the controls' default value (RELOAD_INTERVAL=5s, RELOAD_GAP=100ms, RELOAD_EVENT_WAIT=10ms) to verify the bug https://bugzilla.redhat.com/show_bug.cgi?id=1471899 ? If not can you give the recommended value for that scenario of thousands routes? And my understanding for this commit is: the max reload duration will be limited to 4890ms (5s-100ms-10ms), right ? |
|
@lihongan those values would work. You could drop RELOAD_INTERVAL to 1s and RELOAD_GAP and RELOAD_EVENT_WAIT to 0ms if you wanted. That might make it easier to reproduce and validate. The duration is the longest of the times. So if a reload is desired, then it can reload if (and only if):
So, the longest possible reload time is really unknowable. If the time taken to actually perform the haproxy reload is longer than the RELOAD_INTERVAL, it will be the actual reload time + RELOAD_GAP. Otherwise, if the actual reload time is less than RELOAD_INTERVAL - RELOAD_GAP, then it will reload at RELOAD_INTERVAL. RELOAD_EVENT_WAIT will likely be ignored on a busy system since if you have events coming in all the time that would trigger a reload, then a reload will be requested long before RELOAD_INTERVAL and RELOAD_GAP will allow it. On the other hand, on a system with few route or endpoint changes, then before, the reload would happen immediately as soon as the first event was delivered, and the second event would have to wait for RELOAD_INTERVAL before it took effect, even though it was probably a route or endpoint change related to the first event, and the first event may not be useful without it. So, we now allow the router to hesitate to gather more events before actually triggering the reload. |
a6ee50f to
9f2f273
Compare
9f2f273 to
2cb02bb
Compare
| value = time.Duration(defaultReloadEventWait * time.Millisecond) | ||
| } | ||
| return value | ||
| } |
There was a problem hiding this comment.
Kinda seems like these could be one function that takes the env var name and the default interval as arguments.
| glog.V(4).Infof("Router state synchronized for the first time") | ||
| r.synced = true | ||
| r.stateChanged = true | ||
| if hasSynced { |
There was a problem hiding this comment.
I'm confused... if hasSynced = true that implies the state is fully synchronized, but then below r.synced can be false? maybe there's a better name.
There was a problem hiding this comment.
If the user tells us we have synced, but our internal state doesn't yet reflect that... update the internal state and emit a message
| // We need to track the earliest commit time so we can do burst supression. | ||
| now := time.Now() | ||
| r.commitReqTime = &now | ||
| r.lock.Unlock() |
There was a problem hiding this comment.
What if you just defer r.lock.Unlock() at the top, and then in this block do defer r.commitWorker() so the locking is a bit simpler? The second defer should run after the unlock.
There was a problem hiding this comment.
We can't do it that way because the unlock would be called after the commitWorker because defers are called in reverse order from definition.
pkg/router/template/router.go
Outdated
| lastReloadEnd time.Time | ||
| // commitReqTime is nil if no commit is needed, otherwise it is the time the commit was requested | ||
| commitReqTime *time.Time | ||
| // commitRunning indicates whether there is the commitFunc is actively running |
There was a problem hiding this comment.
nit "commitRunning indicates whether the commitFunc is actively running"
0ed02ab to
5b50019
Compare
DirectXMan12
left a comment
There was a problem hiding this comment.
couple of nits, otherwise good
pkg/cmd/infra/router/template.go
Outdated
| const defaultReloadInterval = 5 | ||
|
|
||
| // defaultReloadGap is the minimum gap between a reload end and start in milliseconds. | ||
| const defaultReloadGap = 100 |
There was a problem hiding this comment.
these should just be time.Durations too...
pkg/router/template/router.go
Outdated
| commitRunning bool | ||
| // commitTimer is the timer we use to make callbacks to the delayed commits | ||
| commitTimer *time.Timer | ||
| // The minimum time between the starts of reloads |
There was a problem hiding this comment.
godoc: reloadInterval is the minimum...
There was a problem hiding this comment.
ditto the next few fields
pkg/router/template/router.go
Outdated
| untilNextCallback = getNextReload(timeUntilNextAction(now, r.lastReloadEnd, r.reloadGap)) | ||
| untilNextCallback = getNextReload(timeUntilNextAction(now, *r.commitReqTime, r.reloadEventWait)) | ||
|
|
||
| if untilNextCallback > 0*time.Second { |
There was a problem hiding this comment.
you don't need * time.Second here, I believe
| if r.commitTimer == nil { | ||
| r.commitTimer = time.AfterFunc(untilNextCallback, r.commitWorker) | ||
| } else { | ||
| r.commitTimer.Reset(untilNextCallback) |
There was a problem hiding this comment.
you should note the reason for the reset here as well ("we reset to give a chance to grab extra changes before a reload, but the maxReloadInterval above prevents us from resetting forever").
There was a problem hiding this comment.
Clarified in a comment.
The main purpose is to change the locking so that a reload doesn't hold a lock of the router object for the duration of the reload so that updates that happen while the router is reloaded can be processed immediately and batched up and included when the next reload occurs. Before this, if a reload went longer than the reload interval, one event would be processed, and then trigger a new reload. Which would then lock out other event processing. So the router would not make any meaningful progress consuming events, and spend a lot of time and CPU reloading continually. The router controls are: - $RELOAD_INTERVAL - Max Reload Frequency (start-to-start of a reload), default 5s - $RELOAD_GAP - Gaps between reloads (end-to-start of a reload), default 100ms - $RELOAD_EVENT_WAIT - Bunch events by waiting a little before reload (wait N ms between the time the event comes in and when the reload starts), default 10ms The commit routine is now changed to have a top and bottom half. The top half checks to see if a reload is scheduled, and if it is, it returns immediately. If there's no reload scheduled then it calls the bottom half and returns. The bottom half is in charge of determining if it can immediately reload or if it has to wait. If it must wait, then it works out the earliest time it can reload and schedules a callback to itself for that time. If it determines it can reload, then it runs the reload code immediately. When the reload is complete, it calls itself again to make sure there was no other pending reload that had come in while the reload was running.
These are the autogenerated shell completion files that correspond to the new router reload command line parameters.
Added new environment variables to help with debugging: - OPENSHIFT_LOG_LEVEL: Defaults to 4, but sets the debug level to the given value - OPENSHIFT_GET_ALL_DOCKER_LOGS: A boolean that enables dumping of all container logs if any container failed (rather than just giving the logs from the failure)
The reload constants were previously integers that were multipled by a duration constant later. This is confusing since the units are different, and the environment variables are duration strings. This patch also changes the way the constants are used and moves them inline in the parameter definitions with a helper function to translate to durations.
5b50019 to
4099687
Compare
|
/approve |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: knobunc Assign the PR to them by writing The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
|
/assign @smarterclayton |
| flag.StringVar(&o.TemplateFile, "template", util.Env("TEMPLATE_FILE", ""), "The path to the template file to use") | ||
| flag.StringVar(&o.ReloadScript, "reload", util.Env("RELOAD_SCRIPT", ""), "The path to the reload script to use") | ||
| flag.DurationVar(&o.ReloadInterval, "interval", reloadInterval(), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the last reload time.") | ||
| flag.DurationVar(&o.ReloadInterval, "interval", getDurationEnv("RELOAD_INTERVAL", 5*time.Second), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the time the last reload started.") |
| flag.StringVar(&o.ReloadScript, "reload", util.Env("RELOAD_SCRIPT", ""), "The path to the reload script to use") | ||
| flag.DurationVar(&o.ReloadInterval, "interval", reloadInterval(), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the last reload time.") | ||
| flag.DurationVar(&o.ReloadInterval, "interval", getDurationEnv("RELOAD_INTERVAL", 5*time.Second), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the time the last reload started.") | ||
| flag.DurationVar(&o.ReloadGap, "reload-gap", getDurationEnv("RELOAD_GAP", 100*time.Millisecond), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the time the last reload ended.") |
There was a problem hiding this comment.
How often are we really going to tune these? Adding lots of extra tunables comes with a cost - do these really need to be changeable or can they just be proportional to the overall duration?
Every tunable we add is more choices that someone can get wrong. Also, I couldn't understand how the flags are used from the descriptions, which is a problem.
| return fmt.Errorf("invalid reload interval: %v - must be a positive duration", nsecs) | ||
| } | ||
|
|
||
| if o.ReloadGap < 0 { |
There was a problem hiding this comment.
This should be in validate
There was a problem hiding this comment.
Should the other checks in here be in validate too? Or is there some pattern that I don't grok fully.
| // lock is a mutex used to prevent concurrent router reloads. | ||
| // lock is a mutex used to prevent concurrent router state updates | ||
| lock sync.Mutex | ||
| // Track the start and end of router reloads |
There was a problem hiding this comment.
Oh wow. This is a LOT of variables. This should be abstracted into some sort of separate object that is passed in because the rest of the template router has nothing to do with it.
There was a problem hiding this comment.
Please hide this complexity so that it doesn't get entangled with other places.
| // | ||
| // Note: If this is changed FakeCommit() in fake.go should also be updated | ||
| func (r *templateRouter) Commit() { | ||
| func (r *templateRouter) realCommit(hasSynced bool) { |
There was a problem hiding this comment.
Wow.
So this is incredibly complicated and really needs to be abstracted into something smaller and more well tested on its own. The original point of the "commit wrapper" was to have a function that you invoke that does the right thing, and then the rest of it is ignored. This has now become entangled with the template router.
Please separate all of this logic out into three pieces - the signal (from the rest of the template router to reload), the rate limiter / tracker (the bit that says when an actual commit should go), and the underlying commit action that controls how fast it happens.
I.e.:
- router controller signals (changes happened)
- a rate limiter / commit controller decides whether to pass it on (this code)
- the underlying commit.
There really cannot be a connection between them, and the code here is too complicated for me to be confident it works.
There was a problem hiding this comment.
@liggitt can you help suggest here? i think of this as controller model - several caches:
informer cache -> feeding secondary cache -> controller loop that reads secondary cache and needs to sync up with secondary cache
There was a problem hiding this comment.
I think I have worked out a possible solution.
| keyFunc := func(_ interface{}) (string, error) { | ||
| return "templaterouter", nil | ||
| } | ||
| // Do an immediate commit so that we can reply to health checks before the state syncs |
There was a problem hiding this comment.
does this make the router reply that it is healthy prior to having synced? is that correct? couldn't that cause traffic to be directed to it prior to actually having all the routes set up?
There was a problem hiding this comment.
We already cover that behavior in the router... the haproxy template is passed a synced flag. So it doesn't bind 80/443 until the sync happens, but it does bind the status port for health checks.
There was a problem hiding this comment.
That seems really odd. It reports healthy before it can serve traffic?
t1 (time between start and end of reload) is variable as it depends on the number of routes/endpoints and also on cpu load on the router machine. So as routes/endpoints increase or decrease, what ever value we initially set may not be accurate/desired. I think instead of 3 knobs, just RELOAD_GAP knob will solve most of the reload problems and the current implementation of doing router reload and not blocking other in-memory router updates during this time will help a lot. |
|
Superseded by #17049 |
The main purpose is to change the locking so that a reload doesn't
hold a lock of the router object for the duration of the reload so that updates that happen while the router is reloaded can be processed immediately and batched up and included when the next reload occurs. Before this, if a reload went longer than the reload interval, one event would be processed, and then trigger a new reload. Which would then lock out other event processing. So the router would not make any meaningful progress consuming events, and spend a lot of time and CPU reloading continually.
The router controls are:
the time the event comes in and when the reload starts)
The commit routine is now changed to have a top and bottom half. The
top half checks to see if a reload is scheduled, and if it is, it
returns immediately. If there's no reload scheduled then it calls the
bottom half and returns.
The bottom half is in charge of determining if it can immediately
reload or if it has to wait. If it must wait, then it works out the
earliest time it can reload and schedules a callback to itself for
that time.
If it determines it can reload, then it runs the reload code
immediately. When the reload is complete, it calls itself again to
make sure there was no other pending reload that had come in while the
reload was running.