Optimizations for HAProxy reloads#6744
Optimizations for HAProxy reloads#6744ahardin-rh merged 1 commit intoopenshift:masterfrom jmencak:haproxy-reloads
Conversation
There was a problem hiding this comment.
Hi @jmencak, thanks for writing this.
Just for clarification in my own mind:
- What is the default reload interval?
- When you say "it is currently recommended", could you please be more specific about what to look for when considering this change? What is the "fingerprint" to look for on the system (slow route propagation?)
- You may want to put a specific # of routes. For example, I think we saw it as low as 3000 but some other environments it might have been a bit higher?
There was a problem hiding this comment.
- Default reload interval is 5s.
- 15s comes from what Ben recommended in https://bugzilla.redhat.com/show_bug.cgi?id=1471899 , but that was to address an immediate BZ issue, which should now be fixed in other ways. Perhaps I should re-formulate this or remove this completely as this is trying to fix another problem. 15s could make the number of HAProxy processes 3x lower, in theory.
- Not sure about this. The docs are for 3.8+ and the the ~3000 route issue I was seeing should be fixed by Change the router reload suppression so that it doesn't block updates origin#17049, which already merged.
There was a problem hiding this comment.
Sounds like we need to re-test after #17049 and then re-approach this particular documentation. Unless you wanted to target this at "3.7 and earlier" versions of the docs only?
There was a problem hiding this comment.
This section has nothing to do with #17049 or BZ1471899 other than the fact increasing RELOAD_INTERVAL can alleviate the problems seen by BZ1471899 for versions 3.6 and earlier. I didn't write the section for that purpose. This section has everything to do with the inherent incapability of HAProxy to reload configuration without forking another process while serving (old and new) connections. BZ1471899 and #17049 was retested by QA, but I can retest, not a big deal.
There was a problem hiding this comment.
processes must handle old connections, which
There was a problem hiding this comment.
extra space before large
A large number of...
There was a problem hiding this comment.
ROUTER_DEFAULT_SERVER_TIMEOUT, and RELOAD_INTERVAL.
|
@jmencak Just some nits from me. Thanks for this! Also, can you please confirm that this is targeting 3.9? Is there an associated Trello card? Thanks again! |
|
Yes, targetting 3.9. No trello card, simple change. |
|
Latest changes LGTM. Thanks! |
(cherry picked from commit 253c88d) xref:openshift#6744
|
[rev_history] |
@jeremyeder could you please take a look?
/cc @ahardin-rh