event: scaled range bucket timer#12414
Conversation
Add an interface for timers that can be enabled for some timeout within a range. The actual choice of when the timer should be triggered is not specified by the interface. Possible implementations of the interface include - timers that are triggered when a thread is not busy - timers that are triggered early/late in response to load Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
This adds a ScaledRangeTimerManager class that can construct RangeTimer objects backed by a provided dispatcher, and which contains an adjustable scaling factor that is applied to all previously constructed timers (that haven't been deleted). The implementation uses buckets spaced at fixed exponentially increasing time intervals to get constant time performance for all operations. The trade-off is that for a timer scheduled to fire between [min, max], the actual firing time is not `S * (max - min)` (where S is the scaling factor), but `S * B_i`, where B_i is the interval of the largest bucket less than `max - min`. Buckets are (as of now) spaced out at powers of 2, so a timer scheduled for [0, 10] might actually fire at `S * 8`. Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
antoniovicente
left a comment
There was a problem hiding this comment.
Thanks for taking on this important work.
| * - pending: enabled, min timeout not elapsed | ||
| * - active: enabled, min timeout elapsed, max timeout not elapsed | ||
| */ | ||
| class ScaledRangeTimerManager::ScaledRangeTimer final : public RangeTimer { |
There was a problem hiding this comment.
I think this override must be called: RangeTimerImpl
for consistency with all over envoy classes that implement an interface.
| namespace Envoy { | ||
| namespace Event { | ||
|
|
||
| class ScaledRangeTimerManager { |
There was a problem hiding this comment.
Do we need an interface for this concrete class?
How will connections create instances of RangeTimer?
There was a problem hiding this comment.
My plan is to have the ThreadLocalOverloadStateImpl construct instances of this class if the to-be-added action is configured. I don't know that we need an interface, but it might be useful for unit testing of the OverloadManagerImpl. Thoughts?
| const int index = | ||
| max_duration <= minimum_duration_ | ||
| ? 0 | ||
| : std::floor(std::log(max_duration.count()) / std::log(kBucketScaleFactor) - |
There was a problem hiding this comment.
There may be a bug here related to the difference in units from max_duration and minimum_duration_.
This computation only makes sense if max_duration is in milliseconds. You should accept a millisecond duration instead of duration.
Please add tests to cover the function that computes a bucket index from an input duration.
There was a problem hiding this comment.
Yeah this is a major bug, good catch. I have been trying to use MonotonicTime::duration everywhere in the manager, but I missed minimum_duration_. Now fixed, and I'm hesitant to add tests since this is an internal function.
Addresses some review feedback Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
This prevents timer starvation, which could happen when new timers are repeatedly enabled. Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
| ScaledRangeTimerManager::BucketEnabledList::iterator | ||
| ScaledRangeTimerManager::add(RangeTimerImpl& timer, const MonotonicTime::duration max_duration) { | ||
| Bucket& bucket = getOrCreateBucket(max_duration); | ||
| const auto quantized_duration = getBucketedDuration(max_duration); |
There was a problem hiding this comment.
What guarantees are you providing with your implementation? I thought we wanted to prefer having timers fire close to the configured max time when the overload factor is 0.0, scale factor is 1.0
There was a problem hiding this comment.
Yeah, this is something I was hoping to hash out here on the PR. Here are a couple options, most of which we've discussed
- [as implemented] adjust max duration based on bucket so the timer fires on time for S=0 and early for S=1, but within the window
- adjust min duration for the wait-for-min phase so that
max-minis a bucket boundary; this will fire late for S=0 but within the window, and on-time for S=1 - switch between these two by looking at the current scaling factor when enableTimer is called; this works really well if the scale factor doesn't change too much
- use more buckets by decreasing the exponentiation factor
I'm a fan of option 2 over 1 since it is perfectly accurate for S=1, which preserves the existing behavior for proxies that don't change the default. For S=0, the timer might fire anywhere between min and min + (max - min)/2. We can do better than that by adding in option 4.
I don't know what to think about option 3. It seems like an extra complication that relies on heuristics for correctness. Maybe worth implementing later but I don't think we need it.
@antoniovicente thoughts? If this seems okay I'll go ahead switching from 1 to 2 and reducing the exponentiation factor.
There was a problem hiding this comment.
I'm not seeing the following option among the ones listed above:
Add timers based on max_configured-min_configured to a bucket. Use a priority queue or multiset to keep the timers sorted by max expiry time. Expire the timer with the smallest max expiry time using the following formula:
expiry_time - overload_factor * (max_configured-min_configured) <= current_time
Properties of the algorithm above include: timers fire at the right time for S=1. Also, timers fire at the right time when scaling if all entries in the bucket have the same (max-min).
I'm not sure how to get other people to look over these proposals. I think that I'm currently the only person reviewing this PR. It may be necessary to move the conversation back to the associated issue or a design doc.
|
|
||
| const MonotonicTime now = manager_.dispatcher_.timeSource().monotonicTime(); | ||
| auto it = manager_.add(*this, pending.latest_trigger - now); | ||
| state_.emplace<Active>(pending.latest_trigger - now, it); |
There was a problem hiding this comment.
My recommendation is that you prefer use of max_configured-min_configured which seems more deterministic by avoiding computations based on approximateMonotonicTime. If epoll durations and delay are high, we have bigger problems.
Doing a delta computation based on Dispatcher::approximateMonotonicTime effectively mixes time sources since timer registration uses a different monotonic timer internally (event_base_gettimeofday_cached). Mixing time sources adds inaccuracies. The result is not obviously better than max_configured-min_configured.
Also use the pending timer callback to execute the trigger callback if the min and max times are the same. Signed-off-by: Alex Konradi <akonradi@google.com>
When the scaling factor is 1, timers will run for their full maximum duration, at the cost of running for longer than their minimum when the scaling factor is 0. This biases timers towards the non-overload case. Signed-off-by: Alex Konradi <akonradi@google.com>
This increases timer accuracy at the cost of doubling the number of buckets. Signed-off-by: Alex Konradi <akonradi@google.com>
| }; | ||
|
|
||
| static constexpr int kBucketScaleFactor = 2; | ||
| static constexpr float kBucketScaleFactor = 1.41421356237; // = sqrt(2) |
There was a problem hiding this comment.
You may want to use log base 2, and then multiply the log result by some number. Multiplying by 2 would be equivalent to using sqrt(2) as the base, multiplying by 0.5 would be equivalent to using a log base of 4.
| // bucket `bucket_handle`. | ||
| if (wait_time <= std::chrono::milliseconds::zero()) { | ||
| auto it = manager_.add(*this, bucket_handle); | ||
| state_.emplace<ScalingMax>(bucket_handle, it); |
There was a problem hiding this comment.
I encourage you to seek feedback from others on this comment thread: #12414 (comment)
Signed-off-by: Alex Konradi <akonradi@google.com>
|
/wait |
|
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
|
This pull request has been automatically closed because it has not had activity in the last 14 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
Commit Message: Add RangeTimer and scaling implementation
Additional Description:
Add an interface for timers that can be enabled for some timeout within a range. Also add an implementation where timers are created by a manager object which can efficiently scale their timeouts.
Risk Level: low - no functional changes
Testing: wrote and ran unit tests
Docs Changes: none
Release Notes: none
For #11427, and supersedes #12106