event: scaled range bucket timer by akonradi · Pull Request #12414 · envoyproxy/envoy

akonradi · 2020-07-31T18:38:05Z

Commit Message: Add RangeTimer and scaling implementation
Additional Description:
Add an interface for timers that can be enabled for some timeout within a range. Also add an implementation where timers are created by a manager object which can efficiently scale their timeouts.

Risk Level: low - no functional changes
Testing: wrote and ran unit tests
Docs Changes: none
Release Notes: none

For #11427, and supersedes #12106

Add an interface for timers that can be enabled for some timeout within a range. The actual choice of when the timer should be triggered is not specified by the interface. Possible implementations of the interface include - timers that are triggered when a thread is not busy - timers that are triggered early/late in response to load Signed-off-by: Alex Konradi <akonradi@google.com>

Signed-off-by: Alex Konradi <akonradi@google.com>

This adds a ScaledRangeTimerManager class that can construct RangeTimer objects backed by a provided dispatcher, and which contains an adjustable scaling factor that is applied to all previously constructed timers (that haven't been deleted). The implementation uses buckets spaced at fixed exponentially increasing time intervals to get constant time performance for all operations. The trade-off is that for a timer scheduled to fire between [min, max], the actual firing time is not `S * (max - min)` (where S is the scaling factor), but `S * B_i`, where B_i is the interval of the largest bucket less than `max - min`. Buckets are (as of now) spaced out at powers of 2, so a timer scheduled for [0, 10] might actually fire at `S * 8`. Signed-off-by: Alex Konradi <akonradi@google.com>

Signed-off-by: Alex Konradi <akonradi@google.com>

antoniovicente

Thanks for taking on this important work.

antoniovicente · 2020-08-03T15:41:49Z

+ *  - pending: enabled, min timeout not elapsed
+ *  - active: enabled, min timeout elapsed, max timeout not elapsed
+ */
+class ScaledRangeTimerManager::ScaledRangeTimer final : public RangeTimer {


I think this override must be called: RangeTimerImpl

for consistency with all over envoy classes that implement an interface.

antoniovicente · 2020-08-03T15:42:39Z

+namespace Envoy {
+namespace Event {
+
+class ScaledRangeTimerManager {


Do we need an interface for this concrete class?

How will connections create instances of RangeTimer?

My plan is to have the ThreadLocalOverloadStateImpl construct instances of this class if the to-be-added action is configured. I don't know that we need an interface, but it might be useful for unit testing of the OverloadManagerImpl. Thoughts?

antoniovicente · 2020-08-03T16:46:35Z

+  const int index =
+      max_duration <= minimum_duration_
+          ? 0
+          : std::floor(std::log(max_duration.count()) / std::log(kBucketScaleFactor) -


There may be a bug here related to the difference in units from max_duration and minimum_duration_.

This computation only makes sense if max_duration is in milliseconds. You should accept a millisecond duration instead of duration.

Please add tests to cover the function that computes a bucket index from an input duration.

Yeah this is a major bug, good catch. I have been trying to use MonotonicTime::duration everywhere in the manager, but I missed minimum_duration_. Now fixed, and I'm hesitant to add tests since this is an internal function.

Addresses some review feedback Signed-off-by: Alex Konradi <akonradi@google.com>

Signed-off-by: Alex Konradi <akonradi@google.com>

This prevents timer starvation, which could happen when new timers are repeatedly enabled. Signed-off-by: Alex Konradi <akonradi@google.com>

Signed-off-by: Alex Konradi <akonradi@google.com>

antoniovicente · 2020-08-05T03:48:12Z

+ScaledRangeTimerManager::BucketEnabledList::iterator
+ScaledRangeTimerManager::add(RangeTimerImpl& timer, const MonotonicTime::duration max_duration) {
+  Bucket& bucket = getOrCreateBucket(max_duration);
+  const auto quantized_duration = getBucketedDuration(max_duration);


What guarantees are you providing with your implementation? I thought we wanted to prefer having timers fire close to the configured max time when the overload factor is 0.0, scale factor is 1.0

Yeah, this is something I was hoping to hash out here on the PR. Here are a couple options, most of which we've discussed

[as implemented] adjust max duration based on bucket so the timer fires on time for S=0 and early for S=1, but within the window

adjust min duration for the wait-for-min phase so that max-min is a bucket boundary; this will fire late for S=0 but within the window, and on-time for S=1

switch between these two by looking at the current scaling factor when enableTimer is called; this works really well if the scale factor doesn't change too much

use more buckets by decreasing the exponentiation factor

I'm a fan of option 2 over 1 since it is perfectly accurate for S=1, which preserves the existing behavior for proxies that don't change the default. For S=0, the timer might fire anywhere between min and min + (max - min)/2. We can do better than that by adding in option 4.

I don't know what to think about option 3. It seems like an extra complication that relies on heuristics for correctness. Maybe worth implementing later but I don't think we need it.

@antoniovicente thoughts? If this seems okay I'll go ahead switching from 1 to 2 and reducing the exponentiation factor.

I'm not seeing the following option among the ones listed above:
Add timers based on max_configured-min_configured to a bucket. Use a priority queue or multiset to keep the timers sorted by max expiry time. Expire the timer with the smallest max expiry time using the following formula:
expiry_time - overload_factor * (max_configured-min_configured) <= current_time

Properties of the algorithm above include: timers fire at the right time for S=1. Also, timers fire at the right time when scaling if all entries in the bucket have the same (max-min).

I'm not sure how to get other people to look over these proposals. I think that I'm currently the only person reviewing this PR. It may be necessary to move the conversation back to the associated issue or a design doc.

antoniovicente · 2020-08-05T03:52:53Z

+
+    const MonotonicTime now = manager_.dispatcher_.timeSource().monotonicTime();
+    auto it = manager_.add(*this, pending.latest_trigger - now);
+    state_.emplace<Active>(pending.latest_trigger - now, it);


My recommendation is that you prefer use of max_configured-min_configured which seems more deterministic by avoiding computations based on approximateMonotonicTime. If epoll durations and delay are high, we have bigger problems.

Doing a delta computation based on Dispatcher::approximateMonotonicTime effectively mixes time sources since timer registration uses a different monotonic timer internally (event_base_gettimeofday_cached). Mixing time sources adds inaccuracies. The result is not obviously better than max_configured-min_configured.

Also use the pending timer callback to execute the trigger callback if the min and max times are the same. Signed-off-by: Alex Konradi <akonradi@google.com>

When the scaling factor is 1, timers will run for their full maximum duration, at the cost of running for longer than their minimum when the scaling factor is 0. This biases timers towards the non-overload case. Signed-off-by: Alex Konradi <akonradi@google.com>

This increases timer accuracy at the cost of doubling the number of buckets. Signed-off-by: Alex Konradi <akonradi@google.com>

antoniovicente · 2020-08-06T17:45:37Z

  };

-  static constexpr int kBucketScaleFactor = 2;
+  static constexpr float kBucketScaleFactor = 1.41421356237; // = sqrt(2)


You may want to use log base 2, and then multiply the log result by some number. Multiplying by 2 would be equivalent to using sqrt(2) as the base, multiplying by 0.5 would be equivalent to using a log base of 4.

antoniovicente · 2020-08-06T17:47:03Z

+    // bucket `bucket_handle`.
+    if (wait_time <= std::chrono::milliseconds::zero()) {
+      auto it = manager_.add(*this, bucket_handle);
+      state_.emplace<ScalingMax>(bucket_handle, it);


I encourage you to seek feedback from others on this comment thread: #12414 (comment)

Signed-off-by: Alex Konradi <akonradi@google.com>

antoniovicente · 2020-08-18T17:23:51Z

/wait

stale · 2020-08-29T14:21:22Z

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

stale · 2020-09-05T22:51:17Z

This pull request has been automatically closed because it has not had activity in the last 14 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

akonradi added 4 commits July 15, 2020 11:50

Kick CI

10f022d

Signed-off-by: Alex Konradi <akonradi@google.com>

Add mock range timer

0b71279

Signed-off-by: Alex Konradi <akonradi@google.com>

mattklein123 assigned antoniovicente Jul 31, 2020

Fix format

8c0db18

Signed-off-by: Alex Konradi <akonradi@google.com>

antoniovicente reviewed Aug 3, 2020

View reviewed changes

akonradi added 4 commits August 4, 2020 10:46

Move RangeTimer to its own file and fix names

c505326

Addresses some review feedback Signed-off-by: Alex Konradi <akonradi@google.com>

Fix re-registration bug with scope

89a4b9d

Signed-off-by: Alex Konradi <akonradi@google.com>

Don't re-enable bucket timers unless necessary

548a6af

This prevents timer starvation, which could happen when new timers are repeatedly enabled. Signed-off-by: Alex Konradi <akonradi@google.com>

More review fixes; use cached dispatcher time

ca40250

Signed-off-by: Alex Konradi <akonradi@google.com>

antoniovicente reviewed Aug 5, 2020

View reviewed changes

akonradi added 3 commits August 5, 2020 15:18

Go back to using timeSource().monotonicTime()

855d7ee

Also use the pending timer callback to execute the trigger callback if the min and max times are the same. Signed-off-by: Alex Konradi <akonradi@google.com>

Reduce bucket exponentiation factor

eeedcb2

This increases timer accuracy at the cost of doubling the number of buckets. Signed-off-by: Alex Konradi <akonradi@google.com>

antoniovicente reviewed Aug 6, 2020

View reviewed changes

akonradi mentioned this pull request Aug 10, 2020

[http] Adaptive timeouts for idle HTTP client connections waiting for a request #11427

Closed

Fix format

a476a43

Signed-off-by: Alex Konradi <akonradi@google.com>

repokitteh-read-only Bot added the waiting label Aug 18, 2020

stale Bot added the stale stalebot believes this issue/PR has not been touched recently label Aug 29, 2020

stale Bot closed this Sep 5, 2020

Conversation

akonradi commented Jul 31, 2020

Uh oh!

antoniovicente left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

akonradi Aug 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antoniovicente commented Aug 18, 2020

Uh oh!

stale Bot commented Aug 29, 2020

Uh oh!

stale Bot commented Sep 5, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

akonradi Aug 4, 2020 •

edited

Loading