Skip to content

event: scaled range bucket timer#12414

Closed
akonradi wants to merge 13 commits into
envoyproxy:masterfrom
akonradi:scaled-range-bucket-timer
Closed

event: scaled range bucket timer#12414
akonradi wants to merge 13 commits into
envoyproxy:masterfrom
akonradi:scaled-range-bucket-timer

Conversation

@akonradi
Copy link
Copy Markdown
Contributor

Commit Message: Add RangeTimer and scaling implementation
Additional Description:
Add an interface for timers that can be enabled for some timeout within a range. Also add an implementation where timers are created by a manager object which can efficiently scale their timeouts.

Risk Level: low - no functional changes
Testing: wrote and ran unit tests
Docs Changes: none
Release Notes: none

For #11427, and supersedes #12106

akonradi added 4 commits July 15, 2020 11:50
Add an interface for timers that can be enabled for some timeout within
a range. The actual choice of when the timer should be triggered is
not specified by the interface. Possible implementations of the
interface include

 - timers that are triggered when a thread is not busy
 - timers that are triggered early/late in response to load

Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
This adds a ScaledRangeTimerManager class that can construct RangeTimer
objects backed by a provided dispatcher, and which contains an
adjustable scaling factor that is applied to all previously constructed
timers (that haven't been deleted).

The implementation uses buckets spaced at fixed exponentially increasing
time intervals to get constant time performance for all operations. The
trade-off is that for a timer scheduled to fire between [min, max], the
actual firing time is not `S * (max - min)` (where S is the scaling
factor), but `S * B_i`, where B_i is the interval of the largest bucket
less than `max - min`. Buckets are (as of now) spaced out at powers of
2, so a timer scheduled for [0, 10] might actually fire at `S * 8`.

Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
Copy link
Copy Markdown
Contributor

@antoniovicente antoniovicente left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking on this important work.

* - pending: enabled, min timeout not elapsed
* - active: enabled, min timeout elapsed, max timeout not elapsed
*/
class ScaledRangeTimerManager::ScaledRangeTimer final : public RangeTimer {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this override must be called: RangeTimerImpl

for consistency with all over envoy classes that implement an interface.

namespace Envoy {
namespace Event {

class ScaledRangeTimerManager {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need an interface for this concrete class?

How will connections create instances of RangeTimer?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My plan is to have the ThreadLocalOverloadStateImpl construct instances of this class if the to-be-added action is configured. I don't know that we need an interface, but it might be useful for unit testing of the OverloadManagerImpl. Thoughts?

Comment thread include/envoy/event/timer.h Outdated
Comment thread source/common/event/scaled_range_timer.cc Outdated
Comment thread source/common/event/scaled_range_timer.cc Outdated
Comment thread source/common/event/scaled_range_timer.cc Outdated
Comment thread source/common/event/scaled_range_timer.cc Outdated
Comment thread source/common/event/scaled_range_timer.cc Outdated
Comment thread source/common/event/scaled_range_timer.cc Outdated
const int index =
max_duration <= minimum_duration_
? 0
: std::floor(std::log(max_duration.count()) / std::log(kBucketScaleFactor) -
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be a bug here related to the difference in units from max_duration and minimum_duration_.

This computation only makes sense if max_duration is in milliseconds. You should accept a millisecond duration instead of duration.

Please add tests to cover the function that computes a bucket index from an input duration.

Copy link
Copy Markdown
Contributor Author

@akonradi akonradi Aug 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a major bug, good catch. I have been trying to use MonotonicTime::duration everywhere in the manager, but I missed minimum_duration_. Now fixed, and I'm hesitant to add tests since this is an internal function.

Addresses some review feedback

Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
This prevents timer starvation, which could happen when new timers are
repeatedly enabled.

Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
ScaledRangeTimerManager::BucketEnabledList::iterator
ScaledRangeTimerManager::add(RangeTimerImpl& timer, const MonotonicTime::duration max_duration) {
Bucket& bucket = getOrCreateBucket(max_duration);
const auto quantized_duration = getBucketedDuration(max_duration);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What guarantees are you providing with your implementation? I thought we wanted to prefer having timers fire close to the configured max time when the overload factor is 0.0, scale factor is 1.0

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is something I was hoping to hash out here on the PR. Here are a couple options, most of which we've discussed

  1. [as implemented] adjust max duration based on bucket so the timer fires on time for S=0 and early for S=1, but within the window
  2. adjust min duration for the wait-for-min phase so that max-min is a bucket boundary; this will fire late for S=0 but within the window, and on-time for S=1
  3. switch between these two by looking at the current scaling factor when enableTimer is called; this works really well if the scale factor doesn't change too much
  4. use more buckets by decreasing the exponentiation factor

I'm a fan of option 2 over 1 since it is perfectly accurate for S=1, which preserves the existing behavior for proxies that don't change the default. For S=0, the timer might fire anywhere between min and min + (max - min)/2. We can do better than that by adding in option 4.

I don't know what to think about option 3. It seems like an extra complication that relies on heuristics for correctness. Maybe worth implementing later but I don't think we need it.

@antoniovicente thoughts? If this seems okay I'll go ahead switching from 1 to 2 and reducing the exponentiation factor.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing the following option among the ones listed above:
Add timers based on max_configured-min_configured to a bucket. Use a priority queue or multiset to keep the timers sorted by max expiry time. Expire the timer with the smallest max expiry time using the following formula:
expiry_time - overload_factor * (max_configured-min_configured) <= current_time

Properties of the algorithm above include: timers fire at the right time for S=1. Also, timers fire at the right time when scaling if all entries in the bucket have the same (max-min).

I'm not sure how to get other people to look over these proposals. I think that I'm currently the only person reviewing this PR. It may be necessary to move the conversation back to the associated issue or a design doc.


const MonotonicTime now = manager_.dispatcher_.timeSource().monotonicTime();
auto it = manager_.add(*this, pending.latest_trigger - now);
state_.emplace<Active>(pending.latest_trigger - now, it);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My recommendation is that you prefer use of max_configured-min_configured which seems more deterministic by avoiding computations based on approximateMonotonicTime. If epoll durations and delay are high, we have bigger problems.

Doing a delta computation based on Dispatcher::approximateMonotonicTime effectively mixes time sources since timer registration uses a different monotonic timer internally (event_base_gettimeofday_cached). Mixing time sources adds inaccuracies. The result is not obviously better than max_configured-min_configured.

Also use the pending timer callback to execute the trigger callback if
the min and max times are the same.

Signed-off-by: Alex Konradi <akonradi@google.com>
When the scaling factor is 1, timers will run for their full maximum
duration, at the cost of running for longer than their minimum when the
scaling factor is 0. This biases timers towards the non-overload case.

Signed-off-by: Alex Konradi <akonradi@google.com>
This increases timer accuracy at the cost of doubling the number of
buckets.

Signed-off-by: Alex Konradi <akonradi@google.com>
};

static constexpr int kBucketScaleFactor = 2;
static constexpr float kBucketScaleFactor = 1.41421356237; // = sqrt(2)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to use log base 2, and then multiply the log result by some number. Multiplying by 2 would be equivalent to using sqrt(2) as the base, multiplying by 0.5 would be equivalent to using a log base of 4.

// bucket `bucket_handle`.
if (wait_time <= std::chrono::milliseconds::zero()) {
auto it = manager_.add(*this, bucket_handle);
state_.emplace<ScalingMax>(bucket_handle, it);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I encourage you to seek feedback from others on this comment thread: #12414 (comment)

Signed-off-by: Alex Konradi <akonradi@google.com>
@antoniovicente
Copy link
Copy Markdown
Contributor

/wait

@stale
Copy link
Copy Markdown

stale Bot commented Aug 29, 2020

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale Bot added the stale stalebot believes this issue/PR has not been touched recently label Aug 29, 2020
@stale
Copy link
Copy Markdown

stale Bot commented Sep 5, 2020

This pull request has been automatically closed because it has not had activity in the last 14 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale Bot closed this Sep 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale stalebot believes this issue/PR has not been touched recently waiting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants