Enable recall/percentile latency charts on results by filipecosta90 · Pull Request #300 · erikbern/ann-benchmarks

filipecosta90 · 2022-05-30T22:46:44Z

The following PR enables latency by percentile analysis for a set of common percentiles on the generated results website: p50, p95, p99, and p999 -- i.e. the median and and tail latencies. In other words, They are the latency thresholds at which 50%, 95%, 99%, and 99.9% of queries are faster than that particular presented value.

latency and qps analysis -- and why we need both

For distributions that are non-normal, such as the latency, many “basic rules” of normally distributed statistics are violated. Instead of computing just the mean (or a single number which is the inverse related toqps), which tries to express the whole distribution in a single result, we can use a sampling of the distribution at intervals -- percentiles, which tell you how many requests actually would experience that delay.

Further notes

I believe that as soon as this benchmark utility allows for extended latency tracking/analysis, further questions/requests will arise -- for example allowing for constant throughput benchmarks ( but first things first ).

If you want to dive deeper on extended latency analysis here's a list of further references for documentation, projects and presentations diving deeper in the discussed subject:

maumueller · 2022-06-01T07:03:52Z

Thanks!

Some general observations:

The individual running times are part of the result hdf5 file. (It's the 'times' group.) I don't see why you should compute this during query time.
I would prefer using numpy.percentile instead of another dependency. Am I missing why it doesn't work?

filipecosta90 · 2022-06-01T07:52:14Z

WRT:

1. The individual running times are part of the result hdf5 file. (It's the `'times'` group.) I don't see why you should compute this during query time.

I missed that. will do the change 👍

WRT:

2. I would prefer using `numpy.percentile` instead of another dependency. Am I missing why it doesn't work?

It will work, and for this case ( small number of total observations ) it might not compensate to use a sketching DS given as you mentioned above you already preserve all latencies. It makes sense to use the hdrhistogram or any other approximate computation sketch when you don't want to keep the entire data. Given the entire data is kept no reason to use it. Will swap to numpy.percentile over the times group.

maumueller · 2022-06-01T07:55:17Z

Thanks!

filipecosta90 · 2022-06-04T19:56:18Z

Thanks!

@maumueller I've updated the PR based upon your recommendations. Ready for review =)

erikbern · 2022-06-05T03:53:48Z

Interesting change!

Do you have results for this? My guess would be that you don't have the power-law type fat-tailed latencies you typically see in distributed systems, but that the tails are fairly thin. But I could be wrong!

GuyAv46

Looks good. One fix

filipecosta90 · 2022-06-06T20:45:34Z

Interesting change!

Do you have results for this? My guess would be that you don't have the power-law type fat-tailed latencies you typically see in distributed systems, but that the tails are fairly thin. But I could be wrong!

Given long tails might be related to resource saturation, queuing, etc... I believe we will still see longtails -- and there is no harm there :). However, one favorable thing I think we will avoid is the multi-modal latencies distribution given we're targeting the same type of operations, etc...

CP for erikbern#300: Enable recall/latency charts on results

maumueller · 2022-07-11T08:41:39Z

Sorry, @filipecosta90, I lost track of it. PR looks good and I merged it. Thanks for the contribution.

* Enable recall/latency charts on results * Fixes per PR review

filipecosta90 force-pushed the recall.percentiles branch from 51d412c to 0cdb68b Compare June 4, 2022 19:34

Enable recall/latency charts on results

d3dc75d

filipecosta90 force-pushed the recall.percentiles branch from aa43352 to d3dc75d Compare June 4, 2022 19:55

filipecosta90 mentioned this pull request Jun 4, 2022

CP for #300: Enable recall/latency charts on results RedisAI/ann-benchmarks#26

Merged

GuyAv46 reviewed Jun 6, 2022

View reviewed changes

Fixes per PR review

5e98754

filipecosta90 added a commit to RedisAI/ann-benchmarks that referenced this pull request Jun 9, 2022

Merge pull request #26 from RedisAI/multiclient_latencies

46c5430

CP for erikbern#300: Enable recall/latency charts on results

maumueller merged commit 4387b94 into erikbern:master Jul 11, 2022

filipecosta90 deleted the recall.percentiles branch July 11, 2022 11:46

erikbern pushed a commit that referenced this pull request Apr 14, 2023

Enable recall/percentile latency charts on results (#300)

842b5a3

* Enable recall/latency charts on results * Fixes per PR review

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable recall/percentile latency charts on results#300

Enable recall/percentile latency charts on results#300
maumueller merged 2 commits intoerikbern:masterfrom
RedisAI:recall.percentiles

filipecosta90 commented May 30, 2022

Uh oh!

maumueller commented Jun 1, 2022

Uh oh!

filipecosta90 commented Jun 1, 2022

Uh oh!

maumueller commented Jun 1, 2022

Uh oh!

filipecosta90 commented Jun 4, 2022

Uh oh!

erikbern commented Jun 5, 2022

Uh oh!

GuyAv46 left a comment

Uh oh!

filipecosta90 commented Jun 6, 2022

Uh oh!

maumueller commented Jul 11, 2022 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

filipecosta90 commented May 30, 2022

latency and qps analysis -- and why we need both

Further notes

Uh oh!

maumueller commented Jun 1, 2022

Uh oh!

filipecosta90 commented Jun 1, 2022

Uh oh!

maumueller commented Jun 1, 2022

Uh oh!

filipecosta90 commented Jun 4, 2022

Uh oh!

erikbern commented Jun 5, 2022

Uh oh!

GuyAv46 left a comment

Choose a reason for hiding this comment

Uh oh!

filipecosta90 commented Jun 6, 2022

Uh oh!

maumueller commented Jul 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

maumueller commented Jul 11, 2022 •

edited

Loading