Skip to content

stats: Reduce stat memory by removing duplicate string storage in map keys.#4711

Closed
jmarantz wants to merge 40 commits intoenvoyproxy:masterfrom
jmarantz:no-dup-strings-in-map-keys
Closed

stats: Reduce stat memory by removing duplicate string storage in map keys.#4711
jmarantz wants to merge 40 commits intoenvoyproxy:masterfrom
jmarantz:no-dup-strings-in-map-keys

Conversation

@jmarantz
Copy link
Copy Markdown
Contributor

@jmarantz jmarantz commented Oct 13, 2018

**Note: this code is ready IMO, but #4714 should be checked in first to get a baseline. **

Description: A significant amount of storage is wasted in the maps in ThreadLocalStore with copies of the strings. This changes the maps to use const char* keys, which are held in every form of stat. This is one step toward resolving #4196. It reduces memory from 68k/cluster to 54k/cluster in the Istio test referenced #4196.

A significant follow-up to this would be to tackle the redundant storage for tagExtractedName and TagVector.

Risk Level: low
Testing: //test/...
Docs Changes: n/a
Release Notes:n/a

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Comment thread source/common/stats/heap_stat_data.cc
Comment thread source/common/common/hash.h
… as a key.

Also reduce HeapStatAllocator key overhead by 8 bytes by just using the char*.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…ction boundaries around HeapStatData construction/destruction.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…mbol-table mem.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz jmarantz changed the title stat: Reduce stat memory by removing duplicate string storage in map keys. WIP stat: Reduce stat memory by removing duplicate string storage in map keys. Oct 13, 2018
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff. Two comments from a quick skim.

Comment thread source/common/stats/thread_local_store.h
Comment thread include/envoy/stats/stats.h
@jmarantz jmarantz changed the title WIP stat: Reduce stat memory by removing duplicate string storage in map keys. WIP stats: Reduce stat memory by removing duplicate string storage in map keys. Oct 13, 2018
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz
Copy link
Copy Markdown
Contributor Author

Closing temporarily until (a) #4714 can be merged and (b) I write a stats impl overview per @mattklein123's suggestion above.

…tats are robust at runtime.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
htuch pushed a commit that referenced this pull request Oct 16, 2018
…ce/doc/stats/md (#4729)

per @mattklein123 's suggestion on #4711 , adds a stats design doc. In this rev it only pulls in material from a comment in thread_local_store.h. A few TBD sections were added to suggest structure, but the intent is not to fill them out in this PR, which just moves the existing material and adds an overview for context. Future stats-related PRs should include edits to this file to keep it up-to-date.

This is one step toward resolving #4196.

Risk Level: low
Testing: none
Docs Changes: this is a doc change.
Release Notes: n/a

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…tructure.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz jmarantz reopened this Oct 16, 2018
@jmarantz
Copy link
Copy Markdown
Contributor Author

Re-opening so we can discuss more about the code.

It would be nice to get #4741 and #4714 merged first and then this PR (once master-merged) will show more clearly the stats doc changes and the memory consumption win.

htuch pushed a commit that referenced this pull request Oct 17, 2018
… name structure (#4741)

Describe the current state of stat name representation, as a pre-cursor to #4711 which should be updated to modify the stat naming doc.

Risk Level: low
Testing: none
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Joshua Marantz <jmarantz@google.com>
soya3129 pushed a commit to soya3129/envoy that referenced this pull request Oct 19, 2018
…ce/doc/stats/md (envoyproxy#4729)

per @mattklein123 's suggestion on envoyproxy#4711 , adds a stats design doc. In this rev it only pulls in material from a comment in thread_local_store.h. A few TBD sections were added to suggest structure, but the intent is not to fill them out in this PR, which just moves the existing material and adds an overview for context. Future stats-related PRs should include edits to this file to keep it up-to-date.

This is one step toward resolving envoyproxy#4196.

Risk Level: low
Testing: none
Docs Changes: this is a doc change.
Release Notes: n/a

Signed-off-by: Joshua Marantz <jmarantz@google.com>

Signed-off-by: Yang Song <yasong@yasong00.cam.corp.google.com>
soya3129 pushed a commit to soya3129/envoy that referenced this pull request Oct 19, 2018
… name structure (envoyproxy#4741)

Describe the current state of stat name representation, as a pre-cursor to envoyproxy#4711 which should be updated to modify the stat naming doc.

Risk Level: low
Testing: none
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Joshua Marantz <jmarantz@google.com>

Signed-off-by: Yang Song <yasong@yasong00.cam.corp.google.com>
@mattklein123 mattklein123 self-assigned this Oct 23, 2018
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…eflect intent.

Would appreciate suggestions on how to improve the name, if you have any.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
…ings-in-map-keys

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz jmarantz changed the title WIP stats: Reduce stat memory by removing duplicate string storage in map keys. stats: Reduce stat memory by removing duplicate string storage in map keys. Oct 25, 2018
Copy link
Copy Markdown
Contributor Author

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ptal -- this now directly shows memory-reduction in unit-tests, as well as an associated edit of the stats design doc.

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this makes sense to me. However, do you want to get a speed benchmark in place before we merge this? Even for this change it seems worth it?

Comment thread source/common/stats/heap_stat_data.cc Outdated
auto data = std::make_unique<HeapStatData>(name);
// required to use this allocator. Note that data must be freed by calling
// its free() method, and not by destructing or unique_ptr. So this method
// cannot call anything that might throw, and thus it cannot through itself.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/through/throw

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no longer needed.

Comment thread source/common/stats/raw_stat_data.h Outdated
@@ -77,7 +77,7 @@ struct RawStatData {
/**
* Returns the name as a std::string.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: update

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done and switched to ersatz-doxygen here and elsewhere in the class.

Comment thread source/common/stats/heap_stat_data.cc Outdated
// required to use this allocator.
auto data = std::make_unique<HeapStatData>(name);
// required to use this allocator. Note that data must be freed by calling
// its free() method, and not by destructing or unique_ptr. So this method
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you, but you could use CSmartPtr (search code) to use RAII around calling free().

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CSmartPtr doesn't quite work in this scenario, but unique_ptr does support a custom deletion functor so I just used it directly.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz jmarantz mentioned this pull request Oct 26, 2018
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Copy link
Copy Markdown
Contributor Author

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perf test in #4867 and merged here, showing a slight degradation in performance with non TLS, and a slight increase in performance with TLS, though these could be due to transient noisy neighbors as these runs were not properly interleaved.

The big win in both perf and memory comes from combining this and flat-hash-map. @mattklein123 would you prefer I just issue a combined PR for those?

old:

--------------------------------------------------------------
Benchmark                       Time           CPU Iterations
--------------------------------------------------------------
BM_StatsNoTls_mean       16249144 ns   16248983 ns         37
BM_StatsNoTls_median     16176011 ns   16175443 ns         37
BM_StatsNoTls_stddev       484905 ns     484909 ns         37
BM_StatsWithTls_mean     20958468 ns   20958244 ns         30
BM_StatsWithTls_median   20899829 ns   20899217 ns         30
BM_StatsWithTls_stddev     576702 ns     576784 ns         30

new:

--------------------------------------------------------------
Benchmark                       Time           CPU Iterations
--------------------------------------------------------------
BM_StatsNoTls_mean       17098922 ns   17099054 ns         34
BM_StatsNoTls_median     17049225 ns   17049419 ns         34
BM_StatsNoTls_stddev       354827 ns     354787 ns         34
BM_StatsWithTls_mean     20593702 ns   20593854 ns         30
BM_StatsWithTls_median   20527260 ns   20527525 ns         30
BM_StatsWithTls_stddev     532875 ns     532881 ns         30

flat-hash-map (#4715)

--------------------------------------------------------------
Benchmark                       Time           CPU Iterations
--------------------------------------------------------------
BM_StatsNoTls_mean       13582401 ns   13582484 ns         40
BM_StatsNoTls_median     13519546 ns   13519464 ns         40
BM_StatsNoTls_stddev       269552 ns     269544 ns         40
BM_StatsWithTls_mean     18171743 ns   18171829 ns         28
BM_StatsWithTls_median   18116579 ns   18116550 ns         28
BM_StatsWithTls_stddev     257126 ns     257189 ns         28

combined (#4872)

--------------------------------------------------------------
Benchmark                       Time           CPU Iterations
--------------------------------------------------------------
BM_StatsNoTls_mean       13779759 ns   13779854 ns         39
BM_StatsNoTls_median     13630165 ns   13630325 ns         39
BM_StatsNoTls_stddev       576447 ns     576471 ns         39
BM_StatsWithTls_mean     16336826 ns   16336967 ns         35
BM_StatsWithTls_median   16240568 ns   16240563 ns         35
BM_StatsWithTls_stddev     544943 ns     544942 ns         35

Comment thread source/common/stats/heap_stat_data.cc Outdated
// required to use this allocator.
auto data = std::make_unique<HeapStatData>(name);
// required to use this allocator. Note that data must be freed by calling
// its free() method, and not by destructing or unique_ptr. So this method
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CSmartPtr doesn't quite work in this scenario, but unique_ptr does support a custom deletion functor so I just used it directly.

Comment thread source/common/stats/heap_stat_data.cc Outdated
auto data = std::make_unique<HeapStatData>(name);
// required to use this allocator. Note that data must be freed by calling
// its free() method, and not by destructing or unique_ptr. So this method
// cannot call anything that might throw, and thus it cannot through itself.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no longer needed.

Comment thread source/common/stats/raw_stat_data.h Outdated
@@ -77,7 +77,7 @@ struct RawStatData {
/**
* Returns the name as a std::string.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done and switched to ersatz-doxygen here and elsewhere in the class.

…ass method comments.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz
Copy link
Copy Markdown
Contributor Author

jmarantz commented Oct 26, 2018

I decided to close this for now in favor of #4872 which incorporates the flat_hash_map change. That avoids the risk that this PR on its own has of slightly degrading one axis of performance (though improving several others).

LMK if you still think this one makes sense to go in on its own; the argument for that is that this change is complicated but bounded in scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants