Skip to content

common inline map registry support#27928

Closed
wbpcode wants to merge 39 commits into
envoyproxy:mainfrom
wbpcode:dev-speed-up-map
Closed

common inline map registry support#27928
wbpcode wants to merge 39 commits into
envoyproxy:mainfrom
wbpcode:dev-speed-up-map

Conversation

@wbpcode
Copy link
Copy Markdown
Member

@wbpcode wbpcode commented Jun 13, 2023

Commit Message: common inline map registry support
Additional Description:

To close #27794.

This PR added a new template class InlineMapRegistry. This template could be used as an alternative of normal hash map to store the key/value pairs. But by this template, you can register some frequently used keys as inline keys and get the
handle for the key. Then you can use the handle to access the key/value pair in the map without even one time hash searching.

This is useful when some keys are frequently used and the keys are known at compile time or bootstrap time. You can get superior performance by using the inline handle. For example, the filter state always uses the filter name as the key and the filter name is known at compile time. By using the inline handle, the filter state could get the key/value pair without any hash.

This template also provides the interface to access the key/value pair by the normal key and the interface has similar searching performance as the normal hash map. But the insertion and removal by the normal key is slower than the normal hash map.
The Scope template parameter is used to distinguish different inline map registry. Different Scope will have different inline key registrations and different scope id.

These is usage example:

  // Define a scope type.
  class ExampleScope {
  public:
    static absl::string_view name() { return "ExampleScope"; }
  };

  // Register the inline key. We should never do this after bootstrapping. Typically, we can
  // do this by define a global variable.
  auto inline_handle = InlineMapRegistry<ExampleScope>::registerInlineKey("inline_key");

  // Create the inline map.
  auto inline_map = InlineMapRegistry<ExampleScope>::InlineMap<std::string>::createInlineMap();

  // Get value by inline handle.
  inline_map->insert(inline_handle, std::make_unique<std::string>("value"));
  EXPECT_EQ(*inline_map->lookup(inline_handle), "value");

Risk Level: low.
Testing: n/a.
Docs Changes: n/a.
Release Notes: n/a.
Platform Specific Features: n/a.

Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Jun 13, 2023

Additional test are necessary.

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Jun 13, 2023

A quick benchmark of find with 200 short string keys. If normal string key is used to search inline entry, then find of inline map will be slower than the normal map. The result is as expected because inline map may do two hash searching (one searching to ensure if it's inline key, one searching for underlay map) for once find or lookup.

If inline handle is used, the inline map is super fast. I am a little doubt this result (it's too fast) so I will write some more unit test to ensure everything is right.

---------------------------------------------------------------
Benchmark                     Time             CPU   Iterations
---------------------------------------------------------------
BM_InlineMapFind/0/0       5493 ns         5493 ns       130140  
BM_InlineMapFind/0/0       5442 ns         5442 ns       121725  # Normal map, normal key, call `find` by string.
BM_InlineMapFind/1/0       5457 ns         5457 ns       130060  # Inline map, normal key, call `find` by string.
BM_InlineMapFind/2/0       6855 ns         6855 ns       102479  # Inline map, inline key, call `find` by string.
BM_InlineMapFind/2/1      0.000 ns        0.000 ns   1000000000  # Inline map, inline key, call `find` by inline handle.

wbpcode added 2 commits June 13, 2023 04:01
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
@jmarantz
Copy link
Copy Markdown
Contributor

Can you write more detail in the PR description and class comments when this should be used vs normal maps, including rough performance expectations?

Also, I think this might have been inspired by the HeaderMap implementation; should the HeaderMap implementation be migrated to use this shared infrastructure? Are there pros/cons?

@jmarantz
Copy link
Copy Markdown
Contributor

/wait

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Jun 14, 2023

Can you write more detail in the PR description and class comments when this should be used vs normal maps, including rough performance expectations?

I will do it after I complete the unit tests.

Also, I think this might have been inspired by the HeaderMap implementation; should the HeaderMap implementation be migrated to use this shared infrastructure? Are there pros/cons?

The header map is a little complex. It's impossible to replace the current header map impl in short term (I think it's a long term target).

The initial target is using it as underlay container of filter state or per filter config to speed up the searching operations on the key path of request processing.

wbpcode added 2 commits June 14, 2023 08:17
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
wbpcode added 5 commits June 14, 2023 09:11
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
Copy link
Copy Markdown
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did a quick pass; flushing a few comments

Comment thread test/common/common/inline_map_registry_speed_test.cc Outdated
Comment thread envoy/common/inline_map_registry.h Outdated
Comment thread envoy/common/inline_map_registry.h Outdated
Comment thread envoy/common/inline_map_registry.h Outdated
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Jun 16, 2023

Should I update the following code to using CONSTRUCT_ON_FIRST_USE?

  static uint64_t registrationMapSize() {
    static uint64_t size = []() {
      finalize();
      return mutableRegistrationMap().size();
    }();

    return size;
  }

I think this object will only be constructed on first call and there should has no static initialization/deinitializatoin order problems for simple uint64_t with copy semantic?

Copy link
Copy Markdown
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT of, for the moment, putting an application if this new API in Envoys' codebase into the PR, or making a branched PR that adds a usage. It would just be helpful to see the use-case that inspired this as I have opinions about the usage model that would be better informed if I saw what you had in mind.

Comment thread envoy/common/inline_map_registry.h Outdated
Comment thread envoy/common/inline_map_registry.h Outdated
Comment thread envoy/common/inline_map_registry.h Outdated
Comment thread envoy/common/inline_map_registry.h Outdated
Comment thread envoy/common/inline_map_registry.h Outdated
Comment thread envoy/common/inline_map_registry.h Outdated
@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Jun 19, 2023

WDYT of, for the moment, putting an application if this new API in Envoys' codebase into the PR, or making a branched PR that adds a usage. It would just be helpful to see the use-case that inspired this as I have opinions about the usage model that would be better informed if I saw what you had in mind.

Thanks for you detailed review. I can make the filter state a example to the usage of the new inline map. I will add some new commit to this PR directly.

And after we complete most reviews and if it's necessary, we can split this PR finally.

wbpcode added 2 commits June 19, 2023 11:32
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
Copy link
Copy Markdown
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's your thinking about my higher level suggestion to make this class more stl-like and not have any static data at all? Just make it owned by the registry object and leave that in user control.

Your concepts like 'scope ID' are all fine if they are helpful for debugging; just make them member variables in the registry.

/wait

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Jun 19, 2023

What's your thinking about my higher level suggestion to make this class more stl-like and not have any static data at all? Just make it owned by the registry object and leave that in user control.

If it could make this registry easier to user or maintain, I think it's completely OK for me.

But I am not sure I get the stl-like clearly. I think you want the inline map could be used in a way like normal stl containers and has no special thread limitation or requirement?


I think there are mainly two different parts:

First part, the registry which will container the size and entries of inline keys. This registry will use some singleton data to store these registered keys and we assume all the registration will be done in the main thread before the bootstrapping.

Second part, the inline map which is a dict that provides some methods to accept inline handle as a parameter. This part could be used as normal map and there are no any special requirements to it.

@jmarantz
Copy link
Copy Markdown
Contributor

jmarantz commented Jun 19, 2023

This is not quite a drop-in replacement for hash_map, but I think you should use hash_map as a starting point. The difference is that you have a registry object that needs to be created. Something like this usage model might make sense:

InlineRegistry ireg;
InlineRegistry::Key key1 = ireg.addKey("key1");
InlineRegistry::Key key2 = ireg.addKey("key2");
InlineRegistry::Map<int> map = ireg.createMap<int>; // templatized on value. Keys are either inline or strings.
   OR
InlineRegistry::Map<int> map(ireg);  // I think factory is easier if you want to inline the storage.
map[key1] = 25;  // pre-registered key, instant access
map[key2] = 50;  // inline
map["dynamic key"] = 100;  // dynamic key, hash-lookup speed.

Then you can do instant lookups using InlineRegistry::Key (which could just be an index), and hash lookups using other strings.

Now if the caller wants to make InlineRegistry or its Keys be singletons or globals or whatever, that's up to them to sort out how to make them thread safe. But I think this basic functionality should be "pure" and not have any hidden state. Let the caller choose the semantics that work for them.

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Jun 19, 2023

This is not quite a drop-in replacement for hash_map, but I think you should use hash_map as a starting point. The difference is that you have a registry object that needs to be created. Something like this usage model might make sense:

InlineRegistry ireg;
InlineRegistry::Key key1 = ireg.addKey("key1");
InlineRegistry::Key key2 = ireg.addKey("key2");
InlineRegistry::Map<int> map; // templatized on value. Keys are either inline or strings.
map[key1] = 25;  // pre-registered key, instant access
map[key2] = 50;  // inline
map["dynamic key"] = 100;  // dynamic key, hash-lookup speed.

Then you can do instant lookups using InlineRegistry::Key (which could just be an index), and hash lookups using other strings.

Now if the caller wants to make InlineRegistry or its Keys be singletons or globals or whatever, that's up to them to sort out how to make them thread safe. But I think this basic functionality should be "pure" and not have any hidden state. Let the caller choose the semantics that work for them.

I think this is what I want and what I did. Let me take the filter state as a example to check if it works and what's the final pattern.

wbpcode added 5 commits September 1, 2023 06:26
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
Copy link
Copy Markdown
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few preliminary comments.

As you know I don't love this singleton pattern, and the way it's expressed here seems more fine-grained than necessary.

I'm not completely convinced you need to do it this way. Can we look for a way to use existing Factory objects to store the inline keys?

Comment thread envoy/common/optref.h Outdated
* @return true if the object has a value.
*/
bool has_value() const { return ptr_ != nullptr; }
bool has_value() const { return ptr_ != nullptr; } // // NOLINT(readability-identifier-naming)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can remove extra //

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why this is needed if optref.h must be modified, but another option is to just remove optref.h from this PR. My guess is you thought you needed to change it somehow, and then realized you didn't.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it. I think this is an unexpected change. Although it fixed a clang tidy problem.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right -- it often happens that new clang-tidy checks are added leaving some errors in existing files that are only cleaned up the next time that file is modified. However in this case I don't think you need to modify this file at all so you might as well make the PR smaller.

/**
* Get scope name and all managed descriptors' inline keys as string.
*/
static RegistryInfo registryInfo();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why copy the vector of string? Can you return it as a const ref?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically this method should only be called few times at server instance(s) is/are initialized. It's performance doesn't matter.
And use copy semantics, we needn't considering the case where multiple server instances are initialized and the method is called from multiple threads at same time.

/**
* Call once flag to ensure the registry is finalized only once on multi-thread environment.
*/
static std::once_flag& onceFlag() { MUTABLE_CONSTRUCT_ON_FIRST_USE(std::once_flag); }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this pattern of having lots of MUTABLE_CONSTRUCT_ON_FIRST_USE in separate static member functions. How about just one MUTABLE_CONSTRUCT_ON_FIRST_USE for the class instance then use normal accessors on that class for each member var.

Remember each MUTABLE_CONSTRUCT_ON_FIRST_USE needs to do some kind of lock or atomic to avoid races, so it's better to just do one of them.

Copy link
Copy Markdown
Member Author

@wbpcode wbpcode Sep 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. I can have a try.

@jmarantz
Copy link
Copy Markdown
Contributor

jmarantz commented Sep 5, 2023

I want to be more explicit with a recipe for how I would do this. You made the claim that doing this properly (w/o building complex structures before main()) would touch too many files and I think you have the right to ask me to explain how I think it should be done. So I did a little digging.

I think you need access to the inline keys in FilterStateImpl, but it's got no context. I think that's resolveable in these steps:

  1. Create FilterState::Context object, in which we construct a map descriptor. I think this object can be owned by Server::Instance, similar to Http::Context and Router::Context. See source/server/server.h:392. I think we will know all of the linked-in extensions when Server::InstanceImpl ctor runs.
  2. We will add a reference to FiltertStateContext into Http::Context, and add interfaces to that context to get the FilterState::Context&.
  3. FilterStateImpl is created in three places: ./common/stream_info/filter_state_impl.cc (in a few places) where we create filter states from other filter states, ./common/http/conn_manager_impl.cc, and ./common/router/upstream_request.cc. The HTTP one is easy because Http::Context is already passed to the ConnectionManagerImpl ctor, so you can now get the FilterState::Context from the Http::Context at the call-site. You will need to add an http_context_ member var to that class; the context is already passed to its constructor but it's used to construct the user_agent_ and is not currently saved.
  4. UpstreamRequest has an extra step. However I think it's not too bad. The Http::Context is already passed to the constuctor for Router::Filter and Router::FilterConfig, and is available as interfactes on those objects. I think it won't be too hard to pass it it to the place where the FilterState is constructed.

So there are a few changes needed to plumb this through but I think they are all very easy and non-controversial, following the established pattern of creating the Http::Context. The hardest part (to me, since I don't know how to do it) is knowing exactly which inline keys to declare in the Server.

@jmarantz
Copy link
Copy Markdown
Contributor

jmarantz commented Sep 5, 2023

/wait

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Sep 5, 2023

As you know I don't love this singleton pattern, and the way it's expressed here seems more fine-grained than necessary.
I'm not completely convinced you need to do it this way. Can we look for a way to use existing Factory objects to store the inline keys?

Yeah, I know you prefer the context pattern. I have thought it in deep. But there are two reasons prevent me to use it:

  1. Huge refactor: as you know, filter state and stream info is used very very widely. No matter a new context or exist context is used to store the inline keys, we need propagate the context to almost everywhere. This will bring super lots of code modifications.
  2. Discrete inline key registration/declaration: I hope third party extensions could register or declare new inline keys in their code and provide a easy way to refer or use the inline handle. If context pattern is used, this target would be hard to achieve.

Considering above two problems, I think RIO of context parttern is relatively low, singleton parttern is cheap and simple and has proved could work well (proved by HTTP inline headers).

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Sep 5, 2023

I want to be more explicit with a recipe for how I would do this. You made the claim that doing this properly (w/o building complex structures before main()) would touch too many files and I think you have the right to ask me to explain how I think it should be done. So I did a little digging.

I think you need access to the inline keys in FilterStateImpl, but it's got no context. I think that's resolveable in these steps:

  1. Create FilterState::Context object, in which we construct a map descriptor. I think this object can be owned by Server::Instance, similar to Http::Context and Router::Context. See source/server/server.h:392. I think we will know all of the linked-in extensions when Server::InstanceImpl ctor runs.
  2. We will add a reference to FiltertStateContext into Http::Context, and add interfaces to that context to get the FilterState::Context&.
  3. FilterStateImpl is created in three places: ./common/stream_info/filter_state_impl.cc (in a few places) where we create filter states from other filter states, ./common/http/conn_manager_impl.cc, and ./common/router/upstream_request.cc. The HTTP one is easy because Http::Context is already passed to the ConnectionManagerImpl ctor, so you can now get the FilterState::Context from the Http::Context at the call-site. You will need to add an http_context_ member var to that class; the context is already passed to its constructor but it's used to construct the user_agent_ and is not currently saved.
  4. UpstreamRequest has an extra step. However I think it's not too bad. The Http::Context is already passed to the constuctor for Router::Filter and Router::FilterConfig, and is available as interfactes on those objects. I think it won't be too hard to pass it it to the place where the FilterState is constructed.

So there are a few changes needed to plumb this through but I think they are all very easy and non-controversial, following the established pattern of creating the Http::Context. The hardest part (to me, since I don't know how to do it) is knowing exactly which inline keys to declare in the Server.

I had figured out a similar path. Except the step 3 is different. There are not only three places to update. Note, the FilterStateImpl is used in the StreamInfoImpl. And all places that used StreamInfoImpl need to be updated. The health checkers, the dubbo/thrift/generic/sip/udp proxy, golang filter, the async HTTP/gRPC client, QUIC session, TCP socket, any position that will create connections, etc. There are super lots of files need to be updated.

Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
@jmarantz
Copy link
Copy Markdown
Contributor

jmarantz commented Sep 5, 2023

OK I'll look at the singletons later to try to make sure they are as clean as possible.

In the meantime, did you wind up with a perf test for the impact of using inline maps here? It's a ton of complexity being added; want to make sure the perf benefit is worth it.

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Sep 6, 2023

In the meantime, did you wind up with a perf test for the impact of using inline maps here? It's a ton of complexity being added; want to make sure the perf benefit is worth it.

I think we have made benchmark tests for the inline map directly. So, new benchmark tests for filter state should make no big difference.
I will run the envoy binary directly to see the cpu overhead of new filter state.

@jmarantz
Copy link
Copy Markdown
Contributor

jmarantz commented Sep 6, 2023

I don't think the inline map benchmarks tell us much detail about the impact on system performacne. If after all this we get <1% I have to ask if the complexity is worth it.

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Sep 6, 2023

I don't think the inline map benchmarks tell us much detail about the impact on system performacne. If after all this we get <1% I have to ask if the complexity is worth it.

I will run the envoy binary directly to see the cpu overhead of new filter state.

I mean run the benchmark test for the single filter state class make no big sense because we have did the benchmark test for the inline map class. So I will run the binary to compare the actual cpu overhead of filter state.

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Sep 6, 2023

if the complexity is worth it

Back to the complexity, if sington pattern is used, I think the inline map should be pretty simple to use.

Signed-off-by: wbpcode <wangbaiping@corp.netease.com>
@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Sep 6, 2023

/wait

@jmarantz
Copy link
Copy Markdown
Contributor

jmarantz commented Sep 6, 2023

RE "Back to the complexity, if sington pattern is used, I think the inline map should be pretty simple to use." -- I strongly disagree. The code changes are bounded in scope but the complexity of maintaining a codebase with a ton of complex things happening before main() is a high cost that will continue to be paid in the maintenance. I want to make sure it is worth it.

@wbpcode
Copy link
Copy Markdown
Member Author

wbpcode commented Sep 6, 2023

The code changes are bounded in scope but the complexity of maintaining a codebase with a ton of complex things happening before main() is a high cost that will continue to be paid in the maintenance. I want to make sure it is worth it.

I basically think the key/descriptor registrations are complete same with the various extension factory registrations. So, we have introduced the complexity in long time ago. The new key/descriptor registrations won't make thing worse.

Of course, there maybe are some technical disagreement, but it doesn't matter because I will pend this PR up temporarily because I also agree the gain is trival for now.

@jmarantz jmarantz marked this pull request as draft September 6, 2023 13:32
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Oct 6, 2023

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions Bot added the stale stalebot believes this issue/PR has not been touched recently label Oct 6, 2023
@github-actions
Copy link
Copy Markdown

This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions Bot closed this Oct 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale stalebot believes this issue/PR has not been touched recently waiting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

common inline key registry support

3 participants