-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[C++] PIP 37: Support large message size #13627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++] PIP 37: Support large message size #13627
Conversation
|
It looks like some code is not compatible with the test env. I'll fix it soon. |
|
Hi @merlimat , could you review this PR? |
|
ping @merlimat Please help review this PR. |
…13883) In C++ client, there is a corner case that when a reader's start message ID is the last message of a topic, `hasMessageAvailable` returns true. However, it should return false because the start message ID is exclusive and in this case `readNext` would never return a message unless new messages arrived. The current C++ implementation of `hasMessageAvailable` is from long days ago and has many problems. So this PR migrates the Java implementation of `hasMessageAvailable` to C++ client. Since after the modifications we need to access `startMessageId` in `hasMessageAvailable`, which is called in a different thread from `connectionOpened` that might modify `startMessageId`. We use a common mutex `mutexForMessageIds` to protect the access to `lastDequedMessageId_` and `lastMessageIdInBroker_`. To fix the original tests when `startMessageId` is latest, this PR adds a `GetLastMessageIdResponse` as the response of `GetLastMessageId` request. The `GetLastMessageIdResponse` contains the `consumer_mark_delete_position` introduced from #9652 to compare with `last_message_id` when `startMessageId` is latest. This change added tests `ReaderTest#testHasMessageAvailableWhenCreated` and `MessageIdTest# testCompareLedgerAndEntryId`. (cherry picked from commit e50493e) Fix the conflicts by: - Remove ReaderImpl::getLastMessageIdAsync introduced from #11723 - Remove getPriorityLevel() method introduced from #12076 - Revert changes of registerConsumer from #12118 - Remove new fields introduced from #13627
…13883) In C++ client, there is a corner case that when a reader's start message ID is the last message of a topic, `hasMessageAvailable` returns true. However, it should return false because the start message ID is exclusive and in this case `readNext` would never return a message unless new messages arrived. The current C++ implementation of `hasMessageAvailable` is from long days ago and has many problems. So this PR migrates the Java implementation of `hasMessageAvailable` to C++ client. Since after the modifications we need to access `startMessageId` in `hasMessageAvailable`, which is called in a different thread from `connectionOpened` that might modify `startMessageId`. We use a common mutex `mutexForMessageIds` to protect the access to `lastDequedMessageId_` and `lastMessageIdInBroker_`. To fix the original tests when `startMessageId` is latest, this PR adds a `GetLastMessageIdResponse` as the response of `GetLastMessageId` request. The `GetLastMessageIdResponse` contains the `consumer_mark_delete_position` introduced from #9652 to compare with `last_message_id` when `startMessageId` is latest. This change added tests `ReaderTest#testHasMessageAvailableWhenCreated` and `MessageIdTest# testCompareLedgerAndEntryId`. (cherry picked from commit e50493e) Fix the conflicts by - Remove new fields introduced from #13627
|
Hi @momo-jun could you please take a look at this PR? Similar to the Java client, do we need to add chunking docs for C++ client? Thanks |
Sure. Will look into this. |
Fixes apache#13267 ### Motivation Unlike Java client, the `Client` of C++ client has a `shutdown` method that is responsible to execute the following steps: 1. Call `shutdown` on all internal producers and consumers 2. Close all connections in the pool 3. Close all executors of the executor providers. When an executor is closed, it call `io_service::stop()`, which makes the event loop (`io_service::run()`) in another thread return as soon as possible. However, there is no wait operation. If a client failed to create a producer or consumer, the `close` method will call `shutdown` and close all executors immediately and exits the application. In this case, the detached event loop thread might not exit ASAP, then valgrind will detect the memory leak. This memory leak can be avoided by sleeping for a while after `Client::close` returns or there are still other things to do after that. However, we should still adopt the semantics that after `Client::shutdown` returns, all event loop threads should be terminated. ### Modifications - Add a timeout parameter to the `close` method of `ExecutorService` and `ExecutorServiceProvider` as the max blocking timeout if it's non-negative. - Add a `TimeoutProcessor` helper class to update the left timeout after calling all methods that accept the timeout parameter. - Call `close` on all `ExecutorServiceProvider`s in `ClientImpl::shutdown` with 500ms timeout, which could be long enough. In addition, in `handleClose` method, call `shutdown` in another thread to avoid the deadlock. ### Verifying this change After applying this patch, the reproduce code in apache#13627 will pass the valgrind check. ``` ==3013== LEAK SUMMARY: ==3013== definitely lost: 0 bytes in 0 blocks ==3013== indirectly lost: 0 bytes in 0 blocks ==3013== possibly lost: 0 bytes in 0 blocks ```
* [C++] Wait until event loops terminates when closing the Client Fixes #13267 ### Motivation Unlike Java client, the `Client` of C++ client has a `shutdown` method that is responsible to execute the following steps: 1. Call `shutdown` on all internal producers and consumers 2. Close all connections in the pool 3. Close all executors of the executor providers. When an executor is closed, it call `io_service::stop()`, which makes the event loop (`io_service::run()`) in another thread return as soon as possible. However, there is no wait operation. If a client failed to create a producer or consumer, the `close` method will call `shutdown` and close all executors immediately and exits the application. In this case, the detached event loop thread might not exit ASAP, then valgrind will detect the memory leak. This memory leak can be avoided by sleeping for a while after `Client::close` returns or there are still other things to do after that. However, we should still adopt the semantics that after `Client::shutdown` returns, all event loop threads should be terminated. ### Modifications - Add a timeout parameter to the `close` method of `ExecutorService` and `ExecutorServiceProvider` as the max blocking timeout if it's non-negative. - Add a `TimeoutProcessor` helper class to update the left timeout after calling all methods that accept the timeout parameter. - Call `close` on all `ExecutorServiceProvider`s in `ClientImpl::shutdown` with 500ms timeout, which could be long enough. In addition, in `handleClose` method, call `shutdown` in another thread to avoid the deadlock. ### Verifying this change After applying this patch, the reproduce code in #13627 will pass the valgrind check. ``` ==3013== LEAK SUMMARY: ==3013== definitely lost: 0 bytes in 0 blocks ==3013== indirectly lost: 0 bytes in 0 blocks ==3013== possibly lost: 0 bytes in 0 blocks ```
* [C++] Wait until event loops terminates when closing the Client Fixes #13267 ### Motivation Unlike Java client, the `Client` of C++ client has a `shutdown` method that is responsible to execute the following steps: 1. Call `shutdown` on all internal producers and consumers 2. Close all connections in the pool 3. Close all executors of the executor providers. When an executor is closed, it call `io_service::stop()`, which makes the event loop (`io_service::run()`) in another thread return as soon as possible. However, there is no wait operation. If a client failed to create a producer or consumer, the `close` method will call `shutdown` and close all executors immediately and exits the application. In this case, the detached event loop thread might not exit ASAP, then valgrind will detect the memory leak. This memory leak can be avoided by sleeping for a while after `Client::close` returns or there are still other things to do after that. However, we should still adopt the semantics that after `Client::shutdown` returns, all event loop threads should be terminated. ### Modifications - Add a timeout parameter to the `close` method of `ExecutorService` and `ExecutorServiceProvider` as the max blocking timeout if it's non-negative. - Add a `TimeoutProcessor` helper class to update the left timeout after calling all methods that accept the timeout parameter. - Call `close` on all `ExecutorServiceProvider`s in `ClientImpl::shutdown` with 500ms timeout, which could be long enough. In addition, in `handleClose` method, call `shutdown` in another thread to avoid the deadlock. ### Verifying this change After applying this patch, the reproduce code in #13627 will pass the valgrind check. ``` ==3013== LEAK SUMMARY: ==3013== definitely lost: 0 bytes in 0 blocks ==3013== indirectly lost: 0 bytes in 0 blocks ==3013== possibly lost: 0 bytes in 0 blocks ``` (cherry picked from commit cd78f39)
* [C++] Wait until event loops terminates when closing the Client Fixes #13267 ### Motivation Unlike Java client, the `Client` of C++ client has a `shutdown` method that is responsible to execute the following steps: 1. Call `shutdown` on all internal producers and consumers 2. Close all connections in the pool 3. Close all executors of the executor providers. When an executor is closed, it call `io_service::stop()`, which makes the event loop (`io_service::run()`) in another thread return as soon as possible. However, there is no wait operation. If a client failed to create a producer or consumer, the `close` method will call `shutdown` and close all executors immediately and exits the application. In this case, the detached event loop thread might not exit ASAP, then valgrind will detect the memory leak. This memory leak can be avoided by sleeping for a while after `Client::close` returns or there are still other things to do after that. However, we should still adopt the semantics that after `Client::shutdown` returns, all event loop threads should be terminated. ### Modifications - Add a timeout parameter to the `close` method of `ExecutorService` and `ExecutorServiceProvider` as the max blocking timeout if it's non-negative. - Add a `TimeoutProcessor` helper class to update the left timeout after calling all methods that accept the timeout parameter. - Call `close` on all `ExecutorServiceProvider`s in `ClientImpl::shutdown` with 500ms timeout, which could be long enough. In addition, in `handleClose` method, call `shutdown` in another thread to avoid the deadlock. ### Verifying this change After applying this patch, the reproduce code in #13627 will pass the valgrind check. ``` ==3013== LEAK SUMMARY: ==3013== definitely lost: 0 bytes in 0 blocks ==3013== indirectly lost: 0 bytes in 0 blocks ==3013== possibly lost: 0 bytes in 0 blocks ``` (cherry picked from commit cd78f39)
* [C++] Wait until event loops terminates when closing the Client Fixes #13267 ### Motivation Unlike Java client, the `Client` of C++ client has a `shutdown` method that is responsible to execute the following steps: 1. Call `shutdown` on all internal producers and consumers 2. Close all connections in the pool 3. Close all executors of the executor providers. When an executor is closed, it call `io_service::stop()`, which makes the event loop (`io_service::run()`) in another thread return as soon as possible. However, there is no wait operation. If a client failed to create a producer or consumer, the `close` method will call `shutdown` and close all executors immediately and exits the application. In this case, the detached event loop thread might not exit ASAP, then valgrind will detect the memory leak. This memory leak can be avoided by sleeping for a while after `Client::close` returns or there are still other things to do after that. However, we should still adopt the semantics that after `Client::shutdown` returns, all event loop threads should be terminated. ### Modifications - Add a timeout parameter to the `close` method of `ExecutorService` and `ExecutorServiceProvider` as the max blocking timeout if it's non-negative. - Add a `TimeoutProcessor` helper class to update the left timeout after calling all methods that accept the timeout parameter. - Call `close` on all `ExecutorServiceProvider`s in `ClientImpl::shutdown` with 500ms timeout, which could be long enough. In addition, in `handleClose` method, call `shutdown` in another thread to avoid the deadlock. ### Verifying this change After applying this patch, the reproduce code in #13627 will pass the valgrind check. ``` ==3013== LEAK SUMMARY: ==3013== definitely lost: 0 bytes in 0 blocks ==3013== indirectly lost: 0 bytes in 0 blocks ==3013== possibly lost: 0 bytes in 0 blocks ``` (cherry picked from commit cd78f39)
…e#15316) * [C++] Wait until event loops terminates when closing the Client Fixes apache#13267 ### Motivation Unlike Java client, the `Client` of C++ client has a `shutdown` method that is responsible to execute the following steps: 1. Call `shutdown` on all internal producers and consumers 2. Close all connections in the pool 3. Close all executors of the executor providers. When an executor is closed, it call `io_service::stop()`, which makes the event loop (`io_service::run()`) in another thread return as soon as possible. However, there is no wait operation. If a client failed to create a producer or consumer, the `close` method will call `shutdown` and close all executors immediately and exits the application. In this case, the detached event loop thread might not exit ASAP, then valgrind will detect the memory leak. This memory leak can be avoided by sleeping for a while after `Client::close` returns or there are still other things to do after that. However, we should still adopt the semantics that after `Client::shutdown` returns, all event loop threads should be terminated. ### Modifications - Add a timeout parameter to the `close` method of `ExecutorService` and `ExecutorServiceProvider` as the max blocking timeout if it's non-negative. - Add a `TimeoutProcessor` helper class to update the left timeout after calling all methods that accept the timeout parameter. - Call `close` on all `ExecutorServiceProvider`s in `ClientImpl::shutdown` with 500ms timeout, which could be long enough. In addition, in `handleClose` method, call `shutdown` in another thread to avoid the deadlock. ### Verifying this change After applying this patch, the reproduce code in apache#13627 will pass the valgrind check. ``` ==3013== LEAK SUMMARY: ==3013== definitely lost: 0 bytes in 0 blocks ==3013== indirectly lost: 0 bytes in 0 blocks ==3013== possibly lost: 0 bytes in 0 blocks ``` (cherry picked from commit cd78f39) (cherry picked from commit c0c67db)
…e#15316) * [C++] Wait until event loops terminates when closing the Client Fixes apache#13267 ### Motivation Unlike Java client, the `Client` of C++ client has a `shutdown` method that is responsible to execute the following steps: 1. Call `shutdown` on all internal producers and consumers 2. Close all connections in the pool 3. Close all executors of the executor providers. When an executor is closed, it call `io_service::stop()`, which makes the event loop (`io_service::run()`) in another thread return as soon as possible. However, there is no wait operation. If a client failed to create a producer or consumer, the `close` method will call `shutdown` and close all executors immediately and exits the application. In this case, the detached event loop thread might not exit ASAP, then valgrind will detect the memory leak. This memory leak can be avoided by sleeping for a while after `Client::close` returns or there are still other things to do after that. However, we should still adopt the semantics that after `Client::shutdown` returns, all event loop threads should be terminated. ### Modifications - Add a timeout parameter to the `close` method of `ExecutorService` and `ExecutorServiceProvider` as the max blocking timeout if it's non-negative. - Add a `TimeoutProcessor` helper class to update the left timeout after calling all methods that accept the timeout parameter. - Call `close` on all `ExecutorServiceProvider`s in `ClientImpl::shutdown` with 500ms timeout, which could be long enough. In addition, in `handleClose` method, call `shutdown` in another thread to avoid the deadlock. ### Verifying this change After applying this patch, the reproduce code in apache#13627 will pass the valgrind check. ``` ==3013== LEAK SUMMARY: ==3013== definitely lost: 0 bytes in 0 blocks ==3013== indirectly lost: 0 bytes in 0 blocks ==3013== possibly lost: 0 bytes in 0 blocks ``` (cherry picked from commit cd78f39) (cherry picked from commit 6d365c9)
Motivation
This PR is a C++ catch up of #4400, which implements PIP 37.
Modifications
Changes of the interfaces:
chunkingEnabledconfig to producer so that producer can enable chunking messages by enablingchunkingEnabledand disablingbatchingEnabled.maxPendingChunkedMessageandautoOldestChunkedMessageOnQueueFullconfigs to consumer to control the behavior to handle chunks.Implementations of producer side:
sendAsyncmethod to follow the similar logic of Java client. IfcanAddToBatchreturns false and the message size exceeds the limit, split the large message into chunks and send them one by one.Commands::newSendthat it didn't usereadableBytes()as the buffer's size, which could lead to a wrong check sum result when sending chunks because each chunk is a reference of the original large buffer and only the reader index is different.Implementations of consumer side:
MapCacheclass to wrap a hash map with a double ended queue that records the UUIDs in order of time. Because when we removes an UUID, we need to remove the related cache from both the map and the queue. This class only exposes the necessary methods to avoid one of the two data structures is not cleaned. In addition, it makes test easier to verify the cache is cleared after chunks are merged.processMessageChunkto cache these chunks and merge the chunks with the same UUID into the completed message once the last chunks arrived.Tests:
maxMessageSize=10240intest-conf/standalone-ssl.confso that we don't need to create a too large message in tests.MapCacheTestto verify the public methods ofMapCache.MessageChunkingTestto verify the basic end to end case with all compression types.Does this pull request potentially affect one of the following parts: