Summary
Several architecture improvements for the merge pipeline to reduce lock contention, improve extensibility, and fix subtle merge semantics.
Proposed solution (optional)
1. Double-buffering for HybridDiscovery mutex
HybridDiscoveryStrategy::refresh() holds the mutex during the entire pipeline_.execute() call, which includes synchronous ROS 2 graph introspection. All HTTP handlers calling discover_*() block during refresh, causing request latency spikes.
File: src/ros2_medkit_gateway/src/discovery/hybrid_discovery.cpp:48-55
Fix: Execute pipeline without lock, then swap result under lock.
2. Replace dynamic_cast for RuntimeLayer statistics
MergePipeline::execute() uses dynamic_cast<RuntimeLayer*> to get last_filtered_count(). This couples the pipeline to a concrete type, violating OCP.
File: src/ros2_medkit_gateway/src/discovery/merge_pipeline.cpp:418
Fix: Add virtual size_t filtered_count() const { return 0; } to DiscoveryLayer base class.
3. Fix merge_bool sticky OR for external field
merge_bool() uses target = target || source for MergeWinner::BOTH. Once any layer sets a boolean to true, it can never be reverted. Correct for is_online but wrong for external.
File: src/ros2_medkit_gateway/src/discovery/merge_pipeline.cpp:89-103
Fix: Move external to METADATA field group where manifest is AUTHORITATIVE by default.
4. Make MergePipeline accessors private
get_last_report() and get_linking_result() are public but only called through HybridDiscoveryStrategy. Direct calls would be a data race (no internal mutex).
File: src/ros2_medkit_gateway/include/ros2_medkit_gateway/discovery/merge_pipeline.hpp:67-85
Fix: Make them private with friend class HybridDiscoveryStrategy.
Additional context (optional)
Found during self-review of PR #258. Item 1 (mutex contention) is the most impactful for production - it affects request latency during every discovery refresh cycle.
Summary
Several architecture improvements for the merge pipeline to reduce lock contention, improve extensibility, and fix subtle merge semantics.
Proposed solution (optional)
1. Double-buffering for HybridDiscovery mutex
HybridDiscoveryStrategy::refresh()holds the mutex during the entirepipeline_.execute()call, which includes synchronous ROS 2 graph introspection. All HTTP handlers callingdiscover_*()block during refresh, causing request latency spikes.File:
src/ros2_medkit_gateway/src/discovery/hybrid_discovery.cpp:48-55Fix: Execute pipeline without lock, then swap result under lock.
2. Replace dynamic_cast for RuntimeLayer statistics
MergePipeline::execute()usesdynamic_cast<RuntimeLayer*>to getlast_filtered_count(). This couples the pipeline to a concrete type, violating OCP.File:
src/ros2_medkit_gateway/src/discovery/merge_pipeline.cpp:418Fix: Add
virtual size_t filtered_count() const { return 0; }toDiscoveryLayerbase class.3. Fix merge_bool sticky OR for
externalfieldmerge_bool()usestarget = target || sourceforMergeWinner::BOTH. Once any layer sets a boolean totrue, it can never be reverted. Correct foris_onlinebut wrong forexternal.File:
src/ros2_medkit_gateway/src/discovery/merge_pipeline.cpp:89-103Fix: Move
externalto METADATA field group where manifest is AUTHORITATIVE by default.4. Make MergePipeline accessors private
get_last_report()andget_linking_result()are public but only called throughHybridDiscoveryStrategy. Direct calls would be a data race (no internal mutex).File:
src/ros2_medkit_gateway/include/ros2_medkit_gateway/discovery/merge_pipeline.hpp:67-85Fix: Make them private with
friend class HybridDiscoveryStrategy.Additional context (optional)
Found during self-review of PR #258. Item 1 (mutex contention) is the most impactful for production - it affects request latency during every discovery refresh cycle.