Unified TimeSeriesStore with pluggable backends, global rewrite of timed event storage#1080
Unified TimeSeriesStore with pluggable backends, global rewrite of timed event storage#1080paul-nechifor merged 45 commits intodevfrom
Conversation
- quality_barrier: Callable[[Observable[T]], Observable[T]] - sharpness_barrier: Callable[[Observable[Image]], Observable[Image]]
- Implement find_closest(), first_timestamp(), iterate(), iterate_ts(), iterate_realtime() methods using abstract _iter_items/_find_closest_timestamp - Add scheduler-based stream() with absolute time reference to prevent timing drift during long playback (ported from replay.py) - Move imports to top of file, add proper typing throughout - Fix pickledir.py mypy error (pickle.load returns Any)
- Single-file SQLite storage with indexed timestamp queries - BLOB storage for pickled sensor data - INSERT OR REPLACE for duplicate timestamp handling - Supports multiple tables per database (different sensors) - Added to parametrized tests (15 tests across 3 backends)
- PostgresStore implements SensorStore[T] + Resource for lifecycle management - Multiple stores can share same database with different tables - Tables created automatically on first save - Tests are optional - skip gracefully if PostgreSQL not available - Added psycopg2-binary and types-psycopg2 dependencies - Includes reset_db() helper for simple migrations (drop/recreate)
- Validate table/database names in SqliteStore and PostgresStore using regex (alphanumeric/underscore, not starting with digit) - Fix Transform.to_pose() return type using TYPE_CHECKING import - Add return type annotation to TF.get_pose() - Fix ambiguous doclink in transports.md
SqliteStore now accepts a name (e.g. "recordings/lidar") that gets resolved via get_data_dir to data/recordings/lidar.db. Still supports absolute paths and :memory: for backward compatibility.
- Add TypeVar bound: T = TypeVar("T", bound=Timestamped)
- Simplify save() to always use data.ts (no more optional timestamp)
- Update tests to use SampleData(Timestamped) instead of strings
- SqliteStore accepts str | Path for backward compatibility
# Conflicts: # dimos/models/manipulation/contact_graspnet_pytorch/inference.py
Required when cupy/contact_graspnet are installed locally without type stubs.
Resolve conflicts: - Accept dev's Image refactor (numpy-only, no CudaImage/AbstractImage) - Keep spatial_db2's psycopg2-binary dep, add dev's annotation-protocol + toolz - Keep spatial_db2's print_loss_heatmap in benchmark type - Keep spatial_db2's typed sharpness_barrier signature - Merge TYPE_CHECKING imports in Transform.py (rerun + PoseStamped) - Trivial: accept dev's type-ignore formatting (space after comma)
- Add import-untyped to xacro type: ignore comment in mesh_utils.py - Remove unused record/replay RPC methods from ModuleBase
Remove the Timestamped bound from SensorStore's TypeVar, enabling storage of arbitrary data types. Timestamps are now provided explicitly via save(ts, data), with Timestamped convenience methods (save_ts, pipe_save_ts, consume_stream_ts) as opt-in helpers. iterate_realtime() and stream() now use stored timestamps instead of data.ts.
Implement _delete for InMemoryStore, SqliteStore, PickleDirStore, PostgresStore (LegacyPickleStore raises NotImplementedError). Fix find_closest docstring placement and add get/add/prune_old convenience methods.
…tedKeyList Replace InMemoryStore's dict + sorted-cache (O(n log n) rebuild on every write) with SortedKeyList for O(log n) insert, delete, and range queries. Add collection methods to TimeSeriesStore base: __len__, __iter__, last/last_timestamp, start_ts/ end_ts, time_range, duration, find_before/find_after, slice_by_time. Implement backing abstract methods (_count, _last_timestamp, _find_before, _find_after) in all five backends. Performance benchmarks confirm InMemoryStore matches TimestampedCollection on 100k items.
…nmemory.py Rename the module to better reflect its purpose. Extract InMemoryStore from base.py into its own file (inmemory.py) to keep base.py focused on the abstract TimeSeriesStore class. Update all internal and external imports.
…, store T directly - Bound T to Timestamped — no more raw/non-Timestamped data paths - Removed save_raw, pipe_save_raw, consume_stream_raw - InMemoryStore stores T directly in SortedKeyList (no _Entry wrapper) - Removed duplicate-check on insert (same semantics as TimestampedCollection) - Performance now at parity with TimestampedCollection
- Delete TimestampedCollection class (replaced by InMemoryStore) - Rewrite TimestampedBufferCollection to inherit InMemoryStore - Remove TBuffer_old dead code from tf.py - Fix prune_old mutation-during-iteration bug in base.py - Break circular import with TYPE_CHECKING guard in base.py - Update Image.py to use public API instead of _items access - Update tests to use InMemoryStore directly
|
Personally, I don't like the API. It's similar to what we do with Sessionsstore = SqliteStore("recordings/lidar")
store.save(data)Without a session, API calls race to create store = SqliteStore("recordings/lidar")
session = store.session()
with session:
session.save(data)Then SqliteStoreSession has Lack of ownerMost systems separate the database from the sessions. This makes it easy to close the database which closes all the sessions. store = SqliteStore("recordings/lidar")
session = store.session()
store.close()
Low cohesion in TimeSeriesStoreTimeSeriesStore is a database, a session, and a query set. Most of the methods belong to the query set. (QuerySet is a Django term, others use Manager or Collection for the object which does the querying.) But the code forces people to extend TimeSeriesStore to get what they need instead of TimeSeriesStore using what it needs. Example. class TimeSeriesStore(Generic[T], ABC):
@abstractmethod
def _save(self, timestamp: float, data: T) -> None: ..
def save(self, *data: T) -> None:
for item in data:
self._save(item.ts, item)
class MyStore(TimeSeriesStore[T]):
def _save(self, timestamp: float, data: T) -> None: ..I suggest this: class Session(Protocol[T]):
def save(self, timestamp: float, data: T) -> None: ...
class MySession(Session[T]):
def save(self, timestamp: float, data: T) -> None:
# implementation
...
class TimeSeriesStore(Generic[T])
def __init__(self, session: Session[T]):
self._session = session
def save(self, *data: T) -> None:
for item in data:
self._session.save(data)
VerbosityIf the concern is that this is too verbose, you can always have helper methods. But if the API mixes concerns too much that's not easy to untangle. Example helper method: with store_session("sqlite", "/path/to/sensors.db") as store:
store.save(...) |
greptile is saying 5/5 paul is saying 0/5, this is a classic good cop bad cop situation sounds good! will externalize the sessions |
|
We decided to merge this large thing to not dangle, Pauls feedback on store init/shutdown API is good, so I will rewrite to include external session handling in a follow up |
Import Module and ModuleConfig from dimos.core.module instead of the lazy-loaded dimos.core namespace, which mypy sees as type Any.
Release v0.0.10: Manipulation Stack, MuJoCo Simulation, DDS Transport, Web and Native Visualization via Rerun ## Highlights 88+ commits, 20 contributors, 700+ files changed. The TLDR: **a complete manipulation stack**, **MuJoCo simulation**, **DDS transport**, and **a rewritten visualization pipeline**. Agents are no longer bolted on top — they're refactored as native modules with direct stream access. The entire ROS message dependency has been removed from core DimOS, and we've added VR, phone, and arm teleoperation stacks. You can now vibecode a pick-and-place task from natural language to motor commands. Installation has been significantly streamlined — no more direnv, simpler setup, and the web viewer is now the default. --- ## 🚀 New Features ### Simulation - **MuJoCo simulation module** — Run any DimOS blueprint in simulation with no hardware. Supports xArm and Unitree embodiments, parses MJCF/URDF for robot properties, monotonic clock timing (no `time.sleep`). `dimos --simulation run unitree-go2` ([#1035](#1035)) by @jca0 - **Simulation teleop blueprints** — Added simulation teleop blueprints for Piper, xArm6, and xArm7. ([#1308](#1308)) by @mustafab0 ### Manipulation - **Modular manipulation stack** — Full planning stack with Drake: FK/IK solvers (Jacobian + Drake optimization), RRT path planning, world model with obstacle monitoring, multi-robot management. xArm6/7 and Piper support. ([#1079](#1079)) by @mustafab0 - **Joint servo and cartesian controllers** — Joint position/velocity controllers and cartesian IK task with Pinocchio solver. PoseStamped stream input for real-time control. ([#1116](#1116)) by @mustafab0 - **GraspGen integration** — Grasp generation via Docker-hosted GPU model. Lazy container startup, thread-safe init, RPC `generate_grasps()` returns ranked PoseArray. ([#1119](#1119), [#1234](#1234)) by @JalajShuklaSS - **Gripper control** — Gripper RPC methods on control coordinator, exposed adapter property for custom implementations. ([#1213](#1213)) by @mustafab0 - **Detection3D and Object support** — Object input topics, TF support on manipulation module, pointcloud-to-convex-hull for Drake imports. ([#1236](#1236)) by @mustafab0 - **Agentic pick and place** — Reimplemented manipulation skills for agent-driven pick-and-place workflows. ([#1237](#1237)) by @mustafab0 ### Teleoperation - **Quest VR teleoperation** — Full WebXR + Deno bridge stack. Quest controller data (pose, trigger, grip) streamed to DimOS modules. Monitor-style locking for control loops. ([#1215](#1215)) by @ruthwikdasyam - **Phone teleoperation** — Control Go2 from your phone with a web-based teleop interface. ([#1280](#1280)) by @ruthwikdasyam - **Arm teleop with Pinocchio IK** — Single and dual arm teleoperation using Pinocchio inverse kinematics. Blueprints for xArm, Piper, and dual configurations. ([#1246](#1246)) by @ruthwikdasyam ### Transports & Infrastructure - **DDS transport protocol** — CycloneDDS transport with configurable QoS (high-throughput and reliable profiles). Optional install, benchmark integration. ([#1174](#1174)) by @Kaweees - **Pubsub pattern subscriptions** — Glob and regex pattern matching for topic subscriptions. `subscribe_all()` for bridge-style consumers. Topic type encoding in channel strings (`/topic#module.ClassName`). ([#1114](#1114)) by @leshy - **LCM raw bytes passthrough** — Skip `lcm_encode()` when message is already bytes. ([#1223](#1223)) by @leshy - **Unified TimeSeriesStore** — Pluggable backends (InMemory, SQLite, Pickle, PostgreSQL) with SortedKeyList for O(log n) operations. Replaces the old replay system and TimestampedCollection. Collection API with slice, range, and streaming methods. ([#1080](#1080)) by @leshy - **DimosROS benchmark tests** — Benchmark suite for ROS transport performance. ([#1087](#1087)) by @leshy ### Navigation - **FASTLIO2 support** — Hardware-verified localization with arm64 support. Docker deployment with FAR Planner, terrain analysis, and bagfile playback mode. Builds or-tools from source on arm64. ([#1149](#1149)) by @baishibona - **Native Livox + FASTLIO2 module** — First-class DimOS native module for Livox Mid-360 lidar with FASTLIO2 localization. ([#1235](#1235)) by @leshy ### Visualization - **RerunBridge module and CLI** — New bridge that subscribes to all LCM messages and logs those with `to_rerun()` to Rerun viewer. GlobalConfig singleton, web viewer support. Replaces the old rerun initialization system. ([#1154](#1154)) by @leshy - **Webcam rerun visualization** — Camera module logs to Rerun with pinhole projection for 3D visualization. ([#1117](#1117)) by @ruthwikdasyam - **Default viewer switched to rerun-web** — Browser-based viewer is now the default for broader compatibility. No native viewer install needed. ([#1324](#1324)) by @spomichter ### Agents - **Agent refactor** — Restructured agent module with cleaner imports and global config integration. ([#1211](#1211)) by @paul-nechifor - **Timestamp knowledge** — Agents now have timestamp awareness in prompts for temporal reasoning. ([#1093](#1093)) by @ClaireBookworm - **Observe skill** — Go2 can now observe (capture and describe) its environment via agent skill. ([#1109](#1109)) by @paul-nechifor ### Platform & Hardware - **G1 without ROS** — Unitree G1 blueprints decoupled from ROS dependency. Lazy imports for fast startup. ([#1221](#1221)) by @jeff-hykin - **ARM (aarch64) support** — DimOS runs on ARM hardware. Platform-conditional dependencies, open3d source builds for arm64. ([#1229](#1229)) by @jeff-hykin - **Universal joint/hardware schema** — `HardwareComponent` dataclass with `JointState`, `JointName` type aliases. Backend registry with auto-discovery for SDK adapters. ([#1040](#1040), [#1067](#1067)) by @mustafab0 --- ## 🔧 Improvements - **Optional Dask** — Start without Dask using `--no-dask` flag. Startup time reduced from ~60s to ~45s. ([#1111](#1111), [#1232](#1232)) by @paul-nechifor - **RPC rework** — Renamed `ModuleBlueprint` → `_BlueprintAtom`, `ModuleBlueprintSet` → `Blueprint`, `ModuleConnection` → `Stream`. Added `ModuleRef`, improved type hints throughout. ([#1143](#1143)) by @jeff-hykin - **Image class simplification** — Rewritten as pure NumPy dataclass. Removed CUDA backend, unused methods (solve_pnp, csrt_tracker), and image_impls/ directory. ([#1161](#1161)) by @leshy - **Odometry message cleanup** — Simplified Odometry message type. ([#1256](#1256)) by @leshy - **Remove all ROS message dependencies** — Purged ROS message types from core DimOS. Refactored rosnav to use ROSTransport. Removed dead ROS bridge code. ([#1230](#1230)) by @alexlin2 - **Removed bad function serialization** — Eliminated unnecessary serialization of Python functions. ([#1121](#1121)) by @paul-nechifor - **Benchmark IEC units** — Switched bandwidth benchmarks from SI to IEC units for accuracy. ([#1147](#1147)) by @leshy - **Pubsub typing improvements** — Thread-safety locks on `subscribe_new_topics` and `subscribe_all`. Proper type params across pubsub stack. ([#1153](#1153)) by @leshy - **Autogenerated blueprint list** — Blueprints are now auto-discovered and listed. ([#1100](#1100)) by @paul-nechifor - **Generic Buttons message** — Renamed `QuestButtons` to `Buttons` with generic field names for cross-platform teleop. ([#1261](#1261)) by @ruthwikdasyam - **Dev container uses ros-dev image** — `./bin/dev` now runs the ROS-enabled dev image. ([#1170](#1170)) by @leshy - **LSP support** — Added python-lsp-server and python-lsp-ruff to dev dependencies. ([#1169](#1169)) by @leshy - **Lazy-load pyrealsense2** — RealSense camera module uses lazy imports to avoid errors in simulation environments without the SDK. ([#1309](#1309)) by @spomichter - **Removed unused mmcv and mmengine** — Dead Detic dependencies removed, eliminating slow source builds from install. ([#1319](#1319)) by @spomichter - **Simplified installation** — Removed direnv requirement, streamlined install instructions across all platforms. ([#1315](#1315)) by @spomichter - **DDS extra excluded from --all-extras** — `cyclonedds` requires a source build, so `dds` is now excluded from `uv sync --all-extras` by default. ([#1318](#1318)) by @spomichter - **Nix pre-commit skip** — Skip pre-commit install if hooks already exist. ([#1162](#1162)) by @leshy - **Removed base-requirements** — Consolidated dependency management. ([#1098](#1098)) by @paul-nechifor - **Removed old graspnet** — Cleaned up deprecated graspnet version. ([#1248](#1248)) by @paul-nechifor - **Code cleanup** — Removed `tofix` markers ([#1216](#1216)), fixed ruff issues ([#1112](#1112)), removed old README_installation.md ([#1101](#1101)) by @paul-nechifor --- ## 🐛 Bug Fixes - Fix LFS updating (move from .local to venv) ([#1090](#1090)) by @jeff-hykin - Launch hotfixes: git clone HTTPS, get_data main branch ([#1091](#1091)) by @spomichter - Fix camera demo not showing in Rerun ([#1148](#1148)) by @jeff-hykin - Default to rerun native viewer ([#1099](#1099)) by @Nabla7 - Fix exploration blocking agent loop ([#1258](#1258)) by @paul-nechifor - Fix person-follow blocking agent loop ([#1278](#1278)) by @paul-nechifor - Skip metric3d tests on unsupported xformers GPUs (Blackwell compute capability >9.0) ([#1225](#1225)) by @leshy - Fix manipulation tests ([#1218](#1218), [#1247](#1247)) by @jeff-hykin, @paul-nechifor - Fix control coordinator e2e test ([#1212](#1212)) by @mustafab0 - Fix xarm7-sim broken e2e tests ([#1294](#1294)) by @paul-nechifor - Pin langchain to restore supported providers ([#1241](#1241)) by @spomichter - Fix missing library dependencies in Nix flake ([#1240](#1240)) by @Kaweees - Fix discord invite link ([#1122](#1122)) by @spomichter - macOS edgecase fix ([#1096](#1096)) by @jeff-hykin - Fix second N in logo ([#1250](#1250)) by @jeff-hykin - Fix Unitree Go2 minor issues ([#1307](#1307)) by @paul-nechifor - Fix broken tests ([#1305](#1305)) by @ruthwikdasyam - Fix `uv sync` for some macOS systems ([#1322](#1322)) by @jeff-hykin - Fix mmcv install ([#1313](#1313)) by @paul-nechifor - Fix mypy issues ([#1150](#1150), [#1167](#1167), [#1257](#1257)) by @leshy, @paul-nechifor, @jeff-hykin - Fix Nix install uv pip extras ([#1321](#1321)) by @spomichter --- ## 📚 Documentation - **Major docs overhaul** — New README with feature grid, hardware table, quickstart. Navigation, transports, data streams, and agent docs. ([#1295](#1295)) by @leshy - **Day 1 docs** — Comprehensive getting started guide, development docs, contributing guide, architecture overview. Executable blueprint docs via md-babel-py. ([#1064](#1064)) by @jeff-hykin - **Arm integration guide** — How-to for integrating new robotic arms with DimOS. ([#1238](#1238)) by @mustafab0 - **MCP documentation update** — Updated MCP install and usage instructions. ([#1251](#1251)) by @Kaweees - **Docker docs** — First pass on Docker deployment documentation. ([#1151](#1151)) by @leshy - **Transports documentation** — Encode/decode mixins, SHM examples, ROS/DDS transport docs. ([#1107](#1107)) by @leshy - **Rerun API examples** — Updated examples for the new RerunBridge API. ([#1262](#1262)) by @jeff-hykin - **PR template** added ([#1172](#1172)) by @christiefhyang - **Simplified install instructions** — Removed direnv, streamlined across all platforms. ([#1315](#1315)) by @spomichter - **Python example restored** — Added back the Python usage example. ([#1317](#1317)) by @jeff-hykin - **Nix install updated** — Replaced uv with pip for Nix compatibility. ([#1326](#1326)) by @ruthwikdasyam - **README improvements** ([#1311](#1311)) by @paul-nechifor - **Simplified writing docs** — Consolidated writing_docs to a single markdown file. ([#1254](#1254)) by @jeff-hykin --- ## 🏗️ CI & Build - **ci-complete gate** — Dynamic branch protection via single aggregated status check. MD-only PRs no longer blocked. ([#1279](#1279)) by @spomichter - **Path-based test filtering** — Test jobs fully skip (no container spin-up) when no relevant code changed. ([#1284](#1284), [#1286](#1286)) by @spomichter - **Navigation docker build workflow** — CI builds for the ROS navigation stack. ([#1259](#1259)) by @spomichter - **CUDA test marker** — `@pytest.mark.cuda` for GPU-dependent tests. ([#1220](#1220)) by @jeff-hykin - **e2e test marker** — Marked end-to-end tests for selective CI runs. ([#1110](#1110)) by @paul-nechifor - **pytest stdin fix** — Added `-s` to default addopts for LCM autoconf compatibility. ([#1320](#1320)) by @spomichter --- ##⚠️ Breaking Changes - **RPC renames**: `ModuleBlueprint` → `_BlueprintAtom`, `ModuleBlueprintSet` → `Blueprint`, `ModuleConnection` → `Stream` ([#1143](#1143)) - **Image class rewrite**: `CudaImage` and `NumpyImage` removed. Image is now a pure NumPy dataclass. Methods like `solve_pnp`, `csrt_tracker`, `from_depth`, `to_depth_meters` removed. ([#1161](#1161)) - **ROS messages removed from core**: All `to_ros`/`from_ros` conversion methods removed. Use `ROSTransport` instead. ([#1230](#1230)) - **QuestButtons → Buttons**: Renamed with generic field names. ([#1261](#1261)) - **RerunBridge replaces old rerun init**: `dimos.dashboard.rerun_init` removed. Use `RerunBridgeModule` or the `rerun-bridge` CLI. ([#1154](#1154)) - **Unitree directory restructuring**: `unitree_go2` → `unitree/go2`, `unitree_g1` → `unitree/g1`. Blueprint names updated. ([#1221](#1221)) - **Default viewer is now rerun-web**: Use `--viewer-backend rerun` to restore native viewer. ([#1324](#1324)) --- ## Quickstart ```bash # Install uv pip install dimos[base,unitree] # Try it (no hardware needed) # NOTE: First run downloads ~2.4 GB from LFS dimos --replay run unitree-go2 # Simulate uv pip install dimos[base,unitree,sim] dimos --simulation run unitree-go2 ``` --- ## New Contributors 🎉 - @ruthwikdasyam — Quest VR teleoperation, phone teleop, arm teleop, webcam rerun viz - @JalajShuklaSS — GraspGen integration - @jca0 — MuJoCo simulation module - @christiefhyang — PR template --- **Full Changelog**: [v0.0.9...v0.0.10](v0.0.9...v0.0.10)
Summary
This is a first pass on memory, cleans up the way we deal with timed events in dimos
Creates a
TimeSeriesStore[T]abstraction for timestamped data with multiple backend implementations.Backend implementations are very simple and TBH I don't expect to use psql but sqlite, implemented them more as examples.
TimeSeriesStore[T]Base Class (dimos/memory/timeseries/base.py)Generic abstract base — backends implement
_save,_load,_delete,_iter_items,_find_closest_timestamp,_count,_last_timestamp,_find_before,_find_afterAPI built on top:
save(),find_closest(),find_before(),find_after(),prune_old(),slice_by_time(),iterate(),iterate_realtime(),stream(),consume_stream()Tbound toTimestamped— timestamps come from.tsattributeBackend Implementations
InMemoryStoreinmemory.pyPickleDirStorepickledir.pySqliteStoresqlite.pyPostgresStorepostgres.pyResourcelifecycleLegacyPickleStorelegacy.pyTimedSensorReplayrecordingsUnified the way we treat storage of timed items
Replaced
TimestampedCollectionTimestampedCollectionand its subclassTimestampedBufferCollectionwere used for:TBuffer)align_timestamped)All now use
InMemoryStoredirectly.TimestampedBufferCollectionis a thin wrapper adding auto-prune on insert.TimestampedCollectionclass deleted entirely.Replaced Pickle Sensor Replay System with standard time series store interface
future recording and reply can go via sqlite for example
Replaced Transform service in-memory transform storage
they all depend on the same base, storing transforms in postgres (not that we want to) is a one line change
Other changes
Embeddingtype — all embedding models return the same typeUsage