Skip to content

models/ refactor#819

Merged
leshy merged 42 commits intodevfrom
embedding_rewrite
Dec 16, 2025
Merged

models/ refactor#819
leshy merged 42 commits intodevfrom
embedding_rewrite

Conversation

@leshy
Copy link
Contributor

@leshy leshy commented Dec 9, 2025

  • adds ignore missing imports for ros/mujoco

  • writes consistent type spec for models
    visual language models models/vl/base.py
    embedding models models/embedding/base.py
    local models models/base.py
    huggingface models models/base.py

classDiagram
    class LocalModel {
        +device
        +dtype
        +_model
    }

    class HuggingFaceModel {
        +model_name
        +from_pretrained()
    }

    class Captioner {
        <<ABC>>
        +caption(image)
    }

    class VlModel {
        +query(image, query)
        +query_json()
        +query_detections()
    }

    class EmbeddingModel~E~ {
        <<ABC>>
        +embed(*images)
        +embed_text(*texts)
    }

    class Embedding {
        +vector
        +to_numpy()
        +to_torch()
    }

    LocalModel <|-- HuggingFaceModel

    Captioner <|-- VlModel

    Captioner <|-- Florence2Model
    HuggingFaceModel <|-- Florence2Model

    EmbeddingModel <|-- CLIPModel
    HuggingFaceModel <|-- CLIPModel

    EmbeddingModel <|-- MobileCLIPModel
    LocalModel <|-- MobileCLIPModel

    VlModel <|-- Qwen : remote API
    VlModel <|-- Moondream : local GPU
    VlModel <|-- MoondreamHosted : remote API

    EmbeddingModel ..> Embedding : produces


Loading
  • comprehensive parametric tests for all vl/ embedding/ and captioning/ models (you don't write your own tests for a new module, just add it to a list)

  • ensures models are resources, so have start() and stop() methods that clean up cuda resources, adds optional argument for warmup when initializing

  • adds methods for batch querying of models (for speed)

  • adds optional auto-resize of input images to models (for speed)

  • adds florence2 fast captioning model (implements captioning model type, all vlms are captioners, not all captioners are vlms)

  • adds torch reid models to lfs

  • adds mobileclip models to lfs

  • cleans up dead code (models/segmentation, pointcloud, labels, qwen)

  • adds a quick cli graphing system at dimos/utils/cli/plot.py for profiling

2025-12-09_17-49

@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 9, 2025

Greptile Overview

Greptile Summary

This PR delivers a comprehensive refactoring of the models architecture, establishing a clean type hierarchy for vision-language models, embedding models, and captioning models. The refactor introduces consistent Resource lifecycle management with start() and stop() methods for proper CUDA memory cleanup.

Key improvements:

  • Unified base classes (LocalModel, HuggingFaceModel) with proper device/dtype configuration
  • Clear separation of concerns: Captioner ABC for captioning, VlModel for vision-language queries, EmbeddingModel for embeddings
  • Performance optimizations: batch query methods, auto-resize for faster inference, warmup support
  • New Florence-2 fast captioning model and MobileCLIP embedding model
  • Comprehensive parametric test suite covering all model types
  • CLI plotting utility for profiling model performance
  • Dead code cleanup (segmentation, pointcloud, labels modules)

Architecture:
The class diagram in the PR description accurately reflects the new hierarchy. VLMs inherit from both Captioner and Resource, while embedding models follow a generic pattern with typed Embedding subclasses.

Integration:
Changes to detection types (adding is_valid() method, fixing typing to float for pixel coordinates) ensure proper integration with VLM detection queries. The reid system continues to work correctly with the refactored embedding models.

Confidence Score: 5/5

  • This PR is safe to merge with excellent code quality and comprehensive test coverage
  • This is a well-executed refactor with clean architecture, comprehensive tests, proper resource management, and minimal breaking changes. The code follows best practices with ABCs, cached properties, proper typing, and thorough documentation. All model integrations have been verified through parametric tests.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
dimos/models/base.py 5/5 Clean base class refactor introducing LocalModel and HuggingFaceModel with proper Resource lifecycle management
dimos/models/vl/base.py 5/5 Introduces Captioner ABC and VlModel with auto-resize support and batch query methods
dimos/models/embedding/base.py 5/5 Well-designed EmbeddingModel ABC with Embedding class supporting efficient GPU/CPU tensor operations
dimos/models/vl/florence.py 5/5 New Florence-2 captioning model with efficient batch processing and multiple detail levels
dimos/models/embedding/clip.py 5/5 CLIP model refactored to use new base classes with proper resource cleanup
dimos/models/embedding/mobileclip.py 5/5 MobileCLIP implementation using open_clip with local model weights from LFS
dimos/models/vl/moondream.py 5/5 Moondream VLM with query_multi optimization and auto-resize default at 512x512

Sequence Diagram

sequenceDiagram
    participant App
    participant VlModel
    participant LocalModel
    participant Resource
    participant GPU

    Note over App,GPU: Model Initialization
    App->>VlModel: __init__(auto_resize=(512,512))
    VlModel->>LocalModel: super().__init__()
    LocalModel->>LocalModel: Check autostart/warmup
    
    Note over App,GPU: Resource Lifecycle
    App->>VlModel: start()
    VlModel->>LocalModel: start()
    LocalModel->>LocalModel: Access _model @cached_property
    LocalModel->>GPU: Load model to device
    GPU-->>LocalModel: Model loaded
    VlModel->>VlModel: Run warmup query
    
    Note over App,GPU: Query Processing
    App->>VlModel: query(image, "What is this?")
    VlModel->>VlModel: _prepare_image() - resize if needed
    VlModel->>GPU: Forward pass
    GPU-->>VlModel: Response
    VlModel-->>App: String response
    
    Note over App,GPU: Batch Query Optimization
    App->>VlModel: query_batch([img1, img2, img3], "Describe")
    VlModel->>VlModel: Prepare all images
    VlModel->>GPU: Batched forward pass
    GPU-->>VlModel: Batch responses
    VlModel-->>App: List of responses
    
    Note over App,GPU: Cleanup
    App->>VlModel: stop()
    VlModel->>LocalModel: stop()
    LocalModel->>LocalModel: Delete _model from __dict__
    LocalModel->>GPU: torch.cuda.empty_cache()
    GPU-->>LocalModel: Memory freed
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

37 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@leshy
Copy link
Contributor Author

leshy commented Dec 9, 2025

@greptileai rereview please, also don't include this sequence diagram please

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (1)

  1. dimos/models/vl/moondream.py, line 154-156 (link)

    logic: query_detections() converts normalized coords to pixels using original image dimensions, but _to_pil() may have resized the image via auto_resize. When auto_resize is active, moondream returns coords normalized to the resized image, so scaling them back to the original dimensions will be incorrect.

    Use the resized dimensions from the PIL image instead:

37 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@leshy
Copy link
Contributor Author

leshy commented Dec 9, 2025

Additional Comments (1)

1. `dimos/models/vl/moondream.py`, line 154-156 ([link](/dimensionalos/dimos/blob/5381ff33e3aa1fb1a993fcbb6d72ab5f4e269966/dimos/models/vl/moondream.py#L154-L156))
   **logic:** `query_detections()` converts normalized coords to pixels using original image dimensions, but `_to_pil()` may have resized the image via `auto_resize`. When auto_resize is active, moondream returns coords normalized to the **resized** image, so scaling them back to the original dimensions will be incorrect.
   Use the resized dimensions from the PIL image instead:

37 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptileai this is not correct, moondream returns normalized (0.0-1.0) coordinates so we don't need to take scaling factor into account when generating detections

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right! I made an error in my analysis.

Moondream's detect() method returns normalized coordinates (0.0-1.0) regardless of the input image size, so the coordinates should indeed be scaled directly to the original image dimensions without considering any resizing factor.

The current implementation in query_detections() is correct:

pil_image = self._to_pil(image)  # May resize image
result = self._model.detect(pil_image, query, settings=settings)
# result contains normalized coords (0.0-1.0)

# Scale normalized coords to original image dimensions - this is correct
detection = Detection2DBBox(
    bbox=(
        result["x1"] * image.width,   # Scale to original width
        result["y1"] * image.height,  # Scale to original height  
        result["x2"] * image.width,
        result["y2"] * image.height,
    ),
    # ...
)

Since Moondream normalizes its output coordinates, they represent the same relative positions regardless of input image size, so scaling them to the original image dimensions is the right approach. Thank you for the correction!

@leshy
Copy link
Contributor Author

leshy commented Dec 9, 2025

You're absolutely right! I made an error in my analysis.

@greptileai so why does your review score my moondream implementation 3/5 still, can u update

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

37 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +30 to +31
# tw, th = plt.terminal_size()
# return tw, th // 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be uncommented?

paul-nechifor
paul-nechifor previously approved these changes Dec 13, 2025
@leshy leshy merged commit cbe68d2 into dev Dec 16, 2025
21 of 25 checks passed
leshy added a commit that referenced this pull request Jan 5, 2026
* captioner modules implemented in models/vl, flake.nix fixes

* model structure rework

* refactor

* bugfix

* removed double update_intrinsic on metric3d

* mypy

* typing fixes

* embedding models rewrite

* mobileclip preprocess accessor rewrite

* torch reid models added to lfs, reid/embedding model cleanup

* mobileclip upload

* batch vlm querying

* moondream batch queries and tests

* type fixes

* proper model resource management, speed tests, auto-resizing, plotting

* type fixes

* tests, mypy, correct cleanup

* metric3d tests

* attempting to remove dead code

* scaling bugfix for visual models

* docstring fix

* plotext dep

* open clip dep

* open clip dep fix

* gdown dep

* tensorboard dep

* typing fixes for detections and plotter

* person tracker typing fix

* py 3.10 typing fix

* last type fix

* ignore missing imports (for ros deps)

* nicer init for florence

* type fixes

* mypy ignore ros/mujoco

* addressing PR comments

* image is a fixture

* captioner fixtures

* all PR comments addressed
spomichter added a commit that referenced this pull request Jan 8, 2026
Release v0.0.6: Pre-Launch Unitree Go2 Release

## What's Changed
* Added is_flying_to_target agent skill and fly_to now return string for agent feeback by @spomichter in #635
* Release v0.0.5 by @spomichter in #697
* Rebase ivan g1 by @paul-nechifor in #709
* Navspec by @leshy in #648
* Remove depth module from base unitree go2 blueprints by @spomichter in #712
* Fix Unitree Go2 (replay and spatial memory) by @paul-nechifor in #714
* Add G1 blueprints, and simulation by @paul-nechifor in #724
* New g1 blueprint runfiles by @spomichter in #706
* Update G1/Go2 skills and remove some Robot interfaces by @paul-nechifor in #717
* Add dimos-robot end-to-end test with agents by @paul-nechifor in #716
* Run DimOS and ROS nav in Docker by @paul-nechifor in #700
* Anim experiment by @leshy in #701
* G1 navigation documentation fixes by @spomichter in #738
* Rename dimos-robot to dimos by @paul-nechifor in #740
* Use a process for MuJoCo by @paul-nechifor in #747
* Remove unneeded code files by @paul-nechifor in #718
* Make pygame G1JoystickModule usable for all modules by @paul-nechifor in #741
* error on conflicts by @paul-nechifor in #763
* Hosted Moondream 3 for VLM queries by @alexlin2 in #751
* transport: Remove DaskTransport dead code by @ym-han in #767
* Add editorconfig by @paul-nechifor in #769
* add `type: ignore` by @paul-nechifor in #768
* exclude .md changes from CICD builds by @spomichter in #770
* Working Ivan g1 detection in blueprints by @spomichter in #737
* small env fixes on a fresh install by @leshy in #778
* autofixes by @paul-nechifor in #744
* Support running local agents by @paul-nechifor in #739
* pin major version of langchain packages by @paul-nechifor in #789
* Deduplicate Unitree connections/entrypoints. by @paul-nechifor in #749
* Add TTS and STT by @paul-nechifor in #753
* fix mypy errors by @paul-nechifor in #791
* Use structlog and store JSON logs on disk by @paul-nechifor in #715
* Rpc fixes merge by @paul-nechifor in #801
* transport improvements by @leshy in #713
* Added concurrency check by @spomichter in #803
* make connections work with string annotations by @paul-nechifor in #807
* Run mypy checks in GitHub Actions by @paul-nechifor in #805
* Fix incorrect `= None` by @paul-nechifor in #802
* increase mujoco timeout by @paul-nechifor in #823
* MacOS Support: tests + devShell + mujoco by @jeff-hykin in #745
* nix flake revert by @leshy in #824
* fix mypy issues by @paul-nechifor in #827
* PRODUCTION Nav skills on drone with tracking by @spomichter in #640
* Fix added memory limit to blueprint global config by @spomichter in #856
* models/ refactor by @leshy in #819
* Point Detections by @leshy in #859
* Add generic ignore to gitignore by @jeff-hykin in #864
* fix set transport by @paul-nechifor in #866
* cli-precedence by @paul-nechifor in #857
* show `get_data` progress by @paul-nechifor in #873
* skip if OPENAI_API_KEY not defined by @paul-nechifor in #872
* build foxglove extension by @paul-nechifor in #871
* New planner by @paul-nechifor in #792
* Use `uv` by @paul-nechifor in #870
* Add direnv to gitignore by @Kaweees in #875
* Cuda mapper by @leshy in #862
* rename agents to agents_deprecated by @paul-nechifor in #877
* new planner new mapper by @paul-nechifor in #879
* odom ts parsing by @leshy in #882
* Sim fix by @paul-nechifor in #881
* navigation tuning by @leshy in #883
* Fix: Module init and agents by @leshy in #876
* Remove old setup.sh by @paul-nechifor in #888
* Release planner by @leshy in #887
* fix replay leak by @paul-nechifor in #890
* first pass on large file deletions by @leshy in #891
* Generalized manipulator driver by @mustafab0 in #831
* Restore MacOS Support (flake.nix) by @jeff-hykin in #863
* check-uv by @paul-nechifor in #902
* Make dimos pip-installable by @paul-nechifor in #731
* Revert "Restore MacOS Support (flake.nix)" by @leshy in #907
* jeff flake without py env stuff by @leshy in #911
* remove deprecated docker files by @paul-nechifor in #912
* command center stop and home by @leshy in #893
* use packages by @paul-nechifor in #915
* Fix agents prompt by @paul-nechifor in #914
* fix manifest by @paul-nechifor in #916
* fix move skill by @paul-nechifor in #913
* Ignore individual errors by @paul-nechifor in #919
* Feat/rerun latency panels by @Nabla7 in #917
* WIP Release detections by @leshy in #889
* Remove old navigation modules by @paul-nechifor in #923
* Feat/rerun latency panels by @Nabla7 in #925
* Repair camera module by @leshy in #929
* Repair Stream by @leshy in #932
* Docs Clean by @leshy in #933
* docs: sensor streams by @leshy in #934
* Docs: bugfixes by @leshy in #940
* Fixed doclinks to use git ls by @spomichter in #943
* Examples: third party language interop by @leshy in #946
* DOCS: temporal alignment docs improvements by @leshy in #944
* filter bots from commits by @leshy in #947
* Fix skills by @paul-nechifor in #950
* Limit Rerun viewer memory to 4GB default by @Nabla7 in #949
* Working dimensional MCP server - tested with Claude Code MCP client by @spomichter in #945
* allow registration of different agents by @paul-nechifor in #951
* Pre commit large files by @leshy in #953
* Proper Realsense and ZED Camera Drivers by @alexlin2 in #935
* Granular deps by @leshy in #894
* class VLMAgent(AgentSpec, Module) for streamed VLM queries over Transport by @spomichter in #960
* mac compatible commit filter by @paul-nechifor in #961

## New Contributors
* @ym-han made their first contribution in #767
* @jeff-hykin made their first contribution in #745
* @Kaweees made their first contribution in #875
* @mustafab0 made their first contribution in #831
* @Nabla7 made their first contribution in #917

**Full Changelog**: v0.0.5...v0.0.6
spomichter pushed a commit that referenced this pull request Jan 8, 2026
* captioner modules implemented in models/vl, flake.nix fixes

* model structure rework

* refactor

* bugfix

* removed double update_intrinsic on metric3d

* mypy

* typing fixes

* embedding models rewrite

* mobileclip preprocess accessor rewrite

* torch reid models added to lfs, reid/embedding model cleanup

* mobileclip upload

* batch vlm querying

* moondream batch queries and tests

* type fixes

* proper model resource management, speed tests, auto-resizing, plotting

* type fixes

* tests, mypy, correct cleanup

* metric3d tests

* attempting to remove dead code

* scaling bugfix for visual models

* docstring fix

* plotext dep

* open clip dep

* open clip dep fix

* gdown dep

* tensorboard dep

* typing fixes for detections and plotter

* person tracker typing fix

* py 3.10 typing fix

* last type fix

* ignore missing imports (for ros deps)

* nicer init for florence

* type fixes

* mypy ignore ros/mujoco

* addressing PR comments

* image is a fixture

* captioner fixtures

* all PR comments addressed

Former-commit-id: cbe68d2
spomichter added a commit that referenced this pull request Jan 8, 2026
Release v0.0.6: Pre-Launch Unitree Go2 Release

## What's Changed
* Added is_flying_to_target agent skill and fly_to now return string for agent feeback by @spomichter in #635
* Release v0.0.5 by @spomichter in #697
* Rebase ivan g1 by @paul-nechifor in #709
* Navspec by @leshy in #648
* Remove depth module from base unitree go2 blueprints by @spomichter in #712
* Fix Unitree Go2 (replay and spatial memory) by @paul-nechifor in #714
* Add G1 blueprints, and simulation by @paul-nechifor in #724
* New g1 blueprint runfiles by @spomichter in #706
* Update G1/Go2 skills and remove some Robot interfaces by @paul-nechifor in #717
* Add dimos-robot end-to-end test with agents by @paul-nechifor in #716
* Run DimOS and ROS nav in Docker by @paul-nechifor in #700
* Anim experiment by @leshy in #701
* G1 navigation documentation fixes by @spomichter in #738
* Rename dimos-robot to dimos by @paul-nechifor in #740
* Use a process for MuJoCo by @paul-nechifor in #747
* Remove unneeded code files by @paul-nechifor in #718
* Make pygame G1JoystickModule usable for all modules by @paul-nechifor in #741
* error on conflicts by @paul-nechifor in #763
* Hosted Moondream 3 for VLM queries by @alexlin2 in #751
* transport: Remove DaskTransport dead code by @ym-han in #767
* Add editorconfig by @paul-nechifor in #769
* add `type: ignore` by @paul-nechifor in #768
* exclude .md changes from CICD builds by @spomichter in #770
* Working Ivan g1 detection in blueprints by @spomichter in #737
* small env fixes on a fresh install by @leshy in #778
* autofixes by @paul-nechifor in #744
* Support running local agents by @paul-nechifor in #739
* pin major version of langchain packages by @paul-nechifor in #789
* Deduplicate Unitree connections/entrypoints. by @paul-nechifor in #749
* Add TTS and STT by @paul-nechifor in #753
* fix mypy errors by @paul-nechifor in #791
* Use structlog and store JSON logs on disk by @paul-nechifor in #715
* Rpc fixes merge by @paul-nechifor in #801
* transport improvements by @leshy in #713
* Added concurrency check by @spomichter in #803
* make connections work with string annotations by @paul-nechifor in #807
* Run mypy checks in GitHub Actions by @paul-nechifor in #805
* Fix incorrect `= None` by @paul-nechifor in #802
* increase mujoco timeout by @paul-nechifor in #823
* MacOS Support: tests + devShell + mujoco by @jeff-hykin in #745
* nix flake revert by @leshy in #824
* fix mypy issues by @paul-nechifor in #827
* PRODUCTION Nav skills on drone with tracking by @spomichter in #640
* Fix added memory limit to blueprint global config by @spomichter in #856
* models/ refactor by @leshy in #819
* Point Detections by @leshy in #859
* Add generic ignore to gitignore by @jeff-hykin in #864
* fix set transport by @paul-nechifor in #866
* cli-precedence by @paul-nechifor in #857
* show `get_data` progress by @paul-nechifor in #873
* skip if OPENAI_API_KEY not defined by @paul-nechifor in #872
* build foxglove extension by @paul-nechifor in #871
* New planner by @paul-nechifor in #792
* Use `uv` by @paul-nechifor in #870
* Add direnv to gitignore by @Kaweees in #875
* Cuda mapper by @leshy in #862
* rename agents to agents_deprecated by @paul-nechifor in #877
* new planner new mapper by @paul-nechifor in #879
* odom ts parsing by @leshy in #882
* Sim fix by @paul-nechifor in #881
* navigation tuning by @leshy in #883
* Fix: Module init and agents by @leshy in #876
* Remove old setup.sh by @paul-nechifor in #888
* Release planner by @leshy in #887
* fix replay leak by @paul-nechifor in #890
* first pass on large file deletions by @leshy in #891
* Generalized manipulator driver by @mustafab0 in #831
* Restore MacOS Support (flake.nix) by @jeff-hykin in #863
* check-uv by @paul-nechifor in #902
* Make dimos pip-installable by @paul-nechifor in #731
* Revert "Restore MacOS Support (flake.nix)" by @leshy in #907
* jeff flake without py env stuff by @leshy in #911
* remove deprecated docker files by @paul-nechifor in #912
* command center stop and home by @leshy in #893
* use packages by @paul-nechifor in #915
* Fix agents prompt by @paul-nechifor in #914
* fix manifest by @paul-nechifor in #916
* fix move skill by @paul-nechifor in #913
* Ignore individual errors by @paul-nechifor in #919
* Feat/rerun latency panels by @Nabla7 in #917
* WIP Release detections by @leshy in #889
* Remove old navigation modules by @paul-nechifor in #923
* Feat/rerun latency panels by @Nabla7 in #925
* Repair camera module by @leshy in #929
* Repair Stream by @leshy in #932
* Docs Clean by @leshy in #933
* docs: sensor streams by @leshy in #934
* Docs: bugfixes by @leshy in #940
* Fixed doclinks to use git ls by @spomichter in #943
* Examples: third party language interop by @leshy in #946
* DOCS: temporal alignment docs improvements by @leshy in #944
* filter bots from commits by @leshy in #947
* Fix skills by @paul-nechifor in #950
* Limit Rerun viewer memory to 4GB default by @Nabla7 in #949
* Working dimensional MCP server - tested with Claude Code MCP client by @spomichter in #945
* allow registration of different agents by @paul-nechifor in #951
* Pre commit large files by @leshy in #953
* Proper Realsense and ZED Camera Drivers by @alexlin2 in #935
* Granular deps by @leshy in #894
* class VLMAgent(AgentSpec, Module) for streamed VLM queries over Transport by @spomichter in #960
* mac compatible commit filter by @paul-nechifor in #961

## New Contributors
* @ym-han made their first contribution in #767
* @jeff-hykin made their first contribution in #745
* @Kaweees made their first contribution in #875
* @mustafab0 made their first contribution in #831
* @Nabla7 made their first contribution in #917

**Full Changelog**: v0.0.5...v0.0.6

Former-commit-id: 26e61a70a9469f2e33e51f1296f082b470009c09 [formerly 7ffc878]
Former-commit-id: 725e628
spomichter pushed a commit that referenced this pull request Jan 8, 2026
* captioner modules implemented in models/vl, flake.nix fixes

* model structure rework

* refactor

* bugfix

* removed double update_intrinsic on metric3d

* mypy

* typing fixes

* embedding models rewrite

* mobileclip preprocess accessor rewrite

* torch reid models added to lfs, reid/embedding model cleanup

* mobileclip upload

* batch vlm querying

* moondream batch queries and tests

* type fixes

* proper model resource management, speed tests, auto-resizing, plotting

* type fixes

* tests, mypy, correct cleanup

* metric3d tests

* attempting to remove dead code

* scaling bugfix for visual models

* docstring fix

* plotext dep

* open clip dep

* open clip dep fix

* gdown dep

* tensorboard dep

* typing fixes for detections and plotter

* person tracker typing fix

* py 3.10 typing fix

* last type fix

* ignore missing imports (for ros deps)

* nicer init for florence

* type fixes

* mypy ignore ros/mujoco

* addressing PR comments

* image is a fixture

* captioner fixtures

* all PR comments addressed

Former-commit-id: 2026ce3
spomichter added a commit that referenced this pull request Jan 8, 2026
Release v0.0.6: Pre-Launch Unitree Go2 Release

## What's Changed
* Added is_flying_to_target agent skill and fly_to now return string for agent feeback by @spomichter in #635
* Release v0.0.5 by @spomichter in #697
* Rebase ivan g1 by @paul-nechifor in #709
* Navspec by @leshy in #648
* Remove depth module from base unitree go2 blueprints by @spomichter in #712
* Fix Unitree Go2 (replay and spatial memory) by @paul-nechifor in #714
* Add G1 blueprints, and simulation by @paul-nechifor in #724
* New g1 blueprint runfiles by @spomichter in #706
* Update G1/Go2 skills and remove some Robot interfaces by @paul-nechifor in #717
* Add dimos-robot end-to-end test with agents by @paul-nechifor in #716
* Run DimOS and ROS nav in Docker by @paul-nechifor in #700
* Anim experiment by @leshy in #701
* G1 navigation documentation fixes by @spomichter in #738
* Rename dimos-robot to dimos by @paul-nechifor in #740
* Use a process for MuJoCo by @paul-nechifor in #747
* Remove unneeded code files by @paul-nechifor in #718
* Make pygame G1JoystickModule usable for all modules by @paul-nechifor in #741
* error on conflicts by @paul-nechifor in #763
* Hosted Moondream 3 for VLM queries by @alexlin2 in #751
* transport: Remove DaskTransport dead code by @ym-han in #767
* Add editorconfig by @paul-nechifor in #769
* add `type: ignore` by @paul-nechifor in #768
* exclude .md changes from CICD builds by @spomichter in #770
* Working Ivan g1 detection in blueprints by @spomichter in #737
* small env fixes on a fresh install by @leshy in #778
* autofixes by @paul-nechifor in #744
* Support running local agents by @paul-nechifor in #739
* pin major version of langchain packages by @paul-nechifor in #789
* Deduplicate Unitree connections/entrypoints. by @paul-nechifor in #749
* Add TTS and STT by @paul-nechifor in #753
* fix mypy errors by @paul-nechifor in #791
* Use structlog and store JSON logs on disk by @paul-nechifor in #715
* Rpc fixes merge by @paul-nechifor in #801
* transport improvements by @leshy in #713
* Added concurrency check by @spomichter in #803
* make connections work with string annotations by @paul-nechifor in #807
* Run mypy checks in GitHub Actions by @paul-nechifor in #805
* Fix incorrect `= None` by @paul-nechifor in #802
* increase mujoco timeout by @paul-nechifor in #823
* MacOS Support: tests + devShell + mujoco by @jeff-hykin in #745
* nix flake revert by @leshy in #824
* fix mypy issues by @paul-nechifor in #827
* PRODUCTION Nav skills on drone with tracking by @spomichter in #640
* Fix added memory limit to blueprint global config by @spomichter in #856
* models/ refactor by @leshy in #819
* Point Detections by @leshy in #859
* Add generic ignore to gitignore by @jeff-hykin in #864
* fix set transport by @paul-nechifor in #866
* cli-precedence by @paul-nechifor in #857
* show `get_data` progress by @paul-nechifor in #873
* skip if OPENAI_API_KEY not defined by @paul-nechifor in #872
* build foxglove extension by @paul-nechifor in #871
* New planner by @paul-nechifor in #792
* Use `uv` by @paul-nechifor in #870
* Add direnv to gitignore by @Kaweees in #875
* Cuda mapper by @leshy in #862
* rename agents to agents_deprecated by @paul-nechifor in #877
* new planner new mapper by @paul-nechifor in #879
* odom ts parsing by @leshy in #882
* Sim fix by @paul-nechifor in #881
* navigation tuning by @leshy in #883
* Fix: Module init and agents by @leshy in #876
* Remove old setup.sh by @paul-nechifor in #888
* Release planner by @leshy in #887
* fix replay leak by @paul-nechifor in #890
* first pass on large file deletions by @leshy in #891
* Generalized manipulator driver by @mustafab0 in #831
* Restore MacOS Support (flake.nix) by @jeff-hykin in #863
* check-uv by @paul-nechifor in #902
* Make dimos pip-installable by @paul-nechifor in #731
* Revert "Restore MacOS Support (flake.nix)" by @leshy in #907
* jeff flake without py env stuff by @leshy in #911
* remove deprecated docker files by @paul-nechifor in #912
* command center stop and home by @leshy in #893
* use packages by @paul-nechifor in #915
* Fix agents prompt by @paul-nechifor in #914
* fix manifest by @paul-nechifor in #916
* fix move skill by @paul-nechifor in #913
* Ignore individual errors by @paul-nechifor in #919
* Feat/rerun latency panels by @Nabla7 in #917
* WIP Release detections by @leshy in #889
* Remove old navigation modules by @paul-nechifor in #923
* Feat/rerun latency panels by @Nabla7 in #925
* Repair camera module by @leshy in #929
* Repair Stream by @leshy in #932
* Docs Clean by @leshy in #933
* docs: sensor streams by @leshy in #934
* Docs: bugfixes by @leshy in #940
* Fixed doclinks to use git ls by @spomichter in #943
* Examples: third party language interop by @leshy in #946
* DOCS: temporal alignment docs improvements by @leshy in #944
* filter bots from commits by @leshy in #947
* Fix skills by @paul-nechifor in #950
* Limit Rerun viewer memory to 4GB default by @Nabla7 in #949
* Working dimensional MCP server - tested with Claude Code MCP client by @spomichter in #945
* allow registration of different agents by @paul-nechifor in #951
* Pre commit large files by @leshy in #953
* Proper Realsense and ZED Camera Drivers by @alexlin2 in #935
* Granular deps by @leshy in #894
* class VLMAgent(AgentSpec, Module) for streamed VLM queries over Transport by @spomichter in #960
* mac compatible commit filter by @paul-nechifor in #961

## New Contributors
* @ym-han made their first contribution in #767
* @jeff-hykin made their first contribution in #745
* @Kaweees made their first contribution in #875
* @mustafab0 made their first contribution in #831
* @Nabla7 made their first contribution in #917

**Full Changelog**: v0.0.5...v0.0.6

Former-commit-id: 7ffc878
spomichter added a commit that referenced this pull request Jan 8, 2026
Release v0.0.6: Pre-Launch Unitree Go2 Release

## What's Changed
* Added is_flying_to_target agent skill and fly_to now return string for agent feeback by @spomichter in #635
* Release v0.0.5 by @spomichter in #697
* Rebase ivan g1 by @paul-nechifor in #709
* Navspec by @leshy in #648
* Remove depth module from base unitree go2 blueprints by @spomichter in #712
* Fix Unitree Go2 (replay and spatial memory) by @paul-nechifor in #714
* Add G1 blueprints, and simulation by @paul-nechifor in #724
* New g1 blueprint runfiles by @spomichter in #706
* Update G1/Go2 skills and remove some Robot interfaces by @paul-nechifor in #717
* Add dimos-robot end-to-end test with agents by @paul-nechifor in #716
* Run DimOS and ROS nav in Docker by @paul-nechifor in #700
* Anim experiment by @leshy in #701
* G1 navigation documentation fixes by @spomichter in #738
* Rename dimos-robot to dimos by @paul-nechifor in #740
* Use a process for MuJoCo by @paul-nechifor in #747
* Remove unneeded code files by @paul-nechifor in #718
* Make pygame G1JoystickModule usable for all modules by @paul-nechifor in #741
* error on conflicts by @paul-nechifor in #763
* Hosted Moondream 3 for VLM queries by @alexlin2 in #751
* transport: Remove DaskTransport dead code by @ym-han in #767
* Add editorconfig by @paul-nechifor in #769
* add `type: ignore` by @paul-nechifor in #768
* exclude .md changes from CICD builds by @spomichter in #770
* Working Ivan g1 detection in blueprints by @spomichter in #737
* small env fixes on a fresh install by @leshy in #778
* autofixes by @paul-nechifor in #744
* Support running local agents by @paul-nechifor in #739
* pin major version of langchain packages by @paul-nechifor in #789
* Deduplicate Unitree connections/entrypoints. by @paul-nechifor in #749
* Add TTS and STT by @paul-nechifor in #753
* fix mypy errors by @paul-nechifor in #791
* Use structlog and store JSON logs on disk by @paul-nechifor in #715
* Rpc fixes merge by @paul-nechifor in #801
* transport improvements by @leshy in #713
* Added concurrency check by @spomichter in #803
* make connections work with string annotations by @paul-nechifor in #807
* Run mypy checks in GitHub Actions by @paul-nechifor in #805
* Fix incorrect `= None` by @paul-nechifor in #802
* increase mujoco timeout by @paul-nechifor in #823
* MacOS Support: tests + devShell + mujoco by @jeff-hykin in #745
* nix flake revert by @leshy in #824
* fix mypy issues by @paul-nechifor in #827
* PRODUCTION Nav skills on drone with tracking by @spomichter in #640
* Fix added memory limit to blueprint global config by @spomichter in #856
* models/ refactor by @leshy in #819
* Point Detections by @leshy in #859
* Add generic ignore to gitignore by @jeff-hykin in #864
* fix set transport by @paul-nechifor in #866
* cli-precedence by @paul-nechifor in #857
* show `get_data` progress by @paul-nechifor in #873
* skip if OPENAI_API_KEY not defined by @paul-nechifor in #872
* build foxglove extension by @paul-nechifor in #871
* New planner by @paul-nechifor in #792
* Use `uv` by @paul-nechifor in #870
* Add direnv to gitignore by @Kaweees in #875
* Cuda mapper by @leshy in #862
* rename agents to agents_deprecated by @paul-nechifor in #877
* new planner new mapper by @paul-nechifor in #879
* odom ts parsing by @leshy in #882
* Sim fix by @paul-nechifor in #881
* navigation tuning by @leshy in #883
* Fix: Module init and agents by @leshy in #876
* Remove old setup.sh by @paul-nechifor in #888
* Release planner by @leshy in #887
* fix replay leak by @paul-nechifor in #890
* first pass on large file deletions by @leshy in #891
* Generalized manipulator driver by @mustafab0 in #831
* Restore MacOS Support (flake.nix) by @jeff-hykin in #863
* check-uv by @paul-nechifor in #902
* Make dimos pip-installable by @paul-nechifor in #731
* Revert "Restore MacOS Support (flake.nix)" by @leshy in #907
* jeff flake without py env stuff by @leshy in #911
* remove deprecated docker files by @paul-nechifor in #912
* command center stop and home by @leshy in #893
* use packages by @paul-nechifor in #915
* Fix agents prompt by @paul-nechifor in #914
* fix manifest by @paul-nechifor in #916
* fix move skill by @paul-nechifor in #913
* Ignore individual errors by @paul-nechifor in #919
* Feat/rerun latency panels by @Nabla7 in #917
* WIP Release detections by @leshy in #889
* Remove old navigation modules by @paul-nechifor in #923
* Feat/rerun latency panels by @Nabla7 in #925
* Repair camera module by @leshy in #929
* Repair Stream by @leshy in #932
* Docs Clean by @leshy in #933
* docs: sensor streams by @leshy in #934
* Docs: bugfixes by @leshy in #940
* Fixed doclinks to use git ls by @spomichter in #943
* Examples: third party language interop by @leshy in #946
* DOCS: temporal alignment docs improvements by @leshy in #944
* filter bots from commits by @leshy in #947
* Fix skills by @paul-nechifor in #950
* Limit Rerun viewer memory to 4GB default by @Nabla7 in #949
* Working dimensional MCP server - tested with Claude Code MCP client by @spomichter in #945
* allow registration of different agents by @paul-nechifor in #951
* Pre commit large files by @leshy in #953
* Proper Realsense and ZED Camera Drivers by @alexlin2 in #935
* Granular deps by @leshy in #894
* class VLMAgent(AgentSpec, Module) for streamed VLM queries over Transport by @spomichter in #960
* mac compatible commit filter by @paul-nechifor in #961

## New Contributors
* @ym-han made their first contribution in #767
* @jeff-hykin made their first contribution in #745
* @Kaweees made their first contribution in #875
* @mustafab0 made their first contribution in #831
* @Nabla7 made their first contribution in #917

**Full Changelog**: v0.0.5...v0.0.6

Former-commit-id: 7ffc878
Former-commit-id: 067332a
spomichter pushed a commit that referenced this pull request Jan 8, 2026
* captioner modules implemented in models/vl, flake.nix fixes

* model structure rework

* refactor

* bugfix

* removed double update_intrinsic on metric3d

* mypy

* typing fixes

* embedding models rewrite

* mobileclip preprocess accessor rewrite

* torch reid models added to lfs, reid/embedding model cleanup

* mobileclip upload

* batch vlm querying

* moondream batch queries and tests

* type fixes

* proper model resource management, speed tests, auto-resizing, plotting

* type fixes

* tests, mypy, correct cleanup

* metric3d tests

* attempting to remove dead code

* scaling bugfix for visual models

* docstring fix

* plotext dep

* open clip dep

* open clip dep fix

* gdown dep

* tensorboard dep

* typing fixes for detections and plotter

* person tracker typing fix

* py 3.10 typing fix

* last type fix

* ignore missing imports (for ros deps)

* nicer init for florence

* type fixes

* mypy ignore ros/mujoco

* addressing PR comments

* image is a fixture

* captioner fixtures

* all PR comments addressed

Former-commit-id: 81f5aa8 [formerly 2026ce3]
Former-commit-id: 3f5919d
spomichter added a commit that referenced this pull request Jan 8, 2026
Release v0.0.6: Pre-Launch Unitree Go2 Release

## What's Changed
* Added is_flying_to_target agent skill and fly_to now return string for agent feeback by @spomichter in #635
* Release v0.0.5 by @spomichter in #697
* Rebase ivan g1 by @paul-nechifor in #709
* Navspec by @leshy in #648
* Remove depth module from base unitree go2 blueprints by @spomichter in #712
* Fix Unitree Go2 (replay and spatial memory) by @paul-nechifor in #714
* Add G1 blueprints, and simulation by @paul-nechifor in #724
* New g1 blueprint runfiles by @spomichter in #706
* Update G1/Go2 skills and remove some Robot interfaces by @paul-nechifor in #717
* Add dimos-robot end-to-end test with agents by @paul-nechifor in #716
* Run DimOS and ROS nav in Docker by @paul-nechifor in #700
* Anim experiment by @leshy in #701
* G1 navigation documentation fixes by @spomichter in #738
* Rename dimos-robot to dimos by @paul-nechifor in #740
* Use a process for MuJoCo by @paul-nechifor in #747
* Remove unneeded code files by @paul-nechifor in #718
* Make pygame G1JoystickModule usable for all modules by @paul-nechifor in #741
* error on conflicts by @paul-nechifor in #763
* Hosted Moondream 3 for VLM queries by @alexlin2 in #751
* transport: Remove DaskTransport dead code by @ym-han in #767
* Add editorconfig by @paul-nechifor in #769
* add `type: ignore` by @paul-nechifor in #768
* exclude .md changes from CICD builds by @spomichter in #770
* Working Ivan g1 detection in blueprints by @spomichter in #737
* small env fixes on a fresh install by @leshy in #778
* autofixes by @paul-nechifor in #744
* Support running local agents by @paul-nechifor in #739
* pin major version of langchain packages by @paul-nechifor in #789
* Deduplicate Unitree connections/entrypoints. by @paul-nechifor in #749
* Add TTS and STT by @paul-nechifor in #753
* fix mypy errors by @paul-nechifor in #791
* Use structlog and store JSON logs on disk by @paul-nechifor in #715
* Rpc fixes merge by @paul-nechifor in #801
* transport improvements by @leshy in #713
* Added concurrency check by @spomichter in #803
* make connections work with string annotations by @paul-nechifor in #807
* Run mypy checks in GitHub Actions by @paul-nechifor in #805
* Fix incorrect `= None` by @paul-nechifor in #802
* increase mujoco timeout by @paul-nechifor in #823
* MacOS Support: tests + devShell + mujoco by @jeff-hykin in #745
* nix flake revert by @leshy in #824
* fix mypy issues by @paul-nechifor in #827
* PRODUCTION Nav skills on drone with tracking by @spomichter in #640
* Fix added memory limit to blueprint global config by @spomichter in #856
* models/ refactor by @leshy in #819
* Point Detections by @leshy in #859
* Add generic ignore to gitignore by @jeff-hykin in #864
* fix set transport by @paul-nechifor in #866
* cli-precedence by @paul-nechifor in #857
* show `get_data` progress by @paul-nechifor in #873
* skip if OPENAI_API_KEY not defined by @paul-nechifor in #872
* build foxglove extension by @paul-nechifor in #871
* New planner by @paul-nechifor in #792
* Use `uv` by @paul-nechifor in #870
* Add direnv to gitignore by @Kaweees in #875
* Cuda mapper by @leshy in #862
* rename agents to agents_deprecated by @paul-nechifor in #877
* new planner new mapper by @paul-nechifor in #879
* odom ts parsing by @leshy in #882
* Sim fix by @paul-nechifor in #881
* navigation tuning by @leshy in #883
* Fix: Module init and agents by @leshy in #876
* Remove old setup.sh by @paul-nechifor in #888
* Release planner by @leshy in #887
* fix replay leak by @paul-nechifor in #890
* first pass on large file deletions by @leshy in #891
* Generalized manipulator driver by @mustafab0 in #831
* Restore MacOS Support (flake.nix) by @jeff-hykin in #863
* check-uv by @paul-nechifor in #902
* Make dimos pip-installable by @paul-nechifor in #731
* Revert "Restore MacOS Support (flake.nix)" by @leshy in #907
* jeff flake without py env stuff by @leshy in #911
* remove deprecated docker files by @paul-nechifor in #912
* command center stop and home by @leshy in #893
* use packages by @paul-nechifor in #915
* Fix agents prompt by @paul-nechifor in #914
* fix manifest by @paul-nechifor in #916
* fix move skill by @paul-nechifor in #913
* Ignore individual errors by @paul-nechifor in #919
* Feat/rerun latency panels by @Nabla7 in #917
* WIP Release detections by @leshy in #889
* Remove old navigation modules by @paul-nechifor in #923
* Feat/rerun latency panels by @Nabla7 in #925
* Repair camera module by @leshy in #929
* Repair Stream by @leshy in #932
* Docs Clean by @leshy in #933
* docs: sensor streams by @leshy in #934
* Docs: bugfixes by @leshy in #940
* Fixed doclinks to use git ls by @spomichter in #943
* Examples: third party language interop by @leshy in #946
* DOCS: temporal alignment docs improvements by @leshy in #944
* filter bots from commits by @leshy in #947
* Fix skills by @paul-nechifor in #950
* Limit Rerun viewer memory to 4GB default by @Nabla7 in #949
* Working dimensional MCP server - tested with Claude Code MCP client by @spomichter in #945
* allow registration of different agents by @paul-nechifor in #951
* Pre commit large files by @leshy in #953
* Proper Realsense and ZED Camera Drivers by @alexlin2 in #935
* Granular deps by @leshy in #894
* class VLMAgent(AgentSpec, Module) for streamed VLM queries over Transport by @spomichter in #960
* mac compatible commit filter by @paul-nechifor in #961

## New Contributors
* @ym-han made their first contribution in #767
* @jeff-hykin made their first contribution in #745
* @Kaweees made their first contribution in #875
* @mustafab0 made their first contribution in #831
* @Nabla7 made their first contribution in #917

**Full Changelog**: v0.0.5...v0.0.6

Former-commit-id: 26e61a70a9469f2e33e51f1296f082b470009c09 [formerly 7ffc878]
Former-commit-id: 725e628
Former-commit-id: 2e5f1d4
spomichter pushed a commit that referenced this pull request Jan 8, 2026
* captioner modules implemented in models/vl, flake.nix fixes

* model structure rework

* refactor

* bugfix

* removed double update_intrinsic on metric3d

* mypy

* typing fixes

* embedding models rewrite

* mobileclip preprocess accessor rewrite

* torch reid models added to lfs, reid/embedding model cleanup

* mobileclip upload

* batch vlm querying

* moondream batch queries and tests

* type fixes

* proper model resource management, speed tests, auto-resizing, plotting

* type fixes

* tests, mypy, correct cleanup

* metric3d tests

* attempting to remove dead code

* scaling bugfix for visual models

* docstring fix

* plotext dep

* open clip dep

* open clip dep fix

* gdown dep

* tensorboard dep

* typing fixes for detections and plotter

* person tracker typing fix

* py 3.10 typing fix

* last type fix

* ignore missing imports (for ros deps)

* nicer init for florence

* type fixes

* mypy ignore ros/mujoco

* addressing PR comments

* image is a fixture

* captioner fixtures

* all PR comments addressed

Former-commit-id: 81f5aa8 [formerly 2026ce3]
Former-commit-id: 3f5919d
spomichter added a commit that referenced this pull request Jan 8, 2026
Release v0.0.6: Pre-Launch Unitree Go2 Release

## What's Changed
* Added is_flying_to_target agent skill and fly_to now return string for agent feeback by @spomichter in #635
* Release v0.0.5 by @spomichter in #697
* Rebase ivan g1 by @paul-nechifor in #709
* Navspec by @leshy in #648
* Remove depth module from base unitree go2 blueprints by @spomichter in #712
* Fix Unitree Go2 (replay and spatial memory) by @paul-nechifor in #714
* Add G1 blueprints, and simulation by @paul-nechifor in #724
* New g1 blueprint runfiles by @spomichter in #706
* Update G1/Go2 skills and remove some Robot interfaces by @paul-nechifor in #717
* Add dimos-robot end-to-end test with agents by @paul-nechifor in #716
* Run DimOS and ROS nav in Docker by @paul-nechifor in #700
* Anim experiment by @leshy in #701
* G1 navigation documentation fixes by @spomichter in #738
* Rename dimos-robot to dimos by @paul-nechifor in #740
* Use a process for MuJoCo by @paul-nechifor in #747
* Remove unneeded code files by @paul-nechifor in #718
* Make pygame G1JoystickModule usable for all modules by @paul-nechifor in #741
* error on conflicts by @paul-nechifor in #763
* Hosted Moondream 3 for VLM queries by @alexlin2 in #751
* transport: Remove DaskTransport dead code by @ym-han in #767
* Add editorconfig by @paul-nechifor in #769
* add `type: ignore` by @paul-nechifor in #768
* exclude .md changes from CICD builds by @spomichter in #770
* Working Ivan g1 detection in blueprints by @spomichter in #737
* small env fixes on a fresh install by @leshy in #778
* autofixes by @paul-nechifor in #744
* Support running local agents by @paul-nechifor in #739
* pin major version of langchain packages by @paul-nechifor in #789
* Deduplicate Unitree connections/entrypoints. by @paul-nechifor in #749
* Add TTS and STT by @paul-nechifor in #753
* fix mypy errors by @paul-nechifor in #791
* Use structlog and store JSON logs on disk by @paul-nechifor in #715
* Rpc fixes merge by @paul-nechifor in #801
* transport improvements by @leshy in #713
* Added concurrency check by @spomichter in #803
* make connections work with string annotations by @paul-nechifor in #807
* Run mypy checks in GitHub Actions by @paul-nechifor in #805
* Fix incorrect `= None` by @paul-nechifor in #802
* increase mujoco timeout by @paul-nechifor in #823
* MacOS Support: tests + devShell + mujoco by @jeff-hykin in #745
* nix flake revert by @leshy in #824
* fix mypy issues by @paul-nechifor in #827
* PRODUCTION Nav skills on drone with tracking by @spomichter in #640
* Fix added memory limit to blueprint global config by @spomichter in #856
* models/ refactor by @leshy in #819
* Point Detections by @leshy in #859
* Add generic ignore to gitignore by @jeff-hykin in #864
* fix set transport by @paul-nechifor in #866
* cli-precedence by @paul-nechifor in #857
* show `get_data` progress by @paul-nechifor in #873
* skip if OPENAI_API_KEY not defined by @paul-nechifor in #872
* build foxglove extension by @paul-nechifor in #871
* New planner by @paul-nechifor in #792
* Use `uv` by @paul-nechifor in #870
* Add direnv to gitignore by @Kaweees in #875
* Cuda mapper by @leshy in #862
* rename agents to agents_deprecated by @paul-nechifor in #877
* new planner new mapper by @paul-nechifor in #879
* odom ts parsing by @leshy in #882
* Sim fix by @paul-nechifor in #881
* navigation tuning by @leshy in #883
* Fix: Module init and agents by @leshy in #876
* Remove old setup.sh by @paul-nechifor in #888
* Release planner by @leshy in #887
* fix replay leak by @paul-nechifor in #890
* first pass on large file deletions by @leshy in #891
* Generalized manipulator driver by @mustafab0 in #831
* Restore MacOS Support (flake.nix) by @jeff-hykin in #863
* check-uv by @paul-nechifor in #902
* Make dimos pip-installable by @paul-nechifor in #731
* Revert "Restore MacOS Support (flake.nix)" by @leshy in #907
* jeff flake without py env stuff by @leshy in #911
* remove deprecated docker files by @paul-nechifor in #912
* command center stop and home by @leshy in #893
* use packages by @paul-nechifor in #915
* Fix agents prompt by @paul-nechifor in #914
* fix manifest by @paul-nechifor in #916
* fix move skill by @paul-nechifor in #913
* Ignore individual errors by @paul-nechifor in #919
* Feat/rerun latency panels by @Nabla7 in #917
* WIP Release detections by @leshy in #889
* Remove old navigation modules by @paul-nechifor in #923
* Feat/rerun latency panels by @Nabla7 in #925
* Repair camera module by @leshy in #929
* Repair Stream by @leshy in #932
* Docs Clean by @leshy in #933
* docs: sensor streams by @leshy in #934
* Docs: bugfixes by @leshy in #940
* Fixed doclinks to use git ls by @spomichter in #943
* Examples: third party language interop by @leshy in #946
* DOCS: temporal alignment docs improvements by @leshy in #944
* filter bots from commits by @leshy in #947
* Fix skills by @paul-nechifor in #950
* Limit Rerun viewer memory to 4GB default by @Nabla7 in #949
* Working dimensional MCP server - tested with Claude Code MCP client by @spomichter in #945
* allow registration of different agents by @paul-nechifor in #951
* Pre commit large files by @leshy in #953
* Proper Realsense and ZED Camera Drivers by @alexlin2 in #935
* Granular deps by @leshy in #894
* class VLMAgent(AgentSpec, Module) for streamed VLM queries over Transport by @spomichter in #960
* mac compatible commit filter by @paul-nechifor in #961

## New Contributors
* @ym-han made their first contribution in #767
* @jeff-hykin made their first contribution in #745
* @Kaweees made their first contribution in #875
* @mustafab0 made their first contribution in #831
* @Nabla7 made their first contribution in #917

**Full Changelog**: v0.0.5...v0.0.6

Former-commit-id: 7ffc878
Former-commit-id: 067332a
spomichter added a commit that referenced this pull request Jan 8, 2026
Release v0.0.6: Pre-Launch Unitree Go2 Release

## What's Changed
* Added is_flying_to_target agent skill and fly_to now return string for agent feeback by @spomichter in #635
* Release v0.0.5 by @spomichter in #697
* Rebase ivan g1 by @paul-nechifor in #709
* Navspec by @leshy in #648
* Remove depth module from base unitree go2 blueprints by @spomichter in #712
* Fix Unitree Go2 (replay and spatial memory) by @paul-nechifor in #714
* Add G1 blueprints, and simulation by @paul-nechifor in #724
* New g1 blueprint runfiles by @spomichter in #706
* Update G1/Go2 skills and remove some Robot interfaces by @paul-nechifor in #717
* Add dimos-robot end-to-end test with agents by @paul-nechifor in #716
* Run DimOS and ROS nav in Docker by @paul-nechifor in #700
* Anim experiment by @leshy in #701
* G1 navigation documentation fixes by @spomichter in #738
* Rename dimos-robot to dimos by @paul-nechifor in #740
* Use a process for MuJoCo by @paul-nechifor in #747
* Remove unneeded code files by @paul-nechifor in #718
* Make pygame G1JoystickModule usable for all modules by @paul-nechifor in #741
* error on conflicts by @paul-nechifor in #763
* Hosted Moondream 3 for VLM queries by @alexlin2 in #751
* transport: Remove DaskTransport dead code by @ym-han in #767
* Add editorconfig by @paul-nechifor in #769
* add `type: ignore` by @paul-nechifor in #768
* exclude .md changes from CICD builds by @spomichter in #770
* Working Ivan g1 detection in blueprints by @spomichter in #737
* small env fixes on a fresh install by @leshy in #778
* autofixes by @paul-nechifor in #744
* Support running local agents by @paul-nechifor in #739
* pin major version of langchain packages by @paul-nechifor in #789
* Deduplicate Unitree connections/entrypoints. by @paul-nechifor in #749
* Add TTS and STT by @paul-nechifor in #753
* fix mypy errors by @paul-nechifor in #791
* Use structlog and store JSON logs on disk by @paul-nechifor in #715
* Rpc fixes merge by @paul-nechifor in #801
* transport improvements by @leshy in #713
* Added concurrency check by @spomichter in #803
* make connections work with string annotations by @paul-nechifor in #807
* Run mypy checks in GitHub Actions by @paul-nechifor in #805
* Fix incorrect `= None` by @paul-nechifor in #802
* increase mujoco timeout by @paul-nechifor in #823
* MacOS Support: tests + devShell + mujoco by @jeff-hykin in #745
* nix flake revert by @leshy in #824
* fix mypy issues by @paul-nechifor in #827
* PRODUCTION Nav skills on drone with tracking by @spomichter in #640
* Fix added memory limit to blueprint global config by @spomichter in #856
* models/ refactor by @leshy in #819
* Point Detections by @leshy in #859
* Add generic ignore to gitignore by @jeff-hykin in #864
* fix set transport by @paul-nechifor in #866
* cli-precedence by @paul-nechifor in #857
* show `get_data` progress by @paul-nechifor in #873
* skip if OPENAI_API_KEY not defined by @paul-nechifor in #872
* build foxglove extension by @paul-nechifor in #871
* New planner by @paul-nechifor in #792
* Use `uv` by @paul-nechifor in #870
* Add direnv to gitignore by @Kaweees in #875
* Cuda mapper by @leshy in #862
* rename agents to agents_deprecated by @paul-nechifor in #877
* new planner new mapper by @paul-nechifor in #879
* odom ts parsing by @leshy in #882
* Sim fix by @paul-nechifor in #881
* navigation tuning by @leshy in #883
* Fix: Module init and agents by @leshy in #876
* Remove old setup.sh by @paul-nechifor in #888
* Release planner by @leshy in #887
* fix replay leak by @paul-nechifor in #890
* first pass on large file deletions by @leshy in #891
* Generalized manipulator driver by @mustafab0 in #831
* Restore MacOS Support (flake.nix) by @jeff-hykin in #863
* check-uv by @paul-nechifor in #902
* Make dimos pip-installable by @paul-nechifor in #731
* Revert "Restore MacOS Support (flake.nix)" by @leshy in #907
* jeff flake without py env stuff by @leshy in #911
* remove deprecated docker files by @paul-nechifor in #912
* command center stop and home by @leshy in #893
* use packages by @paul-nechifor in #915
* Fix agents prompt by @paul-nechifor in #914
* fix manifest by @paul-nechifor in #916
* fix move skill by @paul-nechifor in #913
* Ignore individual errors by @paul-nechifor in #919
* Feat/rerun latency panels by @Nabla7 in #917
* WIP Release detections by @leshy in #889
* Remove old navigation modules by @paul-nechifor in #923
* Feat/rerun latency panels by @Nabla7 in #925
* Repair camera module by @leshy in #929
* Repair Stream by @leshy in #932
* Docs Clean by @leshy in #933
* docs: sensor streams by @leshy in #934
* Docs: bugfixes by @leshy in #940
* Fixed doclinks to use git ls by @spomichter in #943
* Examples: third party language interop by @leshy in #946
* DOCS: temporal alignment docs improvements by @leshy in #944
* filter bots from commits by @leshy in #947
* Fix skills by @paul-nechifor in #950
* Limit Rerun viewer memory to 4GB default by @Nabla7 in #949
* Working dimensional MCP server - tested with Claude Code MCP client by @spomichter in #945
* allow registration of different agents by @paul-nechifor in #951
* Pre commit large files by @leshy in #953
* Proper Realsense and ZED Camera Drivers by @alexlin2 in #935
* Granular deps by @leshy in #894
* class VLMAgent(AgentSpec, Module) for streamed VLM queries over Transport by @spomichter in #960
* mac compatible commit filter by @paul-nechifor in #961

## New Contributors
* @ym-han made their first contribution in #767
* @jeff-hykin made their first contribution in #745
* @Kaweees made their first contribution in #875
* @mustafab0 made their first contribution in #831
* @Nabla7 made their first contribution in #917

**Full Changelog**: v0.0.5...v0.0.6

Former-commit-id: 26e61a70a9469f2e33e51f1296f082b470009c09 [formerly 7ffc878]
Former-commit-id: 725e628
Former-commit-id: 2e5f1d4
paul-nechifor pushed a commit that referenced this pull request Jan 8, 2026
* captioner modules implemented in models/vl, flake.nix fixes

* model structure rework

* refactor

* bugfix

* removed double update_intrinsic on metric3d

* mypy

* typing fixes

* embedding models rewrite

* mobileclip preprocess accessor rewrite

* torch reid models added to lfs, reid/embedding model cleanup

* mobileclip upload

* batch vlm querying

* moondream batch queries and tests

* type fixes

* proper model resource management, speed tests, auto-resizing, plotting

* type fixes

* tests, mypy, correct cleanup

* metric3d tests

* attempting to remove dead code

* scaling bugfix for visual models

* docstring fix

* plotext dep

* open clip dep

* open clip dep fix

* gdown dep

* tensorboard dep

* typing fixes for detections and plotter

* person tracker typing fix

* py 3.10 typing fix

* last type fix

* ignore missing imports (for ros deps)

* nicer init for florence

* type fixes

* mypy ignore ros/mujoco

* addressing PR comments

* image is a fixture

* captioner fixtures

* all PR comments addressed

Former-commit-id: 2be857a [formerly cbe68d2]
Former-commit-id: 3583bdb
jeff-hykin pushed a commit that referenced this pull request Jan 9, 2026
* captioner modules implemented in models/vl, flake.nix fixes

* model structure rework

* refactor

* bugfix

* removed double update_intrinsic on metric3d

* mypy

* typing fixes

* embedding models rewrite

* mobileclip preprocess accessor rewrite

* torch reid models added to lfs, reid/embedding model cleanup

* mobileclip upload

* batch vlm querying

* moondream batch queries and tests

* type fixes

* proper model resource management, speed tests, auto-resizing, plotting

* type fixes

* tests, mypy, correct cleanup

* metric3d tests

* attempting to remove dead code

* scaling bugfix for visual models

* docstring fix

* plotext dep

* open clip dep

* open clip dep fix

* gdown dep

* tensorboard dep

* typing fixes for detections and plotter

* person tracker typing fix

* py 3.10 typing fix

* last type fix

* ignore missing imports (for ros deps)

* nicer init for florence

* type fixes

* mypy ignore ros/mujoco

* addressing PR comments

* image is a fixture

* captioner fixtures

* all PR comments addressed

Former-commit-id: 8be510d [formerly cbe68d2]
Former-commit-id: 3583bdb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants