fix(tests): simplify testing by paul-nechifor · Pull Request #1343 · dimensionalOS/dimos

paul-nechifor · 2026-02-22T04:19:24Z

Problem

Our testing setup is quite complicated.

Closes DIM-559

Solution

Remove all marks except: tool, slow (replacement for heavy/integration/e2e), and mujoco.
Have a single testing job which runs all pytest tests.
Fix broken neverending and module tests.
Add custom skipif markers like skipif_in_ci, skipif_no_openai to lessen duplication.
Fix the remaining tests which didn't close threads.
Update the docs.

Breaking Changes

None

How to Test

Run the commands from https://github.com/dimensionalOS/dimos/blob/paul/fix/testing-docs-2/docs/development/testing.md .

Contributor License Agreement

I have read and approved the CLA.

greptile-apps · 2026-02-22T04:21:52Z

Greptile Summary

This PR refactors the test infrastructure by consolidating 13 pytest markers down to 3 (tool, slow, mujoco), removing deprecated/unused markers like heavy, integration, e2e, ros, lcm, gpu, cuda, module, neverending, vis, and exclude.

Key changes:

Simplified pytest configuration in pyproject.toml with unified slow marker for all tests taking >1 second
Replaced CUDA detection logic with environment-based skipif_* markers (skipif_in_ci, skipif_no_openai, skipif_no_alibaba)
Consolidated CI workflow from 5 separate test jobs (run-tests, run-heavy-tests, run-lcm-tests, run-integration-tests, run-ros-tests) into 2 jobs with simplified filters
Updated 40+ test files to use new marker system
Removed obsolete test stubs and tests marked as neverending, exclude, or lcm-only
Improved test_stream.py with event-based synchronization instead of time.sleep() and proper cleanup
Updated documentation to reflect new simplified testing approach

The refactor makes the test suite more maintainable by reducing complexity and providing clearer categorization between fast/slow tests.

Confidence Score: 5/5

This PR is safe to merge with minimal risk - it's a well-executed test infrastructure refactor
The changes are systematic and consistent across all files, replacing deprecated markers with a simpler system. The refactor improves test organization, removes dead code, and makes the CI pipeline more efficient by consolidating jobs. All changes follow clear patterns and improve code quality without touching production logic.
No files require special attention - all changes follow consistent patterns

Important Files Changed

Filename	Overview
pyproject.toml	Simplified pytest markers from 13 down to 3 (`tool`, `slow`, `mujoco`) and updated test filter to match
dimos/conftest.py	Replaced CUDA detection logic with configurable `skipif_*` markers for CI, OpenAI, and Alibaba API keys
docs/development/testing.md	Updated documentation to reflect new simplified marker system (`slow` instead of `integration`, `heavy`, `e2e`, etc.)
.github/workflows/docker.yml	Consolidated 5 separate test jobs into a single job running all non-`tool`/`mujoco` tests with duration reporting
dimos/core/test_stream.py	Changed `@pytest.mark.module` to `@pytest.mark.slow`, added proper cleanup calls and event-based synchronization instead of sleep
dimos/msgs/nav_msgs/test_OccupancyGrid.py	Removed 104-line `@pytest.mark.lcm` test function for LCM broadcasting
dimos/perception/detection/test_moduleDB.py	Removed entire 59-line `test_moduledb_basic` function that was marked with `@pytest.mark.neverending`

_{Last reviewed commit: 3e10cd5}

greptile-apps

_{33 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps

_{48 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

jeff-hykin

I love the color red when it comes to PR's. My only questions are:

is cuda still fine? (E.g. non cuda systems just don't add the cuda flag, that's why we don't need a has_cuda check)
I'm a bit unsure of our policy for .stop /close/shutdown/unsub. It feels like there should be a consistent naming system. Like stop is always stop-all or some similar rule.

jeff-hykin · 2026-02-24T06:20:47Z

dimos/core/test_core.py

+    nav.stop()
+    nav.stop_rpc_client()
+    robot.stop_rpc_client()
+    dimos.close_all()


Is this typical/recommended over dimos.shutdown?

Yes, dimos.close_all does a few other things, but we won't be using Dask for long as I'm removing it.

I guess we can have whatever our forking coordinator is - conform to Resource so stop() recursively goes through active modules/processes?

jeff-hykin · 2026-02-24T06:25:29Z

dimos/core/test_stream.py


    @rpc
-    def stop(self) -> None:
+    def unsub_all(self) -> None:


Seems kinda nice for stop to be a universal method that includes stuff like unsub_all.

I agree. But this test module defined stop whilst not actually calling super().stop() and then calling dimos.close_all(). It felt easier at the time to just rename but I should probably fix it the right way. 😅

paul-nechifor · 2026-02-24T06:49:08Z

* is cuda still fine? (E.g. non cuda systems just don't add the cuda flag, that's why we don't need a has_cuda check)

There are no CUDA tests left. The Image CUDA stuff was removed a while ago. And the Metric3D thing was removed recently. It might make sense to add it again later, but not needed anymore now.

leshy · 2026-02-24T08:13:25Z

.github/workflows/docker.yml

-
  ci-complete:
-    needs: [check-changes, ros, python, ros-python, dev, ros-dev, run-tests, run-heavy-tests, run-lcm-tests, run-integration-tests, run-ros-tests, run-mypy]
+    needs: [check-changes, ros, python, ros-python, dev, ros-dev, run-ros-tests, run-mypy]


just to know what the plan was with ros/non ros parallel tests - I planned to treat ros build as a separate OS, since it does heavy intervention into the OS itslef, and wanted to make sure that unguarded ros imports don't crash non-ros machines, so potentially we want to re-introduce

It makes sense to test that, but maybe not on every PR? Maybe we can have a periodic runner which runs all the tests on a variety of environments like Ubuntu 22.04/24.04, with ROS/without ROS, MacOS, etc.

leshy

This is so amazing, thanks Paul

leshy · 2026-02-24T08:16:14Z

Paul is codebase Jesus, suffering for our sins

greptile-apps bot reviewed Feb 22, 2026

View reviewed changes

paul-nechifor marked this pull request as draft February 22, 2026 06:40

paul-nechifor force-pushed the paul/fix/testing-docs-2 branch 5 times, most recently from 8ac5305 to f5f6f03 Compare February 24, 2026 06:06

paul-nechifor marked this pull request as ready for review February 24, 2026 06:06

greptile-apps bot reviewed Feb 24, 2026

View reviewed changes

paul-nechifor changed the title ~~fix(tests): fix~~ fix(tests): simplify testing Feb 24, 2026

paul-nechifor force-pushed the paul/fix/testing-docs-2 branch from 3e10cd5 to 70d79b5 Compare February 24, 2026 06:36

jeff-hykin reviewed Feb 24, 2026

View reviewed changes

leshy reviewed Feb 24, 2026

View reviewed changes

leshy previously approved these changes Feb 24, 2026

View reviewed changes

fix(tests): simplify testing

e65a531

paul-nechifor dismissed leshy’s stale review via e65a531 February 24, 2026 10:13

paul-nechifor force-pushed the paul/fix/testing-docs-2 branch from 70d79b5 to e65a531 Compare February 24, 2026 10:13

leshy approved these changes Feb 24, 2026

View reviewed changes

leshy merged commit 3b15cde into dev Feb 24, 2026
12 checks passed

Conversation

paul-nechifor commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Breaking Changes

How to Test

Contributor License Agreement

Uh oh!

greptile-apps bot commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

jeff-hykin left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeff-hykin Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

paul-nechifor Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

leshy Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

jeff-hykin Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

paul-nechifor Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

paul-nechifor commented Feb 24, 2026

Uh oh!

leshy Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

paul-nechifor Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

leshy left a comment

Choose a reason for hiding this comment

Uh oh!

leshy commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

paul-nechifor commented Feb 22, 2026 •

edited

Loading

greptile-apps bot commented Feb 22, 2026 •

edited

Loading

jeff-hykin left a comment •

edited

Loading