Skip to content

feat(ci): add macOS CI runners for mypy + tests (DIM-696)#1482

Open
spomichter wants to merge 17 commits intodevfrom
feat/dim-696-macos-ci
Open

feat(ci): add macOS CI runners for mypy + tests (DIM-696)#1482
spomichter wants to merge 17 commits intodevfrom
feat/dim-696-macos-ci

Conversation

@spomichter
Copy link
Contributor

Summary

Adds macOS Apple Silicon (M1) to the CI test matrix. Runs mypy and pytest in parallel with existing Linux pipeline.

Linear: DIM-696

What's new

Two new jobs in docker.yml:

Job Runner What it does
macos-tests macos-latest (M1 arm64, 3 CPU, 7GB RAM) pytest with coverage
macos-mypy macos-latest mypy type checking

Both gate ci-complete alongside existing Linux checks.

How it works

No Docker — runs directly on bare metal:

  1. actions/checkout@v4 (no LFS)
  2. Install uv via astral-sh/setup-uv@v6
  3. uv python install 3.12
  4. uv sync --all-extras --no-extra cuda --no-extra cpu --no-extra dds --no-extra unitree --frozen
  5. Run tests / mypy

Excluded extras (no macOS wheels)

  • cuda — nvidia/CUDA packages
  • cpu — ctransformers (no macOS build)
  • dds — cyclonedds
  • unitree — unitree-webrtc-connect

Storage budget (14GB SSD)

  • OS + tools: ~5GB
  • Repo (no LFS): ~300MB
  • Venv (no CUDA): ~4-5GB
  • Headroom: ~4GB
  • Disk usage logged pre/post test for monitoring

Changes

  • .github/workflows/docker.yml: +78 lines (two new jobs + ci-complete gate update)

Add two parallel macOS jobs to the CI pipeline:
- macos-tests: pytest on Apple Silicon (macos-latest, M1 arm64)
- macos-mypy: mypy type checking on macOS

Uses GitHub-hosted runners (no Docker, no containers). Installs
deps via uv with --all-extras minus cuda/cpu/dds/unitree (no macOS
wheels). LFS files are not fetched (pointer files only).

Both jobs gate ci-complete alongside existing Linux checks.
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 7, 2026

Greptile Summary

This PR extends the CI matrix by adding two GitHub-hosted macOS (Apple Silicon) jobs — macos-tests and macos-mypy — which run pytest and mypy directly on bare metal using uv, without Docker. Both jobs are correctly wired into the ci-complete gate and use appropriate extras exclusions for platform-incompatible packages (CUDA, CPU, DDS, Unitree). The mypy configuration already marks ROS/platform-specific stubs as ignore_missing_imports, so the macOS mypy run should be clean.

Key observations:

  • Missing uv caching: Neither macos-tests nor macos-mypy configures enable-cache: true / cache-dependency-glob on astral-sh/setup-uv@v6. This causes a full dependency reinstall (~4–5 GB) on every run, significantly increasing job duration and macOS runner minute costs.
  • New critical-path dependency on GitHub-hosted runners: Both macOS jobs are added to ci-complete's needs, meaning a GitHub macOS runner outage or quota exhaustion can block all PR merges, even those unrelated to macOS. The existing gate only depended on self-hosted Linux runners.
  • Minor asymmetry: run-tests (Linux) uses --durations=0 and _DIMOS_COV=1 ... coverage combine, while macos-tests uses --durations=10 and omits coverage combine. The omission of coverage combine is consistent given the macOS job doesn't set _DIMOS_COV=1, but the --durations difference is a minor inconsistency worth being aware of.

Confidence Score: 3/5

  • Safe to merge with minor caveats — the new jobs are functionally correct but introduce a performance gap (no caching) and a new availability risk on the CI gate.
  • The YAML logic is sound: conditions are correctly scoped, extras exclusions are appropriate for macOS, and mypy's ignore_missing_imports on ROS stubs prevents false failures. However, the missing uv cache configuration will make every macOS run unnecessarily slow and expensive, and wiring GitHub-hosted runners into ci-complete introduces a new class of CI flakiness not present before this PR.
  • .github/workflows/docker.yml — specifically the ci-complete needs list and the missing cache config in both macOS jobs.

Important Files Changed

Filename Overview
.github/workflows/docker.yml Adds two new macOS CI jobs (macos-tests, macos-mypy) using GitHub-hosted runners and wires them into the ci-complete gate; both jobs lack uv dependency caching, which will make every run slow and expensive, and placing GitHub-hosted runners on the critical merge path introduces a new availability risk.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A([push / pull_request]) --> B[check-changes\nself-hosted Linux]

    B -->|ros/python/dev/tests changes| C[ros]
    B -->|ros/python/dev/tests changes| D[python]
    B -->|always| E[ros-python]
    B -->|always| F[dev]
    B -->|always| G[ros-dev]
    B -->|tests/ros/python/dev| H[run-tests\nLinux Docker]
    B -->|tests/ros/python/dev| I[run-mypy\nLinux Docker]
    B -->|tests/python| J[macos-tests\nGitHub-hosted macOS]
    B -->|tests/python| K[macos-mypy\nGitHub-hosted macOS]

    C --> E
    D --> F
    E --> G
    G --> H
    G --> I

    H --> L{ci-complete}
    I --> L
    J --> L
    K --> L

    L -->|any failure or cancelled| M([❌ CI Failed])
    L -->|all passed or skipped| N([✅ CI Passed])
Loading

Last reviewed commit: 647074c

Comment on lines +269 to +271
- name: Install uv
uses: astral-sh/setup-uv@v6

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing uv dependency caching

Neither macos-tests nor macos-mypy configure caching for the uv virtual environment. Every CI run will re-download and reinstall the entire dependency set from scratch (potentially 4–5 GB of packages), which will be both slow and costly on GitHub-hosted macOS runners.

astral-sh/setup-uv@v6 supports built-in caching via enable-cache and cache-dependency-glob. Adding these options would cache the resolved packages across runs, so only changed dependencies need to be re-fetched.

Suggested change
- name: Install uv
uses: astral-sh/setup-uv@v6
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
enable-cache: true
cache-dependency-glob: "uv.lock"

The same fix applies to the Install uv step in macos-mypy (line 314).

Comment on lines +238 to +239
ci-complete:
needs: [check-changes, ros, python, ros-python, dev, ros-dev, run-tests, run-mypy]
needs: [check-changes, ros, python, ros-python, dev, ros-dev, run-tests, run-mypy, macos-tests, macos-mypy]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

macOS jobs are now on the critical merge path

Adding macos-tests and macos-mypy to ci-complete's needs means that any outage of GitHub-hosted macOS runners will block all PRs from passing the CI gate — even changes that have no impact on macOS (e.g., ROS-only or Docker-only changes). Previously, ci-complete only depended on self-hosted Linux runners.

The macOS jobs already use a condition that skips them when tests or python changes aren't detected, so for most non-Python changes they'll appear as skipped. The concern is when they are triggered but the GitHub macOS runner pool is unavailable or the job times out — in that case ci-complete will fail with contains(needs.*.result, 'failure').

Consider whether this is intentional, or whether the gate should be more selective (e.g., only fail if macOS jobs explicitly fail rather than treating a stall as a block).

Separate macos.yml workflow (not in docker.yml) so macOS-only pushes
don't trigger the full Docker/navigation pipeline.

- macos-tests: pytest on Apple Silicon (macos-latest, M1 arm64)
- macos-mypy: mypy type checking on macOS
- Explicit extras: dev, agents, web, visualization, sim, manipulation,
  drone, psql (no torch/cuda/unitree/dds)
- uv cache enabled for faster repeat installs
- paths-ignore: markdown, docker files
- Change filter: only runs when dimos/*, pyproject.toml, uv.lock, or
  the workflow file itself changes
@spomichter spomichter force-pushed the feat/dim-696-macos-ci branch from 29609a7 to f4419a0 Compare March 7, 2026 18:48
…bility

- Add missing dependency groups to macOS workflow: misc, unitree, perception
- Fix psutil io_counters() mypy error on macOS with type ignore comment
- This resolves missing packages: googlemaps, unitree-webrtc-connect, transformers, ultralytics, moondream
All four install cleanly on macOS arm64:
- perception: transformers, ultralytics (torch CPU ~800MB)
- misc: googlemaps, open_clip_torch, torchreid
- unitree: unitree-webrtc-connect-leshy (pure Python, py3-none-any)
- base: core deps

Only cuda, cpu, dds remain excluded (genuine platform incompatibility).
Also revert cron bot's incorrect changes to extras list.
Keep psutil type: ignore fix from cron.
unitree-webrtc-connect-leshy depends on pyaudio which needs
portaudio.h system library. Add brew install portaudio step.
Per docs/installation/osx.md:
- brew install gnu-sed gcc portaudio git-lfs libjpeg-turbo
- uv sync --all-extras --no-extra dds --frozen

Only dds (cyclonedds) is excluded on macOS. Everything else installs.
NVIDIA CUDA packages don't have macOS wheels and cause
uv sync --all-extras to fail on macOS runners.

Excluded cuda extra alongside existing dds exclusion.
Slow tests (daemon e2e, MCP stress) hang or take 60+ min on the
3-core M1 runner. Skip them with -m 'not (tool or slow or mujoco)'.
Also add 30min job timeout and 120s per-test timeout as safety nets.

Fast tests + mypy still validate macOS compatibility.
Revert cron bot's stats.py edit so docker workflow doesn't detect
Python changes and trigger run-tests on the Linux runners.
…rs on macOS

- io_counters() method is not available on all platforms including macOS
- Added hasattr() check to handle platform differences gracefully
- Maintains backward compatibility by falling back to zero values when unavailable
- Fixes mypy error: 'Process' has no attribute 'io_counters' on macOS CI
- Scope tests to core/ + utils/ (287 tests, ~5 min vs 995 @ 40+ min)
- Add LCM multicast route + UDP buffer sysctl before tests
  (same as dimos autoconf for macOS, which is skipped when CI=1)
- Tests were hanging because LCM couldn't bind multicast without route
- mypy still checks all of dimos/
@spomichter spomichter force-pushed the feat/dim-696-macos-ci branch from 7d7af63 to 993d2fb Compare March 7, 2026 21:12
…duce buffer size

- Add hasattr() check for psutil.Process.io_counters() in stats.py (not available on macOS)
- Reduce kern.ipc.maxsockbuf from 8388608 to 6291456 in macOS CI workflow (macOS limit)
- Enable multicast on loopback interface explicitly
- Configure LCM to use localhost-only networking (udpm://127.0.0.1:7667?ttl=0)
- Add additional networking sysctls for IP forwarding and TTL
- Add debug output for network configuration
- Set LCM_DEFAULT_URL environment variable for tests

This should resolve the 'No route to host' LCM networking failures on macOS GitHub runners by avoiding problematic multicast networking and using localhost-only communication.
- Reset LCM networking to exact autoconf equivalents (route + sysctl)
- Remove cron bot's stats.py edits and backup file
- Only macos.yml changed vs dev
- kern.ipc.maxsockbuf capped at 6291456 on macOS (8388608 = 'Result too large')
- io_counters() doesn't exist on macOS psutil; runtime already catches
  AttributeError but mypy flags it. type: ignore[attr-defined] fixes.
LCM can't create multicast sockets on GitHub-hosted macOS runners
despite correct route + sysctl config. Skip specific LCM tests via -k.
Non-LCM tests (types, config, blueprints, daemon signals) still run.
LCM tests validated on local macOS + Linux CI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant