Skip to content

feat(tests): add --models-dir flag for read-only HF cache#8362

Merged
ranrubin merged 30 commits into
mainfrom
ops-4330/support-local-models
Apr 21, 2026
Merged

feat(tests): add --models-dir flag for read-only HF cache#8362
ranrubin merged 30 commits into
mainfrom
ops-4330/support-local-models

Conversation

@ranrubin
Copy link
Copy Markdown
Contributor

@ranrubin ranrubin commented Apr 20, 2026

Overview:

Adds --models-dir support to the pytest test suite so CI can run model-dependent tests against a pre-staged, read-only HuggingFace cache without downloading anything at test time.

Details:

New functionality (--models-dir flag)

  • pytest_addoption registers --models-dir <path>
  • pytest_configure validates the path exists early (exits with code 2 if not)
  • _models_dir_env session fixture sets HF_HOME or HF_HUB_CACHE (auto-detected from cache layout), HF_HUB_OFFLINE=1, TRANSFORMERS_OFFLINE=1, and DYNAMO_MODELS_DIR
  • predownload_models and predownload_tokenizers short-circuit when --models-dir is active — no downloads attempted
  • download_lora() in lora_utils.py calls pytest.skip() when DYNAMO_MODELS_DIR is set

Bug fixes

  • predownload_models/predownload_tokenizers crash: the early-exit guard documented in the docstring was missing; without it both fixtures called download_models() with HF_HUB_OFFLINE=1 already set, raising OfflineModeIsEnabled
  • PYTHONPATH set to "" on disable: _disable_offline_with_mistral_patch was writing an empty string to PYTHONPATH when the patch dir was its only entry, inserting . into sys.path on some platforms; now pops the key instead
  • _enable_offline_with_mistral_patch not idempotent: added _mistral_patch_applied flag, reset in _disable, so each enable/disable cycle correctly re-injects PYTHONPATH while the class-level patch (which can't be undone) is applied only once per cycle without harmful double-wrapping
  • Dead return value: _apply_models_dir_env always returned (orig, None); removed the dead second element and updated all callers
  • Misleading comment: clarified the HF_HUB_OFFLINE double-write in _apply_models_dir_env

Cleanup

  • Moved from tests.hf_cache import ... to top of conftest.py (PEP 8 / isort)
  • Improved sitecustomize.py content using textwrap.dedent for readability
  • Added fixture dependency comments to predownload_models/predownload_tokenizers explaining the non-obvious _models_dir_env ordering dependency
  • Added @pytest.mark.gpu_0 to all tests (required Hardware marker)

Test improvements

  • monkeypatch.delenv("PYTHONPATH") added to layout tests to prevent sitecustomize pollution on assertion failure
  • PYTHONPATH teardown assertion added to test_restore_clears_vars_that_were_absent
  • caplog assertion added to test_restore_handles_missing_tmp_cache to verify the warning is logged
  • New test_pythonpath_restored_after_apply_restore: verifies pre-existing PYTHONPATH survives an apply/restore round-trip
  • New test_enable_disable_enable_cycle: covers the enable→disable→enable sequence, verifying PYTHONPATH and HF_HUB_OFFLINE are correct after each call
  • Comment added to subprocess test acknowledging it collects from itself

Where should the reviewer start?

  • tests/hf_cache.py — core env management: _mistral_patch_applied per-cycle guard and PYTHONPATH fixes
  • tests/conftest.py — early-exit guards in predownload_models/predownload_tokenizers
  • tests/test_models_dir_flag.py — test isolation improvements and new cycle/PYTHONPATH tests

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • Relates to OPS-4330

Summary by CodeRabbit

  • New Features
    • Add a CLI flag to run tests against a pre-populated, read-only model cache and forward it to subprocesses.
  • Chores
    • Apply and restore cache/offline environment variables and offline runtime patches across test sessions and subprocesses.
  • Bug Fixes
    • Network-dependent tests (e.g., LoRA downloads) are auto-skipped under read-only cache to avoid failures.
  • Tests
    • Add tests validating cache-layouts, offline behavior, env restore semantics, subprocess forwarding, and skip messaging.
  • Documentation
    • Document accepted cache layouts, offline behavior, and compatibility notes for the flag.

Adds a new pytest CLI option `--models-dir` that lets developers run
the test suite against a pre-populated, read-only HuggingFace cache
without writing to it or making network downloads.

- Register `--models-dir` option in `pytest_addoption`; fail fast in
  `pytest_configure` if the path does not exist (returncode 4)
- Add `_apply_models_dir_env`/`_restore_models_dir_env` helpers that set
  `HF_HUB_CACHE` (or `HF_HOME`), `HF_HUB_OFFLINE=1`,
  `TRANSFORMERS_OFFLINE=1`, a writable `TRANSFORMERS_CACHE` temp dir,
  and the `DYNAMO_MODELS_DIR` sentinel
- Add `_models_dir_env` session-scoped fixture so env setup/teardown
  happens exactly once; both predownload fixtures depend on it and
  short-circuit (no downloads, no writes) when the flag is active
- Guard `download_lora()` with a `pytest.skip()` when `DYNAMO_MODELS_DIR`
  is set, preventing silent failures on air-gapped machines
- Add 5 unit tests in `tests/test_models_dir_flag.py` covering both
  cache layouts, teardown correctness, and writable TRANSFORMERS_CACHE
- Document the flag in `.ai/pytest-guidelines.md`

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: rrubin <rrubin@nvidia.com>
@github-actions github-actions Bot added feat documentation Improvements or additions to documentation labels Apr 20, 2026
Fixes five issues from code review on #8362:

1. Unit tests: wrap apply/restore in try/finally so env vars are
   always restored even when an assertion fails mid-test.

2. LoRA skip: move pytest.skip() out of download_lora() (utility
   method) and into the minio_lora_service fixture. download_lora()
   now raises RuntimeError so it is safe to call outside pytest.

3. _models_dir_env fixture: wrap yield in try/finally so
   _restore_models_dir_env runs even if _apply_models_dir_env raises.

4. Temp dir leak: _apply_models_dir_env now returns (orig, tmp_cache_dir)
   and _restore_models_dir_env accepts the tmp_cache_dir and deletes it
   with shutil.rmtree(ignore_errors=True).

5. Weak assertions: test_apply_bare_cache_layout now asserts
   "HF_HOME" not in os.environ; test_apply_hf_home_layout asserts
   "HF_HUB_CACHE" not in os.environ.

Minor: add logging.info() to _apply_models_dir_env showing detected
layout; add PYTHONPATH exclusion comment to _MODELS_DIR_ENV_KEYS.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: rrubin <rrubin@nvidia.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: rrubin <rrubin@nvidia.com>
Fixes all actionable items from the second review:

Bug fixes:
- #1: Change returncode=4 → returncode=2 in pytest_configure exit
  (4 is reserved by pytest for EXIT_NOTESTSCOLLECTED)
- #2: Add comment clarifying HF_HUB_OFFLINE double-clear is safe
  (already in _MODELS_DIR_ENV_KEYS; loop correctly restores original)

Test quality:
- #7: Add missing assertions to test_apply_hf_home_layout
  (HF_HUB_OFFLINE, TRANSFORMERS_OFFLINE, DYNAMO_MODELS_DIR, TRANSFORMERS_CACHE)
- #8: Use monkeypatch in tests 3 & 4 for proper env isolation
  (prevents pre-existing env vars from leaking on test failure)

Design / correctness:
- #3: Fix _models_dir_env docstring ("exactly once" → "once per worker")
- #4: Add comment noting TRANSFORMERS_CACHE deprecation
- #5: Update --models-dir help text and docs to reflect both supported
  layouts (bare HF_HUB_CACHE and HF_HOME), not just bare
- #10: Restore pytest.skip() in download_lora() (test-only infra);
  remove now-redundant guard from minio_lora_service fixture
- #11: Raise hub/ detection log to WARNING with guidance
- #12: Replace shutil.rmtree(ignore_errors=True) with try/except
  so cleanup failures are logged rather than silently swallowed

Not addressed: #6 (keep gpu_0 per project marker policy), #9 (pytester
test deferred — complex due to conftest dependencies, low severity)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: rrubin <rrubin@nvidia.com>
Both test_apply_bare_cache_layout and test_apply_hf_home_layout assert
that one of HF_HOME/HF_HUB_CACHE is absent after _apply_models_dir_env.
Without clearing ambient env vars first, these assertions fail spuriously
on developer machines that have those vars set in their shell.

Fix: clear all _MODELS_DIR_ENV_KEYS via monkeypatch before calling
_apply_models_dir_env, consistent with the test_restore_* tests.

Also parametrize test_restore_preserves_preexisting_values with
use_hf_home=[False, True] to cover the HF_HOME restore branch, which
was previously never exercised (tmp_path has no hub/ subdir by default).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: rrubin <rrubin@nvidia.com>
Bug fixes:
- #1: Add monkeypatch env isolation to test_apply_sets_writable_transformers_cache
- #2: Add TRANSFORMERS_CACHE assertion to test_apply_bare_cache_layout
  (bare-layout path was missing the writable-dir check present in HF_HOME test)

Minor cleanups:
- #4: Move `import pytest` from inside download_lora() to module top-level
  (lora_utils.py is test-only infra; pytest is always available)
- #5: Replace pytestconfig.getoption("--models-dir") re-checks in
  predownload_models/predownload_tokenizers with os.environ.get("DYNAMO_MODELS_DIR")
  (_models_dir_env runs first and sets the var; single source of truth)

New coverage tests:
- #7: test_models_dir_nonexistent_exits_with_code_2 — subprocess test
  verifying pytest_configure exits with returncode=2 on bad path
- #8: test_download_lora_skips_in_models_dir_mode — verifies download_lora()
  raises pytest.skip.Exception when DYNAMO_MODELS_DIR is set

Not addressed: #3 (keep gpu_0 per project guidelines and previous review
retraction), #6 (hook ordering is guaranteed), #9 (complex, low priority)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: rrubin <rrubin@nvidia.com>
- Fix test_models_dir_nonexistent_exits_with_code_2: cwd was set to an
  empty tmp_path so conftest.py was never discovered and --models-dir
  was never registered. Switch to Path(__file__).parents[1] (project
  root) so the option is registered before pytest_configure fires.

- Document edge case in predownload_models and predownload_tokenizers:
  checking DYNAMO_MODELS_DIR (rather than --models-dir getoption) means
  a developer with that var set in their shell silently skips downloads
  but without HF offline env vars being configured.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: rrubin <rrubin@nvidia.com>
…tent-path test

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: rrubin <rrubin@nvidia.com>
Signed-off-by: rrubin <rrubin@nvidia.com>
…ning on transformers≥4.22

Setting TRANSFORMERS_CACHE in _apply_models_dir_env caused a FutureWarning at
transformers import time. With filterwarnings=error in pyproject.toml this
turned every --models-dir test into a failure on the pod (transformers 4.57.6).
TRANSFORMERS_OFFLINE=1 is sufficient; TRANSFORMERS_CACHE was only a guard for
transformers<4.22.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: rrubin <rrubin@nvidia.com>
…pty components

- _disable_offline_with_mistral_patch now calls shutil.rmtree on the
  sitecustomize patch directory so stale /tmp dirs don't accumulate
- Disable PYTHONPATH filter now drops empty components (e and e != patch_dir)
- Enable PYTHONPATH construction filters empty components from the existing
  value before joining, preventing malformed paths when existing PYTHONPATH
  has leading/trailing colons
- Two new unit tests: test_disable_removes_patch_dir and
  test_enable_normalizes_pythonpath_empty_components

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: rrubin <rrubin@nvidia.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: rrubin <rrubin@nvidia.com>
@ranrubin
Copy link
Copy Markdown
Contributor Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 21, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
tests/hf_cache.py (1)

98-114: ⚠️ Potential issue | 🟡 Minor

Use os.pathsep for PYTHONPATH splitting and joining.

The exact-entry filtering is good, but hardcoding ":" breaks on platforms where Python path entries are separated differently. Use os.pathsep in both enable and disable paths, and update the corresponding tests to build PYTHONPATH with os.pathsep.

🛠️ Proposed fix
-    existing_entries = [e for e in os.environ.get("PYTHONPATH", "").split(":") if e]
-    os.environ["PYTHONPATH"] = ":".join([patch_dir] + existing_entries)
+    existing_entries = [
+        e for e in os.environ.get("PYTHONPATH", "").split(os.pathsep) if e
+    ]
+    os.environ["PYTHONPATH"] = os.pathsep.join([patch_dir, *existing_entries])
@@
     patch_dir = os.path.join(tempfile.gettempdir(), f"dynamo_test_hf_patch_{worker_id}")
     pythonpath = os.environ.get("PYTHONPATH", "")
-    result = ":".join(e for e in pythonpath.split(":") if e and e != patch_dir)
+    result = os.pathsep.join(
+        e for e in pythonpath.split(os.pathsep) if e and e != patch_dir
+    )

Read-only verification:

#!/bin/bash
# Description: Find hardcoded PYTHONPATH separators in the helper and its tests.

rg -n 'PYTHONPATH|split\(":\"\)|":".join' tests/hf_cache.py tests/test_models_dir_flag.py -C 2
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/hf_cache.py` around lines 98 - 114, Update both
_enable_offline_with_mistral_patch and _disable_offline_with_mistral_patch to
use os.pathsep instead of the hardcoded ":" when splitting and joining
PYTHONPATH entries: replace split(":") and ":".join(...) with split(os.pathsep)
and os.pathsep.join(...), ensure filtering logic that removes patch_dir
(variable patch_dir) and builds existing_entries remains correct, and update any
tests that construct PYTHONPATH to use os.pathsep so they behave correctly on
non-UNIX platforms.
🧹 Nitpick comments (1)
tests/conftest.py (1)

150-157: Apply --models-dir env vars before test collection to ensure they're set at import time.

Currently _models_dir_env runs as a session fixture after test modules are collected/imported. If any test module imports huggingface_hub or transformers at the top level, it reads HF env vars before this fixture applies them. While tests currently follow the convention of side-effect-free module-level imports, the design should protect against this rather than rely on convention.

Move _apply_models_dir_env() into pytest_configure after path validation, store the snapshot on config.stash, and restore in pytest_unconfigure. Keep _models_dir_env as a compatibility no-op fixture for predownload dependencies that declare it as a requirement.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/conftest.py` around lines 150 - 157, Move application of the models-dir
env snapshot into pytest_configure: after validating models_dir in
pytest_configure, call _apply_models_dir_env() and save its returned snapshot
onto config.stash (e.g., config.stash["models_dir_env_snapshot"] = snapshot);
implement pytest_unconfigure to restore the snapshot from config.stash by
invoking the corresponding restore logic; leave the _models_dir_env fixture
present but make it a no-op that only exists for compatibility so predownload
dependencies depending on it still resolve. Ensure you reference the functions
_apply_models_dir_env, pytest_configure, pytest_unconfigure, and the
_models_dir_env fixture and use config.stash for storage/restoration.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/hf_cache.py`:
- Around line 126-153: Update the environment keys snapshot tuple
_MODELS_DIR_ENV_KEYS to include "TRANSFORMERS_CACHE",
"PYTORCH_TRANSFORMERS_CACHE", and "PYTORCH_PRETRAINED_BERT_CACHE" so that
_apply_models_dir_env() captures and can restore/clear these legacy Transformers
cache variables; ensure the existing logic that pops/sets HF_HUB_CACHE and
HF_HOME also pops those added keys (using os.environ.pop(..., None)) before
setting HF_HUB_OFFLINE, TRANSFORMERS_OFFLINE and DYNAMO_MODELS_DIR, so
pre-existing TRANSFORMERS_CACHE or legacy vars cannot override the intended
read-only --models-dir behavior.

---

Duplicate comments:
In `@tests/hf_cache.py`:
- Around line 98-114: Update both _enable_offline_with_mistral_patch and
_disable_offline_with_mistral_patch to use os.pathsep instead of the hardcoded
":" when splitting and joining PYTHONPATH entries: replace split(":") and
":".join(...) with split(os.pathsep) and os.pathsep.join(...), ensure filtering
logic that removes patch_dir (variable patch_dir) and builds existing_entries
remains correct, and update any tests that construct PYTHONPATH to use
os.pathsep so they behave correctly on non-UNIX platforms.

---

Nitpick comments:
In `@tests/conftest.py`:
- Around line 150-157: Move application of the models-dir env snapshot into
pytest_configure: after validating models_dir in pytest_configure, call
_apply_models_dir_env() and save its returned snapshot onto config.stash (e.g.,
config.stash["models_dir_env_snapshot"] = snapshot); implement
pytest_unconfigure to restore the snapshot from config.stash by invoking the
corresponding restore logic; leave the _models_dir_env fixture present but make
it a no-op that only exists for compatibility so predownload dependencies
depending on it still resolve. Ensure you reference the functions
_apply_models_dir_env, pytest_configure, pytest_unconfigure, and the
_models_dir_env fixture and use config.stash for storage/restoration.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cbdf8383-a1c4-48c7-91a5-384f93b436a1

📥 Commits

Reviewing files that changed from the base of the PR and between 8679faa and 660bd6e.

📒 Files selected for processing (5)
  • .ai/pytest-guidelines.md
  • tests/conftest.py
  • tests/hf_cache.py
  • tests/serve/lora_utils.py
  • tests/test_models_dir_flag.py

Comment thread tests/hf_cache.py
…--models-dir mode

TRANSFORMERS_CACHE, PYTORCH_TRANSFORMERS_CACHE, and PYTORCH_PRETRAINED_BERT_CACHE
take precedence over HF_HUB_CACHE/HF_HOME; if pre-set in the environment they
silently override --models-dir, defeating read-only cache mode.

Adds _TRANSFORMERS_CACHE_OVERRIDE_KEYS tuple, includes these keys in
_MODELS_DIR_ENV_KEYS so they are snapshotted and restored, and pops them
explicitly in _apply_models_dir_env after the HF_HOME/HF_HUB_CACHE logic.
Tests updated to assert all three keys are absent after apply.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: rrubin <rrubin@nvidia.com>
@ranrubin
Copy link
Copy Markdown
Contributor Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 21, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/hf_cache.py (1)

98-99: Nit: use unpacking to avoid list concatenation.

Ruff RUF005 flags this. Trivial change:

♻️ Proposed refactor
-    existing_entries = [e for e in os.environ.get("PYTHONPATH", "").split(":") if e]
-    os.environ["PYTHONPATH"] = ":".join([patch_dir] + existing_entries)
+    existing_entries = [e for e in os.environ.get("PYTHONPATH", "").split(":") if e]
+    os.environ["PYTHONPATH"] = ":".join([patch_dir, *existing_entries])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/hf_cache.py` around lines 98 - 99, Replace the list concatenation when
building the PYTHONPATH with sequence unpacking to satisfy Ruff RUF005: instead
of creating existing_entries and using [patch_dir] + existing_entries, construct
the new path using unpacking so os.environ["PYTHONPATH"] = ":".join([patch_dir,
*existing_entries]) (referencing variables patch_dir and existing_entries and
the os.environ["PYTHONPATH"] assignment).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/hf_cache.py`:
- Around line 98-99: Replace the list concatenation when building the PYTHONPATH
with sequence unpacking to satisfy Ruff RUF005: instead of creating
existing_entries and using [patch_dir] + existing_entries, construct the new
path using unpacking so os.environ["PYTHONPATH"] = ":".join([patch_dir,
*existing_entries]) (referencing variables patch_dir and existing_entries and
the os.environ["PYTHONPATH"] assignment).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b6363dad-3a71-4152-82b9-f19ceafeb54b

📥 Commits

Reviewing files that changed from the base of the PR and between 8679faa and 2e7887d.

📒 Files selected for processing (5)
  • .ai/pytest-guidelines.md
  • tests/conftest.py
  • tests/hf_cache.py
  • tests/serve/lora_utils.py
  • tests/test_models_dir_flag.py

Comment thread tests/test_models_dir_flag.py Outdated
…_dir_env

The parameter was never passed by any production caller (conftest.py and all
other test call sites use the single-arg form). Its only caller was the
companion test test_restore_handles_missing_tmp_cache, which existed solely to
cover that unreachable branch. Remove both together.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: rrubin <rrubin@nvidia.com>
@ranrubin ranrubin enabled auto-merge (squash) April 21, 2026 16:17
@ranrubin ranrubin disabled auto-merge April 21, 2026 16:17
@ranrubin ranrubin enabled auto-merge (squash) April 21, 2026 16:45
@ranrubin ranrubin merged commit b688597 into main Apr 21, 2026
90 of 92 checks passed
@dmitry-tokarev-nv
Copy link
Copy Markdown
Contributor

@ranrubin Should this test be modified/skipped when the model is present in the cache?
https://github.com/ai-dynamo/dynamo/blob/main/tests/test_predownload_models.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

actions documentation Improvements or additions to documentation feat size/XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants