CI: Cache image on main branch by saturley-hall · Pull Request #7 · ai-dynamo/dynamo

saturley-hall · 2025-03-04T17:13:23Z

No description provided.

github-actions · 2025-03-04T17:47:52Z

Test Results

2 files 2 suites 26s ⏱️
71 tests 71 ✅ 0 💤 0 ❌
89 runs 88 ✅ 1 💤 0 ❌

Results for commit e688847.

saturley-hall · 2025-03-04T17:52:34Z

Closing to reopen

Fixes all actionable items from the second review: Bug fixes: - #1: Change returncode=4 → returncode=2 in pytest_configure exit (4 is reserved by pytest for EXIT_NOTESTSCOLLECTED) - #2: Add comment clarifying HF_HUB_OFFLINE double-clear is safe (already in _MODELS_DIR_ENV_KEYS; loop correctly restores original) Test quality: - #7: Add missing assertions to test_apply_hf_home_layout (HF_HUB_OFFLINE, TRANSFORMERS_OFFLINE, DYNAMO_MODELS_DIR, TRANSFORMERS_CACHE) - #8: Use monkeypatch in tests 3 & 4 for proper env isolation (prevents pre-existing env vars from leaking on test failure) Design / correctness: - #3: Fix _models_dir_env docstring ("exactly once" → "once per worker") - #4: Add comment noting TRANSFORMERS_CACHE deprecation - #5: Update --models-dir help text and docs to reflect both supported layouts (bare HF_HUB_CACHE and HF_HOME), not just bare - #10: Restore pytest.skip() in download_lora() (test-only infra); remove now-redundant guard from minio_lora_service fixture - #11: Raise hub/ detection log to WARNING with guidance - #12: Replace shutil.rmtree(ignore_errors=True) with try/except so cleanup failures are logged rather than silently swallowed Not addressed: #6 (keep gpu_0 per project marker policy), #9 (pytester test deferred — complex due to conftest dependencies, low severity) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: rrubin <rrubin@nvidia.com>

Bug fixes: - #1: Add monkeypatch env isolation to test_apply_sets_writable_transformers_cache - #2: Add TRANSFORMERS_CACHE assertion to test_apply_bare_cache_layout (bare-layout path was missing the writable-dir check present in HF_HOME test) Minor cleanups: - #4: Move `import pytest` from inside download_lora() to module top-level (lora_utils.py is test-only infra; pytest is always available) - #5: Replace pytestconfig.getoption("--models-dir") re-checks in predownload_models/predownload_tokenizers with os.environ.get("DYNAMO_MODELS_DIR") (_models_dir_env runs first and sets the var; single source of truth) New coverage tests: - #7: test_models_dir_nonexistent_exits_with_code_2 — subprocess test verifying pytest_configure exits with returncode=2 on bad path - #8: test_download_lora_skips_in_models_dir_mode — verifies download_lora() raises pytest.skip.Exception when DYNAMO_MODELS_DIR is set Not addressed: #3 (keep gpu_0 per project guidelines and previous review retraction), #6 (hook ordering is guaranteed), #9 (complex, low priority) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: rrubin <rrubin@nvidia.com>

…later Reader-first ordering. Old order buried Quick start at position 8 of 9; new order surfaces the runnable commands above all the reference sections (FSx vs EBS, label conventions, aiperf SHA rationale). New section order: 1. Title + 3-config table 2. Pre-requisites (was #3) 3. Quick start (was #8 — promoted) 4. Directory layout (was #7 — now serves as map for the rest) 5. Hardware targets (was #2 — now pure reference; invocation examples moved into Quick start) 6. Storage (was #5) 7. aiperf install (was #6) 8. Naming & ownership (was #4) 9. Notes (unchanged) Also drop the stray "We use ebs by default" sentence — it contradicted both the Storage section and the actual yaml (where the PVC block is fully commented out, no default storage class is set). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…cycle Address senior-architect review of the metrics infrastructure: #1 (two parallel Prometheus paths): merge `register_prometheus` + `setup_component_metrics` into one `setup_metrics(ctx) -> MetricsBindings` method on the Rust `LLMEngine` trait. The trait surface shrinks from 7 → 6 methods; both expfmt bridging and the structured component publisher go through one place. The Python ABC keeps its two methods — the PyO3 bridge adapts (engine-author API unchanged for Python). #3 + #8 (framework lifecycle gauges gated on engine opt-in): - `cleanup_time_seconds` and `drain_time_seconds` move to Rust-side `LifecycleGauges` (prometheus crate gauges registered via the runtime's `MetricsRegistry`). Worker constructs them after `engine.start()` succeeds; observes during cleanup/drain. Emits regardless of engine opt-in. - Drop `set_cleanup_time` / `set_drain_time` from the `ComponentMetricsPublisher` trait — engine no longer responsible. - Drop the corresponding methods from `PyComponentMetricsPublisher`. - `model_load_time_seconds` STAYS on Python `LLMBackendMetrics` for parity with the legacy entry points (both legacy `main.py` and the unified bridge populate it). The unified bridge now constructs `LLMBackendMetrics` UNCONDITIONALLY so the gauge emits even when the engine returns no per-rank sources. #5 (kv_cache_hit_rate Optional semantics): clearer docstring on both Rust `ComponentSnapshot` and the Python ABC — tri-state explicitly documented (`None` = no data, `0.0` = legitimate zero-hit). Deferred with rationale (explained in the post-review architect's note): - #2 (HashMap<MetricKey, f64> ComponentSnapshot): real future-proofing but no current bug; 5 fields for 5 signals isn't worth the refactor cost yet. Revisit at ~10+ signals. - #4 (per-source cadence): explicit defer in original review. - #7 (conformance in CI): workflow change, separate from trait surface. Conformance kit collapses `check_register_prometheus` + `check_component_metrics` into `check_setup_metrics`. All 68 backend-common unit tests pass. Mocker example updated. PyO3 bridge wheel rebuilt successfully. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ci: cache image on main for speed of ci

89d25e2

saturley-hall temporarily deployed to GITLAB March 4, 2025 17:13 — with GitHub Actions Inactive

ci: run it on PR before setting back to main

e688847

saturley-hall temporarily deployed to GITLAB March 4, 2025 17:17 — with GitHub Actions Inactive

saturley-hall temporarily deployed to GITLAB March 4, 2025 17:23 — with GitHub Actions Inactive

saturley-hall closed this Mar 4, 2025

dagil-nvidia mentioned this pull request Oct 21, 2025

feat: Add automated dependency version tracking and extraction #3547

Closed

coderabbitai Bot mentioned this pull request Nov 5, 2025

feat: Update Dynamo k8s deployment example to use ModelExpress #4112

Closed

MatejKosec mentioned this pull request Feb 25, 2026

fix(responses): accept assistant output_text messages without id/status in input #6599

Merged

3 tasks

tanmayv25 mentioned this pull request Apr 15, 2026

DEP: Backend Interface -- LLMEngine ABC and Worker #8251

Open

atchernych mentioned this pull request Apr 23, 2026

Proposal: Hybrid Priority Scheduling in Dynamo #8580

Open

keivenchang mentioned this pull request Apr 24, 2026

[INVESTIGATE] sglang + DSv4: kwarg workaround breaks live — needs diagnosis #8671

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: Cache image on main branch#7

CI: Cache image on main branch#7
saturley-hall wants to merge 2 commits into
mainfrom
harrison/main_branch_cache

saturley-hall commented Mar 4, 2025

Uh oh!

github-actions Bot commented Mar 4, 2025

Uh oh!

saturley-hall commented Mar 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

saturley-hall commented Mar 4, 2025

Uh oh!

github-actions Bot commented Mar 4, 2025

Test Results

Uh oh!

saturley-hall commented Mar 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant