ci: remove timeout for cargo deny check#4
Closed
nv-anants wants to merge 3 commits into
Closed
Conversation
Contributor
Test Results0 tests 0 ✅ 0s ⏱️ Results for commit d750624. ♻️ This comment has been updated with latest results. |
Member
|
Merged into PR #3 |
kylehh
pushed a commit
to kylehh/dynamo
that referenced
this pull request
Apr 11, 2025
elyasmnvidian
added a commit
that referenced
this pull request
Oct 14, 2025
elyasmnvidian
added a commit
that referenced
this pull request
Oct 15, 2025
ryanolson
added a commit
that referenced
this pull request
Dec 2, 2025
Signed-off-by: Ryan Olson <rolson@nvidia.com>
soodoshll
added a commit
to soodoshll/dynamo
that referenced
this pull request
Feb 9, 2026
… (text mode) (ai-dynamo#4) * fix * ups Signed-off-by: Qidong Su <soodoshll@gmail.com> * upd * update * fix * upd * fix --------- Signed-off-by: Qidong Su <soodoshll@gmail.com>
4 tasks
ranrubin
added a commit
that referenced
this pull request
Apr 20, 2026
Fixes all actionable items from the second review: Bug fixes: - #1: Change returncode=4 → returncode=2 in pytest_configure exit (4 is reserved by pytest for EXIT_NOTESTSCOLLECTED) - #2: Add comment clarifying HF_HUB_OFFLINE double-clear is safe (already in _MODELS_DIR_ENV_KEYS; loop correctly restores original) Test quality: - #7: Add missing assertions to test_apply_hf_home_layout (HF_HUB_OFFLINE, TRANSFORMERS_OFFLINE, DYNAMO_MODELS_DIR, TRANSFORMERS_CACHE) - #8: Use monkeypatch in tests 3 & 4 for proper env isolation (prevents pre-existing env vars from leaking on test failure) Design / correctness: - #3: Fix _models_dir_env docstring ("exactly once" → "once per worker") - #4: Add comment noting TRANSFORMERS_CACHE deprecation - #5: Update --models-dir help text and docs to reflect both supported layouts (bare HF_HUB_CACHE and HF_HOME), not just bare - #10: Restore pytest.skip() in download_lora() (test-only infra); remove now-redundant guard from minio_lora_service fixture - #11: Raise hub/ detection log to WARNING with guidance - #12: Replace shutil.rmtree(ignore_errors=True) with try/except so cleanup failures are logged rather than silently swallowed Not addressed: #6 (keep gpu_0 per project marker policy), #9 (pytester test deferred — complex due to conftest dependencies, low severity) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: rrubin <rrubin@nvidia.com>
ranrubin
added a commit
that referenced
this pull request
Apr 20, 2026
Bug fixes: - #1: Add monkeypatch env isolation to test_apply_sets_writable_transformers_cache - #2: Add TRANSFORMERS_CACHE assertion to test_apply_bare_cache_layout (bare-layout path was missing the writable-dir check present in HF_HOME test) Minor cleanups: - #4: Move `import pytest` from inside download_lora() to module top-level (lora_utils.py is test-only infra; pytest is always available) - #5: Replace pytestconfig.getoption("--models-dir") re-checks in predownload_models/predownload_tokenizers with os.environ.get("DYNAMO_MODELS_DIR") (_models_dir_env runs first and sets the var; single source of truth) New coverage tests: - #7: test_models_dir_nonexistent_exits_with_code_2 — subprocess test verifying pytest_configure exits with returncode=2 on bad path - #8: test_download_lora_skips_in_models_dir_mode — verifies download_lora() raises pytest.skip.Exception when DYNAMO_MODELS_DIR is set Not addressed: #3 (keep gpu_0 per project guidelines and previous review retraction), #6 (hook ordering is guaranteed), #9 (complex, low priority) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: rrubin <rrubin@nvidia.com>
4 tasks
yifjiang
added a commit
to yifjiang/dynamo
that referenced
this pull request
May 4, 2026
…code-waiter cleanup Addresses two correctness issues raised by dynamo-ops review on PR 8965. 1. **Race on PrefillReady re-insert** (model_manager.rs) Both `register_prefill_router` and `activate_prefill_router` previously used a `remove → process → insert` pattern on `prefill_router_activators`. Between the `remove` and the `insert` the entry is gone from the map, so a concurrent `remove_prefill_activator` call (called by the watcher on prefill-component teardown) hits the empty map, skips its cleanup, and we then re-insert a stale `PrefillReady` for a prefill that's already gone. The next decode rebuild's lookup finds the resurrected cache and activates a `PrefillRouter` against a dead endpoint. Refactor both functions to use DashMap's `Entry` API. The shard lock is held for the duration of the OccupiedEntry, so any concurrent `remove_prefill_activator` serializes after us and observes the entry it needs to clear. The PrefillReady-consume path in `register_prefill_router` now reads-and-clones in place without removing; the DecodeWaiting/PrefillReady → PrefillReady transition in `activate_prefill_router` uses `OccupiedEntry::insert(...)` for an atomic value swap that returns the old value. 2. **Decode-waiter cleanup gated on `removed.is_some()`** (watcher.rs) `handle_delete_helper`'s decode branch only called `remove_decode_prefill_waiter` when `remove_worker_set` returned `Some(_)`. If decode registered (creating a `DecodeWaiting` activator entry via `register_prefill_router`) but `handle_add_helper` failed later (e.g., on `kv_chooser_for`, monitor setup, or `build_routed_pipeline`) before `add_worker_set`, no WorkerSet ever landed in the manager — and on later teardown `remove_worker_set` returns `None`, leaving the stale `DecodeWaiting` entry orphaned in the activator map. Drop the `removed.is_some()` guard. The helper itself is state-safe (uses `DashMap::remove_if(|_, v| matches!(v, DecodeWaiting(_)))`) so calling it on a key that's vacant or holds `PrefillReady` is a no-op. The prefill-teardown branch keeps its `removed.is_some()` guard because `remove_prefill_activator` blanket-removes both states, and calling it without a real WorkerSet teardown could orphan a live decode-side `DecodeWaiting`. Behavior is unchanged for all four reproducers (ai-dynamo#1 burst-sweep canary OFF, ai-dynamo#2 Pattern 1 multi-decode restart, ai-dynamo#3 Pattern 2 stale-DecodeWaiting, ai-dynamo#4 burst-sweep canary ON); will re-verify on a fresh build.
4 tasks
furionw
added a commit
that referenced
this pull request
May 12, 2026
…later Reader-first ordering. Old order buried Quick start at position 8 of 9; new order surfaces the runnable commands above all the reference sections (FSx vs EBS, label conventions, aiperf SHA rationale). New section order: 1. Title + 3-config table 2. Pre-requisites (was #3) 3. Quick start (was #8 — promoted) 4. Directory layout (was #7 — now serves as map for the rest) 5. Hardware targets (was #2 — now pure reference; invocation examples moved into Quick start) 6. Storage (was #5) 7. aiperf install (was #6) 8. Naming & ownership (was #4) 9. Notes (unchanged) Also drop the stray "We use ebs by default" sentence — it contradicted both the Storage section and the actual yaml (where the PVC block is fully commented out, no default storage class is set). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joshuayao
pushed a commit
to joshuayao/dynamo
that referenced
this pull request
May 13, 2026
…xl v1.0.1 - sglang_ref: e5c2a9b -> v0.5.11 - SGLANG_KERNEL_REF: c668bb67 -> 3fbd4d3372 (latest sgl-kernel-xpu main) - nixl_ref: 0.10.1 -> v1.0.1 Manually re-applied from 7990.patch commit ai-dynamo#4 (skipped during git am due to context drift in container/context.yaml). Signed-off-by: yuanwu <yuan.wu@intel.com>
nv-rinig
added a commit
that referenced
this pull request
May 14, 2026
That reference was tied to the CI fan-out section, which is no longer in this PR. Signed-off-by: Rini Gupta <rinig@nvidia.com>
This was referenced May 15, 2026
tanmayv25
added a commit
that referenced
this pull request
May 16, 2026
…cycle Address senior-architect review of the metrics infrastructure: #1 (two parallel Prometheus paths): merge `register_prometheus` + `setup_component_metrics` into one `setup_metrics(ctx) -> MetricsBindings` method on the Rust `LLMEngine` trait. The trait surface shrinks from 7 → 6 methods; both expfmt bridging and the structured component publisher go through one place. The Python ABC keeps its two methods — the PyO3 bridge adapts (engine-author API unchanged for Python). #3 + #8 (framework lifecycle gauges gated on engine opt-in): - `cleanup_time_seconds` and `drain_time_seconds` move to Rust-side `LifecycleGauges` (prometheus crate gauges registered via the runtime's `MetricsRegistry`). Worker constructs them after `engine.start()` succeeds; observes during cleanup/drain. Emits regardless of engine opt-in. - Drop `set_cleanup_time` / `set_drain_time` from the `ComponentMetricsPublisher` trait — engine no longer responsible. - Drop the corresponding methods from `PyComponentMetricsPublisher`. - `model_load_time_seconds` STAYS on Python `LLMBackendMetrics` for parity with the legacy entry points (both legacy `main.py` and the unified bridge populate it). The unified bridge now constructs `LLMBackendMetrics` UNCONDITIONALLY so the gauge emits even when the engine returns no per-rank sources. #5 (kv_cache_hit_rate Optional semantics): clearer docstring on both Rust `ComponentSnapshot` and the Python ABC — tri-state explicitly documented (`None` = no data, `0.0` = legitimate zero-hit). Deferred with rationale (explained in the post-review architect's note): - #2 (HashMap<MetricKey, f64> ComponentSnapshot): real future-proofing but no current bug; 5 fields for 5 signals isn't worth the refactor cost yet. Revisit at ~10+ signals. - #4 (per-source cadence): explicit defer in original review. - #7 (conformance in CI): workflow change, separate from trait surface. Conformance kit collapses `check_register_prometheus` + `check_component_metrics` into `check_setup_metrics`. All 68 backend-common unit tests pass. Mocker example updated. PyO3 bridge wheel rebuilt successfully. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does the PR do?
Checklist
<commit_type>: <Title>Commit Type:
Check the conventional commit type
box here and add the label to the github PR.
Related PRs:
Where should the reviewer start?
Test plan:
Caveats:
Background
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)