[Perf] Streams 2: Add AMDGPU/HIP stream support by hughperkins · Pull Request #408 · Genesis-Embodied-AI/quadrants

hughperkins · 2026-03-11T23:53:36Z

Mirrors the CUDA stream implementation for HIP: adds stream_ member to AMDGPUContext, stream_destroy/stream_wait_event/malloc_async/ mem_free_async to HIP driver functions, and AMDGPU branches in all Program stream/event methods. Converts AMDGPU kernel launcher to use async memory operations through the active stream. CPU backend returns 0 handles (no-op).

Issue: #

Brief Summary

copilot:summary

Walkthrough

copilot:walkthrough

Mirrors the CUDA stream implementation for HIP: adds stream_ member to AMDGPUContext, stream_destroy/stream_wait_event/malloc_async/ mem_free_async to HIP driver functions, and AMDGPU branches in all Program stream/event methods. Converts AMDGPU kernel launcher to use async memory operations through the active stream. CPU backend returns 0 handles (no-op).

…quadrantsic-2-amdgpu-cpu

Batch the device_result_buffer free into the stream pipeline before the sync barrier, matching the CUDA kernel launcher's ordering for consistency and marginal performance improvement.

Use memcpy_host_to_device_async for external array transfers so they are properly ordered on the active stream, matching the CUDA launcher.

Lower GPU speedup threshold from 1.5x to 1.3x to reduce flakiness in CI under contention, and print actual timings for diagnostics.

…ead_local Mirror the CUDA fixes: guard stream_synchronize against handle==0 to avoid unintentional default stream sync, and make AMDGPUContext::stream_ thread_local for thread-safety.

hughperkins · 2026-03-12T01:19:34Z

Opus review (written before last 6 commits above):

Code Review: `hp/streams-quadrantsic-2-amdgpu-cpu`

Summary

This PR adds AMDGPU (HIP) support for the stream and event APIs that were previously implemented only for CUDA. It touches three areas:

program.cpp -- Mirrors every CUDA stream/event entry point with an AMDGPU equivalent under #ifdef QD_WITH_AMDGPU
AMDGPU RHI layer (amdgpu_context.h/cpp, amdgpu_driver_functions.inc.h) -- Adds stream_ member, getter/setter, missing HIP driver function bindings, and wires the active stream through kernel launch
AMDGPU kernel launcher -- Migrates from synchronous malloc/memcpy/mem_free to their async variants, stream-aware throughout
Tests -- Extends test_create_and_destroy_stream and test_create_and_destroy_event to AMDGPU; adds a new concurrency timing test

The approach is sound: it closely mirrors the existing CUDA stream implementation, and the diff is consistent and well-scoped.

Architecture & Design

The pattern of duplicating each #ifdef QD_WITH_CUDA block with an #ifdef QD_WITH_AMDGPU block in program.cpp is consistent with the existing codebase style. However, this does produce a lot of nearly-identical code (9 blocks). This is a pre-existing design issue and not something to block the PR on, but it's worth noting that a future refactor toward a backend-agnostic GPUDriver interface could eliminate this duplication.

The AMDGPU kernel launcher now structurally matches the CUDA one closely, which is the right approach.

Issues

1. Operation ordering divergence from CUDA kernel launcher (minor)

In the CUDA kernel launcher, mem_free_async(device_result_buffer) is queued on the stream before the transfers synchronization:

// quadrants/runtime/cuda/kernel_launcher.cpp:144-198
// ... kernel launches ...
if (ctx.arg_buffer_size > 0) {
    CUDADriver::get_instance().mem_free_async(device_arg_buffer, active_stream);
}
if (ctx.result_buffer_size > 0) {
    CUDADriver::get_instance().memcpy_device_to_host_async(
        host_result_buffer, device_result_buffer, ctx.result_buffer_size,
        active_stream);
}
CUDADriver::get_instance().mem_free_async(device_result_buffer,
                                          active_stream);
// copy data back to host
if (transfers.size() > 0) {
    CUDADriver::get_instance().stream_synchronize(active_stream);

In the AMDGPU version, mem_free_async(device_result_buffer) comes after the transfers sync:

// quadrants/runtime/amdgpu/kernel_launcher.cpp:168-180
if (transfers.size()) {
    AMDGPUDriver::get_instance().stream_synchronize(active_stream);
    for (auto itr = transfers.begin(); itr != transfers.end(); itr++) {
        // ...
    }
}
AMDGPUDriver::get_instance().mem_free_async(device_result_buffer,
                                            active_stream);

This isn't a correctness bug, but it's a minor inefficiency: in the CUDA version, the free is batched into the stream pipeline before the sync barrier; in the AMDGPU version, the sync has already drained the stream, so mem_free_async runs on an idle stream. Consider matching the CUDA ordering for consistency and marginal performance.

2. memcpy_host_to_device for external arrays is still synchronous

// quadrants/runtime/amdgpu/kernel_launcher.cpp:71-72
AMDGPUDriver::get_instance().memcpy_host_to_device(
    (void *)device_ptrs[data_ptr_idx], data_ptr, arr_sz);

The external array transfer path still uses synchronous memcpy_host_to_device, which will operate on the default stream rather than active_stream. This is consistent with the CUDA launcher (which does the same), but worth noting: if a user sets a custom stream and passes host-backed external arrays, the transfer won't be ordered with respect to that stream. This is a pre-existing limitation in both backends.

3. free_kernel_arg_pointer uses synchronous mem_free on async-allocated memory

// quadrants/rhi/amdgpu/amdgpu_context.h:44-50
void free_kernel_arg_pointer() {
    for (auto &i : kernel_arg_pointer_) {
        AMDGPUDriver::get_instance().mem_free(i);
    }
    kernel_arg_pointer_.erase(kernel_arg_pointer_.begin(),
                              kernel_arg_pointer_.end());
}

context_pointer is allocated with malloc_async on active_stream but freed later via mem_free (synchronous). Mixing is allowed by HIP, and the synchronous free will implicitly wait, so this isn't a correctness issue. But if free_kernel_arg_pointer is ever called while the kernel is still in flight on a non-default stream, the synchronous free will stall. This is pre-existing and not introduced by this PR.

4. hipMallocAsync / hipFreeAsync minimum ROCm version requirement

The old code had a comment noting "Malloc_Async and Free_Async are available after ROCm 5.4". This comment was removed in the diff (which is fine since it was in the wrong place), but there's no build-system or runtime check ensuring the user's HIP runtime supports these functions. If someone runs this on ROCm < 5.4, the PER_AMDGPU_FUNCTION dynamic symbol lookup will fail at runtime. Consider documenting the minimum ROCm version requirement, or adding a graceful fallback.

Test Feedback

5. Timing-based concurrency test may be flaky in CI

The test_concurrent_streams_with_events test asserts speedup > 1.5 on GPU. Timing-based tests are inherently fragile, especially in shared CI environments or on GPUs under contention. The warmup and JIT compilation are handled well, but 5 million spin iterations may not produce enough wall-clock time for a stable ratio on very fast GPUs (where launch overhead dominates), or could be too slow on older GPUs.

Consider:

Logging the actual timings (serial/stream) for diagnostic purposes even on pass
Using a more relaxed threshold (e.g., > 1.3) to reduce flakiness
Adding a pytest.mark.slow or similar marker if this isn't intended to run in every CI pass

6. test_concurrent_streams_with_events uses @test_utils.test() (all archs) including CPU

The test correctly handles CPU with a relaxed speedup > 0.75 check. However, the CPU path still runs two heavy spin loops (5M iterations each) serially, which adds significant wall time to the CPU test suite. If this is a concern, you could restrict the test to GPU-only and keep the CPU path in a lighter test.

Nits

The removed comment block in kernel_launcher.cpp (lines 37-44 in the old code) was verbose but actually contained useful context about why device_result_buffer must always be allocated. The replacement comment is more concise but could lose the specific error message context. The new comment is adequate.
In program.cpp, the function set_current_cuda_stream is semantically misnamed for AMDGPU -- it sets the HIP stream, not a CUDA stream. This is pre-existing (the function name comes from the base branch), but might be worth a follow-up rename to something like set_current_gpu_stream.

Verdict

The PR is clean, well-structured, and correctly mirrors the CUDA stream implementation for AMDGPU. The core changes (async memory operations, stream plumbing through context and kernel launch, driver function bindings) are correct. The issues raised are mostly minor consistency/efficiency points and pre-existing design concerns. I'd recommend addressing items 1 (ordering) and 4 (ROCm version) before merging; the rest can be tracked as follow-ups.

This reverts commit 60d015b.

hughperkins · 2026-03-12T01:23:34Z

I'm reverting the relax of timing threshold in the tests. I think 1.5x is alreayd pretty relaxed

duburcqa · 2026-03-15T10:22:02Z

+PER_AMDGPU_FUNCTION(stream_wait_event,
+                    hipStreamWaitEvent,
+                    void *,
+                    void *,
+                    uint32);


Maybe it is time to increase linewidth of C++ code. 80 chars is painful nowadays. I would recommend either 100 or 120, as a matter of taste.

duburcqa · 2026-03-15T10:22:50Z

-  // Here we have to guarantee the result_result_buffer isn't nullptr
-  // It is interesting - The code following
-  // L60:           DeviceAllocation devalloc =
-  // executor->allocate_memory_on_device( call another kernel and it will result
-  // in
-  //   Memory access fault by GPU node-1 (Agent handle: 0xeda5ca0) on address
-  //   (nil). Reason: Page not present or supervisor privilege.
-  // if you don't allocate it.


Why did you remove this comment? It is no longer applicable? It is irrelevant?

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu Made-with: Cursor # Conflicts: # quadrants/rhi/amdgpu/amdgpu_context.cpp # quadrants/rhi/amdgpu/amdgpu_context.h # quadrants/rhi/amdgpu/amdgpu_driver_functions.inc.h # quadrants/runtime/amdgpu/kernel_launcher.cpp

Made-with: Cursor

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu Made-with: Cursor # Conflicts: # quadrants/rhi/amdgpu/amdgpu_context.h # quadrants/runtime/amdgpu/kernel_launcher.cpp

The pure-Python perf_dispatch timing test is unreliable on Mac Metal and Vulkan (MoltenVK) where timing differences between implementations are too small to consistently pick the fastest one. Made-with: Cursor

…in test_perf_dispatch.py (take base branch's narrower exclude list) Made-with: Cursor

hughperkins · 2026-04-25T08:03:52Z

migrated to use single PR on streams 4

…quadrantsic-2-amdgpu-cpu

github-actions · 2026-04-28T17:33:38Z

Coverage Report (`228150ad2`)

File	Coverage	Missing
🟢 `tests/python/test_streams.py`	100%

Diff coverage: 100% · Overall: 73% · 61 lines, 0 missing

Full annotated report

…quadrantsic-2-amdgpu-cpu

github-actions · 2026-05-01T18:23:35Z

Coverage Report (`ce2fc6bfd`)

File	Coverage	Missing
🟢 `tests/python/test_streams.py`	98%	270

Diff coverage: 98% · Overall: 74% · 61 lines, 1 missing

Full annotated report

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

github-actions · 2026-05-01T20:53:35Z

Coverage Report (`117a71ffb`)

File	Coverage	Missing
🟢 `tests/python/test_streams.py`	100%

Diff coverage: 100% · Overall: 74% · 61 lines, 0 missing

Full annotated report

Streams + autodiff now throws an exception (base branch commit 360adc8), so our stream-aware adstack fixes in shared code (llvm_runtime_executor.cpp) and the AMDGPU resolve_num_threads async DtoH are unnecessary. Take the base branch version for those shared-file sections. Re-apply AMDGPU-specific non-autodiff stream work: active_stream async DtoH/HtoD in launch_llvm_kernel, context_synchronize/device_synchronize in synchronize(). Co-authored-by: Cursor <cursoragent@cursor.com>

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

The comment explains a non-obvious race condition: context_pointer must be freed directly (now via mem_free_async on active_stream) rather than through AMDGPUContext's deferred free list, because that list is drained by LlvmRuntimeExecutor::synchronize which can be called mid-launch. Co-authored-by: Cursor <cursoragent@cursor.com>

github-actions · 2026-05-02T11:33:38Z

Coverage Report (`6e49c52d1`)

File	Coverage	Missing
🟢 `tests/python/test_streams.py`	100%

Diff coverage: 100% · Overall: 74% · 61 lines, 0 missing

Full annotated report

Base branch extracted stream/event methods from program.cpp into program_stream.cpp (CUDA only). Add the AMDGPU branches with make_current() calls to all nine methods in the new file. Co-authored-by: Cursor <cursoragent@cursor.com>

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

Co-authored-by: Cursor <cursoragent@cursor.com>

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

…pile_config() The base branch refactored stream/event methods from Program:: to StreamManager::, which stores the arch in arch_. Our AMDGPU branches still referenced compile_config().arch which is a Program method. Co-authored-by: Cursor <cursoragent@cursor.com>

github-actions · 2026-05-02T14:21:44Z

Coverage Report (`d3317f5cf`)

File	Coverage	Missing
🟢 `tests/python/test_streams.py`	98%	270

Diff coverage: 98% · Overall: 74% · 61 lines, 1 missing

Full annotated report

…quadrantsic-2-amdgpu-cpu

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

…sis-Embodied-AI/gstaichi into hp/streams-quadrantsic-2-amdgpu-cpu

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

Co-authored-by: Cursor <cursoragent@cursor.com>

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

github-actions · 2026-05-02T18:50:09Z

Coverage Report (`93cd1660e`)

File	Coverage	Missing
🟢 `tests/python/test_streams.py`	100%

Diff coverage: 100% · Overall: 74% · 61 lines, 0 missing

Full annotated report

hughperkins · 2026-05-02T19:09:57Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 93cd1660eb

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-02T19:16:07Z

  }
 #endif
+#ifdef QD_WITH_AMDGPU
+  if (arch_ == Arch::amdgpu && stream_handle != 0) {


Synchronize default AMDGPU stream when handle is zero

Remove the stream_handle != 0 guard here so the default stream can be synchronized. In this API, handle 0 represents the default stream (as used by Event.wait()/record() defaults and by backend stream semantics), but this branch currently turns stream_synchronize(0) into a no-op on AMDGPU, so callers that explicitly synchronize the default stream can proceed while queued work is still running.

Useful? React with 👍 / 👎.

The stream_handle != 0 guard made synchronize_stream a no-op for the default stream on AMDGPU, unlike the CUDA path. HIP supports hipStreamSynchronize(nullptr), so remove the guard to match CUDA semantics. Co-authored-by: Cursor <cursoragent@cursor.com>

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

hughperkins · 2026-05-02T19:32:05Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3f5a868958

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-02T19:39:23Z

+
+    speedup = serial_time / stream_time
+    if qd.lang.impl.current_cfg().arch in (qd.cuda, qd.amdgpu):
+        assert speedup > 1.5, f"Expected >1.5x speedup, got {speedup:.2f}x"


Relax fixed 1.5x stream-speedup assertion

This assertion makes the test depend on hardware/driver scheduling behavior rather than correctness: CUDA/HIP stream concurrency is workload- and device-dependent, so valid implementations can produce <1.5x speedup (for example on small or heavily occupied GPUs, or when concurrent kernels are not co-scheduled). That can cause nondeterministic CI failures even when stream/event semantics are correct; the test should assert functional ordering/correctness and use a looser or optional perf check.

Useful? React with 👍 / 👎.

github-actions · 2026-05-02T20:52:53Z

Coverage Report (`3f5a86895`)

File	Coverage	Missing
🟢 `tests/python/test_streams.py`	100%

Diff coverage: 100% · Overall: 74% · 61 lines, 0 missing

Full annotated report

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

github-actions · 2026-05-03T00:23:26Z

Coverage Report (`392b19a3c`)

File	Coverage	Missing
🟢 `tests/python/test_streams.py`	100%

Diff coverage: 100% · Overall: 74% · 61 lines, 0 missing

Full annotated report

hughperkins marked this pull request as draft March 11, 2026 23:54

hughperkins added 6 commits March 11, 2026 17:26

Merge branch 'hp/streams-quadrantsic-1-cuda-streams' into hp/streams-…

b133bd7

…quadrantsic-2-amdgpu-cpu

Move AMDGPU mem_free_async before transfers sync to match CUDA ordering

7555ec5

Batch the device_result_buffer free into the stream pipeline before the sync barrier, matching the CUDA kernel launcher's ordering for consistency and marginal performance improvement.

Convert AMDGPU sync memcpy_host_to_device to async on active_stream

c12d23e

Use memcpy_host_to_device_async for external array transfers so they are properly ordered on the active stream, matching the CUDA launcher.

Document ROCm >= 5.4 requirement for hipMallocAsync/hipFreeAsync

1673a38

Relax concurrency test threshold and log timings

60d015b

Lower GPU speedup threshold from 1.5x to 1.3x to reduce flakiness in CI under contention, and print actual timings for diagnostics.

Add handle==0 guard to AMDGPU stream_synchronize and make stream_ thr…

c4be4ff

…ead_local Mirror the CUDA fixes: guard stream_synchronize against handle==0 to avoid unintentional default stream sync, and make AMDGPUContext::stream_ thread_local for thread-safety.

Revert "Relax concurrency test threshold and log timings"

b28e7c6

This reverts commit 60d015b.

hughperkins marked this pull request as ready for review March 12, 2026 01:24

hughperkins changed the title ~~[Perf] Streams part 2: Add AMDGPU/HIP stream support~~ [Perf] Streams 2: Add AMDGPU/HIP stream support Mar 12, 2026

hughperkins marked this pull request as draft March 12, 2026 04:59

duburcqa reviewed Mar 15, 2026

View reviewed changes

duburcqa approved these changes Mar 15, 2026

View reviewed changes

hughperkins added 5 commits April 19, 2026 18:14

Apply clang-format

31fffbf

Made-with: Cursor

Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…

1056bb4

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu Made-with: Cursor # Conflicts: # quadrants/rhi/amdgpu/amdgpu_context.h # quadrants/runtime/amdgpu/kernel_launcher.cpp

Exclude flaky test_perf_dispatch_python from Metal and Vulkan

798f87a

The pure-Python perf_dispatch timing test is unreliable on Mac Metal and Vulkan (MoltenVK) where timing differences between implementations are too small to consistently pick the fastest one. Made-with: Cursor

Merge origin/hp/streams-quadrantsic-1-cuda-streams, resolve conflict …

22c5524

…in test_perf_dispatch.py (take base branch's narrower exclude list) Made-with: Cursor

hughperkins closed this Apr 25, 2026

hughperkins added 2 commits April 28, 2026 08:43

Merge branch 'hp/streams-quadrantsic-1-cuda-streams' into hp/streams-…

f42d4eb

…quadrantsic-2-amdgpu-cpu

[Doc] Update streams doc with AMDGPU support

2238969

hughperkins reopened this Apr 28, 2026

Merge branch 'hp/streams-quadrantsic-1-cuda-streams' into hp/streams-…

228150a

…quadrantsic-2-amdgpu-cpu

Merge branch 'hp/streams-quadrantsic-1-cuda-streams' into hp/streams-…

e368b4d

…quadrantsic-2-amdgpu-cpu

Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…

117a71f

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

hughperkins and others added 3 commits May 2, 2026 01:46

Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…

b030e4c

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

hughperkins and others added 5 commits May 2, 2026 04:57

Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…

c1562f2

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

Fix clang-format in program_stream.h

1c81322

Co-authored-by: Cursor <cursoragent@cursor.com>

Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…

91fae3f

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

hughperkins and others added 7 commits May 2, 2026 12:19

Merge branch 'hp/streams-quadrantsic-1-cuda-streams' into hp/streams-…

33f2a04

…quadrantsic-2-amdgpu-cpu

Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…

b7eb63a

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

Merge branch 'hp/streams-quadrantsic-2-amdgpu-cpu' of github.com:Gene…

52a3be1

…sis-Embodied-AI/gstaichi into hp/streams-quadrantsic-2-amdgpu-cpu

Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…

4cef21b

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…

4711160

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

Fix clang-format in export_stream.cpp

b4450f7

Co-authored-by: Cursor <cursoragent@cursor.com>

Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…

93cd166

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

chatgpt-codex-connector Bot reviewed May 2, 2026

View reviewed changes

hughperkins and others added 2 commits May 2, 2026 12:18

Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…

3f5a868

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

chatgpt-codex-connector Bot reviewed May 2, 2026

View reviewed changes

Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…

392b19a

…reams' into hp/streams-quadrantsic-2-amdgpu-cpu

Conversation

hughperkins commented Mar 11, 2026

Brief Summary

Walkthrough

Uh oh!

hughperkins commented Mar 12, 2026

Code Review: hp/streams-quadrantsic-2-amdgpu-cpu

Summary

Architecture & Design

Issues

Test Feedback

Nits

Verdict

Uh oh!

hughperkins commented Mar 12, 2026

Uh oh!

duburcqa Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

duburcqa Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

hughperkins commented Apr 25, 2026

Uh oh!

github-actions Bot commented Apr 28, 2026

Coverage Report (228150ad2)

Uh oh!

github-actions Bot commented May 1, 2026

Coverage Report (ce2fc6bfd)

Uh oh!

github-actions Bot commented May 1, 2026

Coverage Report (117a71ffb)

Uh oh!

github-actions Bot commented May 2, 2026

Coverage Report (6e49c52d1)

Uh oh!

github-actions Bot commented May 2, 2026

Coverage Report (d3317f5cf)

Uh oh!

github-actions Bot commented May 2, 2026

Coverage Report (93cd1660e)

Uh oh!

hughperkins commented May 2, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

hughperkins commented May 2, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 2, 2026

Coverage Report (3f5a86895)

Uh oh!

github-actions Bot commented May 3, 2026

Coverage Report (392b19a3c)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Code Review: `hp/streams-quadrantsic-2-amdgpu-cpu`

Coverage Report (`228150ad2`)

Coverage Report (`ce2fc6bfd`)

Coverage Report (`117a71ffb`)

Coverage Report (`6e49c52d1`)

Coverage Report (`d3317f5cf`)

Coverage Report (`93cd1660e`)

Coverage Report (`3f5a86895`)

Coverage Report (`392b19a3c`)