Skip to content

perf(amdgpu): deferred check_errno#48

Merged
yaoliu13 merged 2 commits intoamd-integrationfrom
perf/deepsek/expose_perf_vars
Apr 29, 2026
Merged

perf(amdgpu): deferred check_errno#48
yaoliu13 merged 2 commits intoamd-integrationfrom
perf/deepsek/expose_perf_vars

Conversation

@deepsek
Copy link
Copy Markdown
Collaborator

@deepsek deepsek commented Apr 29, 2026

Pipelined Error check to prevent sync blocking!

Copy link
Copy Markdown

@jamesETsmith jamesETsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 lgtm thanks @deepsek

@rtmadduri rtmadduri self-requested a review April 29, 2026 00:51
@rtmadduri
Copy link
Copy Markdown
Collaborator

I tested this locally and I can see throughput at 1121493

@rtmadduri
Copy link
Copy Markdown
Collaborator

/run-ci

3 similar comments
@yaoliu13
Copy link
Copy Markdown
Collaborator

/run-ci

@yaoliu13
Copy link
Copy Markdown
Collaborator

/run-ci

@deepsek
Copy link
Copy Markdown
Collaborator Author

deepsek commented Apr 29, 2026

/run-ci

@deepsek
Copy link
Copy Markdown
Collaborator Author

deepsek commented Apr 29, 2026

/run-ci

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a deferred (pipelined) check_errno path for the AMDGPU + zero-copy backend to reduce sync-induced blocking while still surfacing solver errors.

Changes:

  • Add a new kernel_bit_reduction_into kernel to write reduced errno into a provided output buffer.
  • Implement a double-buffered, event-gated deferred errno snapshot/readback path in RigidSolver.
  • Add flush_errno() calls in Simulator.destroy() and before backward (_step_grad) to drain pending deferred snapshots; invalidate pending snapshots on state reset paths.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
genesis/engine/solvers/rigid/rigid_solver.py Implements deferred errno reduction/readback, adds flush/invalidation helpers, and integrates into check_errno() + reset-like paths.
genesis/engine/solvers/rigid/abd/misc.py Adds kernel_bit_reduction_into to support writing reductions into a provided buffer/slot.
genesis/engine/simulator.py Flushes deferred errno snapshots during destroy and before backward to avoid missing final/forward-pass errors.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread genesis/engine/solvers/rigid/rigid_solver.py
Comment thread genesis/engine/solvers/rigid/rigid_solver.py
Comment thread genesis/engine/simulator.py
Copy link
Copy Markdown
Collaborator

@yaoliu13 yaoliu13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for a test run

Copy link
Copy Markdown
Collaborator

@yaoliu13 yaoliu13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yaoliu13 yaoliu13 merged commit 1df6d91 into amd-integration Apr 29, 2026
4 checks passed
@yaoliu13 yaoliu13 deleted the perf/deepsek/expose_perf_vars branch April 29, 2026 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants