Skip to content

GPU implementation of sparse equality constraint Jacobian#49

Closed
pelesh wants to merge 3 commits intoolcf-hackathon-2026-devfrom
kasia/equality-constraint-jac
Closed

GPU implementation of sparse equality constraint Jacobian#49
pelesh wants to merge 3 commits intoolcf-hackathon-2026-devfrom
kasia/equality-constraint-jac

Conversation

@pelesh
Copy link
Copy Markdown
Collaborator

@pelesh pelesh commented Apr 23, 2026

Merge request type

  • New feature
  • Resolves bug
  • Documentation
  • Other

Relates to

  • OPFLOW
  • SOPFLOW
  • SCOPFLOW
  • TCOPFLOW
  • CMake build system
  • Spack configuration
  • Manual
  • Web docs
  • Other

This MR updates

  • Header files
  • Source code
  • CMake build system
  • Spack configuration
  • Web docs
  • Manual
  • Other

Summary

Merging branch by @kswirydo rebased to "OLCF" branch.

Port equality constraint Jacobian from PETSc to RAJA GPU kernels

Replace the PETSc-based equality constraint Jacobian computation in the
PBPOLRAJAHIOPSPARSE model with direct GPU kernels using RAJA, eliminating
the D2H-compute-H2D round trip. The sparsity pattern is now computed on
the host during setup and the values are computed entirely on device.

Key changes:
- Add ComputeEqJacValuesGPU_PBPOLRAJAHIOPSPARSE in new gpu.cpp/hpp files
- Add device arrays for flat-array indices (bus eqjacsp_selfidx, line
  eqjacsp_idx/eqjacsp_diag_idx/isdcline, gen xpdevidx/xpsetidx)
- Fix nnz counting bugs (missing gen/load entries, off-by-one in line
  loop) and populate flat-array indices during model setup
- Replace PETSc MatGetRow extraction in sparsity and values phases
- Handle parallel lines by sharing off-diagonal positions with atomicAdd
- Use pre-computed nnz in get_sparse_blocks_info instead of PETSc query
- Add correctness test (test_eqjac_compare) and performance benchmark
  (test_eqjac_perf)

Made-with: Cursor

This PR breaks HiOp MDS tests with RAJA! Need to investigate before merging to develop.

Replace the PETSc-based equality constraint Jacobian computation in the
PBPOLRAJAHIOPSPARSE model with direct GPU kernels using RAJA, eliminating
the D2H-compute-H2D round trip. The sparsity pattern is now computed on
the host during setup and the values are computed entirely on device.

Key changes:
- Add ComputeEqJacValuesGPU_PBPOLRAJAHIOPSPARSE in new gpu.cpp/hpp files
- Add device arrays for flat-array indices (bus eqjacsp_selfidx, line
  eqjacsp_idx/eqjacsp_diag_idx/isdcline, gen xpdevidx/xpsetidx)
- Fix nnz counting bugs (missing gen/load entries, off-by-one in line
  loop) and populate flat-array indices during model setup
- Replace PETSc MatGetRow extraction in sparsity and values phases
- Handle parallel lines by sharing off-diagonal positions with atomicAdd
- Use pre-computed nnz in get_sparse_blocks_info instead of PETSc query
- Add correctness test (test_eqjac_compare) and performance benchmark
  (test_eqjac_perf)

Made-with: Cursor
@pelesh pelesh self-assigned this Apr 23, 2026
@pelesh pelesh added enhancement New feature or request opflow Related to ACOPF computations labels Apr 23, 2026
@pelesh
Copy link
Copy Markdown
Collaborator Author

pelesh commented Apr 23, 2026

Close in favor of #48

@pelesh pelesh closed this Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request opflow Related to ACOPF computations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants