Skip to content

Simplex work estimates#856

Merged
rapids-bot[bot] merged 5 commits intoNVIDIA:mainfrom
chris-maes:simplex_work_estimates
Feb 18, 2026
Merged

Simplex work estimates#856
rapids-bot[bot] merged 5 commits intoNVIDIA:mainfrom
chris-maes:simplex_work_estimates

Conversation

@chris-maes
Copy link
Copy Markdown
Contributor

@chris-maes chris-maes commented Feb 13, 2026

This PR converts dual simplex to use manual work estimates to simplify the code and ease maintenance.

Work estimates are also calculated for more functions.

@chris-maes chris-maes requested a review from a team as a code owner February 13, 2026 04:34
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Feb 13, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@chris-maes chris-maes added non-breaking Introduces a non-breaking change improvement Improves an existing functionality dual_simplex labels Feb 13, 2026
@chris-maes chris-maes added this to the 26.04 milestone Feb 13, 2026
@chris-maes
Copy link
Copy Markdown
Contributor Author

/ok to test e4985d0

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 13, 2026

📝 Walkthrough

Walkthrough

Adds a runtime work-accounting parameter (f_t& work_estimate) propagated through many dual-simplex solver routines and replaces custom instrumented vector types (ins_vector) with std::vector across matrix/vector APIs; also fixes a few direct debug/access sites to use direct member access (e.g., A.x, A.i).

Changes

Cohort / File(s) Summary
Dense/sparse container migration
cpp/src/dual_simplex/sparse_matrix.hpp, cpp/src/dual_simplex/sparse_matrix.cpp, cpp/src/dual_simplex/sparse_vector.hpp, cpp/src/dual_simplex/sparse_vector.cpp
Replaced ins_vector with std::vector for CSC/CSR and sparse-vector members; removed instrumented-vector overloads and memory-instrumentation includes; simplified cumulative_sum and scatter_dense to concrete std::vector signatures.
Work-estimate instrumentation (core LU & factorization)
cpp/src/dual_simplex/basis_solves.hpp, cpp/src/dual_simplex/basis_solves.cpp, cpp/src/dual_simplex/right_looking_lu.hpp, cpp/src/dual_simplex/right_looking_lu.cpp, cpp/src/dual_simplex/right_looking_lu.*
Added f_t& work_estimate parameter to factorize_basis, basis_repair, form_b, right_looking_lu and many internal LU helpers; propagated increments through pivoting, Schur/assembly, and updated explicit instantiations.
Basis updates & bookkeeping
cpp/src/dual_simplex/basis_updates.hpp, cpp/src/dual_simplex/basis_updates.cpp
Added mutable work_estimate_ state, accessor work_estimate() and clear_work_estimate(); updated reset/compute_transposes/clear to accumulate work accounting.
Triangular solves & reachability updates
cpp/src/dual_simplex/triangle_solve.hpp, cpp/src/dual_simplex/triangle_solve.cpp
Changed lower/upper triangular solve APIs to accept std::vector<f_t>& x and f_t& work_estimate; added work_estimate to reach, depth_first_search, and sparse_triangle_solve and updated instantiations.
Phase-2, primal, crossover & plumbing
cpp/src/dual_simplex/phase2.cpp, cpp/src/dual_simplex/primal.cpp, cpp/src/dual_simplex/initial_basis.cpp, cpp/src/dual_simplex/crossover.cpp
Massively threaded f_t& work_estimate through phase-2, primal-phase2, initial-basis, and crossover flows; converted many ins_vector parameters to std::vector and updated numerous helper function signatures and call sites.
Singletons & permutation helpers
cpp/src/dual_simplex/singletons.hpp, cpp/src/dual_simplex/singletons.cpp
Unified permutation APIs to std::vector<i_t>, added f_t template param and f_t& work_estimate to singleton/order helpers, updated explicit instantiations.
Vector utilities
cpp/src/dual_simplex/vector_math.hpp, cpp/src/dual_simplex/vector_math.cpp
Introduced concrete std::vector-based permute_vector, inverse_permute_vector, and inverse_permutation templates; moved implementations to .cpp with explicit double/integer instantiations.
Ratio test & bound-flipping instrumentation
cpp/src/dual_simplex/bound_flipping_ratio_test.hpp, cpp/src/dual_simplex/bound_flipping_ratio_test.cpp
Added private work_estimate_ and public accessor; instrumented compute_breakpoints, single_pass, heap_passes, compute_step_length to update work accounting and iterate only nonzero delta_z entries.
Right-looking LU signature cleanup
cpp/src/dual_simplex/right_looking_lu.hpp, cpp/src/dual_simplex/right_looking_lu.cpp
Removed VectorI template parameter, switched to std::vector<i_t> for column_list/q/pinv, added f_t& work_estimate output and adjusted row-permutation-only signature.
Sparse matrix/vector helpers & solves
cpp/src/dual_simplex/sparse_matrix.cpp, cpp/src/dual_simplex/sparse_matrix.hpp, cpp/src/dual_simplex/sparse_vector.*
Concrete std::vector overloads for scatter_dense and related helpers; removed generic VectorF/VectorI templates and ins_vector-based overloads; adjusted explicit instantiations.
Minor direct-access and CUDA view fixes
cpp/src/dual_simplex/bounds_strengthening.cpp, cpp/src/dual_simplex/folding.cpp, cpp/src/barrier/barrier.cu, cpp/src/barrier/cusparse_view.cu
Replaced .underlying() access with direct members (A.i, A.x, A.col_start, Arow.j, Arow.x); small debug print change and CSR copy now uses direct member vector copies.
Triangle/solve instrumentation propagation
cpp/src/dual_simplex/triangle_solve.*, cpp/src/dual_simplex/vector_math.*
Propagated work_estimate through DFS/reach/triangle-solve code and added explicit instantiations for work_estimate-aware signatures.
External/problem translation updates
cpp/src/mip_heuristics/problem/problem.cu, cpp/src/pdlp/translate.hpp
Switched host CSR assignments from ins_vector initializers to direct std::vector copies when constructing host-side CSR structures.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.69% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Simplex work estimates' directly and clearly describes the main purpose of this PR: adding work estimation functionality to the simplex solver implementation.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
cpp/src/dual_simplex/basis_updates.hpp (1)

171-194: ⚠️ Potential issue | 🔴 Critical

work_estimate_ is read before initialization — undefined behavior.

Neither constructor initializes work_estimate_ in its member initializer list. Both constructors call clear(), which performs work_estimate_ += 2 * refactor_frequency_ and work_estimate_ += xi_workspace_.size() + x_workspace_.size() (lines 408, 412), reading the uninitialized field via +=. The second constructor additionally calls compute_transposes() (line 375), which also uses +=.

Add work_estimate_(0.0) to both constructors' initializer lists.

Proposed fix

For the first constructor (line 171):

   basis_update_mpf_t(i_t n, const i_t refactor_frequency)
     : L0_(n, n, 1),
       ...
-      hypersparse_threshold_(0.05)
+      hypersparse_threshold_(0.05),
+      work_estimate_(0.0)
   {

For the second constructor (line 196):

   basis_update_mpf_t(const csc_matrix_t<i_t, f_t>& Linit,
                      ...
-      hypersparse_threshold_(0.05)
+      hypersparse_threshold_(0.05),
+      work_estimate_(0.0)
   {
cpp/src/dual_simplex/right_looking_lu.cpp (1)

18-18: ⚠️ Potential issue | 🟡 Minor

Remove unused using cuopt::ins_vector; — leftover from the ins_vectorstd::vector migration.

The ins_vector alias is imported on line 18 but never referenced anywhere in the file. The using declaration and the associated include on line 10 can both be removed.

Proposed fix
-#include <utilities/memory_instrumentation.hpp>
-
 `#include` <raft/common/nvtx.hpp>

 `#include` <cassert>
 `#include` <cmath>
 `#include` <cstdio>

-using cuopt::ins_vector;
🤖 Fix all issues with AI agents
In `@cpp/src/dual_simplex/basis_updates.cpp`:
- Around line 1264-1268: The else branch calls WT.transpose(V) but uses V_nz
still zero when updating work_estimate_, so update V_nz using the actual
non-zero count before accounting for the transpose cost; e.g., set V_nz =
V.col_start[m] (or otherwise compute V's nnz for column m) prior to doing
work_estimate_ += 3 * V_nz so the cost of WT.transpose(V) is correctly included
(refer to WT.transpose(V), V_nz, work_estimate_, and V.col_start).
- Around line 2215-2219: When etilde.i.size() can be zero, computing
work_estimate_ += etilde.i.size() * std::log2(etilde.i.size()) produces NaN; fix
by checking the size first: call etilde.sort() as before, then store auto n =
etilde.i.size(); if (n > 0) add work_estimate_ += n * std::log2(n); then always
call S_.append_column(etilde) and update work_estimate_ += 4 * n; this ensures
the log2 term is only computed for n>0 while preserving the rest of the logic
around etilde, work_estimate_, etilde.sort(), and S_.append_column(etilde).

In `@cpp/src/dual_simplex/bound_flipping_ratio_test.cpp`:
- Around line 234-239: In the loop that pops from bare_idx (the while loop
computing work_estimate_), guard the std::log2 call against a zero size so
std::log2(0) can't produce -inf; update the calculation for work_estimate_
(which currently does work_estimate_ += 2 * std::log2(bare_idx.size())) to use a
minimum of 1 for the argument (e.g., std::log2(std::max<size_t>(1,
bare_idx.size()))) or perform the addition only when bare_idx.size() > 0 so
work_estimate_ cannot become -infinity after pop_back(); adjust the code around
the pop and work_estimate_ update in the same block to use this guard.

In `@cpp/src/dual_simplex/bound_flipping_ratio_test.hpp`:
- Around line 56-58: The member work_estimate_ is not initialized and can be
read by the accessor work_estimate(), so initialize work_estimate_ to a sensible
default to avoid undefined reads; locate the class that defines work_estimate_,
add an initialization either at the member declaration (e.g., f_t work_estimate_
= 0;) or in the class constructor initializer list, and ensure the same
initialization is applied for the other occurrences referenced around lines
100-103 so all code paths see a defined value before any assignment.

In `@cpp/src/dual_simplex/phase2.cpp`:
- Around line 2508-2512: The phase2_work_estimate update is using a product
instead of the sum, over-counting allocated elements; change the increment from
using n * 2 * m to n + 2 * m so it reflects the sizes of objective (n), c_basic
(m) and xB_workspace (m). Locate the site updating phase2_work_estimate (the
line near phase2_work_estimate += n * 2 * m) and replace the expression with n +
2 * m so the estimate correctly sums the allocations for objective, c_basic, and
xB_workspace.
- Around line 3484-3485: The work estimation is using delta_z_indices.size()
after phase2::clear_delta_z has already cleared that vector, so
phase2_work_estimate += 3 * delta_z_indices.size() is a no-op; capture the size
(or increment phase2_work_estimate) before calling phase2::clear_delta_z (or
compute auto n = delta_z_indices.size(); phase2_work_estimate += 3 * n; then
call phase2::clear_delta_z(entering_index, leaving_index, delta_z_mark,
delta_z_indices, delta_z)) so the work is accounted correctly.
🧹 Nitpick comments (5)
cpp/src/dual_simplex/vector_math.cpp (1)

127-168: New permutation utilities are correct and well-documented.

The implementations for permute_vector, inverse_permute_vector, and inverse_permutation are standard and correct. The work-cost comments are a nice addition.

Minor nit: auto n = p.size() yields size_t, which may trigger signed/unsigned comparison warnings in the loop for (i_t k = 0; k < n; ++k). Consider const i_t n = static_cast<i_t>(p.size()) for consistency with the rest of the codebase.

cpp/src/dual_simplex/crossover.cpp (1)

500-536: Factorize→repair→refactorize pattern is duplicated 4 times.

This pattern (lines 500–536, 857–896, 1231–1257, 1403–1430) follows the same structure: factorize_basis → check rank → basis_repairfactorize_basis → check rank again. Extracting this into a shared helper (e.g., factorize_with_repair) would reduce ~120 lines of near-identical code and make the work_estimate plumbing less repetitive. Each call site differs only in return type mapping and a couple of post-repair steps.

cpp/src/dual_simplex/right_looking_lu.cpp (1)

36-43: Internal helpers still use typename VectorI template — could be simplified.

initialize_degree_data and load_elements (lines 36, 82) still take a generic VectorI for column_list, but all callers in this file pass std::vector<i_t>. Since the PR is removing ins_vector throughout, these could be simplified to match. Low priority since they're in an anonymous namespace and work correctly.

cpp/src/dual_simplex/singletons.cpp (1)

217-226: Redundant static_castcol_perm and row_perm are already std::vector<i_t>&.

These casts were meaningful when the parameters were ins_vector, but after the migration to std::vector they are identity casts that add confusion. The same pattern repeats at lines 253–254.

Proposed fix (lines 217–226)
-    auto& col_perm_vec = static_cast<std::vector<i_t>&>(col_perm);
-    auto& row_perm_vec = static_cast<std::vector<i_t>&>(row_perm);
     row_col_graph_t<i_t> graph{Cdeg.begin(),
-                               col_perm_vec.begin(),
+                               col_perm.begin(),
                                A.col_start.cbegin(),
                                A.i.cbegin(),
                                Rdeg.begin(),
-                               row_perm_vec.begin(),
+                               row_perm.begin(),
                                Rp.cbegin(),
                                Rj.cbegin()};
Proposed fix (lines 253–262)
-    auto& row_perm_vec2 = static_cast<std::vector<i_t>&>(row_perm);
-    auto& col_perm_vec2 = static_cast<std::vector<i_t>&>(col_perm);
     row_col_graph_t<i_t> graph{Rdeg.begin(),
-                               row_perm_vec2.begin(),
+                               row_perm.begin(),
                                Rp.cbegin(),
                                Rj.cbegin(),
                                Cdeg.begin(),
-                               col_perm_vec2.begin(),
+                               col_perm.begin(),
                                A.col_start.cbegin(),
                                A.i.cbegin()};
cpp/src/dual_simplex/phase2.cpp (1)

3490-3498: Redundant null check on work_unit_context.

Line 3490 already guards the block with && work_unit_context, so the inner check on line 3495 is redundant. Not harmful, but unnecessary.

Proposed simplification
     if ((iter % FEATURE_LOG_INTERVAL) == 0 && work_unit_context) {
       [[maybe_unused]] i_t iters_elapsed = iter - last_feature_log_iter;

       phase2_work_estimate += ft.work_estimate();
       ft.clear_work_estimate();
-      if (work_unit_context) {
-        work_unit_context->record_work_sync_on_horizon(phase2_work_estimate / 1e8);
-      }
+      work_unit_context->record_work_sync_on_horizon(phase2_work_estimate / 1e8);
       phase2_work_estimate = 0.0;

@chris-maes
Copy link
Copy Markdown
Contributor Author

/ok to test 8766092

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
cpp/src/dual_simplex/right_looking_lu.cpp (1)

314-523: ⚠️ Potential issue | 🔴 Critical

Fix iterator invalidation after pop_back() in col_count/row_count removal loops.

Computing it - begin() after pop_back() when the iterator pointed to the erased element is undefined behavior. If the found element is the last in the vector, the iterator becomes invalid after removal. Capture the index before pop_back() to fix this pattern in four locations: update_Cdegree_and_col_count (line 337), update_Rdegree_and_row_count (line 373), and two removals inside schur_complement (lines 489 and 504).

Suggested fix
-        std::swap(*it, col_count[cdeg].back());
-        col_count[cdeg].pop_back();
-        work_estimate += (it - col_count[cdeg].begin());
+        const auto idx = static_cast<i_t>(it - col_count[cdeg].begin());
+        std::swap(*it, col_count[cdeg].back());
+        col_count[cdeg].pop_back();
+        work_estimate += idx;
cpp/src/dual_simplex/phase2.cpp (1)

182-183: ⚠️ Potential issue | 🟡 Minor

Format string bug: %d used for f_t (double) argument.

lp.upper[leaving_index] is f_t (a double), but the format specifier is %d (int). This is undefined behavior and will print garbage.

Proposed fix
-  printf(
-    "lower %e x %e upper %d\n", lp.lower[leaving_index], x[leaving_index], lp.upper[leaving_index]);
+  printf(
+    "lower %e x %e upper %e\n", lp.lower[leaving_index], x[leaving_index], lp.upper[leaving_index]);
🤖 Fix all issues with AI agents
In `@cpp/src/dual_simplex/phase2.cpp`:
- Around line 1408-1409: The work_estimate is over-counting because after
v_sparse.scatter(v) only v_sparse.i.size() entries are written, but the code
multiplies by v.size() (the full dense length); update the charge to use the
sparse count instead. Replace the current increment using 2 * v.size() with 2 *
v_sparse.i.size() (or equivalent access to the sparse index count) so that
work_estimate reflects the actual number of scattered entries; ensure you use
the same v_sparse symbol used in the scatter call.
- Around line 2519-2525: The work estimate incorrectly charges only m for two
size-n copies: update the increment of phase2_work_estimate after the
copy-constructors of vstatus_old and z_old to account for both size-n copies
(e.g. replace phase2_work_estimate += m with phase2_work_estimate += 2 * n, or
+= 4 * n if you want to include allocation cost), ensuring the change is made
where vstatus_old and z_old are created so the estimate correctly reflects the
cost of copying those n-sized vectors.
🧹 Nitpick comments (1)
cpp/src/dual_simplex/phase2.cpp (1)

1185-1188: atilde_start_size should be an integer type, not f_t.

atilde_index.size() returns size_t, but it is stored in f_t (double). While practically safe for reasonable sizes, using i_t or size_t is more appropriate and avoids potential precision loss for very large values. Same issue on line 1198.

Proposed fix
-      const f_t atilde_start_size = atilde_index.size();
+      const i_t atilde_start_size = atilde_index.size();

@chris-maes
Copy link
Copy Markdown
Contributor Author

/ok to test 9effcff

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/dual_simplex/phase2.cpp (1)

182-183: ⚠️ Potential issue | 🟡 Minor

Format specifier bug: %d used for a double argument.

lp.upper[leaving_index] is of type f_t (double) but is printed with %d, causing undefined behavior and garbage output.

Proposed fix
-  printf(
-    "lower %e x %e upper %d\n", lp.lower[leaving_index], x[leaving_index], lp.upper[leaving_index]);
+  printf(
+    "lower %e x %e upper %e\n", lp.lower[leaving_index], x[leaving_index], lp.upper[leaving_index]);
🧹 Nitpick comments (1)
cpp/src/dual_simplex/phase2.cpp (1)

3489-3495: Accumulated work is lost on non-periodic exit paths.

When the main loop terminates (OPTIMAL at line 2852, or early returns like NUMERICAL/TIME_LIMIT/DUAL_UNBOUNDED), the phase2_work_estimate accumulated since the last FEATURE_LOG_INTERVAL checkpoint is never recorded to work_unit_context. If precision of total work tracking matters, a final flush before each exit point would be needed.

Copy link
Copy Markdown
Contributor

@aliceb-nv aliceb-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a ton for this work Chris, that's a lot of changes :)
Just a few minor observations; we can merge as I observed no significant changes on MIPLIB2017 runs (same number of feasible, similar gap when running in determinism mode)

I wonder if we could unify some common operations through utility functions, say, a tracked_sort (that computes n log n) or tracked_copy when doing std::vector assignments. As it stands I'm afraid it will be all too easy to forget some work estimate updates :)
I'm also thinking that the work estimate updates tend to bloat the code a little, which isn't bad in itself (since we are after all performing more things), but it is interwoven with the rest of the business logic. Maybe there could be a way to make the work estimate logic clearer/more decoupled. But let's discuss that another time! I really appreciate all the work you've done on this.

Comment on lines -2728 to -2754
cuopt::scope_guard work_unit_guard([&]() {
i_t remaining_iters = iter - last_feature_log_iter;
if (remaining_iters <= 0) return;

auto [total_loads, total_stores] = aggregator.collect_and_flush();
// features.byte_loads = total_loads;
// features.byte_stores = total_stores;

// f_t now = toc(start_time);
// features.interval_runtime = now - interval_start_time;
// interval_start_time = now;

// features.iteration = iter;
// features.num_refactors = num_refactors;
// features.num_basis_updates = ft.num_updates();
// features.sparse_delta_z_count = sparse_delta_z;
// features.dense_delta_z_count = dense_delta_z;
// features.total_bound_flips = total_bound_flips;
// features.num_infeasibilities = infeasibility_indices.size();
// features.delta_y_nz_percentage = delta_y_nz_percentage;
// features.log_features(settings);

if (work_unit_context) {
// TEMP;
work_unit_context->record_work_sync_on_horizon((total_loads + total_stores) / 1e8);
}
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was removing the scope guard intended? We lose the work estimate update if we do an early exit, for example on NUMERICAL

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. I didn't understand how this work. This is supposed to run the code when it goes out of scope?

@chris-maes
Copy link
Copy Markdown
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 337a11f into NVIDIA:main Feb 18, 2026
271 of 276 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dual_simplex improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants