Skip to content

Fix printing of non-zero coeffs after cut passes#858

Closed
mtanneau wants to merge 2 commits intoNVIDIA:mainfrom
mtanneau:mt/nnz-with-cuts
Closed

Fix printing of non-zero coeffs after cut passes#858
mtanneau wants to merge 2 commits intoNVIDIA:mainfrom
mtanneau:mt/nnz-with-cuts

Conversation

@mtanneau
Copy link
Copy Markdown
Contributor

@mtanneau mtanneau commented Feb 13, 2026

Description

I tracked the issue of non-sensical number of non-zeros to this line:

original_lp_.A.col_start[original_lp_.A.n]);

And replaced the manual access to A.col_start[A.n] with A.nnz().

For completeness, I tried the following variants:

  • ❌ same as main
  • A.nnz()
  • ✅ Grab number of non-zeros outside of the printf statement, then print it. Use templated integer type.
    const i_t num_non_zeros = original_lp_.A.col_start[original_lp_.A.n];
    ... printf("%d", num_non_zeros);  // illustrative, not functional as-is
  • ❌ Grab number of non-zeros outside of the printf statement, then print it. Use auto type.
    const auto num_non_zeros = original_lp_.A.col_start[original_lp_.A.n];
    ... printf("%d", num_non_zeros);

I suspect that directly accessing A.col_start[A.n] causes the value to be cast to the wrong type, which then causes some kind of issue inside printf. To see this, change the printf statement above to

settings_.log.printf("Size with cuts : %d constraints, %d variables, %d %d %d nonzeros\n",
                         original_lp_.num_rows,
                         original_lp_.num_cols,
                         original_lp_.A.col_start[original_lp_.A.n],
                         12345);

which should output something like

Size with cuts : 336 constraints, 6411 variables, [-994767476, -1061948824, 12345] nonzeros

when solving the air05.mps instance.

I also checked that all 4 variants are displayed as expected when printed directly to std::cout.

Issue

Addresses part of #855, namely the number of non-zeros after cuts have been added

Checklist

  • I am familiar with the Contributing Guidelines.
  • Testing
    • New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • The documentation is up to date with these changes
    • Added new documentation
    • NA

Summary by CodeRabbit

  • Bug Fixes
    • Corrected nonzero count reporting in problem size diagnostics to accurately reflect actual values, improving the accuracy of size metrics displayed during problem solving.

Signed-off-by: mtanneau <mathieu.tanneau@gmail.com>
@mtanneau mtanneau requested a review from a team as a code owner February 13, 2026 16:49
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Feb 13, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 13, 2026

📝 Walkthrough

Walkthrough

A single line in a logging statement was modified to replace accessing the nonzero count via original_lp_.A.col_start[original_lp_.A.n] with a direct call to original_lp_.A.nnz(). This is a non-functional logging improvement with no control flow impact.

Changes

Cohort / File(s) Summary
Logging Fix
cpp/src/branch_and_bound/branch_and_bound.cpp
Replaced indirect nonzero count access in "Size with cuts" log message with direct nnz() method call for clarity and correctness.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: fixing the printing of non-zero coefficients after cut passes, which is the core issue addressed in the PR.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chris-maes
Copy link
Copy Markdown
Contributor

The line of code should be correct as is. I think you may be exposing another bug.

Can you send the output you get that doesn't make sense?

Copy link
Copy Markdown
Contributor

@chris-maes chris-maes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A.col_start[A.n] should always be equal to the number of nonzeros. I think you may be hitting another bug.

Let's fix the original bug rather than merging this change.

@chris-maes
Copy link
Copy Markdown
Contributor

Here is the output I get on air05.mps

cuOpt version: 26.4.0, git hash: e4985d05d, host arch: x86_64, device archs: 75-real
CPU: AMD Ryzen Threadripper PRO 3975WX 32-Cores, threads (physical/logical): 32/64, RAM: 27.25 GiB
CUDA 12.9, device: Quadro RTX 8000 (ID 0), VRAM: 47.25 GiB
CUDA device UUID: 5c4dffffffbfffffffee-5c38-ffffff8957

Solving a problem with 426 constraints, 7195 variables (7195 integers), and 52121 nonzeros
Problem scaling:
Objective coefficents range:          [4e+01, 3e+03]
Constraint matrix coefficients range: [1e+00, 1e+00]
Constraint rhs / bounds range:        [1e+00, 1e+00]
Variable bounds range:                [0e+00, 1e+00]

Original problem: 426 constraints, 7195 variables, 52121 nonzeros
Calling Papilo presolver (git hash 741a2b9c)
Presolve status: reduced the problem
Presolve removed: 90 constraints, 1116 variables, 16171 nonzeros
Presolved problem: 336 constraints, 6079 variables, 35950 nonzeros
Objective function is integral
Papilo presolve time: 0.65
Objective offset 9167.000000 scaling_factor 1.000000
Model fingerprint: 0x3257a27
Running presolve!
Unused variables detected, eliminating them! Unused var count 4
After cuOpt presolve: 335 constraints, 6075 variables, objective offset 9167.000000.
cuOpt presolve time: 2.67

Solving LP root relaxation in concurrent mode
Skipping column scaling
Dual Simplex Phase 1
Dual feasible solution found.
Dual Simplex Phase 2
 Iter     Objective           Num Inf.  Sum Inf.     Perturb  Time
    0 -8.1936000000000000e+04     277 2.31020000e+04 0.00e+00 3.37
    1 -8.0926000000000000e+04     266 1.48790000e+04 0.00e+00 3.37
 1000 -5.7739216141603800e+04     137 1.69703732e+02 0.00e+00 3.53


Root relaxation solution found in 1122 iterations and 0.20s by Dual Simplex
Root relaxation objective +2.58776093e+04


 | Explored | Unexplored |    Objective    |     Bound     | IntInf | Depth | Iter/Node |   Gap    |  Time  |
           0            0             +inf    +2.588508e+04      228      0   1.2e+03        -        3.70
Gomory    cuts : 1
MIR       cuts : 0
Knapsack  cuts : 0
Strong CG cuts : 0
Cut pool size  : 215
Size with cuts : 336 constraints, 6411 variables, 42275 nonzeros
Strong branching using 63 threads and 228 fractional variables
H                            +4.887100e+04    +2.588508e+04                                47.0%      3.78
H                            +4.876600e+04    +2.588508e+04                                46.9%      3.78
H                            +4.795900e+04    +2.588508e+04                                46.0%      3.78
H                            +4.746900e+04    +2.588508e+04                                45.5%      3.78
H                            +4.709900e+04    +2.588508e+04                                45.0%      3.78
H                            +4.596300e+04    +2.588508e+04                                43.7%      3.78
Strong branching completed in 0.31s
Exploring the B&B tree using 63 threads

 | Explored | Unexplored |    Objective    |     Bound     | IntInf | Depth | Iter/Node |   Gap    |  Time  |
H                            +4.496500e+04    +2.588508e+04                                42.4%      4.01
H                            +4.484300e+04    +2.588508e+04                                42.3%      4.01
D        179          181    +2.653300e+04    +2.595461e+04        0     11   1.2e+02       2.2%      5.08
B        223          179    +2.640200e+04    +2.595461e+04        0     13   1.1e+02       1.7%      5.12
D        238          165    +2.637400e+04    +2.600182e+04        0     29   1.1e+02       1.4%      5.16
Explored 633 nodes in 5.61s.
Absolute Gap 2.419355e+00 Objective 2.6374000000000004e+04 Lower Bound 2.6371580645161292e+04
Optimal solution found within relative MIP gap tolerance (1.0e-04)

@mtanneau
Copy link
Copy Markdown
Contributor Author

mtanneau commented Feb 13, 2026

Let me run this again and share the logs here.

OK, here we go.

Note the following:

  • Number of non-zeros displayed after cut pass
  • Final absolute gap is negative
Setting CUDA_MODULE_LOADING to EAGER
Reading file air05.mps
cuOpt version: 26.4.0, git hash: b3bb21e4, host arch: aarch64, device archs: 121a-real
CPU: Unknown, threads (physical/logical): 1/20, RAM: 96.08 GiB
CUDA 12.9, device: NVIDIA GB10 (ID 0), VRAM: 119.70 GiB
CUDA device UUID: 43bce240-d7d6-65e5-ca7a-e351bab5de7a

Solving a problem with 426 constraints, 7195 variables (7195 integers), and 52121 nonzeros
Problem scaling:
Objective coefficents range:          [4e+01, 3e+03]
Constraint matrix coefficients range: [1e+00, 1e+00]
Constraint rhs / bounds range:        [1e+00, 1e+00]
Variable bounds range:                [0e+00, 1e+00]

Original problem: 426 constraints, 7195 variables, 52121 nonzeros
Calling Papilo presolver (git hash 741a2b9c)
Presolve status: reduced the problem
Presolve removed: 90 constraints, 1116 variables, 16171 nonzeros
Presolved problem: 336 constraints, 6079 variables, 35950 nonzeros
Objective function is integral
Papilo presolve time: 0.42
Objective offset 9167.000000 scaling_factor 1.000000
Model fingerprint: 0x3257a27
Running presolve!
Unused variables detected, eliminating them! Unused var count 4
After cuOpt presolve: 335 constraints, 6075 variables, objective offset 9167.000000.
cuOpt presolve time: 3.85

Solving LP root relaxation in concurrent mode
Skipping column scaling
Dual Simplex Phase 1
Dual feasible solution found.
Dual Simplex Phase 2
 Iter     Objective           Num Inf.  Sum Inf.     Perturb  Time
    0 -8.1936000000000000e+04     277 2.31020000e+04 0.00e+00 4.51
    1 -8.0926000000000000e+04     266 1.48790000e+04 0.00e+00 4.51
 1000 -5.7784953825729302e+04     127 1.13852044e+02 0.00e+00 4.61


Root relaxation solution found in 1194 iterations and 0.13s by Dual Simplex
Root relaxation objective +2.58776093e+04


 | Explored | Unexplored |    Objective    |     Bound     | IntInf | Depth | Iter/Node |   Gap    |  Time  |
           0            0             +inf    +2.588508e+04      228      0   1.2e+03        -        4.74
Gomory    cuts : 1
MIR       cuts : 0
Knapsack  cuts : 0
Strong CG cuts : 0
Cut pool size  : 211
Size with cuts : 336 constraints, 6411 variables, -1488061860 nonzeros
Strong branching using 19 threads and 228 fractional variables
Strong branching completed in 0.47s
Exploring the B&B tree using 19 threads

 | Explored | Unexplored |    Objective    |     Bound     | IntInf | Depth | Iter/Node |   Gap    |  Time  |
H                            +4.466800e+04    +2.588508e+04                                42.1%      5.28
H                            +4.448000e+04    +2.588508e+04                                41.8%      5.28
H                            +4.430000e+04    +2.588508e+04                                41.6%      5.29
H                            +4.387300e+04    +2.588508e+04                                41.0%      5.29
H                            +4.354400e+04    +2.588508e+04                                40.6%      5.30
H                            +4.287600e+04    +2.588508e+04                                39.6%      5.31
H                            +4.191500e+04    +2.588508e+04                                38.2%      5.32
H                            +4.166800e+04    +2.588508e+04                                37.9%      5.34
H                            +4.149500e+04    +2.588508e+04                                37.6%      5.35
H                            +4.121300e+04    +2.588508e+04                                37.2%      5.37
D         42           44    +2.657100e+04    +2.591578e+04        0     16   1.4e+02       2.5%      6.78
B         44           44    +2.645600e+04    +2.591578e+04        0     19   1.3e+02       2.0%      6.78
D         62           56    +2.644800e+04    +2.591578e+04        0     30   1.3e+02       2.0%      6.85
D         71           63    +2.644500e+04    +2.595460e+04        0     30   1.3e+02       1.9%      6.87
D         71           63    +2.644100e+04    +2.595460e+04        0     29   1.3e+02       1.8%      6.87
D         77           63    +2.643900e+04    +2.595460e+04        0     27   1.3e+02       1.8%      6.88
B        170          110    +2.640200e+04    +2.610151e+04        0     12   1.1e+02       1.1%      7.16
D        342          127    +2.640000e+04    +2.624972e+04        0     19   9.6e+01       0.6%      7.60
D        344          129    +2.637500e+04    +2.624972e+04        0     17   9.6e+01       0.5%      7.61
D        348          128    +2.637400e+04    +2.626036e+04        0     17   9.6e+01       0.4%      7.62
Explored 485 nodes in 7.79s.
Absolute Gap -5.595838e+00 Objective 2.6374000000000022e+04 Lower Bound 2.6379595837897017e+04
Optimal solution found.
Solution objective: 26374.000000 , relative_mip_gap 0.000212 solution_bound 26379.595838 presolve_time 4.275866 total_solve_time 7.854454 max constraint violation 0.000000 max int violation 0.000000 max var bounds violation 0.000000 nodes 485 simplex_iterations 40126
  • cuopt was built from source, I'm using the Conda-based workflow as recommended in CONTRIBUTING.md
  • I'm running on a DGX Spark desktop (GB10 GPU and 20-core CPU). That's an aarch64 linux
  • Runs are not deterministic: the number of non-zeros and the final gap displayed are not consistent across runs

@mtanneau
Copy link
Copy Markdown
Contributor Author

The issue does seem to be coming from how printf displays that number, not the underlying numerical value being wrong.

Printing directly to console using cout or printf as follows

std::cout << "(  cout) nnz = " << original_lp_.A.col_start[original_lp_.A.n] << std::endl;
printf("(printf) nnz = %d\n", original_lp_.A.col_start[original_lp_.A.n]);

yields the following output for me:

(  cout) nnz = 42295
(printf) nnz = 459570076  # typically non-sensical value

@mtanneau
Copy link
Copy Markdown
Contributor Author

I'm able to narrow it down to the following MRE:

// this displays weird output
ins_vector<i_t> x(2);
x[0] = 0;
x[1] = 1;
printf("x = [%d, %d]\n", x[0], x[1]);
// but this doesn't
std::vector<i_t> y = {10, 11};
printf("(printf) y=[%d, %d]\n", y[0], y[1]);

which, on my system, outputs

(printf) x=[1208483168, 1635283872]
(printf) y=[10, 11]

My best guess is that when printf is passed the array element directly, it somehow sees extra data (maybe an extra pointer)? No issue occurs when extracting the numerical value and casting it to i_t before printing, nor when using a std::vector container.

@chris-maes
Copy link
Copy Markdown
Contributor

Ok. Thanks. The printing issue is likely caused by the ins_vector. PR #856 removes the ins_vector from the sparse matrix code.

The bigger issue is the gap is negative and lower bound is greater than the upper bound.

@aliceb-nv
Copy link
Copy Markdown
Contributor

I confirm this is caused by a nasty interaction between ins_vector, which uses an element_proxy_t object to monitor dereferences, and C-style varargs used by printf. C++ type casting dynamics usually prevent this, but varags were obviously not conceived with fancy C++ tricks in mind :)

Chris' PR will resolve this naturally, as ins_vectors will be phased out.

@mtanneau
Copy link
Copy Markdown
Contributor Author

Thanks @chris-maes and @aliceb-nv for looking into this!
I am closing this PR, and will update the issue to focus on the negative gap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants