Skip to content

Sub-MIP recombiner and B&B global variable changes#259

Merged
rapids-bot[bot] merged 31 commits intoNVIDIA:branch-25.10from
akifcorduk:paper_tests
Aug 22, 2025
Merged

Sub-MIP recombiner and B&B global variable changes#259
rapids-bot[bot] merged 31 commits intoNVIDIA:branch-25.10from
akifcorduk:paper_tests

Conversation

@akifcorduk
Copy link
Copy Markdown
Contributor

This PR adds a new recombiner: sub-MIP. We use the B&B solver to solve the subproblem with the time limit of the recombiner. This PR also runs both local searches (line segment and FJ) one after another, instead of running either of them with 50% probability.

This PR also changes all the global variables in B&B to class member variables. This way, multiple B&B instances could run in parallel in the same process, including the sub-MIP recombiner.

@akifcorduk akifcorduk added this to the 25.10 milestone Aug 6, 2025
@akifcorduk akifcorduk requested a review from a team as a code owner August 6, 2025 12:03
@akifcorduk akifcorduk added non-breaking Introduces a non-breaking change improvement Improves an existing functionality labels Aug 6, 2025
@akifcorduk akifcorduk requested review from kaatish and rg20 August 6, 2025 12:03
@rgsl888prabhu rgsl888prabhu changed the base branch from branch-25.08 to branch-25.10 August 6, 2025 14:25
CUOPT_LOG_DEBUG(
"n_vars_from_guiding %d n_vars_from_other %d", n_vars_from_guiding, n_vars_from_other);
this->compute_vars_to_fix(offspring, vars_to_fix, n_vars_from_other, n_vars_from_guiding);
auto [fixed_problem, fixed_assignment, variable_map] = offspring.fix_variables(vars_to_fix);
Copy link
Copy Markdown
Contributor

@hlinsen hlinsen Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if it would cause size issues if we are to remove fixed variable or apply presolve on the fixed problem?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "removing fixed variable"?

I don't think it would cause any problems if we applied presolve as long as the conversions are done correctly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can change the size of the CSR matrix

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are changing the size of the CSR matrix already. That's the fixed_problem.

global_variables::mutex_upper.unlock();
return upper_bound;
mutex_upper.lock();
const f_t upper_bound_ = upper_bound;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making these changes @akifcorduk !

One nitpick. We have local variables for lower bound, upper bound, and gap. As well as the member variables that are shared between threads.

I think you used the convention of adding an underscore suffix to local variables (i.e. upper_bound_ or lower_bound_). I'm used to the exact opposite convention---that member variable have an underscore suffix. So I read lower_bound_ as a member variable of the class.

To avoid confusion, would you be ok if we used the underscore prefix for the member variables?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Chris :) To me at a first glance, prefix or suffix underscores suggest a member variable

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I wanted to get rid of this local variables for reading. I wanted to convert all global/shared variables into atomic. I can do it in this PR, or later in another PR. What's your preference @chris-maes ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled it, let's do the atomic changes in another PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great. Thanks

Comment thread cpp/src/dual_simplex/pseudo_costs.hpp Outdated
Comment thread cpp/src/dual_simplex/pseudo_costs.hpp Outdated
Comment thread cpp/src/dual_simplex/pseudo_costs.cpp
branch_and_bound_settings.integer_tol = context.settings.tolerances.integrality_tolerance;
// disable B&B logs, so that it is not interfering with the main B&B thread
branch_and_bound_settings.log.log = false;
dual_simplex::branch_and_bound_t<i_t, f_t> branch_and_bound(branch_and_bound_problem,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should add a callback here. Since any feasible solution in the sub-MIP should be a feasible solution in the original problem. So you want to propagate solutions out of the sub-MIP.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I wanted to implement getting the best primal solution, this way we can get all intermediate solutions.

fixed_problem.get_host_user_problem(branch_and_bound_problem);
branch_and_bound_solution.resize(branch_and_bound_problem.num_cols);
// Fill in the settings for branch and bound
branch_and_bound_settings.time_limit = sub_mip_recombiner_config_t::sub_mip_time_limit;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably instead of, or in addition to, a time limit, we should add a node limit to branch and bound

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a case? Does returning early provide any value? Or does the B&B have diminishing returns in submip context when we run it more than certain number of nodes?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed offline, I think using a node limit just makes things more deterministic. We can switch later when the MIP heuristics are deterministic. Fine to use time limit for now.

Comment thread benchmarks/linear_programming/cuopt/mip_test_instances.hpp
Comment on lines +22 to +42
namespace diversity_config_t {
static double time_ratio_on_init_lp = 0.1;
static double max_time_on_lp = 30;
static double time_ratio_of_probing_cache = 0.10;
static double max_time_on_probing = 60;
static size_t max_iterations_without_improvement = 15;
static int max_var_diff = 256;
static size_t max_solutions = 32;
static double initial_infeasibility_weight = 1000.;
static double default_time_limit = 10.;
static int initial_island_size = 3;
static int maximum_island_size = 8;
static bool use_avg_diversity = false;
static double generation_time_limit_ratio = 0.6;
static double max_island_gen_time = 600;
static size_t n_sol_for_skip_init_gen = 3;
static double max_fast_sol_time = 10;
static double lp_run_time_if_feasible = 15.;
static double lp_run_time_if_infeasible = 1;
static bool halve_population = true;
}; // namespace diversity_config_t
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the loss of the const/constexpr qualifier intended?
Mutable global variables may cause issues

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this was intended. We are changing those constants depending on a environment config variable. They are wrapped in a namespace, so i doubt they will cause issues.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, makes sense!
Although, ideally to make the semantics clearer I'd see this as a separate struct type, marked as a 'const' member and properly initialized at solver construction
Even if in this context it is unlikely to cause issues, global mutable variables are just inherently alarming especially in a large multithreaded codebase

Copy link
Copy Markdown
Contributor

@aliceb-nv aliceb-nv Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you keep them as global: please mark them as 'static inline' or 'extern'. As it stands, a static variable declared in a header will be instantiated with internal linkage in each separate translation unit; so if one of them is modified in a given .cu file, other .cu files won't see this change

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, i like the idea of member, i will do that.

Comment thread cpp/src/mip/diversity/recombiners/recombiner_stats.hpp Outdated
global_variables::mutex_upper.unlock();
return upper_bound;
mutex_upper.lock();
const f_t upper_bound_ = upper_bound;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Chris :) To me at a first glance, prefix or suffix underscores suggest a member variable

Comment thread cpp/src/mip/diversity/diversity_manager.cu Outdated
@akifcorduk akifcorduk requested review from a team as code owners August 8, 2025 13:42
@akifcorduk akifcorduk requested review from tmckayus and removed request for a team August 8, 2025 13:42
@akifcorduk akifcorduk removed request for a team and tmckayus August 8, 2025 13:44
Copy link
Copy Markdown
Contributor

@chris-maes chris-maes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to branch and bound and pseudocosts look good to me.

Awesome work Akif. I'm excited to have sub-MIPping in cuOpt!

@akifcorduk
Copy link
Copy Markdown
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 2283fd5 into NVIDIA:branch-25.10 Aug 22, 2025
142 of 144 checks passed
jieyibi pushed a commit to yining043/cuopt that referenced this pull request Mar 26, 2026
This PR adds a new recombiner: sub-MIP. We use the B&B solver to solve the subproblem with the time limit of the recombiner. This PR also runs both local searches (line segment and FJ) one after another, instead of running either of them with 50% probability.

This PR also changes all the global variables in B&B to class member variables. This way, multiple B&B instances could run in parallel in the same process, including the sub-MIP recombiner.

Authors:
  - Akif ÇÖRDÜK (https://github.com/akifcorduk)

Approvers:
  - Chris Maes (https://github.com/chris-maes)

URL: NVIDIA#259
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants