Skip to content

Add process_investment_constraints input file#1020

Merged
Aurashk merged 30 commits intomainfrom
add-process-investment-constraints-input
Dec 9, 2025
Merged

Add process_investment_constraints input file#1020
Aurashk merged 30 commits intomainfrom
add-process-investment-constraints-input

Conversation

@Aurashk
Copy link
Copy Markdown
Collaborator

@Aurashk Aurashk commented Dec 1, 2025

Description

Adds investment constraints associated with each process which limits the amount each agent can invest in it based on specified parameters

Initial implementation adds an investment_constraints map to each process, which is keyed by region and associated year, and each key corresponds to a vector of ProcessInvestmentConstraint structs.

Fixes #124

Type of change

  • Bug fix (non-breaking change to fix an issue)
  • New feature (non-breaking change to add functionality)
  • Refactoring (non-breaking, non-functional change to improve maintainability)
  • Optimization (non-breaking change to speed up the code)
  • Breaking change (whatever its nature)
  • Documentation (improve or add documentation)

Key checklist

  • All tests pass: $ cargo test
  • The documentation builds and looks OK: $ cargo doc

Further checks

  • Code is commented, particularly in hard-to-understand areas
  • Tests added that prove fix is effective or that feature works

@codecov
Copy link
Copy Markdown

codecov bot commented Dec 1, 2025

Codecov Report

❌ Patch coverage is 88.09524% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.69%. Comparing base (7dd9790) to head (2f30d33).
⚠️ Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
src/input/process/investment_constraints.rs 88.60% 3 Missing and 6 partials ⚠️
src/input/process.rs 75.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1020   +/-   ##
=======================================
  Coverage   81.68%   81.69%           
=======================================
  Files          53       53           
  Lines        6608     6610    +2     
  Branches     6608     6610    +2     
=======================================
+ Hits         5398     5400    +2     
+ Misses        950      949    -1     
- Partials      260      261    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

src/process.rs Outdated
pub discount_rate: Dimensionless,
}

/// Value specification for different possible types of investment constraints
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some input on this part:
I'm guessing in the end we don't want to store limit_type directly - we should just use it to store the correct parameters in InvestmentConstraintValue. I've also assumed the parameters can depend on ConstraintType but perhaps it's possible to simplify.

Is it enough to have one float stored in Addition which will tell you the yearly limit to apply? Or does it need to be a lower limit and an upper limit as decided by limit_type? I.e each agent needs to invest between low_lim and upper_lim per year. I'm a little confused because we also have a Limit ConstraintType and I'm not sure what that means in this context.

Not so worried about what fields are in InvestmentConstraintValue::Growth and InvestmentConstraintValue::Limit as we are starting with Addition but happy to take input on that too.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Collaborator

@tsmbland tsmbland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start. I have some comments which hopefully answer your questions. I would say for now we should just support upper limits. Hopefully that should simplify things.

I also wonder whether it might be simpler to have dedicated columns for each constraint type (similar to MUSE1), instead of the constraint_type column? For example,

process_id,regions,commission_years,constraint_type,value,growth_constraint_seed
WNDFRM,all,all,addition,100,
WNDFRM,all,all,growth,0.5,10
WNDFRM,all,all,total,1000,

would become

process_id,regions,commission_years,addition_limit,growth_limit,growth_seed,total_limit
WNDFRM,all,all,100,0.5,10,1000

Seems a bit clearer to me, only one extra column, and there's less onus on the user to remember the syntax for the constraint_type column. I also find the growth_constraint_seed column a bit confusing in the top format, especially when the constraint is not the "growth" type.

The downsides would be if the user only wants to define one type of constraint, or perhaps one constraint for all regions and another constraint for only some regions.

Not sure to be honest, happy to get other opinions

#[derive(DeserializeLabeledStringEnum)]
enum LimitType {
#[derive(DeserializeLabeledStringEnum, Debug, Clone, Copy, PartialEq, Eq)]
pub enum LimitType {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got rid of this in #1018 and gone for range syntax instead. We should probably do the same here.

That said, I wonder if it's worth just storing upper limits for now? Lower limits are a new idea that we don't have in MUSE1, and it seems a bit unclear to me how we might enforce these

/// Represents a row of the process investment constraints CSV file
#[derive(Deserialize)]
struct ProcessInvestmentConstraintRaw {
constraint_name: String,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realise this was in the schema, but I can't see why we'd need a name, so let's get rid of this

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have a slight pushback for this, when we give error messages for validation it might be useful to reference the name they provided so they can quickly find the associated constraint, if you think the inconvenience of having to define on isn't worth it though I'll get rid of it. We can still report the process ID and region etc

Copy link
Copy Markdown
Collaborator

@tsmbland tsmbland Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need it, to be honest - you can just quote the aberrant value and the user will be able to find it

e.g. "Addition constraint value must be non-negative for constraint '{}'" -> "Invalid value for addition constraint: -1; must be non-negative"

Assuming the user can do ctrl-f they should be able to find the constraint from just the value. In this case we don't need any more context because -1 is always invalid

struct ProcessInvestmentConstraintRaw {
constraint_name: String,
process_id: String,
#[serde(default = "default_region")]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not give a default, to be consistent with the other files

process_id: String,
#[serde(default = "default_region")]
regions: String,
commission_years: u32,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a String and parsed with parse_year_str (e.g. to allow "all" or "2020;2025"). See how we do this for process_parameters

src/process.rs Outdated
growth_constraint_seed: f64,
},
/// Limit constraint: Not implemented yet
Limit {},
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename Limit to Total

The values will be:

  • Addition: addition_limit
  • Growth: growth_limit, growth_seed
  • Total: total_limit

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I think storing a Vec<ProcessInvestmentConstraint>> probably isn't ideal because most likely in a particular part of the code we'll want to extract a particular limit, and it would be a bit annoying to have to loop through the vec to find it.

What about a struct instead, with fields:

  • addition_limit
  • growth_limit
  • total_limit

Addition and total would just be Option<f64>. Since growth limits need a value and a seed, you'd store this either as a tuple or another struct (GrowthLimit) with named fields

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to check I understand this bit. For each process, region and year you can have any combination of the three constraints but only one of each type?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly.

@Aurashk Aurashk marked this pull request as ready for review December 8, 2025 15:16
@Aurashk
Copy link
Copy Markdown
Collaborator Author

Aurashk commented Dec 8, 2025

Thanks @tsmbland. Think the latest commit should address all your comments but let me know if I've misinterpreted anything. I went with the data structure you suggested for now, but haven't added Growth and Total constraints since we're implementing Addition first. Should be very straightforward to add them into this PR if it's useful though. Regarding the file format the only other possibility I can think of is having three separate files (all of them optional), one for each possible type of constraint i.e.
investment_constraints_addition.csv (identical to what this PR currently has):

process_id,regions,commission_years,addition_limit
WNDFRM,all,all,100

investment_constraints_growth.csv:

process_id,regions,commission_years,growth_limit, growth_seed
WNDFRM,all,all, 0.5, 10

investment_constraints_total.csv

process_id,regions,commission_years,total_limit
WNDFRM,all,all,1000

I guess it depends on if it's common to want all multiple constraints on the same process at once, or more common to have one type?

@Aurashk Aurashk requested a review from dalonsoa December 8, 2025 15:46
@Aurashk Aurashk changed the title added the investment_constraints input file Add process_investment_constraints input file Dec 8, 2025
Copy link
Copy Markdown
Collaborator

@tsmbland tsmbland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly there! A couple of comments, and we'll need a yaml file in schemas/input

})?;

// Validate all parsed years are milestone years
for year in &constraint_years {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_year_str will already do this check so we don't need this

})?;

// Parse associated commission years
let constraint_years = parse_year_str(&record.commission_years, milestone_years)
Copy link
Copy Markdown
Collaborator

@tsmbland tsmbland Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be all milestone years within the year range of the process, rather than all milestone years

let constraints = vec![ProcessInvestmentConstraintRaw {
process_id: "process1".into(),
regions: "GBR".into(),
commission_years: "2025".into(), // Outside process years (2010-2020)
Copy link
Copy Markdown
Collaborator

@tsmbland tsmbland Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, but it's also not a milestone year, which is what's probably causing the error here. I don't think there are any checks in place on process years specifically (see above suggestion)?

&milestone_years,
);

assert!(result.is_err());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's useful to check the contents of the error message. We have the assert_error! macro for this

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a few of these so I'd go through and fix them all up

@tsmbland
Copy link
Copy Markdown
Collaborator

tsmbland commented Dec 8, 2025

Thanks @tsmbland. Think the latest commit should address all your comments but let me know if I've misinterpreted anything. I went with the data structure you suggested for now, but haven't added Growth and Total constraints since we're implementing Addition first. Should be very straightforward to add them into this PR if it's useful though. Regarding the file format the only other possibility I can think of is having three separate files (all of them optional), one for each possible type of constraint i.e. investment_constraints_addition.csv (identical to what this PR currently has):

process_id,regions,commission_years,addition_limit
WNDFRM,all,all,100

investment_constraints_growth.csv:

process_id,regions,commission_years,growth_limit, growth_seed
WNDFRM,all,all, 0.5, 10

investment_constraints_total.csv

process_id,regions,commission_years,total_limit
WNDFRM,all,all,1000

I guess it depends on if it's common to want all multiple constraints on the same process at once, or more common to have one type?

Let's go with the current approach for now, and we can always change it down the line if it becomes limiting

@Aurashk
Copy link
Copy Markdown
Collaborator Author

Aurashk commented Dec 9, 2025

Thanks, I've added the yaml file and implemented the other improvements @tsmbland

Copy link
Copy Markdown
Collaborator

@tsmbland tsmbland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Could you just add a note somewhere in the yaml that the constraint isn't applied yet? Also looks like there's duplicate test. Otherwise, good to go

assert_eq!(process_constraints.len(), 6);
}
#[rstest]
fn test_read_constraints_invalid_year(processes: ProcessMap) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get rid of this as it's equivalent to the test below, right?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, good catch!

add note about unused input invemstment constraint
@Aurashk Aurashk merged commit a82a2fe into main Dec 9, 2025
8 checks passed
@Aurashk Aurashk deleted the add-process-investment-constraints-input branch December 9, 2025 16:25
@tsmbland tsmbland mentioned this pull request Jan 26, 2026
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow for adding process investment constraints

2 participants