Add process_investment_constraints input file#1020
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1020 +/- ##
=======================================
Coverage 81.68% 81.69%
=======================================
Files 53 53
Lines 6608 6610 +2
Branches 6608 6610 +2
=======================================
+ Hits 5398 5400 +2
+ Misses 950 949 -1
- Partials 260 261 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
src/process.rs
Outdated
| pub discount_rate: Dimensionless, | ||
| } | ||
|
|
||
| /// Value specification for different possible types of investment constraints |
There was a problem hiding this comment.
Need some input on this part:
I'm guessing in the end we don't want to store limit_type directly - we should just use it to store the correct parameters in InvestmentConstraintValue. I've also assumed the parameters can depend on ConstraintType but perhaps it's possible to simplify.
Is it enough to have one float stored in Addition which will tell you the yearly limit to apply? Or does it need to be a lower limit and an upper limit as decided by limit_type? I.e each agent needs to invest between low_lim and upper_lim per year. I'm a little confused because we also have a Limit ConstraintType and I'm not sure what that means in this context.
Not so worried about what fields are in InvestmentConstraintValue::Growth and InvestmentConstraintValue::Limit as we are starting with Addition but happy to take input on that too.
tsmbland
left a comment
There was a problem hiding this comment.
Good start. I have some comments which hopefully answer your questions. I would say for now we should just support upper limits. Hopefully that should simplify things.
I also wonder whether it might be simpler to have dedicated columns for each constraint type (similar to MUSE1), instead of the constraint_type column? For example,
process_id,regions,commission_years,constraint_type,value,growth_constraint_seed
WNDFRM,all,all,addition,100,
WNDFRM,all,all,growth,0.5,10
WNDFRM,all,all,total,1000,
would become
process_id,regions,commission_years,addition_limit,growth_limit,growth_seed,total_limit
WNDFRM,all,all,100,0.5,10,1000
Seems a bit clearer to me, only one extra column, and there's less onus on the user to remember the syntax for the constraint_type column. I also find the growth_constraint_seed column a bit confusing in the top format, especially when the constraint is not the "growth" type.
The downsides would be if the user only wants to define one type of constraint, or perhaps one constraint for all regions and another constraint for only some regions.
Not sure to be honest, happy to get other opinions
src/input/process/availability.rs
Outdated
| #[derive(DeserializeLabeledStringEnum)] | ||
| enum LimitType { | ||
| #[derive(DeserializeLabeledStringEnum, Debug, Clone, Copy, PartialEq, Eq)] | ||
| pub enum LimitType { |
There was a problem hiding this comment.
I've got rid of this in #1018 and gone for range syntax instead. We should probably do the same here.
That said, I wonder if it's worth just storing upper limits for now? Lower limits are a new idea that we don't have in MUSE1, and it seems a bit unclear to me how we might enforce these
| /// Represents a row of the process investment constraints CSV file | ||
| #[derive(Deserialize)] | ||
| struct ProcessInvestmentConstraintRaw { | ||
| constraint_name: String, |
There was a problem hiding this comment.
I realise this was in the schema, but I can't see why we'd need a name, so let's get rid of this
There was a problem hiding this comment.
I do have a slight pushback for this, when we give error messages for validation it might be useful to reference the name they provided so they can quickly find the associated constraint, if you think the inconvenience of having to define on isn't worth it though I'll get rid of it. We can still report the process ID and region etc
There was a problem hiding this comment.
I don't think we need it, to be honest - you can just quote the aberrant value and the user will be able to find it
e.g. "Addition constraint value must be non-negative for constraint '{}'" -> "Invalid value for addition constraint: -1; must be non-negative"
Assuming the user can do ctrl-f they should be able to find the constraint from just the value. In this case we don't need any more context because -1 is always invalid
| struct ProcessInvestmentConstraintRaw { | ||
| constraint_name: String, | ||
| process_id: String, | ||
| #[serde(default = "default_region")] |
There was a problem hiding this comment.
Let's not give a default, to be consistent with the other files
| process_id: String, | ||
| #[serde(default = "default_region")] | ||
| regions: String, | ||
| commission_years: u32, |
There was a problem hiding this comment.
This should be a String and parsed with parse_year_str (e.g. to allow "all" or "2020;2025"). See how we do this for process_parameters
src/process.rs
Outdated
| growth_constraint_seed: f64, | ||
| }, | ||
| /// Limit constraint: Not implemented yet | ||
| Limit {}, |
There was a problem hiding this comment.
I'd rename Limit to Total
The values will be:
Addition:addition_limitGrowth:growth_limit,growth_seedTotal:total_limit
There was a problem hiding this comment.
That said, I think storing a Vec<ProcessInvestmentConstraint>> probably isn't ideal because most likely in a particular part of the code we'll want to extract a particular limit, and it would be a bit annoying to have to loop through the vec to find it.
What about a struct instead, with fields:
addition_limitgrowth_limittotal_limit
Addition and total would just be Option<f64>. Since growth limits need a value and a seed, you'd store this either as a tuple or another struct (GrowthLimit) with named fields
There was a problem hiding this comment.
Just want to check I understand this bit. For each process, region and year you can have any combination of the three constraints but only one of each type?
|
Thanks @tsmbland. Think the latest commit should address all your comments but let me know if I've misinterpreted anything. I went with the data structure you suggested for now, but haven't added Growth and Total constraints since we're implementing Addition first. Should be very straightforward to add them into this PR if it's useful though. Regarding the file format the only other possibility I can think of is having three separate files (all of them optional), one for each possible type of constraint i.e. investment_constraints_growth.csv: investment_constraints_total.csv I guess it depends on if it's common to want all multiple constraints on the same process at once, or more common to have one type? |
tsmbland
left a comment
There was a problem hiding this comment.
Mostly there! A couple of comments, and we'll need a yaml file in schemas/input
| })?; | ||
|
|
||
| // Validate all parsed years are milestone years | ||
| for year in &constraint_years { |
There was a problem hiding this comment.
parse_year_str will already do this check so we don't need this
| })?; | ||
|
|
||
| // Parse associated commission years | ||
| let constraint_years = parse_year_str(&record.commission_years, milestone_years) |
There was a problem hiding this comment.
This should be all milestone years within the year range of the process, rather than all milestone years
| let constraints = vec![ProcessInvestmentConstraintRaw { | ||
| process_id: "process1".into(), | ||
| regions: "GBR".into(), | ||
| commission_years: "2025".into(), // Outside process years (2010-2020) |
There was a problem hiding this comment.
Perhaps, but it's also not a milestone year, which is what's probably causing the error here. I don't think there are any checks in place on process years specifically (see above suggestion)?
| &milestone_years, | ||
| ); | ||
|
|
||
| assert!(result.is_err()); |
There was a problem hiding this comment.
It's useful to check the contents of the error message. We have the assert_error! macro for this
There was a problem hiding this comment.
There's a few of these so I'd go through and fix them all up
Let's go with the current approach for now, and we can always change it down the line if it becomes limiting |
Co-authored-by: Tom Bland <t.bland@imperial.ac.uk>
…rocess years range add yaml for investment constraints file
|
Thanks, I've added the yaml file and implemented the other improvements @tsmbland |
tsmbland
left a comment
There was a problem hiding this comment.
Great! Could you just add a note somewhere in the yaml that the constraint isn't applied yet? Also looks like there's duplicate test. Otherwise, good to go
| assert_eq!(process_constraints.len(), 6); | ||
| } | ||
| #[rstest] | ||
| fn test_read_constraints_invalid_year(processes: ProcessMap) { |
There was a problem hiding this comment.
I think we can get rid of this as it's equivalent to the test below, right?
add note about unused input invemstment constraint
Description
Adds investment constraints associated with each process which limits the amount each agent can invest in it based on specified parameters
Initial implementation adds an
investment_constraintsmap to each process, which is keyed by region and associated year, and each key corresponds to a vector ofProcessInvestmentConstraintstructs.Fixes #124
Type of change
Key checklist
$ cargo test$ cargo docFurther checks