Skip to content

Conversation

@divine7022
Copy link
Collaborator

originally discussed -- comment

Description

Add generate_OAT_SA_design() function to create input design matrices for OAT sensitivity analysis.
SA requires isolating the effect of each parameter by holding all other inputs (met, IC, soil) constant. The existing generate_joint_ensemble_design() randomizes these inputs, which is correct for ensemble runs but invalidates SA variance decomposition.

added tests in test.input_design.R to validate the OAT sensitivity design, including checks :

  • that the total number of runs is calculated correctly
  • all non-parameter inputs (such as met, IC, and soil data) remain constant across OAT steps as required for valid SA
  • comparison test showing the key difference from ensemble design

Motivation and Context

Review Time Estimate

  • Immediately
  • Within one week
  • When possible

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation.
  • My name is in the list of CITATION.cff
  • I agree that PEcAn Project may distribute my contribution under any or all of
    • the same license as the existing code,
    • and/or the BSD 3-clause license.
  • I have updated the CHANGELOG.md.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

Copy link
Member

@mdietze mdietze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good overall test that this function is working correctly would be to verify that the run.write.configs module runs correctly with the OAT design as an input and continues to generate output that's readable by the SA postprocessing and graphing functions. If it doesn't, then this function is generating inputs that are not serving any useful purpose. Along the way, we may want/get to simplify the run write configs logic

#' all inputs vary together.
#'
#' @param settings PEcAn settings object
#' @param sa_samples Optional. Pre-loaded SA samples from samples.Rdata.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This argument suggests a fundamental design misunderstanding. sa_samples should be generated by this function, not read by it. I'm OK with taking this as an optional input if one wants to reuse a previous sa_samples, but if it's not provided it needs to be generated

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the parameter now clearly indicates it's optional for reusing existing samples, not the expected input

if (is.null(sa_samples)) {
samples_file <- file.path(settings$outdir, "samples.Rdata")

# generate samples if they don't exist (safety fallback)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the default, not the safety fallback

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed "safety fallback" comment


PEcAn.uncertainty::get.parameter.samples(
settings,
ensemble.size = 1, # SA doesn't need ensemble samples
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed as this is the default

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed, good catch!

settings,
ensemble.size = 1, # SA doesn't need ensemble samples
posterior.files,
ens.sample.method
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also not needed as this isn't an ensemble sample

#' @export
#' @author Akash B V
#' @importFrom rlang %||%
generate_OAT_SA_design <- function(settings, sa_samples = NULL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be too much at this PR, but it would be good to move away from settings as a dependency unless the underlying number of pieces of information is too large to meaningfully pass to the function. But in that case it would be good to document exactly what part of the settings is required. Here I think it might just be settings$outdir, settings$pfts, settings$sensitivity.analysis, and settings$ensemble (i.e. I wonder if you could get away with a function that has outdir, pfts, sensitivity.analysis, ensemble, and sa_samples as arguments?). Would also be good to better document semi-hidden dependencies (e.g. what does the pft$posterior.files need to point to for the function to actually sample parameters correctly)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with reducing dependency of settings for input desing by passing specific argument, but after analyzing both functions to keep the architecture consistent, i found that the parameter requirements are not same; generate_OAT_SA_design needs outdir, pfts, samplingspace, sensitivity.analysis and generate_joint_ensemble_design additionally needs run$inputs (via input.ens.gen() which samples from settings$run$inputs[[input]]$path).
However there is a deeper blocker -- both functions call get.parameter.samples(settings, ...) which itself uses many settings fields (database$bety, host$name, sensitivity.analysis, etc.). So even with explicit parameters in the design functions, we'd still pass settings through to get.parameter.samples. And then that involves refactoring SDA and sobol callers.
anyways i have documented what setting it uses and semi-hidden dependiences in both desing function. I happy to know ur thoughts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine if the parameters requirements are not the same. I also think it's fine to push the refactor of the generate design functions and get.parameter.samples to a future PR

@mdietze
Copy link
Member

mdietze commented Dec 26, 2025

── Failed tests ────────────────────────────────────────────────────────────────
Error ('test.input_design.R:165:3'): OAT design integrates with write.sa.configs for SA postprocessing
Error in PEcAn.uncertainty::write.sa.configs(defaults = settings$pfts, quantile.samples = sa_samples, settings = settings, model = "FAKE", write.to.db = FALSE, input_design = input_design): unused argument (input_design = input_design)

@divine7022
Copy link
Collaborator Author

divine7022 commented Dec 26, 2025

PEcAn.uncertainty::write.sa.configs(defaults = settings$pfts, quantile.samples = sa_samples, settings = settings, model = "FAKE", write.to.db = FALSE, input_design = input_design): unused argument (input_design = input_design)

we are passing input_design as a new parameter to write.sa.configs() in #3708 -- all tests are passed when i branched that PR.
This will fix when we merge #3708

@divine7022
Copy link
Collaborator Author

https://github.com/PecanProject/pecan/actions/runs/20528083080/job/58974923917?pr=3729

modules/meta.analysis/vignettes/meta-analysis-demo.Rmd -- failing because use()
it seems like this PR induced this failure -- #3728 ,
is there any reason to use use() here instead of library() ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants