Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
2b08818
support SA input_design
divine7022 Dec 7, 2025
efc6a33
update .Rd
divine7022 Dec 7, 2025
ceb3573
add rlang to import
divine7022 Dec 7, 2025
31b4c12
update namespace
divine7022 Dec 7, 2025
5902fdb
runs are now stored in runs_manifest.csv
divine7022 Dec 7, 2025
16499f4
update run.write.configs.Rd
divine7022 Dec 7, 2025
6d234f9
write configs to use runs_manifest.csv and vectorized variable readin…
divine7022 Dec 7, 2025
7e62bef
write configs to use runs_manifest.csv and vectorized variable readin…
divine7022 Dec 7, 2025
a223998
update read.sa.output.Rd
divine7022 Dec 7, 2025
8223f1d
update write.sa.configs.Rd
divine7022 Dec 7, 2025
770814f
use runs_manifest.csv in load_ensemble
divine7022 Dec 7, 2025
20d14e0
update changelog
divine7022 Dec 7, 2025
dd525d1
update NEWS.md
divine7022 Dec 7, 2025
ceb53cc
add comment
divine7022 Dec 7, 2025
982cd35
update roxygen
divine7022 Dec 7, 2025
fc86545
update write.sa.configs.Rd
divine7022 Dec 7, 2025
78bf963
fix indentation
divine7022 Dec 7, 2025
3ce205b
add rlang to DESCRIPTION
divine7022 Dec 8, 2025
20643d5
update doceker depends
divine7022 Dec 8, 2025
502622e
remove 'samples' ref
divine7022 Dec 9, 2025
c9be470
check the priority of input design and handle the edge case which was…
divine7022 Dec 10, 2025
d06704d
Merge remote-tracking branch 'origin/develop' into run-manifest
divine7022 Dec 10, 2025
1918d46
remove seperate input_desing argu and consolidate to single input_desing
divine7022 Dec 31, 2025
60d80b3
update .Rd
divine7022 Dec 31, 2025
9a6c512
removed seperate desing parameters and encapsulated redundent chunk i…
divine7022 Dec 31, 2025
0aa313e
update runModule.run.write.configs.Rd
divine7022 Dec 31, 2025
d8682e3
simplify by extracting run_ids once
divine7022 Dec 31, 2025
7cc8762
added todo comment to benchmark, defer to future PR if complex post-p…
divine7022 Dec 31, 2025
80fd907
add dot-prepare_input_designs.Rd
divine7022 Dec 31, 2025
708254b
add tests for multisite SA handing by run.write.configs
divine7022 Dec 31, 2025
af8f89e
add tests to check coordination of newly added input_desing argu to w…
divine7022 Dec 31, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@ For more information about this file see also [Keep a Changelog](http://keepacha
- Stopped testing on R 4.1, started testing on R 4.5, and updated prebuilt Docker images to match -- they are now available for R releases 4.2 through 4.5 as well as for R under development.
- `write.config.STICS()` now modifies parameters with vectors rather than individually.
- Code for DART has been moved from `modules/` to `contrib/` and its license more clearly described.
- Sensitivity analysis and ensemble runs now generate separate input design matrices with appropriate dimensions, fixing dimension mismatch errors in multisite workflows. (#3708)
- Generated runs are now stored in a `runs_manifest.csv` file in the output directory instead of modifying `samples.Rdata` (#3708)



Expand Down
1 change: 1 addition & 0 deletions base/workflow/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ Imports:
PEcAn.uncertainty,
PEcAn.utils,
purrr (>= 0.2.3),
rlang,
XML
Suggests:
mockery,
Expand Down
1 change: 1 addition & 0 deletions base/workflow/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ export(runModule_start_model_runs)
export(start_model_runs)
importFrom(dplyr,"%>%")
importFrom(dplyr,.data)
importFrom(rlang,"%||%")
2 changes: 2 additions & 0 deletions base/workflow/NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@
- **New Default**: The sampling method default has changed from "uniform" to "random"
- **Impact**: This ensures consistency across sites in ensemble runs but may produce different results compared to previous versions

* Sensitivity analysis and ensemble runs now generate separate input design matrices with appropriate dimensions, fixing dimension mismatch errors in multisite workflows. (#3708)
* Generated runs are now stored in a `runs_manifest.csv` file in the output directory instead of modifying `samples.Rdata` (#3708)
Comment on lines +11 to +12
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once 1.10.0 is released this will need moving to go under the next unreleased heading, but need to wait until then


# PEcAn.workflow 1.9.0

Expand Down
68 changes: 47 additions & 21 deletions base/workflow/R/run.write.configs.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,12 @@
#'
#' @param settings a PEcAn settings list
#' @param ensemble.size number of ensemble runs
#' @param input_design input indices for samples
#' @param write should the runs be written to the database?
#' @param posterior.files Filenames for posteriors for drawing samples for ensemble and sensitivity
#' analysis (e.g. post.distns.Rdata, or prior.distns.Rdata)
#' @param overwrite logical: Replace output files that already exist?
#' @param input_design Input design specification. A list with \code{ensemble} and/or
#' \code{sensitivity} entries, each containing a data.frame of input indices.
#'
#' @details The default value for \code{posterior.files} is NA, in which case the
#' most recent posterior or prior (in that order) for the workflow is used.
Expand All @@ -23,11 +24,16 @@
#' @return an updated settings list, which includes ensemble IDs for SA and ensemble analysis
#' @export
#'
#' @author David LeBauer, Shawn Serbin, Ryan Kelly, Mike Dietze
#' @author David LeBauer, Shawn Serbin, Ryan Kelly, Mike Dietze, Akash B V

run.write.configs <- function(settings, ensemble.size, input_design, write = TRUE,
posterior.files = rep(NA, length(settings$pfts)),
overwrite = TRUE) {

# extract designs from input_design list
input_design_ens <- if (!is.null(input_design)) input_design$ensemble else NULL
input_design_sa <- if (!is.null(input_design)) input_design$sensitivity else NULL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per earlier comments, I'd recommend only having one input_design, not separate ens and sa designs


## Skip database connection if settings$database is NULL or write is False
if (!isTRUE(write) && is.null(settings$database)) {
PEcAn.logger::logger.info("Not writing this run to database, so database connection skipped")
Expand Down Expand Up @@ -105,10 +111,10 @@ run.write.configs <- function(settings, ensemble.size, input_design, write = TRU

samples.file <- file.path(settings$outdir, "samples.Rdata")
if (file.exists(samples.file)) {
samples <- new.env()
load(samples.file, envir = samples) ## loads ensemble.samples, trait.samples, sa.samples, runs.samples, env.samples
trait.samples <- samples$trait.samples
trait_sample_indices <- input_design[["param"]]
existing_data <- new.env()
load(samples.file, envir = existing_data) ## loads ensemble.samples, trait.samples, sa.samples, runs.samples, env.samples
trait.samples <- existing_data$trait.samples
trait_sample_indices <- input_design_ens[["param"]]
ensemble.samples <- list()
for (pft in names(trait.samples)) {
pft_traits <- trait.samples[[pft]]
Expand All @@ -120,9 +126,9 @@ run.write.configs <- function(settings, ensemble.size, input_design, write = TRU
)
names(ensemble.samples[[pft]]) <- names(pft_traits)
}
sa.samples <- samples$sa.samples
runs.samples <- samples$runs.samples
## env.samples <- samples$env.samples
sa.samples <- existing_data$sa.samples
## runs.samples <- existing_data$runs.samples
## env.samples <- existing_data$env.samples
} else {
PEcAn.logger::logger.error(samples.file, "not found, this file is required by the run.write.configs function")
}
Expand Down Expand Up @@ -159,6 +165,9 @@ run.write.configs <- function(settings, ensemble.size, input_design, write = TRU
pft.names <- names(trait.samples)
trait.names <- lapply(trait.samples, names)

# Initialize the Manifest Dataframe
run_manifest_df <- data.frame()

### NEED TO IMPLEMENT: Load Environmental Priors and Posteriors

### Sensitivity Analysis
Expand All @@ -170,11 +179,17 @@ run.write.configs <- function(settings, ensemble.size, input_design, write = TRU
quantile.samples = sa.samples,
settings = settings,
model = model,
input_design = input_design_sa,
write.to.db = write
)

# collect manifest data
if ("manifest" %in% names(sa.runs)) {
run_manifest_df <- rbind(run_manifest_df, sa.runs$manifest)
}

# Store output in settings and output variables
runs.samples$sa <- sa.run.ids <- sa.runs$runs
sa.run.ids <- sa.runs$runs
settings$sensitivity.analysis$ensemble.id <- sa.ensemble.id <- sa.runs$ensemble.id

# Save sensitivity analysis info
Expand All @@ -192,12 +207,17 @@ run.write.configs <- function(settings, ensemble.size, input_design, write = TRU
ensemble.samples = ensemble.samples,
settings = settings,
model = model,
input_design = input_design,
input_design = input_design_ens,
write.to.db = write
)

# collect manifest data
if ("manifest" %in% names(ens.runs)) {
run_manifest_df <- rbind(run_manifest_df, ens.runs$manifest)
}

# Store output in settings and output variables
runs.samples$ensemble <- ens.run.ids <- ens.runs$runs
ens.run.ids <- ens.runs$runs
settings$ensemble$ensemble.id <- ens.ensemble.id <- ens.runs$ensemble.id
ens.samples <- ensemble.samples # rename just for consistency

Expand All @@ -211,13 +231,19 @@ run.write.configs <- function(settings, ensemble.size, input_design, write = TRU
PEcAn.logger::logger.info("###### Finished writing model run config files #####")
PEcAn.logger::logger.info("config files samples in ", file.path(settings$outdir, "run"))

### Save output from SA/Ensemble runs
# A lot of this is duplicate with the ensemble/sa specific output above, but kept for backwards compatibility.
save(ensemble.samples, trait.samples, sa.samples, runs.samples, pft.names, trait.names,
file = file.path(settings$outdir, "samples.Rdata")
)
PEcAn.logger::logger.info("parameter values for runs in ", file.path(settings$outdir, "samples.RData"))
# write runs manifest
manifest.file <- file.path(settings$outdir, "runs_manifest.csv")

# always write manifest (even if empty) so downstream knows workflow completed
utils::write.table(run_manifest_df,
file = manifest.file,
sep = ",",
row.names = FALSE,
col.names = overwrite || !file.exists(manifest.file),
append = !overwrite)

PEcAn.logger::logger.info("Run manifest written to ", manifest.file)

options(scipen = scipen)
invisible(settings)
return(settings)
}
return(invisible(settings))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please don't remove terminating newlines. These days the rule is mostly just to avoid needless extra changes, but there do exist some (mostly ancient 😅 ) tools that complain if it isn't there.

117 changes: 96 additions & 21 deletions base/workflow/R/runModule.run.write.configs.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,47 +2,63 @@
#'
#' @param settings a PEcAn Settings or MultiSettings object
#' @param overwrite logical: Replace config files if they already exist?
#' @param input_design the input indices for samples
#' @param input_design Optional. Input design specification. Can be:
#' \itemize{
#' \item A list with \code{ensemble} and/or \code{sensitivity} entries
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#' \item A list with \code{ensemble} and/or \code{sensitivity} entries

#' \item A single data.frame (interpreted as ensemble design)
#' \item NULL to auto-generate designs based on settings
#' }
#'
#' @return A modified settings object, invisibly
#'
#' @details
#' This function serves as the orchestration layer between PEcAn workflows and
#' the config-writing machinery. It generates appropriate input designs
#' (ensemble and/or SA) if not provided. For MultiSettings, it generates designs once
#' from the first site then shares across all sites for consistent sampling. Finally,
#' it delegates to \code{\link{run.write.configs}} for actual config generation.
#' The input design determines how parameter samples and input files (met, soil,
#' etc.) are coordinated across runs. Ensemble designs typically use random or
#' quasi-random sampling, while SA designs hold non-parameter inputs constant
#' (OAT methodology).
#'
#' @importFrom dplyr %>%
#' @importFrom rlang %||%
#' @export


runModule.run.write.configs <- function(settings,
overwrite = TRUE,
input_design = NULL) {

if (PEcAn.settings::is.MultiSettings(settings)) {
if (overwrite && file.exists(file.path(settings$rundir, "runs.txt"))) {
PEcAn.logger::logger.warn("Existing runs.txt file will be removed.")
unlink(file.path(settings$rundir, "runs.txt"))
}
if (is.null(input_design)) {
ensemble_size <- settings$ensemble$size
design_result <- PEcAn.uncertainty::generate_joint_ensemble_design(
settings = settings[1],
ensemble_size = ensemble_size
)
input_design <- design_result$X
}

# prepare designs once for all sites (consistent sampling)
designs <- .prepare_input_designs(settings[1], input_design)

return(PEcAn.settings::papply(settings,
runModule.run.write.configs,
overwrite = FALSE,
input_design = input_design))
input_design = designs))

} else if (PEcAn.settings::is.Settings(settings)) {
# double check making sure we have method for parameter sampling
if (is.null(settings$ensemble$samplingspace$parameters$method)) {
settings$ensemble$samplingspace$parameters$method <- "uniform"
}
if (is.null(input_design)) {
ensemble_size <- settings$ensemble$size
design_result <- PEcAn.uncertainty::generate_joint_ensemble_design(
settings = settings,
ensemble_size = ensemble_size
)
input_design <- design_result$X
}
ensemble_size <- nrow(input_design)

# prepare designs (may already be normalized from MultiSettings)
designs <- .prepare_input_designs(settings, input_design)

# determine ensemble size from design
ensemble_size <- if (!is.null(designs$ensemble)) {
nrow(designs$ensemble)
} else {
settings$ensemble$size %||% 1
}

# check to see if there are posterior.files tags under pft
posterior.files <- settings$pfts %>%
Expand All @@ -54,9 +70,68 @@ runModule.run.write.configs <- function(settings,
write = isTRUE(settings$database$bety$write), # treat null as FALSE
posterior.files = posterior.files,
overwrite = overwrite,
input_design = input_design
input_design = designs
))
} else {
stop("runModule.run.write.configs only works with Settings or MultiSettings")
}
}


#' Prepare input designs for ensemble and sensitivity analysis
#'
#' Normalizes and generates input design matrices. This helper ensures
#' consistent handling of the various input_design formats and
#' auto-generates designs when needed.
#'
#' @param settings A single PEcAn settings object
#' @param input_design Input design specification (see \code{runModule.run.write.configs})
#' @return A list with \code{ensemble} and \code{sensitivity} entries (each a data.frame or NULL)
#'
#' @details
#' Input normalization rules:
#' \itemize{
#' \item If \code{input_design} is already a list with \code{ensemble}/\code{sensitivity}
#' keys, return as-is
#' \item If \code{input_design} is a single data.frame, interpret as ensemble design
#' \item If NULL and \code{settings$ensemble} exists, generate via
#' \code{generate_joint_ensemble_design}
#' \item If NULL and \code{settings$sensitivity.analysis} exists, generate via
#' \code{generate_OAT_SA_design}
#' }
#'
#' @keywords internal

.prepare_input_designs <- function(settings, input_design) {

# already normalized? return as-is
if (is.list(input_design) &&
any(c("ensemble", "sensitivity") %in% names(input_design))) {
return(input_design)
}

designs <- list(ensemble = NULL, sensitivity = NULL)

# single data.frame = ensemble design
if (is.data.frame(input_design)) {
designs$ensemble <- input_design
}

# generate ensemble design if needed
if (is.null(designs$ensemble) && "ensemble" %in% names(settings)) {
ensemble_size <- settings$ensemble$size %||% 1
design_result <- PEcAn.uncertainty::generate_joint_ensemble_design(
settings = settings,
ensemble_size = ensemble_size
)
designs$ensemble <- design_result$X
}

# generate SA design if needed
if (is.null(designs$sensitivity) && "sensitivity.analysis" %in% names(settings)) {
design_result <- PEcAn.uncertainty::generate_OAT_SA_design(settings)
designs$sensitivity <- design_result$X
}

return(designs)
}
34 changes: 34 additions & 0 deletions base/workflow/man/dot-prepare_input_designs.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 3 additions & 2 deletions base/workflow/man/run.write.configs.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading