Refactor workflow to stateless run manifest architecture and SA input design coordination #3708

divine7022 · 2025-12-07T20:16:55Z

Description

updated run.write.configs to generate a runs_manifest.csv file containing run metadata (run_id, site_id, pft_name, trait, quantile, type) instead of appending to samples.Rdata.
samples.Rdata is now treated as a read-only "Master parameter definition" file.
updated write.sa.configs and write.ensemble.configs to return manifest dataframes.

refactored read.sa.output and read.ensemble.output to pass the full vector of requested variables to read.output in a single call.
eliminated the inefficient inner loop that opened/closed netcdf files for every individual variable.

Additionally this PR separates input design generation for SA and ensemble runs, fixing dimension mismatch errors in multisite workflows.
previously both SA and ensemble used a single input_design matrix, causing failures when:

Ensemble needed N runs (e.g. 50)
SA needed 1 (median) + (traits * quantiles) runs (e.g. 145)

causes in dimension mismatch errors in input design coordination.

runModule.run.write.configs generates separate ens_input_design and sa_input_design matrices.
run.write.configs accepts both input_design_ens and input_design_sa parameters.
Old input_design parameter still works with deprecation warning.

Motivation and Context

Review Time Estimate

Immediately
Within one week
When possible

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My change requires a change to the documentation.
My name is in the list of CITATION.cff
I agree that PEcAn Project may distribute my contribution under any or all of
- the same license as the existing code,
- and/or the BSD 3-clause license.
I have updated the CHANGELOG.md.
I have updated the documentation accordingly.
I have read the CONTRIBUTING document.
I have added tests to cover my changes.
All new and existing tests passed.

…g in read.ensemble.output to fix I/O bottleneck

…g in read.sa.output to fix I/O bottleneck

… hidden

infotroph · 2025-12-12T09:48:53Z

base/workflow/R/run.write.configs.R

+#' @param input_design DEPRECATED. Use input_design_ens and input_design_sa instead
+#' @param input_design_ens Input design matrix for ensemble analysis
+#' @param input_design_sa Input design matrix for sensitivity analysis


I'm not a fan of having this be three separate arguments. What if we kept input_design, expecting a list with entries ensemble and sensitivity and also accepting a single design object (interpreting it as being for the ensemble)?

(We discussed live that it'd be nice to encode SA and ensemble setup into the design such that run.write.configs doesn't even need to know which one it's working with, but that's still well in the future)

I 100% agree with @infotroph that we shouldn't have multiple input_design arguments. Where I disagree is that I don't think it makes sense to have ensemble and sensitivity elements. I think we should move to a place where a single call to run.write.configs runs a single input_design. If you want to run both an ensemble projection and a SA then you should call run.write.configs twice with different input_designs (one for SA one for ensemble). In other words, I think the SA/ensemble distinction should be deprecated at this point w.r.t. run.write.configs. For example, a Sobol SA is already actually executed via the "ensemble" pathway.

infotroph · 2025-12-12T09:56:07Z

base/workflow/R/run.write.configs.R

  invisible(settings)
  return(settings)
-}
+}


Nit: Please don't remove terminating newlines. These days the rule is mostly just to avoid needless extra changes, but there do exist some (mostly ancient 😅 ) tools that complain if it isn't there.

infotroph · 2025-12-12T10:04:12Z

base/workflow/R/run.write.configs.R

+
+  manifest.file <- file.path(settings$outdir, "runs_manifest.csv")
+
+  if (nrow(run_manifest_df) > 0) {


Is it OK for the manifest to be empty? If so I'd favor writing a zero-row file so it's easy to tell that afterwards; if not OK then should we throw an error here instead of skipping?

I think here the better approach would be to always write manifest;
distinguishes "workflow completed with no runs" from "workflow crashed before writing", and let the downstream consumers handles it.

infotroph · 2025-12-12T10:08:41Z

base/workflow/R/run.write.configs.R

+    if (overwrite && file.exists(manifest.file)) {
+      unlink(manifest.file)
+    }
+
+    utils::write.table(run_manifest_df,
+                       file = manifest.file,
+                       sep = ",",
+                       row.names = FALSE,
+                       col.names = !file.exists(manifest.file),
+                       append = file.exists(manifest.file))


write.table removes existing files unless append=TRUE, so this unlink isn't needed

Suggested change

if (overwrite && file.exists(manifest.file)) {

unlink(manifest.file)

}

utils::write.table(run_manifest_df,

file = manifest.file,

sep = ",",

row.names = FALSE,

col.names = !file.exists(manifest.file),

append = file.exists(manifest.file))

utils::write.table(run_manifest_df,

file = manifest.file,

sep = ",",

row.names = FALSE,

col.names = !file.exists(manifest.file),

append = !overwrite)

exactly, its defensive that adds complexity without benefit

infotroph · 2025-12-12T10:13:42Z

base/workflow/R/run.write.configs.R

  invisible(settings)
  return(settings)


Not your fault but fixing this while I see it -- invisible() was doing nothing here and usually only makes sense as a return value

Suggested change

invisible(settings)

return(settings)

return(invisible(settings))

I should know this, but what's actually been changed in the returned settings object. I'm wondering whether the settings is still the right thing to return at this point (e.g., are we moving to a place where the manifest might be the thing a user actually wants back?)

Changed: at least ensemble IDs and PFT outdirs (if they weren't there in the input settings).

Return value: My gut is settings is still the right thing to return. I'm saying this without checking, but I think at the moment every runModule.*() takes a MultiSettings and returns a possibly modified MultiSettings, so changing that would be an even bigger overhaul. But maybe this is an argument for putting (a copy of) the manifest into the returned settings?

But maybe this is an argument for putting (a copy of) the manifest into the returned settings?

If you are saying (settings$runs_manifest <- run_manifest_df), then manifest written to pecan.CONFIGS.xml via listToXml(), which would serialize every row as nested XML elements. But for for large SDA runs would increase the disk size (may be in giga byte). Slow down read.settings() / write.settings()

infotroph · 2025-12-16T01:06:08Z

modules/uncertainty/R/ensemble.R

+  for (i in seq_len(nrow(ens.run.ids))) {
+
+    # Handle column name difference (manifest has 'run_id', legacy 'ens.run.ids' might have 'id')
+    run.id <- if ("run_id" %in% names(ens.run.ids)) ens.run.ids$run_id[i] else ens.run.ids$id[i]


Would it work to iterate over the IDs directly instead of by index?

Suggested change

for (i in seq_len(nrow(ens.run.ids))) {

# Handle column name difference (manifest has 'run_id', legacy 'ens.run.ids' might have 'id')

run.id <- if ("run_id" %in% names(ens.run.ids)) ens.run.ids$run_id[i] else ens.run.ids$id[i]

# Handle column name difference (manifest has 'run_id', legacy 'ens.run.ids' might have 'id')

run_ids <- ens.run.ids$run_id %||% ens.run.ids$id

for (run_id in run_ids) {

The index i is needed for output list indexing, but we could simplify by extracting run_ids once before the loop; and run_id mean the actual id's which seems confusing;
you suggested for (run_id in run_ids) but opted for the index-based approach because ensemble.output[[as.character(i)]] uses position based keys not run ids and seq_along() provides automatic increment without having to define another variable for manual idx <- idx + 1

infotroph · 2025-12-16T01:14:28Z

modules/uncertainty/R/sensitivity.R

+      if (nrow(subset_df) == 1) {
+         run.id <- subset_df$run_id
+      } else if (nrow(subset_df) > 1) {
+         PEcAn.logger::logger.warn("Multiple runs found for", trait, quantile, "- using the last one.")


When would this happen and should it be an error instead of a warning?

infotroph · 2025-12-16T01:20:53Z

modules/uncertainty/R/sensitivity.R

+         next
+      }
+
+      pass_pft <- switch(per.pft + 1, NULL, pft.name)


Weirdly phrased -- I see this was existing code but I wonder if I'm missing a reason it wasn't pass_pft <- if(isTRUE(per.pft)) pft.name else NULL, which I'd find easier to read.

infotroph · 2025-12-16T01:50:50Z

modules/uncertainty/R/sensitivity.R

+      pass_pft <- switch(per.pft + 1, NULL, pft.name)
+
+      # Pass ALL variables at once to avoid repeated file opening. And call read.output once
+      out.tmp <- PEcAn.utils::read.output(


This might be for a different PR, but since this is getting converted to data frame a few lines down it might be worth benchmarking the existing approach against passing dataframe = TRUE into read.output

i have added TODO comment; defer to future PR if complex filtering needed;
because i suspect keeping dataframe = FALSE make sense, cuz the current aggregation doesn't benefit from data.frame structure. If we add more complex post-processing then we can then benchmark.

infotroph · 2025-12-18T09:29:30Z

base/workflow/R/runModule.run.write.configs.R

+      ens_input_design <- design_result$X
+    }
+
+    # handle SA design - check if already provided before generating


This is now turning into a fair amount of code repeated in both legs of the settings/multisettings if/else. Could we do some of this unconditionally outside the if? Or define a helper function to encapsulate it?

…nto helper function

…rocessing filtering needed

…rite.sa.configs

mdietze · 2025-12-31T17:50:09Z

base/workflow/R/run.write.configs.R

+
+  # extract designs from input_design list
+  input_design_ens <- if (!is.null(input_design)) input_design$ensemble else NULL
+  input_design_sa <- if (!is.null(input_design)) input_design$sensitivity else NULL


Per earlier comments, I'd recommend only having one input_design, not separate ens and sa designs

mdietze · 2025-12-31T17:51:07Z

base/workflow/R/runModule.run.write.configs.R

-#' @param sa_input_design Input design matrix for SA (internal use)
+#' @param input_design Optional. Input design specification. Can be:
+#'   \itemize{
+#'     \item A list with \code{ensemble} and/or \code{sensitivity} entries


Suggested change

#' \item A list with \code{ensemble} and/or \code{sensitivity} entries

divine7022 added 11 commits December 7, 2025 19:12

support SA input_design

2b08818

update .Rd

efc6a33

add rlang to import

ceb3573

update namespace

31b4c12

runs are now stored in runs_manifest.csv

5902fdb

update run.write.configs.Rd

16499f4

write configs to use runs_manifest.csv and vectorized variable readin…

6d234f9

…g in read.ensemble.output to fix I/O bottleneck

write configs to use runs_manifest.csv and vectorized variable readin…

7e62bef

…g in read.sa.output to fix I/O bottleneck

update read.sa.output.Rd

a223998

update write.sa.configs.Rd

8223f1d

use runs_manifest.csv in load_ensemble

770814f

github-actions bot added Modules Base labels Dec 7, 2025

divine7022 added 6 commits December 7, 2025 20:21

update changelog

20d14e0

update NEWS.md

dd525d1

add comment

ceb53cc

update roxygen

982cd35

update write.sa.configs.Rd

fc86545

fix indentation

78bf963

divine7022 requested review from dlebauer, infotroph and mdietze December 7, 2025 20:29

divine7022 added 2 commits December 8, 2025 04:37

add rlang to DESCRIPTION

3ce205b

update doceker depends

20643d5

github-actions bot added the Dockerfile label Dec 8, 2025

divine7022 added 3 commits December 9, 2025 12:38

remove 'samples' ref

502622e

check the priority of input design and handle the edge case which was…

c9be470

… hidden

Merge remote-tracking branch 'origin/develop' into run-manifest

d06704d

divine7022 mentioned this pull request Dec 11, 2025

add input design documentation to book source #3716

Merged

14 tasks

infotroph reviewed Dec 18, 2025

View reviewed changes

divine7022 mentioned this pull request Dec 26, 2025

Add generate_OAT_SA_design() for sensitivity analysis input design #3729

Open

14 tasks

divine7022 added 9 commits December 31, 2025 01:10

remove seperate input_desing argu and consolidate to single input_desing

1918d46

update .Rd

60d80b3

removed seperate desing parameters and encapsulated redundent chunk i…

9a6c512

…nto helper function

update runModule.run.write.configs.Rd

0aa313e

simplify by extracting run_ids once

d8682e3

added todo comment to benchmark, defer to future PR if complex post-p…

7cc8762

…rocessing filtering needed

add dot-prepare_input_designs.Rd

80fd907

add tests for multisite SA handing by run.write.configs

708254b

add tests to check coordination of newly added input_desing argu to w…

af8f89e

…rite.sa.configs

github-actions bot added the Tests label Dec 31, 2025

mdietze reviewed Dec 31, 2025

View reviewed changes


		manifest.file <- file.path(settings$outdir, "runs_manifest.csv")

		if (nrow(run_manifest_df) > 0) {

	invisible(settings)
	return(settings)
	return(invisible(settings))

Refactor workflow to stateless run manifest architecture and SA input design coordination #3708

Are you sure you want to change the base?

Refactor workflow to stateless run manifest architecture and SA input design coordination #3708

Uh oh!

Conversation

divine7022 commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Review Time Estimate

Types of changes

Checklist:

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

divine7022 commented Dec 7, 2025 •

edited

Loading