Skip to content

score() does not group by scale for forecast_multivariate_sample with transform_forecasts(append=TRUE) #1108

@annakrystalli

Description

@annakrystalli

Summary

score() does not correctly handle the scale column when scoring forecast_multivariate_sample objects produced by transform_forecasts(append=TRUE). The two scales (natural + transformed) are collapsed into a single set of scored rows instead of being scored separately. The univariate forecast_sample path handles this correctly.

Reprex

library(scoringutils)
#> scoringutils_2.1.2.9000

# === Univariate: works correctly ===
transformed_uni <- transform_forecasts(
  example_sample_continuous, fun = sqrt, append = TRUE, label = "sqrt"
)
scored_uni <- score(transformed_uni)
table(scored_uni$scale, useNA = "always")
#> natural    sqrt    <NA>
#>     887     878       0
# Two sets of scores, one per scale -- correct.

# === Multivariate: broken ===
transformed_mv <- suppressWarnings(transform_forecasts(
  example_multivariate_sample, fun = sqrt, append = TRUE, label = "sqrt"
))

nrow(example_multivariate_sample)
#> [1] 35624
nrow(transformed_mv)
#> [1] 71248
# Correctly doubled.

table(transformed_mv$scale, useNA = "always")
#> natural    sqrt    <NA>
#>   35624   35624       0
# Both scales present in transformed data.

scored_mv <- score(transformed_mv)
nrow(scored_mv)
#> [1] 224
# Expected 448 (224 per scale). Only one set of scores returned.

table(scored_mv$scale, useNA = "always")
#> <NA>
#>    0
# scale column dropped entirely.
Session info
R version 4.5.1 (2025-06-13)
Platform: aarch64-apple-darwin20
Running under: macOS Sequoia 15.7.3

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] scoringutils_2.1.2.9000

Expected behaviour

score() should group by scale (in addition to the forecast unit columns) when scoring, producing separate scores for each scale. This is how it works for univariate forecast types (e.g., forecast_sample, forecast_quantile).

Context

Related to #1071 / #1072 -- the transform_forecasts(append=TRUE) fix resolved the immediate error for multivariate forecasts, but this downstream issue in score() remains.

Discovered while implementing sample forecast scoring in hubverse-org/hubEvals#94.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions