score() does not group by scale for forecast_multivariate_sample with transform_forecasts(append=TRUE)

## Summary

`score()` does not correctly handle the `scale` column when scoring `forecast_multivariate_sample` objects produced by `transform_forecasts(append=TRUE)`. The two scales (natural + transformed) are collapsed into a single set of scored rows instead of being scored separately. The univariate `forecast_sample` path handles this correctly.

## Reprex

```r
library(scoringutils)
#> scoringutils_2.1.2.9000

# === Univariate: works correctly ===
transformed_uni <- transform_forecasts(
  example_sample_continuous, fun = sqrt, append = TRUE, label = "sqrt"
)
scored_uni <- score(transformed_uni)
table(scored_uni$scale, useNA = "always")
#> natural    sqrt    <NA>
#>     887     878       0
# Two sets of scores, one per scale -- correct.

# === Multivariate: broken ===
transformed_mv <- suppressWarnings(transform_forecasts(
  example_multivariate_sample, fun = sqrt, append = TRUE, label = "sqrt"
))

nrow(example_multivariate_sample)
#> [1] 35624
nrow(transformed_mv)
#> [1] 71248
# Correctly doubled.

table(transformed_mv$scale, useNA = "always")
#> natural    sqrt    <NA>
#>   35624   35624       0
# Both scales present in transformed data.

scored_mv <- score(transformed_mv)
nrow(scored_mv)
#> [1] 224
# Expected 448 (224 per scale). Only one set of scores returned.

table(scored_mv$scale, useNA = "always")
#> <NA>
#>    0
# scale column dropped entirely.
```

<details><summary>Session info</summary>

```
R version 4.5.1 (2025-06-13)
Platform: aarch64-apple-darwin20
Running under: macOS Sequoia 15.7.3

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] scoringutils_2.1.2.9000
```

</details>

## Expected behaviour

`score()` should group by `scale` (in addition to the forecast unit columns) when scoring, producing separate scores for each scale. This is how it works for univariate forecast types (e.g., `forecast_sample`, `forecast_quantile`).

## Context

Related to #1071 / #1072 -- the `transform_forecasts(append=TRUE)` fix resolved the immediate error for multivariate forecasts, but this downstream issue in `score()` remains.

Discovered while implementing sample forecast scoring in https://github.com/hubverse-org/hubEvals/issues/94.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

score() does not group by scale for forecast_multivariate_sample with transform_forecasts(append=TRUE) #1108

Summary

Reprex

Expected behaviour

Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

score() does not group by scale for forecast_multivariate_sample with transform_forecasts(append=TRUE) #1108

Description

Summary

Reprex

Expected behaviour

Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions