Skip to content

add fraction matching based on only the light / na labels#125

Merged
tonywu1999 merged 3 commits intodevelfrom
feat-turnover
Apr 10, 2026
Merged

add fraction matching based on only the light / na labels#125
tonywu1999 merged 3 commits intodevelfrom
feat-turnover

Conversation

@tonywu1999
Copy link
Copy Markdown
Contributor

@tonywu1999 tonywu1999 commented Apr 7, 2026

Motivation and Context

This PR refines fraction-matching logic so fraction selection for a feature is driven by light (IsotopeLabelType == "L") and NA observations only, ignoring heavy ("H") observations when deciding which fraction to pick. This ensures fraction choice reflects the label-free (or light/NA) signal — important for turnover analyses where the light peptide abundance should determine the chosen fraction — while still retaining H rows from the selected fraction in the final output. If a feature lacks any L/NA observations, the logic falls back to H observations for measurement counts and mean-abundance tie-breaking.

Detailed Changes

  • R/utils_fractions.R

    • .removeOverlappingFeatures()
      • Compute measurement counts per (feature, Fraction) using only rows where IsotopeLabelType == "L" or is.na(IsotopeLabelType), and Intensity is non-missing and > 0.
      • For feature–fraction groups with zero L/NA observations, recompute and append n_obs derived from H rows to the measurement_count used to select is_max (fallback behavior).
      • Prevent dropping features solely because they lack L/NA rows; ensure H-derived counts are considered only when L/NA are absent.
    • .resolveFractionTies()
      • When resolving ties among fractions with equal maximum measurement counts, compute mean_abundance using only L/NA rows first.
      • For tied features that lack L/NA rows, include mean_abundance computed from H rows as a fallback and combine appropriately before selecting the fraction with the highest mean_abundance.
  • inst/tinytest/test_fractions.R

    • Added IsotopeLabelType = "L" to existing fixtures and introduced new fixtures:
      • fractionated_lh (mix of L and H, plus NA-only cases)
      • fractionated_h_only and fractionated_h_tied (H-only scenarios to validate fallback)
    • Added assertions verifying:
      • Fraction selection uses only L/NA observations (H does not influence choice).
      • Final output retains both L and H rows for the selected fraction.
      • NA-only features are resolved using NA observation counts.
      • When only H rows exist, selection falls back to H-based counts and breaks ties by higher mean Intensity.
      • Full-dataset identity checks for the expected subset of selected rows.
  • man/dot-getCorrectFraction.Rd

    • Removed the generated documentation file for internal helper .getCorrectFraction (documentation-only deletion; no runtime change).
  • man/dot-resolveFractionTies.Rd

    • Added documentation for the internal function .resolveFractionTies describing purpose, usage, arguments, and return value.

Unit Tests Added or Modified

  • inst/tinytest/test_fractions.R
    • New and extended test cases cover:
      • Mixed L/H features: confirm fraction chosen by L/NA counts and ties by mean L/NA intensity; ensure retained output includes both labels for the chosen fraction.
      • NA-only features: confirm selection based on NA observations.
      • H-only features and ties: confirm fallback to H-derived counts and mean-intensity tie-breaking.
      • Full-dataset tests validating the final selected-row subset matches expectations.

Coding Guidelines / Violations

  • No coding guideline violations identified. Changes adhere to existing code patterns and conventions (data.table usage, internal function naming, tests in tinytest).

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 7, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3f568781-89d4-4a12-b324-954bd54e6e43

📥 Commits

Reviewing files that changed from the base of the PR and between 4e6ee04 and 4d8bd68.

📒 Files selected for processing (2)
  • R/utils_fractions.R
  • inst/tinytest/test_fractions.R
🚧 Files skipped from review as they are similar to previous changes (1)
  • R/utils_fractions.R

📝 Walkthrough

Walkthrough

Restricted overlap-counting and tie-breaking in fraction selection to use only rows with IsotopeLabelType == "L" or NA (and non-missing, >0 Intensity); when no such rows exist for a feature, fall back to using "H" rows to compute measurement counts or mean abundances. Tests and documentation adjusted accordingly.

Changes

Cohort / File(s) Summary
Core logic
R/utils_fractions.R
Updated .removeOverlappingFeatures() and .resolveFractionTies() to compute run counts and tie-breaking mean abundances first from rows where IsotopeLabelType == "L" or is.na(IsotopeLabelType) (with non-missing, positive Intensity); when none exist for a feature, incorporate "H" rows as a fallback for measurement counts or mean abundance before selecting the winning Fraction.
Tests
inst/tinytest/test_fractions.R
Added IsotopeLabelType = "L" to fixtures, introduced fractionated_lh, fractionated_h_only, and fractionated_h_tied cases, and expanded assertions to verify selection uses L/NA observations first, retains both L and H rows for chosen fractions, and falls back to H-based counts/means when needed.
Documentation
man/dot-getCorrectFraction.Rd, man/dot-resolveFractionTies.Rd
Removed generated .Rd for .getCorrectFraction; added man/dot-resolveFractionTies.Rd documenting .resolveFractionTies(input, max_fractions) and its tie-resolution behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • Fix fraction efficient #126 — Modifies the same fraction-selection logic in R/utils_fractions.R, adjusting how run counts and mean intensities are computed for overlapping fractions and tie-breaking.

Poem

🐰 I nibble through fractions, light and neat,
I count the L's where intensities meet,
If none are found, I peek at H's tune,
Then pick the slice that sings the loudest rune. 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request description is entirely missing, lacking all required sections: motivation/context, detailed changes list, testing information, and pre-review checklist. Add a comprehensive description following the template, including motivation for the change, detailed bullet-point list of modifications, testing approach, and completed pre-review checklist.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: updating fraction matching logic to prioritize light and NA isotope labels in the decision-making process.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat-turnover

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tonywu1999 tonywu1999 marked this pull request as ready for review April 10, 2026 17:39
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
R/utils_fractions.R (1)

279-280: ⚠️ Potential issue | 🟠 Major

Prevent silent feature drops when no L/NA rows qualify.

Because fraction_map is derived from the filtered subset, nomatch = 0 can remove entire features (e.g., H-only rows) without an explicit warning/error.

💡 Proposed safeguard
     if (data.table::uniqueN(input$Fraction) > 1) {
         measurement_count = input[
             (IsotopeLabelType == "L" | is.na(IsotopeLabelType)) & 
             !is.na(Intensity) & Intensity > 0,
             .(n_obs = uniqueN(Run)),
             by = .(feature, Fraction)
         ]
+        missing_anchor_features = setdiff(unique(input$feature), unique(measurement_count$feature))
+        if (length(missing_anchor_features) > 0) {
+            msg = paste(
+                "** No L/NA positive-intensity observations for feature(s):",
+                paste(missing_anchor_features, collapse = ", ")
+            )
+            getOption("MSstatsLog")("ERROR", msg)
+            stop(msg)
+        }
         measurement_count[, is_max := n_obs == max(n_obs), by = "feature"]
         max_fractions = measurement_count[(is_max)]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/utils_fractions.R` around lines 279 - 280, The current filtering uses
fraction_map from .resolveFractionTies and then does input = input[fraction_map,
on = .(feature, Fraction), nomatch = 0], which can silently drop entire features
when no L/NA rows qualify; change the logic to detect features present in input
but missing in fraction_map (compute setdiff between unique(input$feature) and
unique(fraction_map$feature)), and for any missing features either emit a
warning naming those features or fall back to keeping their original rows (i.e.,
do not apply the nomatch = 0 drop for those features); implement this check
around the call to .resolveFractionTies and the subsequent join/filter so that
fraction_map, input, and feature are used to decide whether to warn or to
preserve rows instead of silently dropping them.
🧹 Nitpick comments (1)
man/dot-resolveFractionTies.Rd (1)

22-25: Document the isotope/intensity eligibility used in tie-breaking.

The description should mention that mean intensity is computed only from rows with IsotopeLabelType == "L" or NA, and Intensity > 0, to match implementation behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@man/dot-resolveFractionTies.Rd` around lines 22 - 25, Update the
documentation for resolveFractionTies to explicitly state the tie-breaker
computes mean intensity only over measurements where IsotopeLabelType is "L" or
NA and Intensity > 0; mention that fractions tied by count are resolved by
selecting the fraction with the highest mean intensity computed using only those
eligible rows to match the implementation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@R/utils_fractions.R`:
- Around line 271-274: The subset step assumes IsotopeLabelType exists; ensure
it is present and normalized before filtering by explicitly adding a default and
normalization: in the preprocessing path that leads into
.removeOverlappingFeatures (or right before the block using IsotopeLabelType,
Intensity, feature, Fraction, uniqueN(Run)), add code to create IsotopeLabelType
if missing (e.g., set IsotopeLabelType := NA_character_ when !"IsotopeLabelType"
%in% names(dt)) and normalize unexpected values to NA (e.g., convert non "L"/"H"
entries to NA) so the condition (IsotopeLabelType == "L" |
is.na(IsotopeLabelType)) & !is.na(Intensity) & Intensity > 0 works without
error. Ensure this change is applied where inputs are shaped
(preprocessing/clean functions) or immediately before the grouping that computes
.(n_obs = uniqueN(Run)) by .(feature, Fraction).

---

Outside diff comments:
In `@R/utils_fractions.R`:
- Around line 279-280: The current filtering uses fraction_map from
.resolveFractionTies and then does input = input[fraction_map, on = .(feature,
Fraction), nomatch = 0], which can silently drop entire features when no L/NA
rows qualify; change the logic to detect features present in input but missing
in fraction_map (compute setdiff between unique(input$feature) and
unique(fraction_map$feature)), and for any missing features either emit a
warning naming those features or fall back to keeping their original rows (i.e.,
do not apply the nomatch = 0 drop for those features); implement this check
around the call to .resolveFractionTies and the subsequent join/filter so that
fraction_map, input, and feature are used to decide whether to warn or to
preserve rows instead of silently dropping them.

---

Nitpick comments:
In `@man/dot-resolveFractionTies.Rd`:
- Around line 22-25: Update the documentation for resolveFractionTies to
explicitly state the tie-breaker computes mean intensity only over measurements
where IsotopeLabelType is "L" or NA and Intensity > 0; mention that fractions
tied by count are resolved by selecting the fraction with the highest mean
intensity computed using only those eligible rows to match the implementation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9ebf79ff-e76c-4873-9c84-95e8e0d4e4ea

📥 Commits

Reviewing files that changed from the base of the PR and between 077b51f and 4e6ee04.

📒 Files selected for processing (4)
  • R/utils_fractions.R
  • inst/tinytest/test_fractions.R
  • man/dot-getCorrectFraction.Rd
  • man/dot-resolveFractionTies.Rd
💤 Files with no reviewable changes (1)
  • man/dot-getCorrectFraction.Rd

Comment thread R/utils_fractions.R Outdated
@tonywu1999 tonywu1999 merged commit 0eae2d3 into devel Apr 10, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant