add fraction matching based on only the light / na labels#125
add fraction matching based on only the light / na labels#125tonywu1999 merged 3 commits intodevelfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughRestricted overlap-counting and tie-breaking in fraction selection to use only rows with Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
f831151 to
4e6ee04
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
R/utils_fractions.R (1)
279-280:⚠️ Potential issue | 🟠 MajorPrevent silent feature drops when no L/NA rows qualify.
Because
fraction_mapis derived from the filtered subset,nomatch = 0can remove entire features (e.g., H-only rows) without an explicit warning/error.💡 Proposed safeguard
if (data.table::uniqueN(input$Fraction) > 1) { measurement_count = input[ (IsotopeLabelType == "L" | is.na(IsotopeLabelType)) & !is.na(Intensity) & Intensity > 0, .(n_obs = uniqueN(Run)), by = .(feature, Fraction) ] + missing_anchor_features = setdiff(unique(input$feature), unique(measurement_count$feature)) + if (length(missing_anchor_features) > 0) { + msg = paste( + "** No L/NA positive-intensity observations for feature(s):", + paste(missing_anchor_features, collapse = ", ") + ) + getOption("MSstatsLog")("ERROR", msg) + stop(msg) + } measurement_count[, is_max := n_obs == max(n_obs), by = "feature"] max_fractions = measurement_count[(is_max)]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@R/utils_fractions.R` around lines 279 - 280, The current filtering uses fraction_map from .resolveFractionTies and then does input = input[fraction_map, on = .(feature, Fraction), nomatch = 0], which can silently drop entire features when no L/NA rows qualify; change the logic to detect features present in input but missing in fraction_map (compute setdiff between unique(input$feature) and unique(fraction_map$feature)), and for any missing features either emit a warning naming those features or fall back to keeping their original rows (i.e., do not apply the nomatch = 0 drop for those features); implement this check around the call to .resolveFractionTies and the subsequent join/filter so that fraction_map, input, and feature are used to decide whether to warn or to preserve rows instead of silently dropping them.
🧹 Nitpick comments (1)
man/dot-resolveFractionTies.Rd (1)
22-25: Document the isotope/intensity eligibility used in tie-breaking.The description should mention that mean intensity is computed only from rows with
IsotopeLabelType == "L"orNA, andIntensity > 0, to match implementation behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@man/dot-resolveFractionTies.Rd` around lines 22 - 25, Update the documentation for resolveFractionTies to explicitly state the tie-breaker computes mean intensity only over measurements where IsotopeLabelType is "L" or NA and Intensity > 0; mention that fractions tied by count are resolved by selecting the fraction with the highest mean intensity computed using only those eligible rows to match the implementation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@R/utils_fractions.R`:
- Around line 271-274: The subset step assumes IsotopeLabelType exists; ensure
it is present and normalized before filtering by explicitly adding a default and
normalization: in the preprocessing path that leads into
.removeOverlappingFeatures (or right before the block using IsotopeLabelType,
Intensity, feature, Fraction, uniqueN(Run)), add code to create IsotopeLabelType
if missing (e.g., set IsotopeLabelType := NA_character_ when !"IsotopeLabelType"
%in% names(dt)) and normalize unexpected values to NA (e.g., convert non "L"/"H"
entries to NA) so the condition (IsotopeLabelType == "L" |
is.na(IsotopeLabelType)) & !is.na(Intensity) & Intensity > 0 works without
error. Ensure this change is applied where inputs are shaped
(preprocessing/clean functions) or immediately before the grouping that computes
.(n_obs = uniqueN(Run)) by .(feature, Fraction).
---
Outside diff comments:
In `@R/utils_fractions.R`:
- Around line 279-280: The current filtering uses fraction_map from
.resolveFractionTies and then does input = input[fraction_map, on = .(feature,
Fraction), nomatch = 0], which can silently drop entire features when no L/NA
rows qualify; change the logic to detect features present in input but missing
in fraction_map (compute setdiff between unique(input$feature) and
unique(fraction_map$feature)), and for any missing features either emit a
warning naming those features or fall back to keeping their original rows (i.e.,
do not apply the nomatch = 0 drop for those features); implement this check
around the call to .resolveFractionTies and the subsequent join/filter so that
fraction_map, input, and feature are used to decide whether to warn or to
preserve rows instead of silently dropping them.
---
Nitpick comments:
In `@man/dot-resolveFractionTies.Rd`:
- Around line 22-25: Update the documentation for resolveFractionTies to
explicitly state the tie-breaker computes mean intensity only over measurements
where IsotopeLabelType is "L" or NA and Intensity > 0; mention that fractions
tied by count are resolved by selecting the fraction with the highest mean
intensity computed using only those eligible rows to match the implementation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9ebf79ff-e76c-4873-9c84-95e8e0d4e4ea
📒 Files selected for processing (4)
R/utils_fractions.Rinst/tinytest/test_fractions.Rman/dot-getCorrectFraction.Rdman/dot-resolveFractionTies.Rd
💤 Files with no reviewable changes (1)
- man/dot-getCorrectFraction.Rd
Motivation and Context
This PR refines fraction-matching logic so fraction selection for a feature is driven by light (IsotopeLabelType == "L") and NA observations only, ignoring heavy ("H") observations when deciding which fraction to pick. This ensures fraction choice reflects the label-free (or light/NA) signal — important for turnover analyses where the light peptide abundance should determine the chosen fraction — while still retaining H rows from the selected fraction in the final output. If a feature lacks any L/NA observations, the logic falls back to H observations for measurement counts and mean-abundance tie-breaking.
Detailed Changes
R/utils_fractions.R
inst/tinytest/test_fractions.R
man/dot-getCorrectFraction.Rd
man/dot-resolveFractionTies.Rd
Unit Tests Added or Modified
Coding Guidelines / Violations