fix(checks): support 'heavy'/'light' IsotopeLabelType explicitly#189
fix(checks): support 'heavy'/'light' IsotopeLabelType explicitly#189tonywu1999 merged 6 commits intodevelfrom
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 13 minutes and 1 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughIntroduced an internal helper Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@R/utils_checks.R`:
- Around line 234-243: The label normalization change must be applied to the
raw-input validation path too: update .checkUnProcessedDataValidity to use the
same label_map and factor creation as in .prepareForDataProcess (use
label_map[as.character(input$ISOTOPELABELTYPE)] and set ISOTOPELABELTYPE <-
factor(..., levels = intersect(c("H","L"),
label_map[as.character(input$ISOTOPELABELTYPE)]))) instead of the old
single-level fallback to "L", so single-level values like "heavy" map to "H"
consistently across both preprocessing paths.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b799e890-dfc9-41e8-ab17-13ff6a9c5ecd
📒 Files selected for processing (3)
R/utils_checks.Rinst/tinytest/test_dataProcess.Rinst/tinytest/test_utils_checks.R
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
R/utils_checks.R (1)
219-227:⚠️ Potential issue | 🟠 MajorNormalize
ISOTOPELABELTYPEbefore enforcing the “max 2 levels” check.The validation currently runs on raw values, so alias mixes like
"H","heavy", and"L"can be incorrectly rejected as “more than two levels” before canonicalization.Proposed fix
- if (data.table::uniqueN(input$ISOTOPELABELTYPE) > 2) { + input$ISOTOPELABELTYPE <- .mapIsotopeLabelType(input$ISOTOPELABELTYPE) + normalized_labels <- stats::na.omit(as.character(input$ISOTOPELABELTYPE)) + if (data.table::uniqueN(normalized_labels) > 2) { getOption("MSstatsLog")( "ERROR", paste( "There are more than two levels of labeling.", "So far, only label-free or reference-labeled experiment are supported. - stop")) stop("Statistical tools in MSstats are only proper for label-free or with reference peptide experiments.") } - - input$ISOTOPELABELTYPE <- .mapIsotopeLabelType(input$ISOTOPELABELTYPE) input }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@R/utils_checks.R` around lines 219 - 227, Normalize ISOTOPELABELTYPE before counting unique levels: call .mapIsotopeLabelType on input$ISOTOPELABELTYPE first (so aliases like "H"/"heavy"/"L" map to canonical values), then perform data.table::uniqueN(...) and the error/stop logic; update the check that currently uses data.table::uniqueN(input$ISOTOPELABELTYPE) to use the mapped value and keep the existing processLogger/error message and stop call intact.
🧹 Nitpick comments (1)
R/utils_checks.R (1)
150-157: Consider case-insensitive, trimmed label mapping for better alias handling.Current mapping is exact-match only; variants like
"Heavy"," light ", or"h"becomeNA. Making this normalization tolerant would reduce avoidable missing labels.Proposed refactor
.mapIsotopeLabelType = function(x) { - label_map <- c( - "H" = "H", - "L" = "L", - "heavy" = "H", - "light" = "L" - ) - mapped <- label_map[as.character(x)] + label_map <- c( + "h" = "H", + "l" = "L", + "heavy" = "H", + "light" = "L" + ) + key <- tolower(trimws(as.character(x))) + mapped <- unname(label_map[key]) factor(mapped, levels = intersect(c("H", "L"), mapped)) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@R/utils_checks.R` around lines 150 - 157, Normalize and trim input before lookup: create a normalized version of x (e.g., lowercased and whitespace-trimmed) and map common aliases like "h", "l", "heavy", "light" to canonical "H"/"L" via the existing label_map; replace the current lookup (mapped <- label_map[as.character(x)]) with one that uses the normalized values, then produce the factor with levels = intersect(c("H","L"), mapped) as before (referencing label_map, mapped, x, and the factor call).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@R/utils_checks.R`:
- Around line 219-227: Normalize ISOTOPELABELTYPE before counting unique levels:
call .mapIsotopeLabelType on input$ISOTOPELABELTYPE first (so aliases like
"H"/"heavy"/"L" map to canonical values), then perform data.table::uniqueN(...)
and the error/stop logic; update the check that currently uses
data.table::uniqueN(input$ISOTOPELABELTYPE) to use the mapped value and keep the
existing processLogger/error message and stop call intact.
---
Nitpick comments:
In `@R/utils_checks.R`:
- Around line 150-157: Normalize and trim input before lookup: create a
normalized version of x (e.g., lowercased and whitespace-trimmed) and map common
aliases like "h", "l", "heavy", "light" to canonical "H"/"L" via the existing
label_map; replace the current lookup (mapped <- label_map[as.character(x)])
with one that uses the normalized values, then produce the factor with levels =
intersect(c("H","L"), mapped) as before (referencing label_map, mapped, x, and
the factor call).
PR Type
Bug fix, Tests
Description
Normalize
heavy/lightisotope labelsPreserve consistent
H/Lfactor levelsAdd preprocessing edge-case coverage
Verify SRM light/protein outputs
Diagram Walkthrough
File Walkthrough
utils_checks.R
Normalize isotope label aliases during preprocessingR/utils_checks.R
heavy/lighttoH/LmappingISOTOPELABELTYPEfrom normalized valuesH/Lstatestest_dataProcess.R
Strengthen SRM `dataProcess` output assertionsinst/tinytest/test_dataProcess.R
FeatureLevelDatacontainsLlabelsProteinLevelDatais non-emptytest_utils_checks.R
Cover isotope label normalization edge casesinst/tinytest/test_utils_checks.R
PEPTIDEMODIFIEDSEQUENCEheavy/lightnormalization toH/LNAlabels becomeNAMotivation and Context
The MSstats package needed to support explicit "heavy"/"light" IsotopeLabelType encodings in addition to the canonical "H"/"L" values. Previously, the code did not normalize these alternative labels, causing inconsistent factor level handling. This fix introduces a normalization layer that maps these common labels to standard values while ensuring factor levels are consistently set based on the actual data present.
Solution Summary
The PR adds an isotope label mapping mechanism that:
Detailed Changes
R/utils_checks.R
.prepareForDataProcess():if (uniqueN == 2) then c("H","L") else "L")label_mapdictionary mapping:"H"→"H""L"→"L""heavy"→"H""light"→"L"factor(label_map[as.character(input$ISOTOPELABELTYPE)], levels = c("H", "L", "NA"))inst/tinytest/test_utils_checks.R
.prepareForDataProcess()with multiple scenarios:PEPTIDEMODIFIEDSEQUENCEis renamed toPEPTIDESEQUENCEand the original column is removed"heavy"maps to factor level"H"and"light"maps to"L"with exact factor levelsc("H","L")"L"c("H","L")"test") produceNAvaluesc("H","L")preserved in mixed NA/L/H scenariosinst/tinytest/test_dataProcess.R
QuantDataDefault(result ofdataProcess(SRMRawData, ...))FeatureLevelData$LABELcontains light-label ("L") rowsProteinLevelDatais non-empty (nrow(...) > 0)"H") and light ("L") label preservationUnit Tests
Test Coverage Summary
test_utils_checks.R (new file, 92 lines):
test_dataProcess.R (2 additional assertions):
FeatureLevelData$LABELcontains light-label rowsProteinLevelDatanon-empty validationCoding Guidelines
No coding guideline violations detected. The implementation follows R conventions for factor creation and mapping, uses consistent naming conventions (
.prefix for internal functions), and maintains consistency with existing codebase patterns.