fix(checks): support 'heavy'/'light' IsotopeLabelType explicitly by tonywu1999 · Pull Request #189 · Vitek-Lab/MSstats

tonywu1999 · 2026-04-12T20:56:46Z

PR Type

Bug fix, Tests

Description

Normalize heavy/light isotope labels
Preserve consistent H/L factor levels
Add preprocessing edge-case coverage
Verify SRM light/protein outputs

Diagram Walkthrough

flowchart LR
  a["Raw ISOTOPELABELTYPE values"]
  b["Map heavy/light to H/L"]
  c["Create normalized factor levels"]
  d["Validate preprocessing and SRM outputs"]
  a -- "normalize" --> b
  b -- "build" --> c
  c -- "covered by" --> d

File Walkthrough

Relevant files

Bug fix

utils_checks.R `Normalize isotope label aliases during preprocessing` R/utils_checks.R Add explicit `heavy`/`light` to `H`/`L` mapping Rebuild `ISOTOPELABELTYPE` from normalized values Restrict factor levels to present `H`/`L` states	+10/-6

Tests

test_dataProcess.R Strengthen SRM `dataProcess` output assertions inst/tinytest/test_dataProcess.R Assert `FeatureLevelData` contains `L` labels Assert `ProteinLevelData` is non-empty	+8/-0
test_utils_checks.R `Cover isotope label normalization edge cases` inst/tinytest/test_utils_checks.R Add rename coverage for `PEPTIDEMODIFIEDSEQUENCE` Test `heavy`/`light` normalization to `H`/`L` Verify single and mixed label factor levels Confirm unknown and `NA` labels become `NA`	+92/-0

Motivation and Context

The MSstats package needed to support explicit "heavy"/"light" IsotopeLabelType encodings in addition to the canonical "H"/"L" values. Previously, the code did not normalize these alternative labels, causing inconsistent factor level handling. This fix introduces a normalization layer that maps these common labels to standard values while ensuring factor levels are consistently set based on the actual data present.

Solution Summary

The PR adds an isotope label mapping mechanism that:

Converts explicit "heavy" and "light" strings to canonical "H" and "L" values
Normalizes factor levels to reflect only the labels actually present in the data (instead of forcing levels unconditionally)
Handles edge cases including unknown labels (mapped to NA) and NA inputs (preserved as NA with appropriate factor levels)

Detailed Changes

R/utils_checks.R

Replaced label handling logic in .prepareForDataProcess():
- Removed the previous conditional factor level assignment (if (uniqueN == 2) then c("H","L") else "L")
- Introduced an explicit label_map dictionary mapping:
  - "H" → "H"
  - "L" → "L"
  - "heavy" → "H"
  - "light" → "L"
- Applied mapping via factor(label_map[as.character(input$ISOTOPELABELTYPE)], levels = c("H", "L", "NA"))
Impact: Non-H/L encodings are now normalized; factor levels are restricted to mapped values intersected with canonical levels, eliminating the prior behavior of forcing all non-2-level cases to level "L"
Lines changed: +12/-10 (net +2)

inst/tinytest/test_utils_checks.R

New comprehensive test file covering .prepareForDataProcess() with multiple scenarios:
- Column renaming: Verifies PEPTIDEMODIFIEDSEQUENCE is renamed to PEPTIDESEQUENCE and the original column is removed
- Heavy/light mapping: Confirms "heavy" maps to factor level "H" and "light" maps to "L" with exact factor levels c("H","L")
- Single-label cases: Tests all-"light" and all-"L" inputs produce single factor level "L"
- Mixed labels: Verifies H/L string inputs preserve per-row labels with factor levels c("H","L")
- Unknown labels: Confirms unmapped labels (e.g., "test") produce NA values
- NA handling: Validates NA inputs map to NA, with factor levels c("H","L") preserved in mixed NA/L/H scenarios
Lines changed: +92/-0

inst/tinytest/test_dataProcess.R

Strengthened SRM output validation: Added assertions for QuantDataDefault (result of dataProcess(SRMRawData, ...))
- Confirms FeatureLevelData$LABEL contains light-label ("L") rows
- Verifies ProteinLevelData is non-empty (nrow(...) > 0)
Impact: Expands coverage to validate both heavy ("H") and light ("L") label preservation
Lines changed: +8/-0

Unit Tests

Test Coverage Summary

test_utils_checks.R (new file, 92 lines):
- 1 test for PEPTIDEMODIFIEDSEQUENCE renaming
- 7 tests for IsotopeLabelType normalization covering: heavy/light mapping, single labels, mixed H/L, unknown labels, NA propagation, and mixed NA/L/H scenarios
- All tests validate both value-level behavior (including NA propagation) and factor level outcomes
test_dataProcess.R (2 additional assertions):
- Assertion 1: FeatureLevelData$LABEL contains light-label rows
- Assertion 2: ProteinLevelData non-empty validation

Coding Guidelines

No coding guideline violations detected. The implementation follows R conventions for factor creation and mapping, uses consistent naming conventions (. prefix for internal functions), and maintains consistency with existing codebase patterns.

coderabbitai · 2026-04-12T20:56:59Z

Warning

Rate limit exceeded

@tonywu1999 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 13 minutes and 1 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 13 minutes and 1 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 36b0dc31-6892-4365-a7ca-c31109cc48d1

📥 Commits

Reviewing files that changed from the base of the PR and between 13fadf0 and 7030c00.

📒 Files selected for processing (1)

R/utils_checks.R

📝 Walkthrough

Walkthrough

Introduced an internal helper .mapIsotopeLabelType() that maps various IsotopeLabelType encodings (e.g., "heavy"/"light") to canonical "H"/"L" and uses those mapped values as factor levels in .checkUnProcessedDataValidity() and .prepareForDataProcess(). Added tests covering many IsotopeLabelType scenarios.

Changes

Cohort / File(s)	Summary
Label mapping helper `R/utils_checks.R`	Added `.mapIsotopeLabelType()` and replaced prior `factor()`+binary-level fallback logic with an explicit mapping to canonical `"H"`/`"L"` and restricted factor levels.
Updated tests (existing) `inst/tinytest/test_dataProcess.R`	Expanded assertions for `dataProcess(...)` outputs: check presence of `"L"` in `FeatureLevelData$LABEL` and that `ProteinLevelData` is non-empty.
New tests (coverage) `inst/tinytest/test_utils_checks.R`	New test file exercising `.prepareForDataProcess()` with varied `IsotopeLabelType` inputs: `"heavy"`→`"H"`, `"light"`→`"L"`, mixed `"H"/"L"`, unknown labels→`NA`, and NA propagation; asserts values and factor levels.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Add unit tests for normalization #188: Related label-handling tests and dataProcess label preservation checks.

Poem

🐰
Hop-hop, I mapped the tags just right,
"heavy" to H and "light" to L in sight,
Tests nibble through cases, bold and spry,
Missing labels hide, mixed levels fly,
A little rabbit cheers — code clarified!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description deviates from the template structure, using a custom format with diagrams and file walkthrough tables instead of the required sections.	Restructure the description to follow the template: add 'Motivation and Context' section, provide detailed bullet-point 'Changes' list, describe 'Testing' additions, and complete the checklist.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding explicit support for 'heavy'/'light' IsotopeLabelType values, which aligns with the core fix implemented in utils_checks.R.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat-turnover-1

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-12T20:58:34Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

github-actions · 2026-04-12T20:59:41Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Normalize label values first Normalize `ISOTOPELABELTYPE` before the lookup. As written, common inputs like `Heavy`, `LIGHT`, or values with surrounding spaces are converted to `NA`, which will silently misclassify valid rows. R/utils_checks.R [234-243] +normalized_labels <- tolower(trimws(as.character(input$ISOTOPELABELTYPE))) label_map <- c( - "H" = "H", - "L" = "L", + "h" = "H", + "l" = "L", "heavy" = "H", "light" = "L" ) +mapped_labels <- unname(label_map[normalized_labels]) input$ISOTOPELABELTYPE <- factor( - label_map[as.character(input$ISOTOPELABELTYPE)], - levels = intersect(c("H", "L"), label_map[as.character(input$ISOTOPELABELTYPE)]) + mapped_labels, + levels = intersect(c("H", "L"), mapped_labels) ) Suggestion importance[1-10]: 6 __ Why: This is a valid robustness improvement because the current `label_map` only handles exact `H`/`L`/`heavy`/`light` values, so case or whitespace variants would become `NA`. It could prevent mislabeling in `.prepareForDataProcess`, but the PR itself does not show those variants are currently expected.	Low

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@R/utils_checks.R`:
- Around line 234-243: The label normalization change must be applied to the
raw-input validation path too: update .checkUnProcessedDataValidity to use the
same label_map and factor creation as in .prepareForDataProcess (use
label_map[as.character(input$ISOTOPELABELTYPE)] and set ISOTOPELABELTYPE <-
factor(..., levels = intersect(c("H","L"),
label_map[as.character(input$ISOTOPELABELTYPE)]))) instead of the old
single-level fallback to "L", so single-level values like "heavy" map to "H"
consistently across both preprocessing paths.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b799e890-dfc9-41e8-ab17-13ff6a9c5ecd

📥 Commits

Reviewing files that changed from the base of the PR and between 0cd7d27 and f4c4189.

📒 Files selected for processing (3)

R/utils_checks.R
inst/tinytest/test_dataProcess.R
inst/tinytest/test_utils_checks.R

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

R/utils_checks.R (1)

219-227: ⚠️ Potential issue | 🟠 Major

Normalize ISOTOPELABELTYPE before enforcing the “max 2 levels” check.

The validation currently runs on raw values, so alias mixes like "H", "heavy", and "L" can be incorrectly rejected as “more than two levels” before canonicalization.

Proposed fix

-    if (data.table::uniqueN(input$ISOTOPELABELTYPE) > 2) {
+    input$ISOTOPELABELTYPE <- .mapIsotopeLabelType(input$ISOTOPELABELTYPE)
+    normalized_labels <- stats::na.omit(as.character(input$ISOTOPELABELTYPE))
+    if (data.table::uniqueN(normalized_labels) > 2) {
         getOption("MSstatsLog")(
           "ERROR",  paste(
             "There are more than two levels of labeling.",
             "So far, only label-free or reference-labeled experiment are supported. - stop"))
         stop("Statistical tools in MSstats are only proper for label-free or with reference peptide experiments.")
     }
-    
-    input$ISOTOPELABELTYPE <- .mapIsotopeLabelType(input$ISOTOPELABELTYPE)
     input
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@R/utils_checks.R` around lines 219 - 227, Normalize ISOTOPELABELTYPE before
counting unique levels: call .mapIsotopeLabelType on input$ISOTOPELABELTYPE
first (so aliases like "H"/"heavy"/"L" map to canonical values), then perform
data.table::uniqueN(...) and the error/stop logic; update the check that
currently uses data.table::uniqueN(input$ISOTOPELABELTYPE) to use the mapped
value and keep the existing processLogger/error message and stop call intact.

🧹 Nitpick comments (1)

R/utils_checks.R (1)

150-157: Consider case-insensitive, trimmed label mapping for better alias handling.

Current mapping is exact-match only; variants like "Heavy", " light ", or "h" become NA. Making this normalization tolerant would reduce avoidable missing labels.

Proposed refactor

 .mapIsotopeLabelType = function(x) {
-    label_map <- c(
-        "H"     = "H",
-        "L"     = "L",
-        "heavy" = "H",
-        "light" = "L"
-    )
-    mapped <- label_map[as.character(x)]
+    label_map <- c(
+        "h"     = "H",
+        "l"     = "L",
+        "heavy" = "H",
+        "light" = "L"
+    )
+    key <- tolower(trimws(as.character(x)))
+    mapped <- unname(label_map[key])
     factor(mapped, levels = intersect(c("H", "L"), mapped))
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@R/utils_checks.R` around lines 150 - 157, Normalize and trim input before
lookup: create a normalized version of x (e.g., lowercased and
whitespace-trimmed) and map common aliases like "h", "l", "heavy", "light" to
canonical "H"/"L" via the existing label_map; replace the current lookup (mapped
<- label_map[as.character(x)]) with one that uses the normalized values, then
produce the factor with levels = intersect(c("H","L"), mapped) as before
(referencing label_map, mapped, x, and the factor call).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@R/utils_checks.R`:
- Around line 219-227: Normalize ISOTOPELABELTYPE before counting unique levels:
call .mapIsotopeLabelType on input$ISOTOPELABELTYPE first (so aliases like
"H"/"heavy"/"L" map to canonical values), then perform data.table::uniqueN(...)
and the error/stop logic; update the check that currently uses
data.table::uniqueN(input$ISOTOPELABELTYPE) to use the mapped value and keep the
existing processLogger/error message and stop call intact.

---

Nitpick comments:
In `@R/utils_checks.R`:
- Around line 150-157: Normalize and trim input before lookup: create a
normalized version of x (e.g., lowercased and whitespace-trimmed) and map common
aliases like "h", "l", "heavy", "light" to canonical "H"/"L" via the existing
label_map; replace the current lookup (mapped <- label_map[as.character(x)])
with one that uses the normalized values, then produce the factor with levels =
intersect(c("H","L"), mapped) as before (referencing label_map, mapped, x, and
the factor call).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 99a28fb7-f0a4-4f41-8392-d57a313d28f6

📥 Commits

Reviewing files that changed from the base of the PR and between f4c4189 and 13fadf0.

📒 Files selected for processing (1)

R/utils_checks.R

tonywu1999 added 4 commits April 12, 2026 16:37

fix(checks): support 'heavy'/'light' IsotopeLabelType explicitly

f1892bd

added unit tests

2633317

fix unit tests

b0f8c8e

really fix unit tests

f4c4189

tonywu1999 changed the title ~~Feat turnover 1~~ fix(checks): support 'heavy'/'light' IsotopeLabelType explicitly Apr 12, 2026

github-actions Bot added the Review effort 2/5 label Apr 12, 2026

coderabbitai Bot reviewed Apr 12, 2026

View reviewed changes

Comment thread R/utils_checks.R Outdated

address coderabbit comment

13fadf0

coderabbitai Bot reviewed Apr 12, 2026

View reviewed changes

address coderabbit comment 2

7030c00

tonywu1999 merged commit d9418ed into devel Apr 12, 2026
2 checks passed

tonywu1999 deleted the feat-turnover-1 branch April 12, 2026 21:21

coderabbitai Bot mentioned this pull request Apr 14, 2026

feat(plotting): Refactor QC plot headings to be compatible for protein turnover #194

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(checks): support 'heavy'/'light' IsotopeLabelType explicitly#189

fix(checks): support 'heavy'/'light' IsotopeLabelType explicitly#189
tonywu1999 merged 6 commits intodevelfrom
feat-turnover-1

tonywu1999 commented Apr 12, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Apr 12, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Apr 12, 2026

Uh oh!

github-actions Bot commented Apr 12, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tonywu1999 commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Type

Description

Diagram Walkthrough

File Walkthrough

Motivation and Context

Solution Summary

Detailed Changes

R/utils_checks.R

inst/tinytest/test_utils_checks.R

inst/tinytest/test_dataProcess.R

Unit Tests

Test Coverage Summary

Coding Guidelines

Uh oh!

coderabbitai Bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Apr 12, 2026

PR Reviewer Guide 🔍

Uh oh!

github-actions Bot commented Apr 12, 2026

PR Code Suggestions ✨

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tonywu1999 commented Apr 12, 2026 •

edited

Loading

coderabbitai Bot commented Apr 12, 2026 •

edited

Loading