feat(plotting): Refactor QC plot headings to be compatible for protein turnover by tonywu1999 · Pull Request #194 · Vitek-Lab/MSstats

tonywu1999 · 2026-04-14T20:00:16Z

PR Type

Bug fix, Enhancement

Description

Preserve original turnover groups
Use is_labeled_ref in plots
Support Heavy/Light label fallback
Fix QC group axis handling

Diagram Walkthrough

flowchart LR
  A["Validated turnover input"]
  B["Original GROUP and SUBJECT kept"]
  C["Plot labels derived from metadata"]
  D["QC and turnover plots corrected"]
  A -- "preserve metadata" --> B
  B -- "drives plot grouping" --> D
  C -- "uses `is_labeled_ref` or fallback labels" --> D

File Walkthrough

Relevant files

Bug fix

dataProcessPlots.R `Improve turnover label and group plotting` R/dataProcessPlots.R Prefer `is_labeled_ref` for `LABEL` mapping Add fallback `Heavy`/`Light` label handling Update single-label naming and colors Keep all `GROUP` values in axes	+26/-23
utils_checks.R `Retain original grouping in validation` R/utils_checks.R Preserve `GROUP_ORIGINAL` as `GROUP` Preserve `SUBJECT_ORIGINAL` as `SUBJECT` Remove hard-coded `"0"` turnover assignments	+2/-2

Motivation and Context

This PR adjusts plotting functions to properly handle protein turnover experiments, which differ from SRM (Selected Reaction Monitoring) experiments in their use of isotope labels. In protein turnover studies, both heavy (H) and light (L) labeled channels represent independent biological measurements and are processed independently. The PR refactors data preparation and plotting logic to support this distinction through the introduction of an is_labeled_ref column that marks whether a row represents a labeled reference (used in SRM) versus endogenous labeled channels (used in turnover studies).

Detailed Changes

R/utils_checks.R - Data Column Preparation

Removed conditional GROUP/SUBJECT assignment: Previously, the function conditionally assigned GROUP and SUBJECT based on LABEL == "L", using a hardcoded sentinel value "0" for non-"L" labels
Simplified logic: Now directly assigns GROUP and SUBJECT from their original values (GROUP_ORIGINAL and SUBJECT_ORIGINAL respectively)
This eliminates the "0" sentinel value that was preventing proper group numbering in downstream processing

R/dataProcessPlots.R - Profile and QC Plotting Functions

Added is_labeled_ref column detection: Both .plotProfile and .plotQC functions now check for the presence of the is_labeled_ref column
- If present: LABEL is remapped to "Reference" (for TRUE values) and "Endogenous" (for FALSE values) with appropriate color coding
- If absent: Falls back to legacy label detection logic that maps labels based on their presence and count
Simplified tempGroupName construction: Removed conditional logic that filtered based on length(unique(processed$LABEL)) == 2 and GROUP == '0'; now always orders by RUN before deriving group levels and factoring GROUP
Applied consistent LABEL handling: Both plot functions use the same LABEL remapping logic for consistency

Unit Tests

Existing tests in inst/tinytest/test_dataProcess.R validate the is_labeled_ref column behavior:

Tests verify that when is_labeled_ref=TRUE is present, protein summarization uses the labeled reference path (via .runTukey with is_labeled_reference=TRUE)
Tests confirm that labeled reference rows (H channel in SRM) are not imputed despite being censored
Tests validate that labeled endogenous rows (L channel) receive proper imputation when censored

Coding Guidelines

No coding guideline violations detected. The changes maintain consistency with the existing codebase patterns for data manipulation using data.table, conditional column operations, and factor level assignments.

coderabbitai · 2026-04-14T20:00:31Z

📝 Walkthrough

Walkthrough

The PR refactors label and group handling logic across two utility files. Changes simplify tempGroupName construction, replace conditional label remapping with a column-aware approach (checking for is_labeled_ref column), and remove conditional derivation of GROUP and SUBJECT from LABEL values in the column processing function.

Changes

Cohort / File(s)	Summary
Label Remapping Logic `R/dataProcessPlots.R`	Replaced conditional `LABEL` remapping with column-aware approach: if `is_labeled_ref` column exists, map to "Reference"/"Endogenous"; otherwise detect levels and map to "Heavy"/"Light" or "Endogenous"/"Heavy" based on detected labels. Simplified `tempGroupName` construction by removing conditional filtering. Applied changes to both `.plotProfile()` and `.plotQC()` functions.
Column Processing Updates `R/utils_checks.R`	Modified `.updateColumnsForProcessing()` to directly assign `GROUP` and `SUBJECT` from their `_ORIGINAL` counterparts instead of deriving them conditionally from `LABEL == "L"`, removing the sentinel `"0"` value injection for non-`"L"` labels.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Vitek-Lab/MSstats#189: Modifies isotope label normalization and heavy/light label mapping downstream of label assignment logic.
Vitek-Lab/MSstats#192: Directly aligned refactor introducing is_labeled_reference column handling and replacing LABEL-based conditionals throughout the codebase.

Suggested labels

Review effort 3/5

Poem

🐰 Labels were tangled, conditions were deep,
Now columns speak clearly, no secrets to keep,
A "Reference" here, "Endogenous" there,
The logic flows cleaner through processing air! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description uses a custom format instead of the required template, lacks detailed motivation/context, and leaves the 'Changes' section empty despite significant code modifications.	Restructure the description following the template: add comprehensive 'Motivation and Context' section, replace the diagram with a detailed bullet-point 'Changes' list, add 'Testing' subsection describing unit tests, and complete the pre-review checklist.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check	✅ Passed	The title 'Refactor QC plot headings to be compatible for protein turnover' is partially related to the changeset. It specifically references QC plot headings and protein turnover compatibility, which align with some aspects of the changes (particularly in .plotQC modifications). However, the actual changes are broader—they refactor GROUP/SUBJECT derivation logic, LABEL remapping based on is_labeled_ref detection, and affect both profile and QC plotting functions. The title narrows focus to 'QC plot headings' when the changes span multiple plotting functions and core label handling logic.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat-turnover-final

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-14T20:02:24Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Missing labels When `is_labeled_ref` is present but some rows are `NA`, this branch overwrites `LABEL` with `NA` instead of falling back to the existing label values. In partially annotated turnover inputs, those rows can drop out of the plot or lose their legend/color assignment. if ("is_labeled_ref" %in% colnames(processed)) { processed[, LABEL := factor( ifelse(is_labeled_ref, "Reference", "Endogenous"), levels = c("Reference", "Endogenous") )] Fallback relabel The two-label fallback renames any pair of label values to `Heavy`/`Light` purely from factor level order. If an input already uses semantic labels such as `Reference` and `Endogenous` but does not include `is_labeled_ref`, the plot legend will be wrong and the series can even be swapped by alphabetical ordering. label_levels = levels(factor(processed$LABEL)) if (length(label_levels) == 2) { processed[, LABEL := factor(LABEL, labels = c("Heavy", "Light"))] } else if ("L" %in% label_levels) {

github-actions · 2026-04-14T20:04:14Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Guard label remapping `length(label_levels) == 2` will relabel any two-valued input as `Heavy/Light`, which silently mislabels older `Reference/Endogenous` datasets when `is_labeled_ref` is absent. Branch on the actual raw `LABEL` values instead, and keep unknown schemes unchanged rather than forcing them into turnover labels. Mirror the same fix in the later `LABEL` normalization block as well. R/dataProcessPlots.R [295-302] -label_levels = levels(factor(processed$LABEL)) -if (length(label_levels) == 2) { - processed[, LABEL := factor(LABEL, labels = c("Heavy", "Light"))] -} else if ("L" %in% label_levels) { - processed[, LABEL := factor(LABEL, labels = c("Endogenous"))] +label_levels = sort(unique(stats::na.omit(as.character(processed$LABEL)))) +if (identical(label_levels, c("H", "L"))) { + processed[, LABEL := factor(as.character(LABEL), + levels = c("H", "L"), + labels = c("Heavy", "Light"))] +} else if (identical(label_levels, "L")) { + processed[, LABEL := factor("Endogenous", levels = "Endogenous")] +} else if (identical(label_levels, "H")) { + processed[, LABEL := factor("Heavy", levels = "Heavy")] } else { - processed[, LABEL := factor(LABEL, labels = c("Heavy"))] + processed[, LABEL := factor(as.character(LABEL), levels = label_levels)] } Suggestion importance[1-10]: 7 __ Why: This correctly points out that `length(label_levels) == 2` is too broad and can mislabel any two-valued `LABEL` scheme as `Heavy/Light`. It addresses a real plotting/legend correctness issue in `dataProcessPlots`, though its impact is limited to label interpretation rather than core computation.	Medium
Possible issue	Skip placeholder axis group `tempGroupName` now includes every `GROUP`, so legacy labeled inputs that still carry placeholder group `0` will produce a fake x-axis group and shift `groupAxis`/`lineNameAxis`. Exclude that placeholder only when multiple labels are present so the plot remains correct for both old and new data layouts. R/dataProcessPlots.R [264-265] tempGroupName = unique(processed[, c("GROUP", "RUN")]) +if (data.table::uniqueN(stats::na.omit(processed$LABEL)) > 1 && + "0" %in% as.character(tempGroupName$GROUP)) { + tempGroupName = tempGroupName[as.character(GROUP) != "0"] +} tempGroupName = tempGroupName[order(RUN), ] Suggestion importance[1-10]: 3 __ Why: The suggestion is plausible as a backward-compatibility safeguard because a placeholder `GROUP` value of `0` would distort `groupAxis` and `lineNameAxis`. However, this PR also changes `R/utils_checks.R` so `GROUP` now comes directly from `GROUP_ORIGINAL`, making the issue mostly defensive rather than clearly necessary in the new flow.	Low

tonywu1999 added 4 commits April 14, 2026 15:26

remove the hard-coded "0" for GROUP

0748d48

change group number to process qc plots properly

eb86e66

use is_labeled_ref

241c30e

adjust plots for protein turnover

7483f5e

tonywu1999 changed the title ~~Feat turnover final~~ feat(plotting): Rename QC plot and profile plot headings for protein turnover Apr 14, 2026

github-actions Bot changed the title ~~feat(plotting): Rename QC plot and profile plot headings for protein turnover~~ Feat turnover final Apr 14, 2026

github-actions Bot added the Review effort 2/5 label Apr 14, 2026

fix srm test file

d2df267

tonywu1999 changed the title ~~Feat turnover final~~ feat(plotting): Rename QC plot headings for protein turnover Apr 14, 2026

tonywu1999 changed the title ~~feat(plotting): Rename QC plot headings for protein turnover~~ feat(plotting): Refactor QC plot headings to be compatible for protein turnover Apr 14, 2026

fix profile plots

3a986ab

tonywu1999 merged commit f07eeeb into devel Apr 14, 2026
2 checks passed

tonywu1999 deleted the feat-turnover-final branch April 14, 2026 20:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(plotting): Refactor QC plot headings to be compatible for protein turnover#194

feat(plotting): Refactor QC plot headings to be compatible for protein turnover#194
tonywu1999 merged 6 commits intodevelfrom
feat-turnover-final

tonywu1999 commented Apr 14, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Apr 14, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tonywu1999 commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Type

Description

Diagram Walkthrough

File Walkthrough

Motivation and Context

Detailed Changes

R/utils_checks.R - Data Column Preparation

R/dataProcessPlots.R - Profile and QC Plotting Functions

Unit Tests

Coding Guidelines

Uh oh!

coderabbitai Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Apr 14, 2026

PR Reviewer Guide 🔍

Uh oh!

github-actions Bot commented Apr 14, 2026

PR Code Suggestions ✨

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tonywu1999 commented Apr 14, 2026 •

edited

Loading

coderabbitai Bot commented Apr 14, 2026 •

edited

Loading