Ylim down#131
Conversation
📝 WalkthroughWalkthroughUpdates y-axis limit calculation for Profile/QC plots from hardcoded zeros to data-dependent values computed as Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
man/MSstatsPTM.Rd (1)
1-2:⚠️ Potential issue | 🔴 CriticalRevert this edit to
man/MSstatsPTM.Rdand add the URL toR/MSstatsPTM.Rinstead.This file is auto-generated by roxygen2 and must not be edited manually. The header explicitly directs you to edit
R/MSstatsPTM.R. Manual changes will be lost when documentation is regenerated.To properly add the website URL:
- Add it to the roxygen2 comments in
R/MSstatsPTM.R(in a@seealsosection)- Run
roxygen2::roxygenize()to regenerateman/MSstatsPTM.Rd- Commit the updated
R/MSstatsPTM.Randman/MSstatsPTM.Rd(auto-generated by roxygen2)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@man/MSstatsPTM.Rd` around lines 1 - 2, Revert the manual edits to man/MSstatsPTM.Rd (restore the version generated by roxygen2) and instead add the website URL into the roxygen comments of R/MSstatsPTM.R (for example in a `@seealso` block or other appropriate roxygen tag), then run roxygen2::roxygenize() to regenerate man/MSstatsPTM.Rd and commit the updated R/MSstatsPTM.R and the auto-generated man/MSstatsPTM.Rd; do not edit the .Rd file by hand.
🧹 Nitpick comments (3)
R/utils_dataProcessPlots.R (2)
875-880: Minor inconsistency: gate useslength(data.table.list) == 4here butplot_globalin the profile wrappers.
.profile.tmt(Line 556) and.profile.lf(Line 1380) gate the samec(ptm, protein)decision onplot_global, while.qc.tmt(Line 878) and.qc.lf(Line 1677) re-derive it fromlength(data.table.list) == 4. Functionally equivalent today, but it would be cleaner to computeplot_globalonce near the top of each wrapper (before this block) and use it consistently — either as part of the helper suggested above, or by hoisting the existingplot_global = length(...) == 4assignment up. That avoids future drift if one branch changes how "global protein data is present" is determined.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@R/utils_dataProcessPlots.R` around lines 875 - 880, Compute and reuse a single boolean "plot_global" flag instead of re-checking length(data.table.list) == 4 in multiple places: add a plot_global <- (length(data.table.list) == 4) near the top of each wrapper (.profile.tmt, .qc.tmt, .profile.lf, .qc.lf) and replace the inline checks (e.g., the expression used when building all_abund) with that plot_global variable so both c(ptm, protein) gating and other branches consistently use plot_global.
553-558: Recommended refactor: extract the duplicatedy.limdownauto-calculation into a small helper.The same two-line
all_abund/floor(min(...) - 3)pattern is repeated in.profile.tmt(Lines 553-558),.qc.tmt(Lines 875-880),.profile.lf(Lines 1377-1382), and.qc.lf(Lines 1674-1679). Centralizing it would keep the four wrappers in lock-step if the formula ever changes (e.g., switching the-3margin or the clamp), and would make the intent named/searchable.♻️ Example helper
# `@noRd` .compute.ylim.down = function(ylimDown, ptm_abund, protein_abund = NULL) { if (is.numeric(ylimDown)) return(ylimDown) all_abund = if (!is.null(protein_abund)) c(ptm_abund, protein_abund) else ptm_abund if (!length(all_abund) || all(is.na(all_abund))) return(0) max(floor(min(all_abund, na.rm = TRUE) - 3), 0) }Then each wrapper becomes a single call, e.g.:
- if (is.numeric(ylimDown)) { - y.limdown = ylimDown - } else { - all_abund = if (plot_global) c(datafeature.ptm$ABUNDANCE, datafeature.protein$ABUNDANCE) else datafeature.ptm$ABUNDANCE - y.limdown = max(floor(min(all_abund, na.rm = TRUE) - 3), 0) - } + y.limdown = .compute.ylim.down( + ylimDown, + datafeature.ptm$ABUNDANCE, + if (plot_global) datafeature.protein$ABUNDANCE else NULL + )As a side effect this also handles the edge case where
all_abundis empty or all-NA (currentlymin(..., na.rm = TRUE)would emit a warning and returnInf, producingy.limdown = Inf).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@R/utils_dataProcessPlots.R` around lines 553 - 558, Extract the duplicated y.limdown auto-calculation into a small helper (e.g. .compute.ylim.down) and replace the repeated blocks in .profile.tmt, .qc.tmt, .profile.lf and .qc.lf with a single call to that helper; the helper should accept ylimDown, ptm_abund and optional protein_abund, return ylimDown immediately if numeric, combine ptm and protein abundance when provided, handle empty/all-NA vectors by returning 0, and otherwise compute max(floor(min(all_abund, na.rm = TRUE) - 3), 0) so callers like the code that previously set y.limdown follow the same behavior without duplication.R/dataProcessPlotsPTM.R (1)
31-33: Doc update reads correctly and matches the implementation.The new
ylimDowndefault description aligns withmax(floor(min(all_abund, na.rm = TRUE) - 3), 0)inR/utils_dataProcessPlots.R.One small nit: in
.format.data.process.plots,ABUNDANCEis taken either fromINTENSITY(withlog2()applied) or from already-logged columns likeLOGINTENSITIES/LOG2INTENSITY(lines 91-98 ofR/utils_dataProcessPlots.R), so calling them strictly "log2(intensities)" is slightly imprecise for theLOGINTENSITIESpath. Consider "minimum log-intensity" to cover both cases.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@R/dataProcessPlotsPTM.R` around lines 31 - 33, Update the wording for ylimDown to use "minimum log-intensity" instead of "minimum log2(intensities)" to accurately reflect that ABUNDANCE in .format.data.process.plots is derived either by log2(INTENSITY) or taken from already-logged columns (LOGINTENSITIES / LOG2INTENSITY); change the param description in R/dataProcessPlotsPTM.R and any related user-facing docs to mention "minimum log-intensity" so it covers both paths, referencing .format.data.process.plots and the ABUNDANCE assignment logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@man/MSstatsPTM.Rd`:
- Around line 1-2: Revert the manual edits to man/MSstatsPTM.Rd (restore the
version generated by roxygen2) and instead add the website URL into the roxygen
comments of R/MSstatsPTM.R (for example in a `@seealso` block or other appropriate
roxygen tag), then run roxygen2::roxygenize() to regenerate man/MSstatsPTM.Rd
and commit the updated R/MSstatsPTM.R and the auto-generated man/MSstatsPTM.Rd;
do not edit the .Rd file by hand.
---
Nitpick comments:
In `@R/dataProcessPlotsPTM.R`:
- Around line 31-33: Update the wording for ylimDown to use "minimum
log-intensity" instead of "minimum log2(intensities)" to accurately reflect that
ABUNDANCE in .format.data.process.plots is derived either by log2(INTENSITY) or
taken from already-logged columns (LOGINTENSITIES / LOG2INTENSITY); change the
param description in R/dataProcessPlotsPTM.R and any related user-facing docs to
mention "minimum log-intensity" so it covers both paths, referencing
.format.data.process.plots and the ABUNDANCE assignment logic.
In `@R/utils_dataProcessPlots.R`:
- Around line 875-880: Compute and reuse a single boolean "plot_global" flag
instead of re-checking length(data.table.list) == 4 in multiple places: add a
plot_global <- (length(data.table.list) == 4) near the top of each wrapper
(.profile.tmt, .qc.tmt, .profile.lf, .qc.lf) and replace the inline checks
(e.g., the expression used when building all_abund) with that plot_global
variable so both c(ptm, protein) gating and other branches consistently use
plot_global.
- Around line 553-558: Extract the duplicated y.limdown auto-calculation into a
small helper (e.g. .compute.ylim.down) and replace the repeated blocks in
.profile.tmt, .qc.tmt, .profile.lf and .qc.lf with a single call to that helper;
the helper should accept ylimDown, ptm_abund and optional protein_abund, return
ylimDown immediately if numeric, combine ptm and protein abundance when
provided, handle empty/all-NA vectors by returning 0, and otherwise compute
max(floor(min(all_abund, na.rm = TRUE) - 3), 0) so callers like the code that
previously set y.limdown follow the same behavior without duplication.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a1abe8c0-ed7d-45b8-a6c3-c49c8eaf44a2
📒 Files selected for processing (4)
R/dataProcessPlotsPTM.RR/utils_dataProcessPlots.Rman/MSstatsPTM.Rdman/dataProcessPlotsPTM.Rd
Motivation and Context
This PR modifies the default behavior for the
ylimDownparameter in thedataProcessPlotsPTMfunction, which controls the lower y-axis limit for Profile and QC plots. Previously, whenylimDownwas not explicitly specified (defaultFALSE), plots automatically used a lower limit of 0. This change implements a data-driven lower limit calculation that adapts to the actual intensity ranges in the dataset, specifically using the minimum log2-transformed abundance value (log2 intensity) minus 3, floored and clamped at 0 to prevent negative axis limits.Changes Made
Documentation Updates (
R/dataProcessPlotsPTM.Rand generatedman/dataProcessPlotsPTM.Rd):ylimDownparameter documentation to describe the new default behavior: "floor of minimum log2(intensities) after normalization - 3, or 0 if that value is negative"Core Logic Implementation (
R/utils_dataProcessPlots.R):.profile.tmt,.qc.tmt,.profile.lf,.qc.lf) to computey.limdowndynamically whenylimDownis not provided as a numeric valueylimDown = FALSE(default), the new calculation is:y.limdown = max(floor(min(all_abund, na.rm = TRUE) - 3), 0)all_abundvector includes both PTM and protein abundances when protein/global data is present, otherwise only PTM abundancesylimDownis explicitly provided as a numeric value, the behavior remains unchanged (uses the provided value)Package Documentation (
man/MSstatsPTM.Rd):https://vitek-lab.github.io/MSstatsPTM/)Minor Punctuation Fix: One commit addresses a minor punctuation issue
Unit Tests
No new unit tests were added or modified to verify these changes. The existing test suite (
inst/tinytest/) contains tests for data summarization, conversion utilities, and parameter validation checks, but does not include tests for the plotting functionality (dataProcessPlotsPTM, ProfilePlot, or QCPlot).Coding Guidelines Violations
Minor inconsistencies in code style:
Ternary-style if expressions: The code uses
all_abund = if (condition) value1 else value2syntax, which is unconventional for R style (more common in functional programming languages). This pattern is applied consistently across all four modified functions but may not align with traditional R coding guidelines that typically prefer explicit conditional statements.Inconsistent protocol detection: Different functions use different methods to detect whether protein data is present: some use
plot_globalvariable while others uselength(data.table.list) == 4. Both approaches are functionally equivalent but represent inconsistent code patterns within the same file.