Skip to content

feat(protein turnover): Create initial protein turnover functions for ratio calculation#7

Merged
sszvetecz merged 4 commits intodevelfrom
main
Apr 9, 2026
Merged

feat(protein turnover): Create initial protein turnover functions for ratio calculation#7
sszvetecz merged 4 commits intodevelfrom
main

Conversation

@tonywu1999
Copy link
Copy Markdown
Contributor

@tonywu1999 tonywu1999 commented Apr 3, 2026

Motivation and Context

This PR adds foundational protein turnover functionality to MSstatsResponse, enabling conversion of MSstats FeatureLevelData into heavy/light (H/L) turnover ratios and integrating support for precomputed ratios across the dose-response, IC50 prediction, and visualization code paths. This lets users fit isotonic/IC50 models directly on ratio data (e.g., tracer incorporation experiments) and supports bidirectional (increasing/decreasing) model selection.

Detailed Changes

  • New protein-turnover helper and utilities

    • Added calculateTurnoverRatios(feature_data, channel_col = "LABEL", heavy_label = "H", light_label = "L", time_col = "GROUP", peptide_col = "PEPTIDE", protein_col = "PROTEIN", intensity_col = "INTENSITY", run_col = "RUN", peptide_selector = NULL, agg_function = max, normalize_tracer = FALSE, tracer_constants = NULL)
      • Validates/standardizes input columns to Protein, BaseSequence (strips bracketed suffixes), Label, TimeVal (via parse_timepoint), Intensity, Run
      • Filters to heavy/light labels, drops unparseable timepoints
      • Optional peptide selection via peptide_selector (grouped group_modify)
      • Aggregates duplicates on (Protein, BaseSequence, TimeVal, Run, Label) using agg_function (default max with na.rm=TRUE)
      • Pivots heavy/light to wide, computes Total, H_frac, L_frac
      • Optional tracer normalization when normalize_tracer=TRUE using provided tracer_constants
      • Returns data frame with Protein, BaseSequence, TimeVal, Run, Heavy, Light, Total, H_frac, L_frac (or empty data.frame() with warning if nothing matches)
    • Added selectTopNPeptides(df, n = 10): example peptide selector that returns top-n BaseSequence per Protein by summed intensity
    • Added non-exported parse_timepoint(time_strings): parses numeric hours from strings, converts days/weeks to hours
  • Isotonic regression / dose-response changes

    • doseResponseFit(): supports increasing parameter as logical or "both"
      • When "both", fits decreasing and increasing isotonic models per protein, applies BH FDR per direction, selects direction with larger F_statistic (ties → decreasing), returns single row per protein with new direction column
      • Added precalculated_ratios = FALSE parameter; when TRUE, skips internal ratio computation
      • Output now includes direction column
    • fitIsotonicRegression(): added precalculated_ratios parameter and returned model list includes the flag
  • IC50 prediction & bootstrapping

    • predictIC50():
      • increasing accepts "both"
      • Added precalculated_ratios parameter; target_response default changed 0.5 → NULL with auto-derivation (0.5 for decreasing, 1.5 for increasing); error if precalculated_ratios=TRUE and target_response not provided
      • Returns direction column
    • IC50_prediction.R:
      • Baseline/target-response logic conditional on precalculated_ratios (uses mean(y[x==0]) when precalculated)
      • Supports increasing="both" by fitting both directions and selecting prediction (default to decreasing when both succeed)
      • Added bootstrapIC50Precalculated(dose, response, n_samples = 1000, alpha = 0.10, increasing = FALSE, target_response = 0.5, transform_x = TRUE) for bootstrapping on precomputed ratio responses; integrated into bootstrap control flow
  • Visualization

    • visualizeResponseProtein(): added precalculated_ratios, color_by, x_lab, y_lab, title; increasing supports "both"; default alpha changed 0.05 → 0.10; fits both directions when requested and selects by SSE; passes increasing_final and precalculated_ratios to predictIC50
    • plotIsotonic(): added precalculated_ratios, color_by, original_data; x_lab/y_lab default to NULL with internal defaults; target_response default set explicitly to 0.5; conditional point coloring when color_by + original_data provided; adjusted IC50 target computation to use local adjusted_target and bypass DMSO-based log2 adjustment when precalculated_ratios or ratio enabled; CI bands drawn only when ci contains non-missing ci_lower and ci_upper
  • Package metadata, namespace, docs, vignette

    • NAMESPACE: exports calculateTurnoverRatios and selectTopNPeptides; added imports: dplyr::pull/rename/slice_head/ungroup, stringr::str_detect/str_extract/str_remove_all, tidyr::pivot_wider
    • DESCRIPTION: added tidyr and stringr to Imports
    • Added roxygen-generated man pages: calculateTurnoverRatios, selectTopNPeptides, bootstrapIC50Precalculated, parse_timepoint; updated man pages for doseResponseFit, predictIC50, visualizeResponseProtein, plotIsotonic, fitIsotonicRegression
    • Added vignette vignettes/Protein_turnover_vignettes.Rmd demonstrating a complete turnover workflow (data preprocessing, calculateTurnoverRatios with tracer normalization, formatting output for MSstatsResponse, and visualization)

Unit Tests

  • No new unit tests were added for:
    • calculateTurnoverRatios, selectTopNPeptides, parse_timepoint
    • new precalculated_ratios mode paths in doseResponseFit / fitIsotonicRegression / predictIC50 / IC50 bootstrap functions
    • visualization paths exercising precalculated_ratios and increasing="both"
  • Existing test files (e.g., test-PredictIC50.R, test_Isotonic_Regression.R, test-IC50_prediction.R, test-Visualize_Isotonic_Fit.R) remain but do not cover the new turnover helpers or the new parameter modes.

Coding Guidelines / Issues

  • Primary code-quality gap: missing unit tests for all newly added functionality and new code paths (protein-turnover helpers, parse_timepoint, selectTopNPeptides, precalculated_ratios modes, bootstrapIC50Precalculated, increasing="both" branches in fitting/visualization). These should be added before merging.
  • Other notes (non-blocking):
    • New package Imports (tidyr, stringr) and additional dplyr symbols are consistent with tidyverse usage in the project but increase surface area—verify namespace collisions and NAMESPACE ordering if necessary.
    • Behavior change: predictIC50 now errors when precalculated_ratios=TRUE and target_response not provided; ensure downstream callers (including vignette/examples) supply target_response or accept the new behavior.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 3, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 03131f98-7d7f-45f4-ab68-6fc7fc3d64e0

📥 Commits

Reviewing files that changed from the base of the PR and between cd1cdd7 and 61d206b.

📒 Files selected for processing (3)
  • DESCRIPTION
  • NAMESPACE
  • man/DoseResponseFit.Rd
✅ Files skipped from review due to trivial changes (2)
  • DESCRIPTION
  • NAMESPACE
🚧 Files skipped from review as they are similar to previous changes (1)
  • man/DoseResponseFit.Rd

📝 Walkthrough

Walkthrough

Adds bidirectional isotonic fitting with per-protein model selection, a precalculated_ratios mode propagated through fitting/IC50/visualization, and a new protein turnover module (ratio calculation + peptide selection) with vignette and docs.

Changes

Cohort / File(s) Summary
Isotonic Regression Core
R/Fit_Isotonic_Regression.R
increasing accepts TRUE/FALSE/"both"; fits both directions per protein when requested, selects by F-statistic; added precalculated_ratios param and direction output column.
IC50 Prediction & Bootstrapping
R/IC50_prediction.R, R/PredictIC50.R
Adds precalculated_ratios flow including a new bootstrapIC50Precalculated(); target_response default → NULL with auto-derivation; supports increasing="both"; direction included in outputs and propagated into bootstrap/CI paths.
Visualization & Plotting
R/Visualize_Isotonic_Fit.R
visualizeResponseProtein() and plotIsotonic() accept precalculated_ratios, color_by, custom labels/title; handle increasing="both" and pass selected direction into CI/IC50 calls; tightened CI rendering.
Protein Turnover Module
R/protein_turnover_ratio_helper.R
New exports calculateTurnoverRatios() (FeatureLevelData → heavy/light fractions, optional peptide selection, aggregation, tracer normalization), selectTopNPeptides(), and internal parse_timepoint() helper.
Package Exports & Imports
NAMESPACE, DESCRIPTION
Exported calculateTurnoverRatios and selectTopNPeptides; added imports from dplyr, stringr, tidyr; DESCRIPTION updated to import tidyr and stringr.
Documentation & Help
man/*.Rd
Added/updated docs for precalculated_ratios, increasing="both", direction output, new functions (bootstrapIC50Precalculated, calculateTurnoverRatios, selectTopNPeptides, parse_timepoint) and updated examples.
Vignette
vignettes/Protein_turnover_vignettes.Rmd
New vignette demonstrating data prep, calculateTurnoverRatios() (with tracer normalization), mapping results for MSstatsResponse flow, and visualization examples.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant doseResponseFit as doseResponseFit
    participant fitIsotonic as fitIsotonicRegression
    participant ModelDec as DecreasingModel
    participant ModelInc as IncreasingModel
    participant Selector as DirectionSelector
    participant Result as ResultDF

    User->>doseResponseFit: Call with data, increasing="both", precalculated_ratios?
    doseResponseFit->>fitIsotonic: Fit with increasing=FALSE
    fitIsotonic->>ModelDec: build decreasing isotonic model
    ModelDec-->>fitIsotonic: model + F_statistic
    doseResponseFit->>fitIsotonic: Fit with increasing=TRUE
    fitIsotonic->>ModelInc: build increasing isotonic model
    ModelInc-->>fitIsotonic: model + F_statistic_alt
    doseResponseFit->>Selector: compare F_statistic vs F_statistic_alt
    Selector->>Result: choose direction, attach `direction` column
    Result-->>User: return selected-model rows and metadata
Loading
sequenceDiagram
    participant User
    participant Calc as calculateTurnoverRatios
    participant Validate as ValidateInputs
    participant Parse as parse_timepoint
    participant Filter as FilterRows
    participant PeptideSel as peptide_selector (optional)
    participant Aggregate as AggregateIntensity
    participant Pivot as pivot_wider
    participant Tracer as TracerNormalization
    participant Output as RatiosDF

    User->>Calc: provide FeatureLevelData + params
    Calc->>Validate: check required columns
    Calc->>Parse: convert time strings → hours
    Parse-->>Calc: TimeVal
    Calc->>Filter: keep valid timepoints & labels
    Filter->>PeptideSel: apply optional peptide selector
    PeptideSel-->>Aggregate: return selected peptides
    Aggregate->>Pivot: pivot heavy/light intensities wide
    Pivot->>Tracer: if normalize_tracer, apply tracer constants
    Tracer-->>Output: compute H_frac, L_frac, Total
    Output-->>User: return DataFrame with ratios
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

  • Main #4: Overlaps changes to IC50 fitting and visualization (R/IC50_prediction.R, R/Visualize_Isotonic_Fit.R) with related plotting/target-response logic.

Suggested labels

enhancement

Poem

🐰 I hopped through code both left and right,
Fitted curves in morning light—
Ratios counted, peptides picked,
Models chosen, ties untwitched—
A floppy cheer for changes bright!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main addition of the PR: introducing new protein turnover functions for ratio calculation, particularly the calculateTurnoverRatios function and supporting utilities.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch main

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@man/calculateTurnoverRatios.Rd`:
- Around line 60-100: The examples in man/calculateTurnoverRatios.Rd reference
an undefined quant_data (used as quant_data$FeatureLevelData) which will break R
CMD check; update the examples in the calculateTurnoverRatios Rd file to either
wrap the entire examples block in \dontrun{} or add a short preamble that loads
the example dataset (mirroring the pattern in man/DoseResponseFit.Rd) so
quant_data is defined before calls to calculateTurnoverRatios, and ensure any
example variable names (quant_data, tracer_consts) match the loaded data.

In `@man/PredictIC50.Rd`:
- Around line 41-44: The man page description for the parameter target_response
is incorrect; update the documentation for PredictIC50 to state that
target_response defaults to NULL (not 0.5) and that when NULL the function
auto-adjusts to 0.5 for decreasing curves and 1.5 for increasing curves, and
that passing precalculated_ratios = TRUE with target_response = NULL raises an
error; reference the PredictIC50 parameter target_response and the
precalculated_ratios behavior (implementation in PredictIC50.R) and change the
"Default" text to accurately describe these semantics.
- Around line 126-135: The example calls predictIC50 with precalculated_ratios =
TRUE but omits the required target_response; update the example to pass an
explicit target_response value (e.g., target_response = 0.5 or whatever
normalized response is intended) when calling predictIC50 so it matches the
behavior enforced in R/PredictIC50.R; locate the example block where predictIC50
is invoked and add the target_response parameter to the function call.

In `@man/selectTopNPeptides.Rd`:
- Around line 20-27: The example uses an undefined quant_data causing R CMD
check failures; update the man page example around selectTopNPeptides and
calculateTurnoverRatios to either wrap the example in \dontrun{ ... } or include
a short preamble that loads or constructs a valid quant_data object (e.g.,
load(exampleDataset) or create minimal quant_data$FeatureLevelData) so the call
peptide_selector = function(df) selectTopNPeptides(df, n = 10) runs without
error; ensure references to selectTopNPeptides and calculateTurnoverRatios are
preserved and the chosen fix is applied consistently in the examples block.

In `@R/IC50_prediction.R`:
- Around line 118-124: The current branch unconditionally chooses the decreasing
fit when both ic50_dec and ic50_inc are available; instead compute a fit-quality
metric (e.g., SSE = sum((y - predict(fit))^2) or residual sum of squares) for
both fit_dec and fit_inc, compare them, and select the model with the lower SSE.
Update ic50_est to ic50_dec or ic50_inc, chosen_direction to "decreasing" or
"increasing", fit_try to the corresponding fit (fit_dec or fit_inc), and set
increasing_final to FALSE or TRUE based on that comparison.

In `@R/PredictIC50.R`:
- Around line 122-137: The null-target_response branch currently falls back to
defaults and causes incorrect behavior when increasing == "both"; update the
logic in PredictIC50 (the block handling target_response, increasing, and
precalculated_ratios) so that if is.null(target_response) and increasing ==
"both" you stop with an error demanding an explicit target_response (unless
precalculated_ratios forces a different check), instead of assigning
target_response = 0.5; ensure the error clearly references target_response and
increasing so callers must supply the response level when using both-fitting
mode.

In `@R/protein_turnover_ratio_helper.R`:
- Around line 170-180: parse_timepoint() currently truncates decimals and fails
on NA unit detection; change the numeric extraction to allow decimals (e.g. use
a regex like "^[0-9]+(\\.[0-9]+)?" or "^[0-9]*\\.?[0-9]+") so numeric_part
preserves fractional values, convert that to numeric, and ensure unit flags
(is_days, is_weeks) are coerced to FALSE for missing inputs (e.g. replace NA
with FALSE or compute flags as !is.na(time_strings) & str_detect(...)). Then
compute hours from numeric_part and the safe unit flags and return the result.
Reference symbols: function parse_timepoint, variables numeric_part, is_days,
is_weeks, and hours.
- Around line 23-61: Update the examples in protein_turnover_ratio_helper.R so
they are self-contained for R CMD check: use a packaged example dataset (e.g.,
included sample data object) instead of undefined quant_data, either explicitly
call dplyr verbs with namespace (dplyr::filter, dplyr::group_by, %>% via
magrittr::`%>%`) or add a library(dplyr) call inside a \dontrun{} block, and fix
tracer normalization by making tracer_constants keys match the TimeVal lookup
used in calculateTurnoverRatios (e.g., use as.character(TimeVal) keys or wrap
the lookup with as.character) so tracer_constants[as.character(TimeVal)] works;
wrap longer multi-step workflows/examples in \dontrun{} to avoid heavy runtime
during checks and apply the same changes to the other example blocks referenced
(lines ~193-197).
- Around line 149-153: The code silently produces NA when a TimeVal has no entry
in tracer_constants and only adjusts H_frac, breaking the L_frac complement;
update the df_wide processing so you first map tracer_factor <-
tracer_constants[as.character(TimeVal)], fail fast if any mapped tracer_factor
is NA (e.g. stop() with a clear message including offending TimeVal values),
then normalize H_frac <- H_frac / tracer_factor and immediately recompute L_frac
<- 1 - H_frac to keep the two fractions consistent (make these changes in the
mutate block around df_wide, TimeVal, H_frac, L_frac, and tracer_constants).

In `@R/Visualize_Isotonic_Fit.R`:
- Around line 312-316: The current legend-hiding block only runs when color_by
is NULL, so grouped plots ignore legend = FALSE; update Visualize_Isotonic_Fit.R
to apply the legend toggle unconditionally after the plot is constructed: after
building model_fit_plot (the ggplot object created by the function), if legend
is FALSE call model_fit_plot = model_fit_plot + ggplot2::theme(legend.position =
"none") so the legend is suppressed regardless of color_by; ensure you locate
the existing conditional around model_fit_plot and move or duplicate the theme
application to run after plot construction and before returning model_fit_plot.

In `@vignettes/Protein_turnover_vignettes.Rmd`:
- Around line 18-20: The vignette currently uses a hard-coded relative path in
the readRDS call that will break outside your local checkout; move the sample
RDS file into the package’s inst/extdata directory and update the vignette to
load it via system.file (so quant_data_fraction_impute <-
readRDS(system.file("extdata","quant_data_fraction_imputed_new.rds",
package="YourPackage"))) or, if you don't want to bundle the file, mark the R
chunk that contains the readRDS call (the chunk that creates
quant_data_fraction_impute) with eval = FALSE so it doesn't run during
CI/rendering.
- Around line 86-105: The vignette uses example_proteins before it's defined;
define example_proteins (e.g., example_proteins <-
unique(test_ratio_calc$protein) or set to a specific known protein string)
before any calls that index it (the preview chunk and the
visualizeResponseProtein call) so that example_proteins[1] is valid; ensure the
variable is a character/vector and placed prior to the test_ratio_calc %>%
filter(...) and the visualizeResponseProtein(...) invocation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5255f89d-7253-47fa-add6-eb6f85318eda

📥 Commits

Reviewing files that changed from the base of the PR and between 8739ce4 and 90a7fa2.

📒 Files selected for processing (16)
  • NAMESPACE
  • R/Fit_Isotonic_Regression.R
  • R/IC50_prediction.R
  • R/PredictIC50.R
  • R/Visualize_Isotonic_Fit.R
  • R/protein_turnover_ratio_helper.R
  • man/DoseResponseFit.Rd
  • man/PredictIC50.Rd
  • man/VisualizeResponseProtein.Rd
  • man/bootstrapIC50Precalculated.Rd
  • man/calculateTurnoverRatios.Rd
  • man/fitIsotonicRegression.Rd
  • man/parse_timepoint.Rd
  • man/plotIsotonic.Rd
  • man/selectTopNPeptides.Rd
  • vignettes/Protein_turnover_vignettes.Rmd

Comment thread man/calculateTurnoverRatios.Rd
Comment thread man/PredictIC50.Rd
Comment on lines +41 to +44
\item{target_response}{Numeric, the response fraction (e.g., 0.5, 0.25, 0.75, 1.5).
For decreasing curves: use values < 1 (e.g., 0.5 for IC50).
For increasing curves: use values > 1 (e.g., 1.5 for 50\% increase).
Default = 0.5 (auto-adjusts to 1.5 for increasing curves).}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Documentation for target_response default is inconsistent with actual behavior.

Line 17 shows target_response = NULL but line 44 states "Default = 0.5". Per the implementation in R/PredictIC50.R:111-138, the actual default is NULL which:

  • Auto-adjusts to 0.5 for decreasing curves
  • Auto-adjusts to 1.5 for increasing curves
  • Throws an error when precalculated_ratios = TRUE

Suggest updating the description to reflect NULL default with auto-adjustment semantics.

Proposed fix
 \item{target_response}{Numeric, the response fraction (e.g., 0.5, 0.25, 0.75, 1.5).
 For decreasing curves: use values < 1 (e.g., 0.5 for IC50).
 For increasing curves: use values > 1 (e.g., 1.5 for 50\% increase).
-Default = 0.5 (auto-adjusts to 1.5 for increasing curves).}
+Default = NULL (auto-adjusts to 0.5 for decreasing or 1.5 for increasing curves;
+required when precalculated_ratios = TRUE).}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
\item{target_response}{Numeric, the response fraction (e.g., 0.5, 0.25, 0.75, 1.5).
For decreasing curves: use values < 1 (e.g., 0.5 for IC50).
For increasing curves: use values > 1 (e.g., 1.5 for 50\% increase).
Default = 0.5 (auto-adjusts to 1.5 for increasing curves).}
\item{target_response}{Numeric, the response fraction (e.g., 0.5, 0.25, 0.75, 1.5).
For decreasing curves: use values < 1 (e.g., 0.5 for IC50).
For increasing curves: use values > 1 (e.g., 1.5 for 50\% increase).
Default = NULL (auto-adjusts to 0.5 for decreasing or 1.5 for increasing curves;
required when precalculated_ratios = TRUE).}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@man/PredictIC50.Rd` around lines 41 - 44, The man page description for the
parameter target_response is incorrect; update the documentation for PredictIC50
to state that target_response defaults to NULL (not 0.5) and that when NULL the
function auto-adjusts to 0.5 for decreasing curves and 1.5 for increasing
curves, and that passing precalculated_ratios = TRUE with target_response = NULL
raises an error; reference the PredictIC50 parameter target_response and the
precalculated_ratios behavior (implementation in PredictIC50.R) and change the
"Default" text to accurately describe these semantics.

Comment thread man/PredictIC50.Rd
Comment on lines +126 to +135
# Example 7: Using pre-calculated ratios
turnover_data <- prepared_data
turnover_data$response <- 2^turnover_data$response /
mean(2^turnover_data$response[turnover_data$dose == 0])

ic50_precalc <- predictIC50(
data = turnover_data,
precalculated_ratios = TRUE,
bootstrap = TRUE
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Example 7 will fail: target_response is required when precalculated_ratios = TRUE.

Per the implementation in R/PredictIC50.R, when precalculated_ratios = TRUE and target_response = NULL, the function throws an error requiring explicit specification. This example omits target_response.

Proposed fix
 # Example 7: Using pre-calculated ratios
 turnover_data <- prepared_data
 turnover_data$response <- 2^turnover_data$response /
                           mean(2^turnover_data$response[turnover_data$dose == 0])

 ic50_precalc <- predictIC50(
   data = turnover_data,
   precalculated_ratios = TRUE,
+  target_response = 0.5,
   bootstrap = TRUE
 )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Example 7: Using pre-calculated ratios
turnover_data <- prepared_data
turnover_data$response <- 2^turnover_data$response /
mean(2^turnover_data$response[turnover_data$dose == 0])
ic50_precalc <- predictIC50(
data = turnover_data,
precalculated_ratios = TRUE,
bootstrap = TRUE
)
# Example 7: Using pre-calculated ratios
turnover_data <- prepared_data
turnover_data$response <- 2^turnover_data$response /
mean(2^turnover_data$response[turnover_data$dose == 0])
ic50_precalc <- predictIC50(
data = turnover_data,
precalculated_ratios = TRUE,
target_response = 0.5,
bootstrap = TRUE
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@man/PredictIC50.Rd` around lines 126 - 135, The example calls predictIC50
with precalculated_ratios = TRUE but omits the required target_response; update
the example to pass an explicit target_response value (e.g., target_response =
0.5 or whatever normalized response is intended) when calling predictIC50 so it
matches the behavior enforced in R/PredictIC50.R; locate the example block where
predictIC50 is invoked and add the target_response parameter to the function
call.

Comment thread man/selectTopNPeptides.Rd
Comment on lines +20 to +27
\examples{
# Use as peptide_selector
ratios <- calculateTurnoverRatios(
feature_data = quant_data$FeatureLevelData,
peptide_selector = function(df) selectTopNPeptides(df, n = 10)
)

}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Example references undefined quant_data variable.

The example code will fail when executed during R CMD check because quant_data is not defined. Either wrap in \dontrun{} or add a preamble loading example data.

Proposed fix
 \examples{
 # Use as peptide_selector
+\dontrun{
 ratios <- calculateTurnoverRatios(
   feature_data = quant_data$FeatureLevelData,
   peptide_selector = function(df) selectTopNPeptides(df, n = 10)
 )
+}

 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@man/selectTopNPeptides.Rd` around lines 20 - 27, The example uses an
undefined quant_data causing R CMD check failures; update the man page example
around selectTopNPeptides and calculateTurnoverRatios to either wrap the example
in \dontrun{ ... } or include a short preamble that loads or constructs a valid
quant_data object (e.g., load(exampleDataset) or create minimal
quant_data$FeatureLevelData) so the call peptide_selector = function(df)
selectTopNPeptides(df, n = 10) runs without error; ensure references to
selectTopNPeptides and calculateTurnoverRatios are preserved and the chosen fix
is applied consistently in the examples block.

Comment thread R/IC50_prediction.R
Comment on lines +118 to +124
} else if (!is.na(ic50_dec) && !is.na(ic50_inc)) {
# Both work - choose based on better fit (could use SSE if stored)
# Default to decreasing for now
ic50_est = ic50_dec
chosen_direction = "decreasing"
fit_try = fit_dec
increasing_final = FALSE
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Don't default to decreasing when both fits cross the target.

When ic50_dec and ic50_inc are both non-NA, this path unconditionally picks the decreasing model. That can return the wrong IC50/direction even though both fitted curves are already available. Compare fit quality here (for example residual SSE from y - y_pred) before choosing.

Possible fix
       } else if (!is.na(ic50_dec) && !is.na(ic50_inc)) {
-        # Both work - choose based on better fit (could use SSE if stored)
-        # Default to decreasing for now
-        ic50_est = ic50_dec
-        chosen_direction = "decreasing"
-        fit_try = fit_dec
-        increasing_final = FALSE
+        sse_dec = sum((y - fit_dec$y_pred)^2, na.rm = TRUE)
+        sse_inc = sum((y - fit_inc$y_pred)^2, na.rm = TRUE)
+
+        if (sse_dec <= sse_inc) {
+          ic50_est = ic50_dec
+          chosen_direction = "decreasing"
+          fit_try = fit_dec
+          increasing_final = FALSE
+        } else {
+          ic50_est = ic50_inc
+          chosen_direction = "increasing"
+          fit_try = fit_inc
+          increasing_final = TRUE
+        }
       } else {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
} else if (!is.na(ic50_dec) && !is.na(ic50_inc)) {
# Both work - choose based on better fit (could use SSE if stored)
# Default to decreasing for now
ic50_est = ic50_dec
chosen_direction = "decreasing"
fit_try = fit_dec
increasing_final = FALSE
} else if (!is.na(ic50_dec) && !is.na(ic50_inc)) {
sse_dec = sum((y - fit_dec$y_pred)^2, na.rm = TRUE)
sse_inc = sum((y - fit_inc$y_pred)^2, na.rm = TRUE)
if (sse_dec <= sse_inc) {
ic50_est = ic50_dec
chosen_direction = "decreasing"
fit_try = fit_dec
increasing_final = FALSE
} else {
ic50_est = ic50_inc
chosen_direction = "increasing"
fit_try = fit_inc
increasing_final = TRUE
}
} else {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/IC50_prediction.R` around lines 118 - 124, The current branch
unconditionally chooses the decreasing fit when both ic50_dec and ic50_inc are
available; instead compute a fit-quality metric (e.g., SSE = sum((y -
predict(fit))^2) or residual sum of squares) for both fit_dec and fit_inc,
compare them, and select the model with the lower SSE. Update ic50_est to
ic50_dec or ic50_inc, chosen_direction to "decreasing" or "increasing", fit_try
to the corresponding fit (fit_dec or fit_inc), and set increasing_final to FALSE
or TRUE based on that comparison.

Comment on lines +149 to +153
df_wide <- df_wide %>%
mutate(
tracer_factor = tracer_constants[as.character(TimeVal)],
H_frac = H_frac / tracer_factor
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate tracer factors and keep both fractions consistent.

Any missing timepoint mapping currently turns H_frac into NA silently, and after dividing only H_frac, the returned L_frac is no longer the complement. Fail fast on missing tracer_constants entries and recompute L_frac from the normalized heavy fraction.

Possible fix
   if (normalize_tracer) {
     if (is.null(tracer_constants)) {
       stop("normalize_tracer = TRUE but tracer_constants was not provided")
     }
+
+    missing_timepoints <- setdiff(as.character(unique(df_wide$TimeVal)), names(tracer_constants))
+    if (length(missing_timepoints) > 0) {
+      stop("Missing tracer constants for timepoints: ",
+           paste(missing_timepoints, collapse = ", "))
+    }
 
     df_wide <- df_wide %>%
       mutate(
-        tracer_factor = tracer_constants[as.character(TimeVal)],
-        H_frac = H_frac / tracer_factor
+        tracer_factor = unname(tracer_constants[as.character(TimeVal)]),
+        H_frac = H_frac / tracer_factor,
+        L_frac = 1 - H_frac
       )
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/protein_turnover_ratio_helper.R` around lines 149 - 153, The code silently
produces NA when a TimeVal has no entry in tracer_constants and only adjusts
H_frac, breaking the L_frac complement; update the df_wide processing so you
first map tracer_factor <- tracer_constants[as.character(TimeVal)], fail fast if
any mapped tracer_factor is NA (e.g. stop() with a clear message including
offending TimeVal values), then normalize H_frac <- H_frac / tracer_factor and
immediately recompute L_frac <- 1 - H_frac to keep the two fractions consistent
(make these changes in the mutate block around df_wide, TimeVal, H_frac, L_frac,
and tracer_constants).

Comment on lines +170 to +180
parse_timepoint <- function(time_strings) {
# Extract numeric part
numeric_part <- as.numeric(str_extract(time_strings, "^[0-9]+"))

# Handle different units
is_days <- str_detect(time_strings, "d|day")
is_weeks <- str_detect(time_strings, "w|week")

hours <- numeric_part
hours[is_days] <- numeric_part[is_days] * 24
hours[is_weeks] <- numeric_part[is_weeks] * 24 * 7
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

parse_timepoint() mishandles missing and fractional labels.

str_extract(..., "^[0-9]+") truncates values like 0.5hr to 0, and str_detect() returns NA for missing inputs, which can make the indexed assignments fail before the caller reaches filter(!is.na(TimeVal)). Parse decimals and coerce missing unit flags to FALSE.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/protein_turnover_ratio_helper.R` around lines 170 - 180, parse_timepoint()
currently truncates decimals and fails on NA unit detection; change the numeric
extraction to allow decimals (e.g. use a regex like "^[0-9]+(\\.[0-9]+)?" or
"^[0-9]*\\.?[0-9]+") so numeric_part preserves fractional values, convert that
to numeric, and ensure unit flags (is_days, is_weeks) are coerced to FALSE for
missing inputs (e.g. replace NA with FALSE or compute flags as
!is.na(time_strings) & str_detect(...)). Then compute hours from numeric_part
and the safe unit flags and return the result. Reference symbols: function
parse_timepoint, variables numeric_part, is_days, is_weeks, and hours.

Comment on lines +312 to 316
# Handle legend display
if (!legend && is.null(color_by)) {
model_fit_plot = model_fit_plot +
ggplot2::theme(legend.position = "none")
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Honor legend = FALSE even when color_by is used.

This condition only hides the legend in the default coloring path, so grouped plots always show a legend regardless of the caller's legend argument. Apply the legend toggle after building the plot, independent of color_by.

Possible fix
-  if (!legend && is.null(color_by)) {
+  if (!legend) {
     model_fit_plot = model_fit_plot +
       ggplot2::theme(legend.position = "none")
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/Visualize_Isotonic_Fit.R` around lines 312 - 316, The current legend-hiding
block only runs when color_by is NULL, so grouped plots ignore legend = FALSE;
update Visualize_Isotonic_Fit.R to apply the legend toggle unconditionally after
the plot is constructed: after building model_fit_plot (the ggplot object
created by the function), if legend is FALSE call model_fit_plot =
model_fit_plot + ggplot2::theme(legend.position = "none") so the legend is
suppressed regardless of color_by; ensure you locate the existing conditional
around model_fit_plot and move or duplicate the theme application to run after
plot construction and before returning model_fit_plot.

Comment on lines +18 to +20
Load most recent data ---
```{r}
quant_data_fraction_impute = readRDS("../../../ProteinTurnover/data/quant_data_fraction_imputed_new.rds")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Bundle the vignette dataset with the package.

readRDS("../../../ProteinTurnover/...") only works from one local checkout layout, so this vignette will fail on CI, in installed packages, and in rendered docs. Please move the sample data under inst/extdata and load it via system.file(...), or mark this chunk eval = FALSE.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@vignettes/Protein_turnover_vignettes.Rmd` around lines 18 - 20, The vignette
currently uses a hard-coded relative path in the readRDS call that will break
outside your local checkout; move the sample RDS file into the package’s
inst/extdata directory and update the vignette to load it via system.file (so
quant_data_fraction_impute <-
readRDS(system.file("extdata","quant_data_fraction_imputed_new.rds",
package="YourPackage"))) or, if you don't want to bundle the file, mark the R
chunk that contains the readRDS call (the chunk that creates
quant_data_fraction_impute) with eval = FALSE so it doesn't run during
CI/rendering.

Comment on lines +86 to +105
test_ratio_calc %>% filter (protein == example_proteins[1]) # should have 71 rows

```


```{r}
# visualize response
visualizeResponseProtein(
test_ratio_calc,
protein_name = example_proteins[1],
drug_name = "Synthesis",
precalculated_ratios = TRUE,
ratio_response = FALSE,
transform_dose = FALSE,
increasing = TRUE,
show_ic50 = TRUE,
add_ci = FALSE,
target_response = 0.5,
y_lab = "relative abundance",x_lab = "time (hrs)",
color_by = 'BaseSequence')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Define example_proteins before using it.

Lines 86 and 95 index example_proteins[1], but this vignette never creates that vector, so the preview chunk and visualizeResponseProtein() call will abort. Derive it from unique(test_ratio_calc$protein) or hard-code a known protein from the bundled example data.

Possible fix
+# Pick one protein from the calculated ratios for the vignette example
+example_proteins <- unique(test_ratio_calc$protein)
+
 # add color_by option for visualization 
 test_ratio_calc %>% filter (protein == example_proteins[1]) # should have 71 rows 
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@vignettes/Protein_turnover_vignettes.Rmd` around lines 86 - 105, The vignette
uses example_proteins before it's defined; define example_proteins (e.g.,
example_proteins <- unique(test_ratio_calc$protein) or set to a specific known
protein string) before any calls that index it (the preview chunk and the
visualizeResponseProtein call) so that example_proteins[1] is valid; ensure the
variable is a character/vector and placed prior to the test_ratio_calc %>%
filter(...) and the visualizeResponseProtein(...) invocation.

@sszvetecz sszvetecz merged commit 2f5792f into devel Apr 9, 2026
1 of 2 checks passed
@tonywu1999 tonywu1999 deleted the main branch April 17, 2026 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants