Handle big spec file#144
Conversation
…g on a server or locally Updated UI for big spec file
📝 WalkthroughWalkthroughAdds a large-Spectronaut-file workflow: new UI controls and local-file browser, MSstatsBig-based conversion path in server utils, an Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant UI as Loadpage UI
participant FileBrowser as shinyFiles Browser
participant Server as Loadpage Server
participant MSB as MSstatsBig
participant Results
User->>UI: Select filetype = "spec"
UI->>UI: Show big-file toggle (big_file_spec)
alt big_file_spec = TRUE
User->>UI: Click browse
UI->>FileBrowser: Open local file chooser
FileBrowser-->>UI: Return local path
UI->>Server: Proceed with local path & options
Server->>Server: .get_data_source_type("spec", TRUE) => "big_spectronaut"
Server->>MSB: bigSpectronauttoMSstatsFormat(local_path, filters)
MSB-->>Server: Return converted data or error
Server->>Results: Store/display converted data (or show error)
else standard path
User->>UI: Upload/select standard spec
UI->>Server: Proceed without local path
Server->>Server: .get_data_source_type(...) => "standard"
Server->>Server: Existing getData() conversion
Server->>Results: Store/display data
end
Results->>User: Present feedback/results
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related issues
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Failed to generate code suggestions for PR |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
R/module-loadpage-server.R (1)
338-350: Consider adding server-side validation for processing parameters.While the UI provides default values for parameters like
qvalue_cutoffandmax_feature_count, adding server-side validation would make the code more robust against edge cases or direct API calls.Example validation
+ # Validate input parameters + qvalue_cutoff <- input$qvalue_cutoff + if (is.null(qvalue_cutoff) || qvalue_cutoff < 0 || qvalue_cutoff > 1) { + qvalue_cutoff <- 0.01 + } + + max_feature_count <- input$max_feature_count + if (is.null(max_feature_count) || max_feature_count < 1) { + max_feature_count <- 20 + } + # Call the big file conversion function from MSstatsConvert converted_data <- MSstatsBig::bigSpectronauttoMSstatsFormat( input_file = local_path_spec(), output_file_name = "output_file.csv", backend = "arrow", filter_by_excluded = input$filter_by_excluded, filter_by_identified = input$filter_by_identified, filter_by_qvalue = input$filter_by_qvalue, - qvalue_cutoff = input$qvalue_cutoff, - max_feature_count = input$max_feature_count, + qvalue_cutoff = qvalue_cutoff, + max_feature_count = max_feature_count, filter_unique_peptides = input$filter_unique_peptides, aggregate_psms = input$aggregate_psms, filter_few_obs = input$filter_few_obs )
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
DESCRIPTIONR/module-loadpage-server.RR/module-loadpage-ui.RR/server.Rman/loadpageServer.Rdtests/testthat/test-loadpage-server.Rtests/testthat/test-module-loadpage-ui.R
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (12)
R/module-loadpage-ui.R (3)
226-226: LGTM! Annotation upload correctly gated for big Spectronaut files.The conditional logic properly excludes annotation file upload when the big-file Spectronaut path is selected, which aligns with the PR objectives.
288-320: LGTM! Well-structured UI for large Spectronaut file workflow.The new UI flow properly supports both standard file input and local file browsing with appropriate processing options. All input IDs are correctly namespaced.
518-525: LGTM! Pre-processing options correctly hidden for big Spectronaut files.The conditional logic properly hides standard pre-processing options when using the large-file Spectronaut path, as those options are now provided in the big-file-specific UI section.
R/server.R (1)
35-35: LGTM! Environment flag correctly passed to loadpageServer.The
is_web_serverparameter is properly passed to enable environment-specific behavior in the loadpage module.man/loadpageServer.Rd (1)
7-14: LGTM! Documentation correctly reflects the new parameter.The
is_web_serverparameter is properly documented with its default value and description.tests/testthat/test-loadpage-server.R (1)
1-21: LGTM! Comprehensive test coverage for data source type helper.The tests validate all relevant scenarios including edge cases (NULL, non-spec file types). This provides good confidence in the routing logic.
tests/testthat/test-module-loadpage-ui.R (1)
397-436: LGTM! Thorough UI testing for the big Spectronaut file workflow.The tests verify all key UI elements and conditional logic for the new large-file processing path. Good attention to detail with HTML entity encoding checks.
R/module-loadpage-server.R (4)
1-13: LGTM! Clean and safe data source type resolver.The helper function correctly uses
isTRUE()for safe boolean checking, which handles NULL and NA cases gracefully.
33-62: LGTM! Proper isolation of local file browser logic.The shinyFiles integration is correctly gated behind the
is_web_servercheck, ensuring local file browsing is only available when running locally. The reactive path handling is clean and safe.
200-204: LGTM! Proceed button logic correctly handles both file upload paths.The conditional logic properly distinguishes between standard Spectronaut file uploads and the new large-file browser workflow.
354-366: LGTM! Proper error handling for memory allocation failures.The error handling correctly catches memory errors, provides user feedback, and prevents further processing. The notification includes technical error details which are helpful for troubleshooting.
DESCRIPTION (1)
27-28: BothshinyFilesandMSstatsBigare confirmed as actively used and available from reputable sources.
shinyFiles(0.9.3) is available on CRAN and is used for file browsing functionality in the load page module.MSstatsBigis available on Bioconductor and is used for the Spectronaut data conversion function. Both packages integrate properly with the codebase and are compatible with the required R version.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In @R/module-loadpage-server.R:
- Around line 297-309: Before calling MSstatsBig::bigSpectronauttoMSstatsFormat
to produce converted_data, validate inputs: ensure local_big_file_path() exists
and is readable, input$qvalue_cutoff is numeric and 0 <= value <= 1,
input$max_feature_count is numeric and > 0, and required boolean flags like
input$filter_by_excluded/input$filter_by_identified are non-NULL; if any check
fails, stop the call and return a clear error message to the user (or set a
reactive error state) so the function is only invoked with validated values.
- Around line 291-299: Replace the hardcoded "output_file.csv" with a unique,
session-scoped temporary filename and ensure it is removed after processing:
generate a temp path (e.g., via tempfile() or by combining a session ID/UUID
with a .csv extension) and pass that path into
MSstatsBig::bigSpectronauttoMSstatsFormat (where local_big_file_path() is the
input), then wrap the call and downstream use of converted_data in a
tryCatch/on.exit block to unlink() the temp file regardless of success or error
so concurrent users don't collide and files are cleaned up.
🧹 Nitpick comments (3)
R/module-loadpage-ui.R (1)
288-337: Consider extracting parameter definitions to reduce duplication.The filtering and processing options (lines 313-323) are hardcoded in the UI function. If these parameters need to be referenced elsewhere (e.g., for validation or defaults), consider extracting them into a shared configuration object or constant.
♻️ Optional refactor to centralize parameter definitions
Create a configuration list that can be reused:
# At the top of the file or in a separate config file SPECTRONAUT_BIG_FILE_DEFAULTS <- list( filter_by_excluded = FALSE, filter_by_identified = FALSE, filter_by_qvalue = TRUE, qvalue_cutoff = 0.01, max_feature_count = 20, filter_unique_peptides = FALSE, aggregate_psms = FALSE, filter_few_obs = FALSE )This would make it easier to maintain consistency between UI defaults and server-side processing.
R/module-loadpage-server.R (2)
30-30: Document the is_web_server parameter behavior.While the parameter has a sensible default, the function documentation should clarify what constitutes a "web server" context and how it affects the module's behavior (e.g., disabling local file browsing).
313-323: Improve error message clarity for users.The error message on line 318 exposes the raw error message from R, which may be technical and confusing for end users. Consider providing more actionable guidance.
💬 Provide user-friendly error messages
converted_data <- tryCatch({ dplyr::collect(converted_data) }, error = function(e) { shinybusy::remove_modal_spinner() + # Determine if this is a memory error or something else + is_memory_error <- grepl("allocate|memory|cannot allocate vector", e$message, ignore.case = TRUE) + + if (is_memory_error) { + showNotification( + "Memory Error: The dataset is too large to load into memory after conversion. Consider:\n1. Running on a machine with more RAM\n2. Further reducing the dataset using the filtering options\n3. Processing the data in chunks using MSstatsBig directly", + type = "error", + duration = NULL + ) + } else { showNotification( - paste("Memory Error: The dataset is too large to process in-memory.", e$message), + paste("Error processing data:", e$message), type = "error", duration = NULL ) + } return(NULL) })
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
R/module-loadpage-server.RR/module-loadpage-ui.R
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (5)
R/module-loadpage-ui.R (2)
535-535: Gating logic correctly hides incompatible options.The processing options are appropriately hidden when
big_file_specis active, preventing users from selecting options that don't apply to the MSstatsBig workflow. This maintains UI consistency with the dual-path approach.Also applies to: 542-542
226-226: Verify that MSstats scheme output from Spectronaut contains all necessary metadata.The annotation file upload is hidden when
big_file_specis checked, and thebigSpectronauttoMSstatsFormat()function does not accept an annotation parameter. Ensure the Spectronaut MSstats scheme output format includes run-to-sample mapping (conditions, bioreplicates) embedded in the file, as this information would normally come from a separate annotation file in standard workflows.R/module-loadpage-server.R (3)
1-13: LGTM - Clear routing logic.The
.get_data_source_typehelper function provides clean separation between standard and big-file processing paths with appropriate defensive checking viaisTRUE().
159-161: Enable button logic correctly handles both file input modes.The proceed button is properly enabled for both standard file upload (
spec_regular_file_ok) and local file browsing (spec_big_file_ok) modes, ensuring users can only proceed when a valid file is selected.
297-309: MSstatsBig API compatibility verified.The function
MSstatsBig::bigSpectronauttoMSstatsFormatexists with all parameters used in the code (input_file,output_file_name,backend,filter_by_excluded,filter_by_identified,filter_by_qvalue,qvalue_cutoff,max_feature_count,filter_unique_peptides,aggregate_psms,filter_few_obs) confirmed as valid parameter names. Thebackend = "arrow"option is supported. MSstatsBig is properly declared as a dependency in DESCRIPTION.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@R/utils.R`:
- Around line 530-545: Wrap the MSstatsBig::bigSpectronauttoMSstatsFormat call
in tryCatch so any error is handled: in the error handler call
shinybusy::remove_modal_spinner(), show a user-facing notification (e.g.,
shiny::showNotification with type="error") that includes the error message, and
return early (e.g., return(NULL)) so downstream code using converted_data is not
executed; locate the call to bigSpectronauttoMSstatsFormat and replace it with
the tryCatch-wrapped version that references local_big_file_path and assigns to
converted_data on success.
♻️ Duplicate comments (1)
R/utils.R (1)
533-536: Avoid hardcoded output_file_name; use a unique temp path + cleanup.Line 535 writes to a fixed
output_file.csv, which can be overwritten across concurrent sessions and left behind on disk. Use a unique temp path and unlink after processing.🔧 Suggested fix
+ output_path <- tempfile("spectronaut_", fileext = ".csv") + on.exit({ + if (file.exists(output_path)) unlink(output_path) + }, add = TRUE) - converted_data <- MSstatsBig::bigSpectronauttoMSstatsFormat( + converted_data <- MSstatsBig::bigSpectronauttoMSstatsFormat( input_file = local_big_file_path, - output_file_name = "output_file.csv", + output_file_name = output_path, backend = "arrow", filter_by_excluded = input$filter_by_excluded, filter_by_identified = input$filter_by_identified, filter_by_qvalue = input$filter_by_qvalue, qvalue_cutoff = input$qvalue_cutoff, max_feature_count = input$max_feature_count, filter_unique_peptides = input$filter_unique_peptides, aggregate_psms = input$aggregate_psms, filter_few_obs = input$filter_few_obs )
🧹 Nitpick comments (1)
R/module-loadpage-server.R (1)
19-48: Confirm is_web_server is set correctly to avoid exposing server files.Lines 21‑48 enable shinyFiles browsing when
is_web_serveris FALSE. If a hosted deployment passes FALSE, users could browse the server filesystem. Please verify the deployment config always setsis_web_server = TRUEfor hosted instances.
…ing the server side code for Big spec file
Overview
This PR adds support for processing large Spectronaut datasets that exceed available RAM by integrating MSstatsBig functionality. Users can now enable large dataset processing via a checkbox option.
Added checkbox UI element to enable MSstatsBig processing for Spectronaut data
Updated pre-processing functions to conditionally route to appropriate converter based on checkbox state.
Removed annotation file upload/selection from Spectronaut large dataset workflow
Added exception handling for datasets where even compressed/reduced data exceeds RAM
Summary by CodeRabbit
New Features
Documentation
Tests
✏️ Tip: You can customize this high-level summary in your review settings.