Skip to content

Conversation

@t0mdavid-m
Copy link
Member

@t0mdavid-m t0mdavid-m commented Aug 14, 2025

  • Download button to the left of tables
  • Selected Masses are now highlighted in Deconv View
  • Prominent Linking to CLI
  • Support of separate MS1/MS2 heatmaps
  • Download on windows doesnt fail anymore for long file names

Summary by CodeRabbit

  • New Features

    • Added MS2 Raw and MS2 Deconvolved heatmaps, including responsive rendering and selection.
    • Enhanced Quickstart with modular, responsive download boxes for Windows app and FLASH CLI tools.
  • Refactor

    • Unified heatmap handling for MS1/MS2 to streamline selection and updates.
    • Reworked Quickstart download section into reusable components with consistent styling.
  • Bug Fixes

    • Trim whitespace when saving parameter values to prevent accidental inconsistencies.
  • Chores

    • Updated the OpenMS Streamlit Vue component submodule to the latest revision.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 14, 2025

Walkthrough

Adds MS2 heatmap support end-to-end: propagates MSLevel in parsed data, exports per-MSLevel heatmaps, initializes and updates MS2 heatmap components, and unifies heatmap selection handling. Standardizes stored download archive filenames, introduces modular download UI in quickstart, trims whitespace in saved parameters, and updates a submodule reference.

Changes

Cohort / File(s) Summary of changes
Standardize stored download archive name
content/FLASHDeconv/FLASHDeconvDownload.py, content/FLASHTnT/FLASHTnTDownload.py
Store archives as download_archive.zip under key download_archive; user-facing download still uses {experiment}.zip.
Layout: add MS2 heatmap components
content/FLASHDeconv/FLASHDeconvLayoutManager.py
Added “MS2 raw heatmap” and “MS2 deconvolved heatmap” options and identifiers in component mappings.
Parsing: propagate MSLevel and export per-level heatmaps
src/parse/masstable.py, src/parse/deconv.py
Propagate MSLevel into annotated and MSSignal data; generate and compress heatmaps separately for MS1 and MS2, storing under ms{level}_... keys.
Render init: MS2 heatmaps + spectrum source change
src/render/initialize.py
Initialize branches for MS2 raw/deconv heatmaps using per-level datasets and caches; use combined_spectrum for annotated spectrum.
Render update: unified heatmap handling
src/render/update.py
Consolidated MS1/MS2 heatmap update paths; dynamic selection keys for raw/deconv variants; single render call per type.
Parameter persistence hygiene
src/workflow/ParameterManager.py
Trim leading/trailing whitespace from string TOPP parameters before saving.
Quickstart: modular download UI
content/quickstart.py
Added reusable boxed download components and responsive rendering; replaced inline enhanced download section.
Submodule bump
openms-streamlit-vue-component
Updated submodule commit reference; no code changes in this repo.

Sequence Diagram(s)

sequenceDiagram
  participant UI as User/UI
  participant Init as initialize_data
  participant FM as FileManager
  participant Cache as Cache/Store
  participant Plot as PlotlyHeatmap

  UI->>Init: initialize_data("ms2_deconv_heat_map", ...)
  Init->>FM: get("ms2_deconv_heatmap")
  FM-->>Init: data_full (MS2 deconv heatmap)
  Init->>Cache: compute_compression_levels(len(data_full))
  loop For each size
    Init->>FM: get(f"ms2_deconv_heatmap_{size}")
    FM-->>Init: cached subset
  end
  Init->>Plot: render with smallest cache + additional_data (all caches)
  Plot-->>UI: MS2 Deconvolved Heatmap displayed
Loading
sequenceDiagram
  participant UI as User/UI
  participant Update as filter_data (update)
  participant Store as selection_store
  participant Render as render_heatmap

  UI->>Update: selection on "Raw MS2 Heatmap"
  Update->>Update: selection = "heatmap_raw2"
  Update->>Store: read selection_store[selection]
  alt selection exists
    Update->>Render: render_heatmap(additional_data["raw_heatmap_df"], dataset, selection, component)
  else No-op
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • update vue #22: Updates the same openms-streamlit-vue-component submodule reference.
  • Make Heatmap Interactive #20: Modifies getMSSignalDF to add per-row metadata; related to this PR’s MSLevel propagation in MSSignal.

Poem

A rabbit taps keys with a hop and a sweep,
MS1, MS2—two heatmaps to keep.
Zips named neat, parameters trim,
Boxes for downloads, edges slim.
With plots that glow and caches that zip—
I thump in delight: ship, ship, ship! 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch new_version

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🔭 Outside diff range comments (1)
src/render/update.py (1)

30-52: Avoid downsampling on empty data to prevent runtime errors

If the heatmap partition is empty (e.g., no MS2 in dataset), calling downsample_heatmap can error. Add early returns for empty frames.

 @st.cache_data(max_entries=4, show_spinner=False, hash_funcs=hash_funcs)
 def render_heatmap(full_data, selection, dataset_name, component_name):
+    # Guard: if the smallest cache is empty, return early
+    if len(full_data) > 0 and isinstance(full_data[0], pd.DataFrame) and full_data[0].empty:
+        return full_data[0]
     if (
         (selection['xRange'][0] < 0)
         and (selection['xRange'][1] < 0)
         and (selection['yRange'][0] < 0)
         and (selection['yRange'][1] < 0)
     ):
-        return downsample_heatmap(full_data[0])
+        return full_data[0] if len(full_data[0]) <= 20000 else downsample_heatmap(full_data[0])
     else:
         for dataset in full_data:
 
             relevant_data = dataset.loc[
                 (dataset['rt'] >= selection['xRange'][0])
                 & (dataset['rt'] <= selection['xRange'][1])
                 & (dataset['mass'] >= selection['yRange'][0])
                 & (dataset['mass'] <= selection['yRange'][1])
             ]
+            if relevant_data.empty:
+                continue
             if len(relevant_data) >= 20000:
                 break
         if len(relevant_data) <= 20000:
             return relevant_data
         downsampled = downsample_heatmap(relevant_data)
         return downsampled
🧹 Nitpick comments (13)
openms-streamlit-vue-component (1)

1-1: Optional: Align the submodule pointer with a tagged release or document rationale.

For traceability, consider bumping to an annotated tag (if available) or add a brief note in CHANGELOG/PR description explaining why commit 9245874 is required (e.g., needed for MS2 heatmap/UI fixes). This helps future maintainers.

content/FLASHTnT/FLASHTnTDownload.py (2)

58-62: Avoid potential filename collisions inside ZIP

Using arcname=Path(filepath).name flattens the directory structure and can lead to collisions if two different outputs share the same basename. Prefer organizing by result key to preserve uniqueness and improve traceability.

Apply this diff to preserve a folder per target in the archive:

-                            for filepath in file_manager.get_results(
-                                experiment, targets, partial=True
-                            ).values():
-                                f.write(filepath, arcname=Path(filepath).name)
+                            for target_key, filepath in file_manager.get_results(
+                                experiment, targets, partial=True
+                            ).items():
+                                arcname = f"{target_key}/{Path(filepath).name}"
+                                f.write(filepath, arcname=arcname)

71-76: Set explicit MIME type for downloads

Small UX nit: set mime to 'application/zip' so browsers consistently recognize the file type.

-                        button_placeholder.download_button(
-                            "Download ⬇️", f, 
-                            file_name = f'{experiment}.zip',
-                            use_container_width=True
-                        )
+                        button_placeholder.download_button(
+                            "Download ⬇️", f,
+                            file_name=f'{experiment}.zip',
+                            mime='application/zip',
+                            use_container_width=True
+                        )
content/FLASHDeconv/FLASHDeconvDownload.py (2)

58-62: Prevent duplicate basenames in the ZIP

Flattening all files to their basenames can produce duplicate entries. Consider namespacing by target key.

-                            for filepath in file_manager.get_results(
-                                experiment, targets, partial=True
-                            ).values():
-                                f.write(filepath, arcname=Path(filepath).name)
+                            for target_key, filepath in file_manager.get_results(
+                                experiment, targets, partial=True
+                            ).items():
+                                arcname = f"{target_key}/{Path(filepath).name}"
+                                f.write(filepath, arcname=arcname)

71-76: Minor: add MIME type for the download

Improves browser handling.

-                        button_placeholder.download_button(
-                            "Download ⬇️", f, 
-                            file_name = f'{experiment}.zip',
-                            use_container_width=True
-                        )
+                        button_placeholder.download_button(
+                            "Download ⬇️", f,
+                            file_name=f'{experiment}.zip',
+                            mime='application/zip',
+                            use_container_width=True
+                        )
content/FLASHDeconv/FLASHDeconvLayoutManager.py (2)

283-289: Set correct MIME type for saved layout JSON

Current mime='json' is non-standard. Use 'application/json'.

 c4.download_button(
     label="Save Setting",
     data=json.dumps(getTrimmedLayoutSetting()),
     file_name='FLASHViewer_layout_settings.json',
-    mime='json',
+    mime='application/json',
     disabled=(validateSubmittedLayout()!=''),
 )

9-28: Avoid parallel lists for options/names to prevent drift

Maintaining two parallel lists risks misalignment over time. Consider a single source of truth structure (e.g., list of tuples or an OrderedDict) and derive both options and names from it.

Example pattern:

COMPONENTS = [
    ("MS1 raw heatmap", "ms1_raw_heatmap"),
    ("MS2 raw heatmap", "ms2_raw_heatmap"),
    ("MS1 deconvolved heatmap", "ms1_deconv_heat_map"),
    ("MS2 deconvolved heatmap", "ms2_deconv_heat_map"),
    # ...
]
COMPONENT_OPTIONS = [label for label, _ in COMPONENTS]
COMPONENT_NAMES   = [name  for _, name in COMPONENTS]

This reduces the chance of index mismatches.

src/parse/masstable.py (1)

218-224: Include MSLevel in MSSignalDF — consider tighter dtype to reduce memory

The levels array is correctly aligned with intensities. Minor improvement: MS level values are typically 1/2; using uint8 saves memory for large datasets.

-    levels = np.concatenate(
+    levels = np.concatenate(
         [
             [anno_df.loc[index, 'MSLevel']]*len(anno_df.loc[index, "intarray"]) 
             for index in anno_df.index
         ],
-        dtype=np.int32
+        dtype=np.uint8
     )
@@
-        'MSLevel' : levels
+        'MSLevel': levels

Also applies to: 243-244

content/quickstart.py (1)

145-159: Generalize rendering to follow available_boxes order

Currently, when two boxes are present, the order is hard-coded (Windows then CLI) and the available_boxes list is only used for its count. Consider iterating over available_boxes and dispatching per entry to reduce duplication and make ordering explicit.

Example approach:

renderers = {
    "windows": (render_windows_download_box, "windows_download_container"),
    "flash_cli": (render_flash_cli_tools_box, "flash_cli_tools_container"),
}
with center_col:
    for key in available_boxes:
        render, container = renderers[key]
        rendered = render()
        if rendered is not False:  # windows returns False if missing
            apply_download_box_styling(container)
src/parse/deconv.py (1)

35-44: Confirm intent: iterative downsampling mutates and stores progressively smaller caches

Iterating sizes in reversed order while reusing the progressively downsampled DataFrame is an optimization (fewer operations on decreasing data), but it changes the statistical properties vs. independent downsampling from the full set for each size. If you require each cache to be a downsample from the full data (not from the previous stage), we should recompute per size from the original partition.

If independent downsampling is desired:

-            for size in reversed(compute_compression_levels(20000, len(relevant_heatmap), logger=logger)):
-                # Downsample iteratively
-                relevant_heatmap = downsample_heatmap(relevant_heatmap, max_datapoints=size)
+            base_heatmap = relevant_heatmap
+            for size in reversed(compute_compression_levels(20000, len(base_heatmap), logger=logger)):
+                # Downsample independently from the original partition
+                relevant_heatmap = downsample_heatmap(base_heatmap, max_datapoints=size)
src/render/initialize.py (3)

35-57: MS2 deconvolved heatmap path looks correct and consistent

Fetching full data, collecting caches, selecting the smallest for initial render, and passing the full cache list via additional_data is consistent with the MS1 path.

There’s duplication across the four heatmap branches. Consider extracting a helper:

def _prepare_heatmap(file_manager, dataset, key_prefix):
    data_full = file_manager.get_results(dataset, [key_prefix])[key_prefix]
    caches = [file_manager.get_results(dataset, [f"{key_prefix}_{s}"])[f"{key_prefix}_{s}"]
              for s in compute_compression_levels(20000, len(data_full))]
    caches.append(data_full)
    return caches[0], caches

Then use:

data_to_send['deconv_heatmap_df'], additional_data['deconv_heatmap_df'] = _prepare_heatmap(file_manager, selected_data, 'ms2_deconv_heatmap')

81-104: MS2 raw heatmap initialization looks good

Mirrors the deconvolved branch properly. See previous comment regarding optional refactor to reduce duplication.


117-119: Switch to combined_spectrum for Annotated Spectrum: verify downstream assumptions

Using combined_spectrum for the Annotated Spectrum component changes the columns available to the PlotlyLineplot compared to anno_spectrum. Ensure the plotting code and any selection slicing in update.filter_data still align with these columns.

If you want, I can scan for accesses to per_scan_data columns under the "Annotated Spectrum" component and verify compatibility.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c691053 and 045cc1d.

⛔ Files ignored due to path filters (3)
  • js-component/dist/assets/index-1a66aa50.js is excluded by !**/dist/**
  • js-component/dist/assets/index-36755211.css is excluded by !**/dist/**
  • js-component/dist/index.html is excluded by !**/dist/**
📒 Files selected for processing (10)
  • content/FLASHDeconv/FLASHDeconvDownload.py (1 hunks)
  • content/FLASHDeconv/FLASHDeconvLayoutManager.py (2 hunks)
  • content/FLASHTnT/FLASHTnTDownload.py (1 hunks)
  • content/quickstart.py (2 hunks)
  • openms-streamlit-vue-component (1 hunks)
  • src/parse/deconv.py (1 hunks)
  • src/parse/masstable.py (3 hunks)
  • src/render/initialize.py (3 hunks)
  • src/render/update.py (1 hunks)
  • src/workflow/ParameterManager.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/parse/deconv.py (2)
src/workflow/FileManager.py (1)
  • store_data (309-327)
src/render/compression.py (2)
  • compute_compression_levels (7-15)
  • downsample_heatmap (18-56)
src/render/initialize.py (3)
src/workflow/FileManager.py (1)
  • get_results (410-455)
src/render/compression.py (1)
  • compute_compression_levels (7-15)
src/render/components.py (1)
  • PlotlyHeatmap (38-45)
🪛 Ruff (0.12.2)
content/quickstart.py

39-40: Use a single with statement with multiple contexts instead of nested with statements

Combine with statements

(SIM117)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build-openms
  • GitHub Check: build-vue-js-component
  • GitHub Check: build-full-app
🔇 Additional comments (10)
openms-streamlit-vue-component (1)

1-1: Submodule bump pinned to commit: looks good for reproducibility.

The submodule is pinned to an exact commit (9245874…), which is good for deterministic builds.

content/FLASHTnT/FLASHTnTDownload.py (1)

63-66: Good fix: standardize stored archive filename to avoid Windows path issues

Storing the cached archive as a constant 'download_archive.zip' under the 'download_archive' key is a solid approach to avoid long-path or illegal-character issues on Windows while keeping the user-facing name as '{experiment}.zip'.

content/FLASHDeconv/FLASHDeconvDownload.py (1)

63-66: Consistent archive-name standardization looks good

Same as FLASHTnT: caching the archive as 'download_archive.zip' under the 'download_archive' key while exposing '{experiment}.zip' to users is a pragmatic fix for Windows path/charset issues.

src/workflow/ParameterManager.py (2)

74-77: Whitespace trimming for TOPP string params is a good improvement

Trimming before persisting reduces noisy diffs and avoids subtle CLI mismatches caused by stray whitespace.


70-77: Verification complete — no params.json strings with leading/trailing whitespace; list-trimming optional

Scanned all params.json files in the repo; no string values or list-of-string elements with leading/trailing whitespace were found.

Files checked:

  • ./example-data/workspaces/default/flashdeconv/params.json
  • ./example-data/workspaces/default/flashtnt/params.json
  • ./example-data/workspaces/default/params.json
  • ./example-data/workspaces/demo_antibody/flashdeconv/params.json
  • ./example-data/workspaces/demo_antibody/flashtnt/params.json
  • ./example-data/workspaces/demo_antibody/params.json
  • ./example-data/workspaces/demo_aqpz/flashdeconv/params.json
  • ./example-data/workspaces/demo_aqpz/flashtnt/params.json
  • ./example-data/workspaces/demo_aqpz/params.json

Location to update if you want list trimming as a precaution:

  • src/workflow/ParameterManager.py (around lines 70–77)

Suggested (optional) diff to also trim list-of-strings:

  •                    if isinstance(value, str):
    
  •                        value = value.strip()
    
  •                    if isinstance(value, str):
    
  •                        value = value.strip()
    
  •                    elif isinstance(value, list) and all(isinstance(x, str) for x in value):
    
  •                        value = [x.strip() for x in value]
    
content/FLASHDeconv/FLASHDeconvLayoutManager.py (1)

11-14: MS2 heatmap identifiers verified — handlers exist

Ripgrep confirms the exact identifiers are used downstream.

  • content/FLASHDeconv/FLASHDeconvLayoutManager.py — COMPONENT_NAMES contains 'ms2_raw_heatmap' and 'ms2_deconv_heat_map'.
  • src/render/initialize.py — handler branches for both identifiers:
    • elif comp_name == 'ms2_deconv_heat_map' (handler)
    • elif comp_name == 'ms2_raw_heatmap' (handler), file_manager.get_results calls using ['ms2_raw_heatmap'] and cache keys like f'ms2_raw_heatmap_{size}'.

No further action required.

src/parse/masstable.py (2)

179-181: Propagating MSLevel to annotateddf enables consistent downstream filtering

Adding annotateddf['MSLevel'] ensures both deconvolved and annotated DataFrames carry MS level metadata. This is needed for per-MSLevel visualizations.


196-205: No action required — MSLevel is populated on both DataFrames

Verified: parseFLASHDeconvOutput assigns MSLevel to both returned DataFrames, and getMSSignalDF is only called with those DataFrames.

  • src/parse/masstable.py — msLevels are collected via spec.getMSLevel() and assigned to both df['MSLevel'] and annotateddf['MSLevel'] (around lines ~173–179).
  • src/parse/deconv.py — parseFLASHDeconvOutput(...) returns deconv_df, anno_df and getMSSignalDF(df) is called for both deconv_df and anno_df (parseDeconv, lines ~13 and ~23).
content/quickstart.py (2)

95-116: Consistent box styling is solid

The key-scoped CSS for container styling is clean and avoids global leakage. Good use of the st-key selector approach for Streamlit.


74-82: Verify Streamlit version compatibility for link_button(type=...)

st.link_button’s type="secondary" parameter requires a recent Streamlit version. If older environments are in play, this may render as default or error.

Would you like me to add a small capability check with a graceful fallback?

Comment on lines +39 to +49
with cols[1]:
with open("OpenMS-App.zip", "rb") as file:
st.download_button(
label="📥 Download for Windows",
data=file,
file_name="OpenMS-App.zip",
mime="archive/zip",
type="secondary",
use_container_width=True,
help="Download FLASHApp for Windows systems"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Combine nested context managers and fix MIME type for ZIP

  • Address Ruff SIM117 by combining the nested with-statements.
  • Use the correct MIME type for ZIP files: application/zip (archive/zip is non-standard).
-    # Center the download button
-    with cols[1]:
-        with open("OpenMS-App.zip", "rb") as file:
-            st.download_button(
-                label="📥 Download for Windows",
-                data=file,
-                file_name="OpenMS-App.zip",
-                mime="archive/zip",
-                type="secondary",
-                use_container_width=True,
-                help="Download FLASHApp for Windows systems"
-            )
+    # Center the download button
+    with cols[1], open("OpenMS-App.zip", "rb") as file:
+        st.download_button(
+            label="📥 Download for Windows",
+            data=file,
+            file_name="OpenMS-App.zip",
+            mime="application/zip",
+            type="secondary",
+            use_container_width=True,
+            help="Download FLASHApp for Windows systems"
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
with cols[1]:
with open("OpenMS-App.zip", "rb") as file:
st.download_button(
label="📥 Download for Windows",
data=file,
file_name="OpenMS-App.zip",
mime="archive/zip",
type="secondary",
use_container_width=True,
help="Download FLASHApp for Windows systems"
)
# Center the download button
with cols[1], open("OpenMS-App.zip", "rb") as file:
st.download_button(
label="📥 Download for Windows",
data=file,
file_name="OpenMS-App.zip",
mime="application/zip",
type="secondary",
use_container_width=True,
help="Download FLASHApp for Windows systems"
)
🧰 Tools
🪛 Ruff (0.12.2)

39-40: Use a single with statement with multiple contexts instead of nested with statements

Combine with statements

(SIM117)

🤖 Prompt for AI Agents
In content/quickstart.py around lines 39 to 49, combine the nested
with-statements into a single context manager (e.g., "with cols[1],
open('OpenMS-App.zip', 'rb') as file:") to resolve Ruff SIM117, and change the
MIME type from the non-standard "archive/zip" to the correct "application/zip"
in the st.download_button call; keep the same parameters and help text.

@@ -1 +1 @@
Subproject commit 1d46a276800d1b3d01c5515d059215253dee44bb
Subproject commit 9245874feed9f2044b09a6023202adde20619eab
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify CI and local dev flows fetch submodules recursively.

Ensure your CI and contributor docs clone/update submodules (since the sandbox won’t auto-fetch them). This avoids runtime/import errors when the submodule isn’t present.

Run this script to check for .gitmodules presence and whether workflows/docs include recursive submodule steps:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "1) Verify .gitmodules exists and references 'openms-streamlit-vue-component'"
if [ -f .gitmodules ]; then
  echo "Found .gitmodules:"
  cat .gitmodules
else
  echo "WARNING: .gitmodules not found at repo root."
fi

echo
echo "2) Search for submodule instructions in docs and workflows"
# Look for typical workflow and doc files
fd -a -E node_modules -E dist -E build -e yml -e yaml -e md -e txt -e sh -e Makefile -e mk -e Dockerfile | \
  rg -n --hidden -e 'submodule|--recursive|git clone|git submodule'

echo
echo "3) Suggested commands to include (if missing):"
cat << 'EOF'
# CI step (example):
git submodule sync --recursive
git submodule update --init --recursive

# Local dev (README):
git clone --recurse-submodules <repo-url>
# or if already cloned:
git submodule update --init --recursive
EOF

Length of output: 757


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "1) .gitmodules (repo root):"
if [ -f .gitmodules ]; then
  sed -n '1,200p' .gitmodules
else
  echo "No .gitmodules found"
fi

echo
echo "2) CI / workflow files present (common locations):"
if [ -d .github/workflows ]; then
  find .github/workflows -maxdepth 1 -type f -print
else
  echo "No .github/workflows directory"
fi
[ -f .gitlab-ci.yml ] && echo ".gitlab-ci.yml exists"
[ -d .circleci ] && find .circleci -maxdepth 1 -type f -print || echo "No .circleci directory"
[ -f azure-pipelines.yml ] && echo "azure-pipelines.yml exists"

echo
echo "3) Top-level README / CONTRIBUTING / docs files (showing first 120 lines):"
for f in README* CONTRIBUTING*; do
  if [ -f "$f" ]; then
    echo "== $f =="
    sed -n '1,120p' "$f"
    echo
  fi
done
if [ -d docs ]; then
  find docs -maxdepth 2 -type f -print
fi

echo
echo "4) Search repository for submodule-related mentions (patterns: submodule|--recursive|git submodule|--recurse):"
rg -n --hidden -S -e 'submodule|--recursive|--recurse|git clone' || true

Length of output: 4398


Ensure CI and contributor docs fetch submodules recursively

Verified: .gitmodules exists and references openms-streamlit-vue-component, but docs/CI/Docker steps do not consistently fetch submodules recursively — this can cause missing files at build/runtime.

Files to update (examples)

  • .gitmodules — present (no change required).
  • README.md — currently shows git submodule init / git submodule update. Update to recommend recursive fetch (see snippet below).
  • Dockerfile (Dockerfile:20) — RUN git clone -b ${VUE_BRANCH} --single-branch ${VUE_REPO} clones non-recursively. Either add --recurse-submodules to the clone or run git submodule update --init --recursive after cloning.
  • .github/workflows/build-windows-executable-app.yaml — contains git submodule update --init THIRDPARTY (non-recursive). Change to use --recursive or ensure the checkout step fetches submodules.
  • Check other CI/workflow files (e.g., .github/workflows/build-docker-images.yml) and any Docker build contexts to ensure submodule contents are available to the build.
  • Note: local .git/config currently has recurseSubmodules = false — don’t rely on contributor-local config; enforce recursion in CI and docs.

Suggested commands to add/update:

# CI step (example):
git submodule sync --recursive
git submodule update --init --recursive

# Local dev (README):
git clone --recurse-submodules <repo-url>
# or if already cloned:
git submodule update --init --recursive

Please update README, Dockerfile, and CI workflow(s) (paths above) to ensure submodules are fetched recursively.

🤖 Prompt for AI Agents
In openms-streamlit-vue-component (README.md, Dockerfile line ~20,
.github/workflows/build-windows-executable-app.yaml and other CI workflow files
referenced) the project and CI docs/steps currently fetch submodules
non-recursively which can leave required files out of builds; update README to
recommend cloning with --recurse-submodules (and include the alternative git
submodule update --init --recursive for already-cloned repos), modify the
Dockerfile clone step to either add --recurse-submodules or run git submodule
sync --recursive && git submodule update --init --recursive immediately after
cloning, change the workflow step in
.github/workflows/build-windows-executable-app.yaml (and any other CI workflows
like build-docker-images.yml) to perform git submodule sync --recursive and git
submodule update --init --recursive (or configure the checkout action to fetch
submodules), and search other CI/Docker build contexts to replace non-recursive
submodule commands with the recursive variants so CI and container builds never
rely on contributor local config.

Comment on lines +25 to +33
for ms_level in [1, 2]:

# Downsample iteratively
heatmap = downsample_heatmap(heatmap, max_datapoints=size)
# Store compressed version
relevant_heatmap = heatmap[heatmap['MSLevel'] == ms_level].drop(columns=['MSLevel'])

# Store full sized version
file_manager.store_data(
dataset_id, f'ms1_{descriptor}_heatmap_{size}', heatmap
dataset_id, f'ms{ms_level}_{descriptor}_heatmap', relevant_heatmap
)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Guard against missing MSLevel column and empty partitions

If MSLevel isn’t present (backward compatibility, older caches) the current code raises a KeyError on the filter. Also, empty per-level partitions should be handled explicitly to avoid downstream issues.

-        for ms_level in [1, 2]:
-            
-            relevant_heatmap = heatmap[heatmap['MSLevel'] == ms_level].drop(columns=['MSLevel'])
+        for ms_level in [1, 2]:
+            if 'MSLevel' in heatmap.columns:
+                relevant_heatmap = heatmap.loc[heatmap['MSLevel'] == ms_level].drop(columns=['MSLevel'])
+            else:
+                # Backward compatibility: treat all as MS1, make MS2 empty
+                relevant_heatmap = heatmap.copy() if ms_level == 1 else heatmap.iloc[0:0].copy()

Additionally, consider short-circuiting when relevant_heatmap is empty to avoid unnecessary work (compression loop will be skipped already, but this improves clarity).


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Repository root: $(pwd)"
echo
echo "Checking for src/parse/deconv.py and nearby context..."
if [ -f src/parse/deconv.py ]; then
  echo "=== src/parse/deconv.py (first 240 lines) ==="
  nl -ba src/parse/deconv.py | sed -n '1,240p'
else
  echo "src/parse/deconv.py not found. Searching for files named deconv.py..."
  rg -n "deconv.py" || true
fi

echo
echo "Searching the repo for occurrences of MSLevel, heatmap, and relevant_heatmap..."
rg -n "MSLevel" || true
echo
rg -n "relevant_heatmap" || true
echo
rg -n "heatmap" || true

Length of output: 506


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Repository root: $(pwd)"
echo

FILE="src/parse/deconv.py"
echo "Looking for $FILE..."
if [ -f "$FILE" ]; then
  echo "=== $FILE (first 240 lines) ==="
  sed -n '1,240p' "$FILE" || true
else
  echo "$FILE not found. Searching for deconv.py..."
  rg -n "deconv.py" || true
fi

echo
echo "Searching the repo for occurrences of MSLevel, relevant_heatmap, and heatmap..."
rg -n "MSLevel" || true
echo
rg -n "relevant_heatmap" || true
echo
rg -n "heatmap" || true

Length of output: 293913


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="src/parse/masstable.py"
echo "Showing $FILE (first 400 lines):"
if [ -f "$FILE" ]; then
  sed -n '1,400p' "$FILE" || true
else
  echo "$FILE not found"
fi

echo
echo "Searching for getMSSignalDF and MSLevel occurrences:"
rg -n "def getMSSignalDF|getMSSignalDF|MSLevel" || true

Length of output: 80766


Guard MSLevel usage in src/parse/deconv.py (add defensive check and skip empty partitions)

The pipeline normally adds MSLevel in src/parse/masstable.py (parseFLASHDeconvOutput / getMSSignalDF), but older caches or external inputs may omit it — add a defensive check and skip empty per-level partitions to avoid KeyError and needless storage.

Files to update

  • src/parse/deconv.py — loop that partitions heatmap by MSLevel (around the for ms_level in [1, 2]: block).
  • (reference) src/parse/masstable.py — MSLevel is set there (parseFLASHDeconvOutput / getMSSignalDF, around lines ~170–180), so the guard is only for backward compatibility.

Suggested change (replace the existing per-level loop with this):

-        for ms_level in [1, 2]:
-            
-            relevant_heatmap = heatmap[heatmap['MSLevel'] == ms_level].drop(columns=['MSLevel'])
-
-            # Store full sized version
-            file_manager.store_data(
-                dataset_id, f'ms{ms_level}_{descriptor}_heatmap', relevant_heatmap
-            )
+        for ms_level in [1, 2]:
+            # Defensive: some inputs/caches may not have MSLevel (backward compatibility)
+            if 'MSLevel' in heatmap.columns:
+                relevant_heatmap = heatmap.loc[heatmap['MSLevel'] == ms_level]
+                if 'MSLevel' in relevant_heatmap.columns:
+                    relevant_heatmap = relevant_heatmap.drop(columns=['MSLevel'])
+            else:
+                # Treat all data as MS1; leave MS2 empty
+                relevant_heatmap = heatmap.copy() if ms_level == 1 else heatmap.iloc[0:0].copy()
+
+            # Short-circuit empty partitions to avoid storing/processing them
+            if relevant_heatmap.empty:
+                continue
+
+            # Store full sized version
+            file_manager.store_data(
+                dataset_id, f'ms{ms_level}_{descriptor}_heatmap', relevant_heatmap
+            )

Tag:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for ms_level in [1, 2]:
# Downsample iteratively
heatmap = downsample_heatmap(heatmap, max_datapoints=size)
# Store compressed version
relevant_heatmap = heatmap[heatmap['MSLevel'] == ms_level].drop(columns=['MSLevel'])
# Store full sized version
file_manager.store_data(
dataset_id, f'ms1_{descriptor}_heatmap_{size}', heatmap
dataset_id, f'ms{ms_level}_{descriptor}_heatmap', relevant_heatmap
)
for ms_level in [1, 2]:
# Defensive: some inputs/caches may not have MSLevel (backward compatibility)
if 'MSLevel' in heatmap.columns:
relevant_heatmap = heatmap.loc[heatmap['MSLevel'] == ms_level]
if 'MSLevel' in relevant_heatmap.columns:
relevant_heatmap = relevant_heatmap.drop(columns=['MSLevel'])
else:
# Treat all data as MS1; leave MS2 empty
relevant_heatmap = heatmap.copy() if ms_level == 1 else heatmap.iloc[0:0].copy()
# Short-circuit empty partitions to avoid storing/processing them
if relevant_heatmap.empty:
continue
# Store full sized version
file_manager.store_data(
dataset_id, f'ms{ms_level}_{descriptor}_heatmap', relevant_heatmap
)
🤖 Prompt for AI Agents
In src/parse/deconv.py around lines 25 to 33, the loop that partitions heatmap
by MSLevel must be made defensive: first check if the heatmap has an 'MSLevel'
column and skip the per-level partitioning entirely if it's missing; inside the
loop, after selecting rows for each ms_level, skip storing when the resulting
relevant_heatmap is empty (continue) to avoid KeyError and needless storage;
ensure you only call file_manager.store_data for non-empty partitions and do not
change behavior otherwise.

Comment on lines +113 to +121
elif (component in ['Deconvolved MS1 Heatmap', 'Deconvolved MS2 Heatmap']):
selection = 'heatmap_deconv' if '1' in component else 'heatmap_deconv2'
if selection in selection_store:
data['deconv_heatmap_df'] = render_heatmap(
additional_data['deconv_heatmap_df'],
selection_store['heatmap_deconv'],
selection_store[selection],
additional_data['dataset'], component
)
elif (component == 'Raw MS1 Heatmap'):
if 'heatmap_raw' in selection_store:
elif (component == ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make MS-level selection detection robust

Using '1' in component to infer MS1 vs MS2 is brittle. Prefer an explicit check (e.g., component contains 'MS1 ' vs 'MS2 ') or a small helper.

-    elif (component in ['Deconvolved MS1 Heatmap', 'Deconvolved MS2 Heatmap']):
-        selection = 'heatmap_deconv' if '1' in component else 'heatmap_deconv2'
+    elif component in ['Deconvolved MS1 Heatmap', 'Deconvolved MS2 Heatmap']:
+        selection = 'heatmap_deconv' if 'MS1 ' in component else 'heatmap_deconv2'
         if selection in selection_store:
             data['deconv_heatmap_df'] = render_heatmap(
                 additional_data['deconv_heatmap_df'], 
                 selection_store[selection],
                 additional_data['dataset'], component
             )
-    elif (component == ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']):
-        selection = 'heatmap_raw' if '1' in component else 'heatmap_raw2'
+    elif component in ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']:
+        selection = 'heatmap_raw' if 'MS1 ' in component else 'heatmap_raw2'
         if selection in selection_store:
             data['raw_heatmap_df'] = render_heatmap(
                 additional_data['raw_heatmap_df'], 
                 selection_store[selection],
                 additional_data['dataset'], component
             )

Optionally, add a guard to avoid KeyError if additional_data keys are missing:

if selection in selection_store and 'deconv_heatmap_df' in additional_data:
    ...

Also applies to: 121-129

🤖 Prompt for AI Agents
In src/render/update.py around lines 113 to 121, the MS-level detection is
brittle (using "'1' in component" and an incorrect "elif (component == ['Raw MS1
Heatmap', 'Raw MS2 Heatmap'])" list comparison) and can raise KeyError when
additional_data keys are missing; replace the fragile check with an explicit
MS-level test (e.g., check if 'MS1' in component or 'MS2' in component or
implement a small helper that returns 'heatmap_deconv' vs 'heatmap_deconv2'),
fix the elif to compare strings/containment correctly, and add guards like
"selection in selection_store and 'deconv_heatmap_df' in additional_data" before
calling render_heatmap; apply the same changes to the analogous block at lines
121-129.

⚠️ Potential issue

Fix raw heatmap branch: equality to list is always False

The condition uses equality against a list, so the branch never executes. This breaks Raw MS1/MS2 heatmap updates.

-    elif (component == ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']):
+    elif (component in ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']):
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
elif (component in ['Deconvolved MS1 Heatmap', 'Deconvolved MS2 Heatmap']):
selection = 'heatmap_deconv' if '1' in component else 'heatmap_deconv2'
if selection in selection_store:
data['deconv_heatmap_df'] = render_heatmap(
additional_data['deconv_heatmap_df'],
selection_store['heatmap_deconv'],
selection_store[selection],
additional_data['dataset'], component
)
elif (component == 'Raw MS1 Heatmap'):
if 'heatmap_raw' in selection_store:
elif (component == ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']):
elif (component in ['Deconvolved MS1 Heatmap', 'Deconvolved MS2 Heatmap']):
selection = 'heatmap_deconv' if '1' in component else 'heatmap_deconv2'
if selection in selection_store:
data['deconv_heatmap_df'] = render_heatmap(
additional_data['deconv_heatmap_df'],
selection_store[selection],
additional_data['dataset'], component
)
elif (component in ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']):
🤖 Prompt for AI Agents
In src/render/update.py around lines 113 to 121, the Raw heatmap branch uses
equality against a list (component == ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']) so
it never executes; change the condition to check membership (e.g. if component
in ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']) and mirror the deconvolved branch's
selection logic by setting selection = 'heatmap' if '1' in component else
'heatmap2', then if selection in selection_store call render_heatmap with the
appropriate raw heatmap data key (e.g. additional_data['raw_heatmap_df']) and
assign the result into data['raw_heatmap_df'] (ensuring you pass
selection_store[selection], additional_data['dataset'], and component in the
same argument order).

@t0mdavid-m t0mdavid-m merged commit 9a261ed into develop Aug 14, 2025
6 of 8 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Aug 17, 2025
@coderabbitai coderabbitai bot mentioned this pull request Dec 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants