-
Notifications
You must be signed in to change notification settings - Fork 0
New Features #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Features #37
Conversation
WalkthroughAdds MS2 heatmap support end-to-end: propagates MSLevel in parsed data, exports per-MSLevel heatmaps, initializes and updates MS2 heatmap components, and unifies heatmap selection handling. Standardizes stored download archive filenames, introduces modular download UI in quickstart, trims whitespace in saved parameters, and updates a submodule reference. Changes
Sequence Diagram(s)sequenceDiagram
participant UI as User/UI
participant Init as initialize_data
participant FM as FileManager
participant Cache as Cache/Store
participant Plot as PlotlyHeatmap
UI->>Init: initialize_data("ms2_deconv_heat_map", ...)
Init->>FM: get("ms2_deconv_heatmap")
FM-->>Init: data_full (MS2 deconv heatmap)
Init->>Cache: compute_compression_levels(len(data_full))
loop For each size
Init->>FM: get(f"ms2_deconv_heatmap_{size}")
FM-->>Init: cached subset
end
Init->>Plot: render with smallest cache + additional_data (all caches)
Plot-->>UI: MS2 Deconvolved Heatmap displayed
sequenceDiagram
participant UI as User/UI
participant Update as filter_data (update)
participant Store as selection_store
participant Render as render_heatmap
UI->>Update: selection on "Raw MS2 Heatmap"
Update->>Update: selection = "heatmap_raw2"
Update->>Store: read selection_store[selection]
alt selection exists
Update->>Render: render_heatmap(additional_data["raw_heatmap_df"], dataset, selection, component)
else No-op
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🔭 Outside diff range comments (1)
src/render/update.py (1)
30-52: Avoid downsampling on empty data to prevent runtime errorsIf the heatmap partition is empty (e.g., no MS2 in dataset), calling downsample_heatmap can error. Add early returns for empty frames.
@st.cache_data(max_entries=4, show_spinner=False, hash_funcs=hash_funcs) def render_heatmap(full_data, selection, dataset_name, component_name): + # Guard: if the smallest cache is empty, return early + if len(full_data) > 0 and isinstance(full_data[0], pd.DataFrame) and full_data[0].empty: + return full_data[0] if ( (selection['xRange'][0] < 0) and (selection['xRange'][1] < 0) and (selection['yRange'][0] < 0) and (selection['yRange'][1] < 0) ): - return downsample_heatmap(full_data[0]) + return full_data[0] if len(full_data[0]) <= 20000 else downsample_heatmap(full_data[0]) else: for dataset in full_data: relevant_data = dataset.loc[ (dataset['rt'] >= selection['xRange'][0]) & (dataset['rt'] <= selection['xRange'][1]) & (dataset['mass'] >= selection['yRange'][0]) & (dataset['mass'] <= selection['yRange'][1]) ] + if relevant_data.empty: + continue if len(relevant_data) >= 20000: break if len(relevant_data) <= 20000: return relevant_data downsampled = downsample_heatmap(relevant_data) return downsampled
🧹 Nitpick comments (13)
openms-streamlit-vue-component (1)
1-1: Optional: Align the submodule pointer with a tagged release or document rationale.For traceability, consider bumping to an annotated tag (if available) or add a brief note in CHANGELOG/PR description explaining why commit 9245874 is required (e.g., needed for MS2 heatmap/UI fixes). This helps future maintainers.
content/FLASHTnT/FLASHTnTDownload.py (2)
58-62: Avoid potential filename collisions inside ZIPUsing arcname=Path(filepath).name flattens the directory structure and can lead to collisions if two different outputs share the same basename. Prefer organizing by result key to preserve uniqueness and improve traceability.
Apply this diff to preserve a folder per target in the archive:
- for filepath in file_manager.get_results( - experiment, targets, partial=True - ).values(): - f.write(filepath, arcname=Path(filepath).name) + for target_key, filepath in file_manager.get_results( + experiment, targets, partial=True + ).items(): + arcname = f"{target_key}/{Path(filepath).name}" + f.write(filepath, arcname=arcname)
71-76: Set explicit MIME type for downloadsSmall UX nit: set mime to 'application/zip' so browsers consistently recognize the file type.
- button_placeholder.download_button( - "Download ⬇️", f, - file_name = f'{experiment}.zip', - use_container_width=True - ) + button_placeholder.download_button( + "Download ⬇️", f, + file_name=f'{experiment}.zip', + mime='application/zip', + use_container_width=True + )content/FLASHDeconv/FLASHDeconvDownload.py (2)
58-62: Prevent duplicate basenames in the ZIPFlattening all files to their basenames can produce duplicate entries. Consider namespacing by target key.
- for filepath in file_manager.get_results( - experiment, targets, partial=True - ).values(): - f.write(filepath, arcname=Path(filepath).name) + for target_key, filepath in file_manager.get_results( + experiment, targets, partial=True + ).items(): + arcname = f"{target_key}/{Path(filepath).name}" + f.write(filepath, arcname=arcname)
71-76: Minor: add MIME type for the downloadImproves browser handling.
- button_placeholder.download_button( - "Download ⬇️", f, - file_name = f'{experiment}.zip', - use_container_width=True - ) + button_placeholder.download_button( + "Download ⬇️", f, + file_name=f'{experiment}.zip', + mime='application/zip', + use_container_width=True + )content/FLASHDeconv/FLASHDeconvLayoutManager.py (2)
283-289: Set correct MIME type for saved layout JSONCurrent mime='json' is non-standard. Use 'application/json'.
c4.download_button( label="Save Setting", data=json.dumps(getTrimmedLayoutSetting()), file_name='FLASHViewer_layout_settings.json', - mime='json', + mime='application/json', disabled=(validateSubmittedLayout()!=''), )
9-28: Avoid parallel lists for options/names to prevent driftMaintaining two parallel lists risks misalignment over time. Consider a single source of truth structure (e.g., list of tuples or an OrderedDict) and derive both options and names from it.
Example pattern:
COMPONENTS = [ ("MS1 raw heatmap", "ms1_raw_heatmap"), ("MS2 raw heatmap", "ms2_raw_heatmap"), ("MS1 deconvolved heatmap", "ms1_deconv_heat_map"), ("MS2 deconvolved heatmap", "ms2_deconv_heat_map"), # ... ] COMPONENT_OPTIONS = [label for label, _ in COMPONENTS] COMPONENT_NAMES = [name for _, name in COMPONENTS]This reduces the chance of index mismatches.
src/parse/masstable.py (1)
218-224: Include MSLevel in MSSignalDF — consider tighter dtype to reduce memoryThe levels array is correctly aligned with intensities. Minor improvement: MS level values are typically 1/2; using uint8 saves memory for large datasets.
- levels = np.concatenate( + levels = np.concatenate( [ [anno_df.loc[index, 'MSLevel']]*len(anno_df.loc[index, "intarray"]) for index in anno_df.index ], - dtype=np.int32 + dtype=np.uint8 ) @@ - 'MSLevel' : levels + 'MSLevel': levelsAlso applies to: 243-244
content/quickstart.py (1)
145-159: Generalize rendering to follow available_boxes orderCurrently, when two boxes are present, the order is hard-coded (Windows then CLI) and the available_boxes list is only used for its count. Consider iterating over available_boxes and dispatching per entry to reduce duplication and make ordering explicit.
Example approach:
renderers = { "windows": (render_windows_download_box, "windows_download_container"), "flash_cli": (render_flash_cli_tools_box, "flash_cli_tools_container"), } with center_col: for key in available_boxes: render, container = renderers[key] rendered = render() if rendered is not False: # windows returns False if missing apply_download_box_styling(container)src/parse/deconv.py (1)
35-44: Confirm intent: iterative downsampling mutates and stores progressively smaller cachesIterating sizes in reversed order while reusing the progressively downsampled DataFrame is an optimization (fewer operations on decreasing data), but it changes the statistical properties vs. independent downsampling from the full set for each size. If you require each cache to be a downsample from the full data (not from the previous stage), we should recompute per size from the original partition.
If independent downsampling is desired:
- for size in reversed(compute_compression_levels(20000, len(relevant_heatmap), logger=logger)): - # Downsample iteratively - relevant_heatmap = downsample_heatmap(relevant_heatmap, max_datapoints=size) + base_heatmap = relevant_heatmap + for size in reversed(compute_compression_levels(20000, len(base_heatmap), logger=logger)): + # Downsample independently from the original partition + relevant_heatmap = downsample_heatmap(base_heatmap, max_datapoints=size)src/render/initialize.py (3)
35-57: MS2 deconvolved heatmap path looks correct and consistentFetching full data, collecting caches, selecting the smallest for initial render, and passing the full cache list via additional_data is consistent with the MS1 path.
There’s duplication across the four heatmap branches. Consider extracting a helper:
def _prepare_heatmap(file_manager, dataset, key_prefix): data_full = file_manager.get_results(dataset, [key_prefix])[key_prefix] caches = [file_manager.get_results(dataset, [f"{key_prefix}_{s}"])[f"{key_prefix}_{s}"] for s in compute_compression_levels(20000, len(data_full))] caches.append(data_full) return caches[0], cachesThen use:
data_to_send['deconv_heatmap_df'], additional_data['deconv_heatmap_df'] = _prepare_heatmap(file_manager, selected_data, 'ms2_deconv_heatmap')
81-104: MS2 raw heatmap initialization looks goodMirrors the deconvolved branch properly. See previous comment regarding optional refactor to reduce duplication.
117-119: Switch to combined_spectrum for Annotated Spectrum: verify downstream assumptionsUsing combined_spectrum for the Annotated Spectrum component changes the columns available to the PlotlyLineplot compared to anno_spectrum. Ensure the plotting code and any selection slicing in update.filter_data still align with these columns.
If you want, I can scan for accesses to per_scan_data columns under the "Annotated Spectrum" component and verify compatibility.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these settings in your CodeRabbit configuration.
⛔ Files ignored due to path filters (3)
js-component/dist/assets/index-1a66aa50.jsis excluded by!**/dist/**js-component/dist/assets/index-36755211.cssis excluded by!**/dist/**js-component/dist/index.htmlis excluded by!**/dist/**
📒 Files selected for processing (10)
content/FLASHDeconv/FLASHDeconvDownload.py(1 hunks)content/FLASHDeconv/FLASHDeconvLayoutManager.py(2 hunks)content/FLASHTnT/FLASHTnTDownload.py(1 hunks)content/quickstart.py(2 hunks)openms-streamlit-vue-component(1 hunks)src/parse/deconv.py(1 hunks)src/parse/masstable.py(3 hunks)src/render/initialize.py(3 hunks)src/render/update.py(1 hunks)src/workflow/ParameterManager.py(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/parse/deconv.py (2)
src/workflow/FileManager.py (1)
store_data(309-327)src/render/compression.py (2)
compute_compression_levels(7-15)downsample_heatmap(18-56)
src/render/initialize.py (3)
src/workflow/FileManager.py (1)
get_results(410-455)src/render/compression.py (1)
compute_compression_levels(7-15)src/render/components.py (1)
PlotlyHeatmap(38-45)
🪛 Ruff (0.12.2)
content/quickstart.py
39-40: Use a single with statement with multiple contexts instead of nested with statements
Combine with statements
(SIM117)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-openms
- GitHub Check: build-vue-js-component
- GitHub Check: build-full-app
🔇 Additional comments (10)
openms-streamlit-vue-component (1)
1-1: Submodule bump pinned to commit: looks good for reproducibility.The submodule is pinned to an exact commit (9245874…), which is good for deterministic builds.
content/FLASHTnT/FLASHTnTDownload.py (1)
63-66: Good fix: standardize stored archive filename to avoid Windows path issuesStoring the cached archive as a constant 'download_archive.zip' under the 'download_archive' key is a solid approach to avoid long-path or illegal-character issues on Windows while keeping the user-facing name as '{experiment}.zip'.
content/FLASHDeconv/FLASHDeconvDownload.py (1)
63-66: Consistent archive-name standardization looks goodSame as FLASHTnT: caching the archive as 'download_archive.zip' under the 'download_archive' key while exposing '{experiment}.zip' to users is a pragmatic fix for Windows path/charset issues.
src/workflow/ParameterManager.py (2)
74-77: Whitespace trimming for TOPP string params is a good improvementTrimming before persisting reduces noisy diffs and avoids subtle CLI mismatches caused by stray whitespace.
70-77: Verification complete — no params.json strings with leading/trailing whitespace; list-trimming optionalScanned all params.json files in the repo; no string values or list-of-string elements with leading/trailing whitespace were found.
Files checked:
- ./example-data/workspaces/default/flashdeconv/params.json
- ./example-data/workspaces/default/flashtnt/params.json
- ./example-data/workspaces/default/params.json
- ./example-data/workspaces/demo_antibody/flashdeconv/params.json
- ./example-data/workspaces/demo_antibody/flashtnt/params.json
- ./example-data/workspaces/demo_antibody/params.json
- ./example-data/workspaces/demo_aqpz/flashdeconv/params.json
- ./example-data/workspaces/demo_aqpz/flashtnt/params.json
- ./example-data/workspaces/demo_aqpz/params.json
Location to update if you want list trimming as a precaution:
- src/workflow/ParameterManager.py (around lines 70–77)
Suggested (optional) diff to also trim list-of-strings:
if isinstance(value, str):value = value.strip()
if isinstance(value, str):value = value.strip()elif isinstance(value, list) and all(isinstance(x, str) for x in value):value = [x.strip() for x in value]content/FLASHDeconv/FLASHDeconvLayoutManager.py (1)
11-14: MS2 heatmap identifiers verified — handlers existRipgrep confirms the exact identifiers are used downstream.
- content/FLASHDeconv/FLASHDeconvLayoutManager.py — COMPONENT_NAMES contains 'ms2_raw_heatmap' and 'ms2_deconv_heat_map'.
- src/render/initialize.py — handler branches for both identifiers:
- elif comp_name == 'ms2_deconv_heat_map' (handler)
- elif comp_name == 'ms2_raw_heatmap' (handler), file_manager.get_results calls using ['ms2_raw_heatmap'] and cache keys like f'ms2_raw_heatmap_{size}'.
No further action required.
src/parse/masstable.py (2)
179-181: Propagating MSLevel to annotateddf enables consistent downstream filteringAdding annotateddf['MSLevel'] ensures both deconvolved and annotated DataFrames carry MS level metadata. This is needed for per-MSLevel visualizations.
196-205: No action required — MSLevel is populated on both DataFramesVerified: parseFLASHDeconvOutput assigns MSLevel to both returned DataFrames, and getMSSignalDF is only called with those DataFrames.
- src/parse/masstable.py — msLevels are collected via spec.getMSLevel() and assigned to both df['MSLevel'] and annotateddf['MSLevel'] (around lines ~173–179).
- src/parse/deconv.py — parseFLASHDeconvOutput(...) returns deconv_df, anno_df and getMSSignalDF(df) is called for both deconv_df and anno_df (parseDeconv, lines ~13 and ~23).
content/quickstart.py (2)
95-116: Consistent box styling is solidThe key-scoped CSS for container styling is clean and avoids global leakage. Good use of the st-key selector approach for Streamlit.
74-82: Verify Streamlit version compatibility for link_button(type=...)st.link_button’s type="secondary" parameter requires a recent Streamlit version. If older environments are in play, this may render as default or error.
Would you like me to add a small capability check with a graceful fallback?
| with cols[1]: | ||
| with open("OpenMS-App.zip", "rb") as file: | ||
| st.download_button( | ||
| label="📥 Download for Windows", | ||
| data=file, | ||
| file_name="OpenMS-App.zip", | ||
| mime="archive/zip", | ||
| type="secondary", | ||
| use_container_width=True, | ||
| help="Download FLASHApp for Windows systems" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Combine nested context managers and fix MIME type for ZIP
- Address Ruff SIM117 by combining the nested with-statements.
- Use the correct MIME type for ZIP files: application/zip (archive/zip is non-standard).
- # Center the download button
- with cols[1]:
- with open("OpenMS-App.zip", "rb") as file:
- st.download_button(
- label="📥 Download for Windows",
- data=file,
- file_name="OpenMS-App.zip",
- mime="archive/zip",
- type="secondary",
- use_container_width=True,
- help="Download FLASHApp for Windows systems"
- )
+ # Center the download button
+ with cols[1], open("OpenMS-App.zip", "rb") as file:
+ st.download_button(
+ label="📥 Download for Windows",
+ data=file,
+ file_name="OpenMS-App.zip",
+ mime="application/zip",
+ type="secondary",
+ use_container_width=True,
+ help="Download FLASHApp for Windows systems"
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| with cols[1]: | |
| with open("OpenMS-App.zip", "rb") as file: | |
| st.download_button( | |
| label="📥 Download for Windows", | |
| data=file, | |
| file_name="OpenMS-App.zip", | |
| mime="archive/zip", | |
| type="secondary", | |
| use_container_width=True, | |
| help="Download FLASHApp for Windows systems" | |
| ) | |
| # Center the download button | |
| with cols[1], open("OpenMS-App.zip", "rb") as file: | |
| st.download_button( | |
| label="📥 Download for Windows", | |
| data=file, | |
| file_name="OpenMS-App.zip", | |
| mime="application/zip", | |
| type="secondary", | |
| use_container_width=True, | |
| help="Download FLASHApp for Windows systems" | |
| ) |
🧰 Tools
🪛 Ruff (0.12.2)
39-40: Use a single with statement with multiple contexts instead of nested with statements
Combine with statements
(SIM117)
🤖 Prompt for AI Agents
In content/quickstart.py around lines 39 to 49, combine the nested
with-statements into a single context manager (e.g., "with cols[1],
open('OpenMS-App.zip', 'rb') as file:") to resolve Ruff SIM117, and change the
MIME type from the non-standard "archive/zip" to the correct "application/zip"
in the st.download_button call; keep the same parameters and help text.
| @@ -1 +1 @@ | |||
| Subproject commit 1d46a276800d1b3d01c5515d059215253dee44bb | |||
| Subproject commit 9245874feed9f2044b09a6023202adde20619eab | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify CI and local dev flows fetch submodules recursively.
Ensure your CI and contributor docs clone/update submodules (since the sandbox won’t auto-fetch them). This avoids runtime/import errors when the submodule isn’t present.
Run this script to check for .gitmodules presence and whether workflows/docs include recursive submodule steps:
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "1) Verify .gitmodules exists and references 'openms-streamlit-vue-component'"
if [ -f .gitmodules ]; then
echo "Found .gitmodules:"
cat .gitmodules
else
echo "WARNING: .gitmodules not found at repo root."
fi
echo
echo "2) Search for submodule instructions in docs and workflows"
# Look for typical workflow and doc files
fd -a -E node_modules -E dist -E build -e yml -e yaml -e md -e txt -e sh -e Makefile -e mk -e Dockerfile | \
rg -n --hidden -e 'submodule|--recursive|git clone|git submodule'
echo
echo "3) Suggested commands to include (if missing):"
cat << 'EOF'
# CI step (example):
git submodule sync --recursive
git submodule update --init --recursive
# Local dev (README):
git clone --recurse-submodules <repo-url>
# or if already cloned:
git submodule update --init --recursive
EOFLength of output: 757
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "1) .gitmodules (repo root):"
if [ -f .gitmodules ]; then
sed -n '1,200p' .gitmodules
else
echo "No .gitmodules found"
fi
echo
echo "2) CI / workflow files present (common locations):"
if [ -d .github/workflows ]; then
find .github/workflows -maxdepth 1 -type f -print
else
echo "No .github/workflows directory"
fi
[ -f .gitlab-ci.yml ] && echo ".gitlab-ci.yml exists"
[ -d .circleci ] && find .circleci -maxdepth 1 -type f -print || echo "No .circleci directory"
[ -f azure-pipelines.yml ] && echo "azure-pipelines.yml exists"
echo
echo "3) Top-level README / CONTRIBUTING / docs files (showing first 120 lines):"
for f in README* CONTRIBUTING*; do
if [ -f "$f" ]; then
echo "== $f =="
sed -n '1,120p' "$f"
echo
fi
done
if [ -d docs ]; then
find docs -maxdepth 2 -type f -print
fi
echo
echo "4) Search repository for submodule-related mentions (patterns: submodule|--recursive|git submodule|--recurse):"
rg -n --hidden -S -e 'submodule|--recursive|--recurse|git clone' || trueLength of output: 4398
Ensure CI and contributor docs fetch submodules recursively
Verified: .gitmodules exists and references openms-streamlit-vue-component, but docs/CI/Docker steps do not consistently fetch submodules recursively — this can cause missing files at build/runtime.
Files to update (examples)
- .gitmodules — present (no change required).
- README.md — currently shows
git submodule init/git submodule update. Update to recommend recursive fetch (see snippet below). - Dockerfile (Dockerfile:20) —
RUN git clone -b ${VUE_BRANCH} --single-branch ${VUE_REPO}clones non-recursively. Either add--recurse-submodulesto the clone or rungit submodule update --init --recursiveafter cloning. - .github/workflows/build-windows-executable-app.yaml — contains
git submodule update --init THIRDPARTY(non-recursive). Change to use--recursiveor ensure the checkout step fetches submodules. - Check other CI/workflow files (e.g., .github/workflows/build-docker-images.yml) and any Docker build contexts to ensure submodule contents are available to the build.
- Note: local .git/config currently has
recurseSubmodules = false— don’t rely on contributor-local config; enforce recursion in CI and docs.
Suggested commands to add/update:
# CI step (example):
git submodule sync --recursive
git submodule update --init --recursive
# Local dev (README):
git clone --recurse-submodules <repo-url>
# or if already cloned:
git submodule update --init --recursive
Please update README, Dockerfile, and CI workflow(s) (paths above) to ensure submodules are fetched recursively.
🤖 Prompt for AI Agents
In openms-streamlit-vue-component (README.md, Dockerfile line ~20,
.github/workflows/build-windows-executable-app.yaml and other CI workflow files
referenced) the project and CI docs/steps currently fetch submodules
non-recursively which can leave required files out of builds; update README to
recommend cloning with --recurse-submodules (and include the alternative git
submodule update --init --recursive for already-cloned repos), modify the
Dockerfile clone step to either add --recurse-submodules or run git submodule
sync --recursive && git submodule update --init --recursive immediately after
cloning, change the workflow step in
.github/workflows/build-windows-executable-app.yaml (and any other CI workflows
like build-docker-images.yml) to perform git submodule sync --recursive and git
submodule update --init --recursive (or configure the checkout action to fetch
submodules), and search other CI/Docker build contexts to replace non-recursive
submodule commands with the recursive variants so CI and container builds never
rely on contributor local config.
| for ms_level in [1, 2]: | ||
|
|
||
| # Downsample iteratively | ||
| heatmap = downsample_heatmap(heatmap, max_datapoints=size) | ||
| # Store compressed version | ||
| relevant_heatmap = heatmap[heatmap['MSLevel'] == ms_level].drop(columns=['MSLevel']) | ||
|
|
||
| # Store full sized version | ||
| file_manager.store_data( | ||
| dataset_id, f'ms1_{descriptor}_heatmap_{size}', heatmap | ||
| dataset_id, f'ms{ms_level}_{descriptor}_heatmap', relevant_heatmap | ||
| ) | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Guard against missing MSLevel column and empty partitions
If MSLevel isn’t present (backward compatibility, older caches) the current code raises a KeyError on the filter. Also, empty per-level partitions should be handled explicitly to avoid downstream issues.
- for ms_level in [1, 2]:
-
- relevant_heatmap = heatmap[heatmap['MSLevel'] == ms_level].drop(columns=['MSLevel'])
+ for ms_level in [1, 2]:
+ if 'MSLevel' in heatmap.columns:
+ relevant_heatmap = heatmap.loc[heatmap['MSLevel'] == ms_level].drop(columns=['MSLevel'])
+ else:
+ # Backward compatibility: treat all as MS1, make MS2 empty
+ relevant_heatmap = heatmap.copy() if ms_level == 1 else heatmap.iloc[0:0].copy()Additionally, consider short-circuiting when relevant_heatmap is empty to avoid unnecessary work (compression loop will be skipped already, but this improves clarity).
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Repository root: $(pwd)"
echo
echo "Checking for src/parse/deconv.py and nearby context..."
if [ -f src/parse/deconv.py ]; then
echo "=== src/parse/deconv.py (first 240 lines) ==="
nl -ba src/parse/deconv.py | sed -n '1,240p'
else
echo "src/parse/deconv.py not found. Searching for files named deconv.py..."
rg -n "deconv.py" || true
fi
echo
echo "Searching the repo for occurrences of MSLevel, heatmap, and relevant_heatmap..."
rg -n "MSLevel" || true
echo
rg -n "relevant_heatmap" || true
echo
rg -n "heatmap" || trueLength of output: 506
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Repository root: $(pwd)"
echo
FILE="src/parse/deconv.py"
echo "Looking for $FILE..."
if [ -f "$FILE" ]; then
echo "=== $FILE (first 240 lines) ==="
sed -n '1,240p' "$FILE" || true
else
echo "$FILE not found. Searching for deconv.py..."
rg -n "deconv.py" || true
fi
echo
echo "Searching the repo for occurrences of MSLevel, relevant_heatmap, and heatmap..."
rg -n "MSLevel" || true
echo
rg -n "relevant_heatmap" || true
echo
rg -n "heatmap" || trueLength of output: 293913
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="src/parse/masstable.py"
echo "Showing $FILE (first 400 lines):"
if [ -f "$FILE" ]; then
sed -n '1,400p' "$FILE" || true
else
echo "$FILE not found"
fi
echo
echo "Searching for getMSSignalDF and MSLevel occurrences:"
rg -n "def getMSSignalDF|getMSSignalDF|MSLevel" || trueLength of output: 80766
Guard MSLevel usage in src/parse/deconv.py (add defensive check and skip empty partitions)
The pipeline normally adds MSLevel in src/parse/masstable.py (parseFLASHDeconvOutput / getMSSignalDF), but older caches or external inputs may omit it — add a defensive check and skip empty per-level partitions to avoid KeyError and needless storage.
Files to update
- src/parse/deconv.py — loop that partitions heatmap by MSLevel (around the
for ms_level in [1, 2]:block). - (reference) src/parse/masstable.py — MSLevel is set there (parseFLASHDeconvOutput / getMSSignalDF, around lines ~170–180), so the guard is only for backward compatibility.
Suggested change (replace the existing per-level loop with this):
- for ms_level in [1, 2]:
-
- relevant_heatmap = heatmap[heatmap['MSLevel'] == ms_level].drop(columns=['MSLevel'])
-
- # Store full sized version
- file_manager.store_data(
- dataset_id, f'ms{ms_level}_{descriptor}_heatmap', relevant_heatmap
- )
+ for ms_level in [1, 2]:
+ # Defensive: some inputs/caches may not have MSLevel (backward compatibility)
+ if 'MSLevel' in heatmap.columns:
+ relevant_heatmap = heatmap.loc[heatmap['MSLevel'] == ms_level]
+ if 'MSLevel' in relevant_heatmap.columns:
+ relevant_heatmap = relevant_heatmap.drop(columns=['MSLevel'])
+ else:
+ # Treat all data as MS1; leave MS2 empty
+ relevant_heatmap = heatmap.copy() if ms_level == 1 else heatmap.iloc[0:0].copy()
+
+ # Short-circuit empty partitions to avoid storing/processing them
+ if relevant_heatmap.empty:
+ continue
+
+ # Store full sized version
+ file_manager.store_data(
+ dataset_id, f'ms{ms_level}_{descriptor}_heatmap', relevant_heatmap
+ )Tag:
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for ms_level in [1, 2]: | |
| # Downsample iteratively | |
| heatmap = downsample_heatmap(heatmap, max_datapoints=size) | |
| # Store compressed version | |
| relevant_heatmap = heatmap[heatmap['MSLevel'] == ms_level].drop(columns=['MSLevel']) | |
| # Store full sized version | |
| file_manager.store_data( | |
| dataset_id, f'ms1_{descriptor}_heatmap_{size}', heatmap | |
| dataset_id, f'ms{ms_level}_{descriptor}_heatmap', relevant_heatmap | |
| ) | |
| for ms_level in [1, 2]: | |
| # Defensive: some inputs/caches may not have MSLevel (backward compatibility) | |
| if 'MSLevel' in heatmap.columns: | |
| relevant_heatmap = heatmap.loc[heatmap['MSLevel'] == ms_level] | |
| if 'MSLevel' in relevant_heatmap.columns: | |
| relevant_heatmap = relevant_heatmap.drop(columns=['MSLevel']) | |
| else: | |
| # Treat all data as MS1; leave MS2 empty | |
| relevant_heatmap = heatmap.copy() if ms_level == 1 else heatmap.iloc[0:0].copy() | |
| # Short-circuit empty partitions to avoid storing/processing them | |
| if relevant_heatmap.empty: | |
| continue | |
| # Store full sized version | |
| file_manager.store_data( | |
| dataset_id, f'ms{ms_level}_{descriptor}_heatmap', relevant_heatmap | |
| ) |
🤖 Prompt for AI Agents
In src/parse/deconv.py around lines 25 to 33, the loop that partitions heatmap
by MSLevel must be made defensive: first check if the heatmap has an 'MSLevel'
column and skip the per-level partitioning entirely if it's missing; inside the
loop, after selecting rows for each ms_level, skip storing when the resulting
relevant_heatmap is empty (continue) to avoid KeyError and needless storage;
ensure you only call file_manager.store_data for non-empty partitions and do not
change behavior otherwise.
| elif (component in ['Deconvolved MS1 Heatmap', 'Deconvolved MS2 Heatmap']): | ||
| selection = 'heatmap_deconv' if '1' in component else 'heatmap_deconv2' | ||
| if selection in selection_store: | ||
| data['deconv_heatmap_df'] = render_heatmap( | ||
| additional_data['deconv_heatmap_df'], | ||
| selection_store['heatmap_deconv'], | ||
| selection_store[selection], | ||
| additional_data['dataset'], component | ||
| ) | ||
| elif (component == 'Raw MS1 Heatmap'): | ||
| if 'heatmap_raw' in selection_store: | ||
| elif (component == ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Make MS-level selection detection robust
Using '1' in component to infer MS1 vs MS2 is brittle. Prefer an explicit check (e.g., component contains 'MS1 ' vs 'MS2 ') or a small helper.
- elif (component in ['Deconvolved MS1 Heatmap', 'Deconvolved MS2 Heatmap']):
- selection = 'heatmap_deconv' if '1' in component else 'heatmap_deconv2'
+ elif component in ['Deconvolved MS1 Heatmap', 'Deconvolved MS2 Heatmap']:
+ selection = 'heatmap_deconv' if 'MS1 ' in component else 'heatmap_deconv2'
if selection in selection_store:
data['deconv_heatmap_df'] = render_heatmap(
additional_data['deconv_heatmap_df'],
selection_store[selection],
additional_data['dataset'], component
)
- elif (component == ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']):
- selection = 'heatmap_raw' if '1' in component else 'heatmap_raw2'
+ elif component in ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']:
+ selection = 'heatmap_raw' if 'MS1 ' in component else 'heatmap_raw2'
if selection in selection_store:
data['raw_heatmap_df'] = render_heatmap(
additional_data['raw_heatmap_df'],
selection_store[selection],
additional_data['dataset'], component
)Optionally, add a guard to avoid KeyError if additional_data keys are missing:
if selection in selection_store and 'deconv_heatmap_df' in additional_data:
...Also applies to: 121-129
🤖 Prompt for AI Agents
In src/render/update.py around lines 113 to 121, the MS-level detection is
brittle (using "'1' in component" and an incorrect "elif (component == ['Raw MS1
Heatmap', 'Raw MS2 Heatmap'])" list comparison) and can raise KeyError when
additional_data keys are missing; replace the fragile check with an explicit
MS-level test (e.g., check if 'MS1' in component or 'MS2' in component or
implement a small helper that returns 'heatmap_deconv' vs 'heatmap_deconv2'),
fix the elif to compare strings/containment correctly, and add guards like
"selection in selection_store and 'deconv_heatmap_df' in additional_data" before
calling render_heatmap; apply the same changes to the analogous block at lines
121-129.
Fix raw heatmap branch: equality to list is always False
The condition uses equality against a list, so the branch never executes. This breaks Raw MS1/MS2 heatmap updates.
- elif (component == ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']):
+ elif (component in ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']):📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| elif (component in ['Deconvolved MS1 Heatmap', 'Deconvolved MS2 Heatmap']): | |
| selection = 'heatmap_deconv' if '1' in component else 'heatmap_deconv2' | |
| if selection in selection_store: | |
| data['deconv_heatmap_df'] = render_heatmap( | |
| additional_data['deconv_heatmap_df'], | |
| selection_store['heatmap_deconv'], | |
| selection_store[selection], | |
| additional_data['dataset'], component | |
| ) | |
| elif (component == 'Raw MS1 Heatmap'): | |
| if 'heatmap_raw' in selection_store: | |
| elif (component == ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']): | |
| elif (component in ['Deconvolved MS1 Heatmap', 'Deconvolved MS2 Heatmap']): | |
| selection = 'heatmap_deconv' if '1' in component else 'heatmap_deconv2' | |
| if selection in selection_store: | |
| data['deconv_heatmap_df'] = render_heatmap( | |
| additional_data['deconv_heatmap_df'], | |
| selection_store[selection], | |
| additional_data['dataset'], component | |
| ) | |
| elif (component in ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']): |
🤖 Prompt for AI Agents
In src/render/update.py around lines 113 to 121, the Raw heatmap branch uses
equality against a list (component == ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']) so
it never executes; change the condition to check membership (e.g. if component
in ['Raw MS1 Heatmap', 'Raw MS2 Heatmap']) and mirror the deconvolved branch's
selection logic by setting selection = 'heatmap' if '1' in component else
'heatmap2', then if selection in selection_store call render_heatmap with the
appropriate raw heatmap data key (e.g. additional_data['raw_heatmap_df']) and
assign the result into data['raw_heatmap_df'] (ensuring you pass
selection_store[selection], additional_data['dataset'], and component in the
same argument order).
Summary by CodeRabbit
New Features
Refactor
Bug Fixes
Chores