Refactor PTM visualization by tonywu1999 · Pull Request #69 · Vitek-Lab/MSstatsBioNet

tonywu1999 · 2026-02-25T15:55:49Z

Motivation and Context

The PR refactors PTM (post-translational modification) visualization in the network visualization module to represent PTM sites as discrete visual elements in Cytoscape graphs. The goal is clearer depiction of PTM sites as child nodes attached to protein parents (with invisible compound containers), plus edge-level PTM overlap tooltips, and safe embedding of data into generated JavaScript.

Detailed Changes

Node generation / structure
- createNodeElements now:
  - emits invisible compound container nodes for proteins that have PTM Site entries (compound id: "compound")
  - emits protein nodes optionally assigned to the compound via parent field
  - emits per-PTM-site child nodes (id: "ptm") with parent_protein and parent (compound) fields and node_type = 'ptm'
  - emits PTM attachment edges (id: "ptm_edge") with edge_type = 'ptm_attachment' and category = 'ptm_attachment'
- pre-computes which protein ids have PTM Site rows to avoid redundant compound emission
PTM overlap detection and edge consolidation
- Added .calculatePTMOverlapAggregated(edges, nodes) to aggregate overlapping PTM site names per consolidated edge_key (source-target-interaction)
- consolidateEdges(edges, nodes = NULL) now accepts nodes, invokes PTM overlap aggregation when nodes provided, and propagates ptm_overlap text into consolidated edge rows
Edge generation and styling
- createEdgeElements now incorporates consolidated ptm_overlap into an escaped tooltip field and includes styling via getEdgeStyle
- Edge payloads include edge_type, category, tooltip, color, line_style, arrow_shape, width
Cytoscape / JS safety and runtime behavior
- Added escape_js_string() helper to escape backslashes, single quotes, CR/LF for safe embedding into single-quoted JavaScript literals
- JavaScript generation (via generateCytoscapeConfig path) updated to expect new node/edge payload fields (parent, parent_protein, compound ids, ptm overlap tooltip). (JS positioning code referenced in diff summary — PTM repositioning routine added to run after layoutstop to arrange PTM child nodes in a bottom-arc around parent nodes.)
Miscellaneous
- getRelationshipProperties extended to keep PTM-relevant configs untouched; getEdgeStyle reused for new edge_type values
- createNodeElements and createEdgeElements ensure escaping of strings when embedding into JS element literals

Unit Tests Added or Modified

No new tests targeting PTM-specific functionality were added.
Existing tests (tests/testthat/test-visualizeNetworksWithHTML.R) cover:
- mapLogFCToColor, getRelationshipProperties, consolidateEdges (general consolidation behavior), getEdgeStyle, createNodeElements (basic node emission without PTM coverage), createEdgeElements (basic edge emission/assert fields), generateCytoscapeConfig, and style/layout conversion helpers.
Missing tests (not present in current test suite):
- Compound node creation and correct parent assignment for protein nodes
- Emission of PTM child nodes and PTM attachment edges (IDs, parent_protein, node_type, category)
- .calculatePTMOverlapAggregated aggregation correctness for various delimiter formats and multi-row node Site values
- Propagation of ptm_overlap into consolidated edges and escaping in tooltip text
- escape_js_string correctness across backslashes, single quotes, CR/LF, and empty/null inputs
- Post-layout JavaScript PTM repositioning logic and resulting coordinates / non-overlap behavior

Coding Guideline Violations / Risks

Test coverage gap (significant): Complex PTM rendering logic (hierarchical compound nodes, PTM child emission, PTM-attachment edges, overlap aggregation, and JS escape/tooltip behavior) lacks direct unit tests. Codecov indicates patch coverage ~74.66% with 37 changed lines untested and identifies R/visualizeNetworksWithHTML.R as missing coverage lines.
Maintainability risk: Non-trivial layout/reposition algorithm (JS added to run on layoutstop) and tooltip wiring are not validated by automated tests, increasing risk of regressions.
Documentation gap (minor): Internal aggregation and PTM-edge semantics would benefit from clearer docstrings or comments describing expected Site delimiters and edge_key matching assumptions.

coderabbitai · 2026-02-25T15:56:05Z

📝 Walkthrough

Walkthrough

Adds PTM-aware rendering to the HTML network exporter: emits compound containers, protein and PTM child nodes, ptm_attachment edges with tooltip data, PTM-specific Cytoscape styles, and JS that repositions PTM nodes around their parent after layout.

Changes

Cohort / File(s)	Summary
Core PTM generation & payloads `R/visualizeNetworksWithHTML.R`	Pre-computes PTM sites, emits invisible compound containers, per-protein nodes (with optional parent compound linkage), PTM child nodes and ptm_attachment edges; includes parent_protein and compound_id in node data and ptm_overlap in edge data.
Cytoscape styles & config `R/visualizeNetworksWithHTML.R`	Adds node_type = 'ptm' styling, invisible compound styling, and a new 'ptm_attachment' edge style (dotted); expands generateCytoscapeConfig payloads to include PTM-related fields.
JavaScript generation & utilities `R/visualizeNetworksWithHTML.R`	Introduces escape_js_string helper, embeds PTM tooltip text safely, emits JS to reposition PTM nodes post-layout in a bottom-arc distribution around parent proteins.
Layout/positioning logic `R/visualizeNetworksWithHTML.R`	Post-layout reposition routine executed after layoutstop to distribute multiple PTMs around parent protein; computes angles and updates positions via Cytoscape JS.

Sequence Diagram

sequenceDiagram
    participant R as Data Processor (R)
    participant Config as Cytoscape Config (JSON)
    participant Browser as Render Engine (JS)
    participant Layout as Layout Engine (Cytoscape)

    R->>Config: Emit protein nodes, compound containers, PTM nodes, edges (with ptm metadata)
    Config->>Browser: Load nodes/edges and style rules
    Browser->>Layout: Run Cytoscape layout
    Layout->>Browser: Fire layoutstop event
    Browser->>Browser: Compute arc positions for PTMs around parent protein
    Browser->>Layout: Update PTM node positions (position override)
    Browser->>Browser: Render final network with positioned PTMs

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

feature(export-png): Add functionality to export networks as high quality PNGs #62: Overlapping changes to PTM tooltip generation, JS escaping logic, and embedding ptm_overlap metadata.
Feature ptm analysis #56: Related PTM-aware node/edge handling and propagation of PTM metadata into Cytoscape output.
Feature html viz #53: Changes to generateCytoscapeConfig that also add PTM-aware nodes, edges, and JS integration.

Suggested labels

Review effort 3/5

Poem

🐇 In code I hop where PTMs hide,

I tuck them round their protein side,
Invisible tubs and dotted ties,
Arcs of sparkle beneath the skies,
A tiny network, neatly styled!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description is entirely missing—only an empty template was provided with no content filled in for Motivation, Changes, Testing, or checklist items.	Provide a complete description covering the motivation for PTM visualization refactoring, detailed list of changes made, testing performed, and mark the pre-review checklist items as complete.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: refactoring of PTM visualization logic within the R file, though it lacks specific detail about the scope of improvements.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch refactor-ptm

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-25T15:56:53Z

Failed to generate code suggestions for PR

codecov-commenter · 2026-02-25T15:58:02Z

Codecov Report

❌ Patch coverage is 72.54902% with 42 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.91%. Comparing base (7ff5f09) to head (c5d4a50).

Files with missing lines	Patch %	Lines
R/visualizeNetworksWithHTML.R	72.54%	42 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##            devel      #69      +/-   ##
==========================================
+ Coverage   58.76%   59.91%   +1.14%     
==========================================
  Files           7        7              
  Lines        1414     1559     +145     
==========================================
+ Hits          831      934     +103     
- Misses        583      625      +42

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@R/visualizeNetworksWithHTML.R`:
- Around line 311-315: Duplicate PTM node/edge IDs are being produced when
constructing IDs from protein + site (variables node_elements, ptm_elements,
emitted_proteins, emitted_compounds); before appending any PTM-related node or
edge, compute the PTM id (e.g., paste protein and site) and check it against a
new deduplication set (e.g., emitted_ptm_ids) and only append if not already
present, adding the id to emitted_ptm_ids after append; apply the same guard
where PTM edges are created (the code paths that push to ptm_elements and
node_elements and where emitted_proteins/emitted_compounds are updated) to
eliminate duplicates across rows and within rows.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7ff5f09 and 4ff6f5d.

📒 Files selected for processing (1)

R/visualizeNetworksWithHTML.R

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@R/visualizeNetworksWithHTML.R`:
- Around line 311-415: The review asks for regression tests covering the new PTM
emission and layout branches in visualizeNetworksWithHTML.R: add testthat cases
that exercise has_ptm_sites / needs_compound logic and deduplication by
emitted_compounds, emitted_ptm_nodes, and emitted_ptm_edges; specifically create
tests for (1) repeated identical sites within one row, (2) identical sites
across multiple rows for the same protein id, (3) rows with multiple distinct
sites, and (4) multiple PTM siblings to verify compound parent assignment and
unique PTM node/edge ids; call the function (the wrapper that returns
node_elements/ptm_elements) with crafted nodes data frames and assert the
returned vector contains the expected compound node id
(paste0(id,'__compound__')), unique ptm node ids (paste0(id,'__ptm__',site)),
and single attachment edges per site, and add these tests to testthat suite so
lines covered in has_ptm_sites, the for-loop PTM emission, and dedupe branches
are exercised.
- Around line 707-709: The selector construction using string interpolation of
parentId is unsafe; instead select candidate nodes and filter by their data
value to avoid selector-special-character issues: replace the
cy.nodes('[parent_protein = "' + parentId + '"]') usage with a safer approach
that first grabs nodes (e.g., cy.nodes() or cy.nodes('[parent_protein]')) and
then .filter(...) comparing node.data('parent_protein') === parentId to produce
siblings, keeping the subsequent idx = siblings.indexOf(ptmNode) and total =
siblings.length logic unchanged.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4ff6f5d and c5d4a50.

📒 Files selected for processing (1)

R/visualizeNetworksWithHTML.R

coderabbitai · 2026-02-25T21:54:47Z

+    node_elements    <- c()
+    ptm_elements     <- c()
+    emitted_proteins <- c()
+    emitted_compounds <- c()
+    emitted_ptm_nodes <- c()
+    emitted_ptm_edges <- c()
+
+    # Pre-compute which protein ids have at least one PTM site row,
+    # so we know upfront whether a compound wrapper is needed
+    has_ptm_sites <- if ("Site" %in% names(nodes)) {
+        ids_with_sites <- unique(nodes$id[!is.na(nodes$Site) & trimws(nodes$Site) != ""])
+        ids_with_sites
+    } else {
+        c()
+    }
+
+    for (i in seq_len(nrow(nodes))) {
+        row      <- nodes[i, ]
+        color    <- node_colors[i]
+        has_site <- "Site" %in% names(nodes) && !is.na(row$Site) && trimws(row$Site) != ""
+
+        display_label <- if (label_column == "hgncName" && !is.na(row$hgncName) && row$hgncName != "") {
+            row$hgncName
        } else {
-            row['id']
+            row$id
        }

-        paste0("{ data: { id: '", row['id'], "', label: '", display_label, "', color: '", row['color'], "' } }")
-    })
+        needs_compound <- row$id %in% has_ptm_sites
+        compound_id    <- paste0(row$id, "__compound__")
+
+        # Emit invisible compound container node once per protein that has PTM children
+        if (needs_compound && !(compound_id %in% emitted_compounds)) {
+            node_elements <- c(node_elements,
+                               paste0("{ data: { id: '", escape_js_string(compound_id),
+                                      "', node_type: 'compound' } }")
+            )
+            emitted_compounds <- c(emitted_compounds, compound_id)
+        }
+
+        # Emit protein node once, assigning it to the compound if one exists
+        if (!(row$id %in% emitted_proteins)) {
+            parent_field <- if (needs_compound) {
+                paste0(", parent: '", escape_js_string(compound_id), "'")
+            } else {
+                ""
+            }
+            node_elements <- c(node_elements,
+                               paste0("{ data: { id: '", escape_js_string(row$id),
+                                      "', label: '", escape_js_string(display_label),
+                                      "', color: '", color,
+                                      "', node_type: 'protein'",
+                                      parent_field,
+                                      " } }")
+            )
+            emitted_proteins <- c(emitted_proteins, row$id)
+        }
+
+        # Emit one PTM child node + attachment edge per individual site
+        if (has_site) {
+            sites <- trimws(unlist(strsplit(as.character(row$Site), "[_,;|]")))
+            sites <- unique(sites[sites != ""])
+
+            for (site in sites) {
+                ptm_node_id <- paste0(row$id, "__ptm__", site)
+                safe_ptm_id <- escape_js_string(ptm_node_id)
+                safe_parent <- escape_js_string(row$id)
+                safe_site   <- escape_js_string(site)
+
+                # PTM node also belongs to the same compound container
+                if (!(ptm_node_id %in% emitted_ptm_nodes)) {
+                    ptm_elements <- c(ptm_elements,
+                                      paste0("{ data: { id: '", safe_ptm_id,
+                                             "', label: '", safe_site,
+                                             "', color: '", color,
+                                             "', parent_protein: '", safe_parent,
+                                             "', parent: '", escape_js_string(compound_id), "'",
+                                             ", node_type: 'ptm' } }")
+                    )
+                    emitted_ptm_nodes <- c(emitted_ptm_nodes, ptm_node_id)
+                }
+
+                ptm_edge_id_raw <- paste0(row$id, "__ptm_edge__", site)
+                if (!(ptm_edge_id_raw %in% emitted_ptm_edges)) {
+                    ptm_edge_id <- escape_js_string(ptm_edge_id_raw)
+                    ptm_elements <- c(ptm_elements,
+                                      paste0("{ data: { id: '", ptm_edge_id,
+                                             "', source: '", safe_parent,
+                                             "', target: '", safe_ptm_id,
+                                             "', edge_type: 'ptm_attachment',",
+                                             " category: 'ptm_attachment',",
+                                             " interaction: '',",
+                                             " color: '", color, "',",
+                                             " line_style: 'dotted',",
+                                             " arrow_shape: 'none',",
+                                             " width: 1.5,",
+                                             " tooltip: '' } }")
+                    )
+                    emitted_ptm_edges <- c(emitted_ptm_edges, ptm_edge_id_raw)
+                }
+            }
+        }
+    }
+
+    return(c(node_elements, ptm_elements))
 }


⚠️ Potential issue | 🟠 Major

Please add regression tests for the new PTM emission/layout paths.

This PR adds non-trivial branching here (compound wrapping, per-site dedup, and post-layout PTM positioning), and this file still has uncovered changed lines in the patch report. Add targeted tests for repeated sites (within/across rows), multi-site rows, and multi-PTM sibling placement.

I can draft testthat cases for these scenarios if you want.

Also applies to: 693-728

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@R/visualizeNetworksWithHTML.R` around lines 311 - 415, The review asks for regression tests covering the new PTM emission and layout branches in visualizeNetworksWithHTML.R: add testthat cases that exercise has_ptm_sites / needs_compound logic and deduplication by emitted_compounds, emitted_ptm_nodes, and emitted_ptm_edges; specifically create tests for (1) repeated identical sites within one row, (2) identical sites across multiple rows for the same protein id, (3) rows with multiple distinct sites, and (4) multiple PTM siblings to verify compound parent assignment and unique PTM node/edge ids; call the function (the wrapper that returns node_elements/ptm_elements) with crafted nodes data frames and assert the returned vector contains the expected compound node id (paste0(id,'__compound__')), unique ptm node ids (paste0(id,'__ptm__',site)), and single attachment edges per site, and add these tests to testthat suite so lines covered in has_ptm_sites, the for-loop PTM emission, and dedupe branches are exercised.

coderabbitai · 2026-02-25T21:54:47Z

+            var siblings = cy.nodes('[parent_protein = \"' + parentId + '\"]');
+            var idx      = siblings.indexOf(ptmNode);
+            var total    = siblings.length;


⚠️ Potential issue | 🟡 Minor

Avoid interpolating raw parentId into the selector string.

If an ID contains selector-significant characters, sibling lookup can fail or mis-select. Prefer filtering by data value instead of string-building selectors.

Proposed fix

- var siblings = cy.nodes('[parent_protein = \"' + parentId + '\"]'); + var siblings = ptmNodes.filter(function(n) { + return n.data('parent_protein') === parentId; + });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@R/visualizeNetworksWithHTML.R` around lines 707 - 709, The selector construction using string interpolation of parentId is unsafe; instead select candidate nodes and filter by their data value to avoid selector-special-character issues: replace the cy.nodes('[parent_protein = "' + parentId + '"]') usage with a safer approach that first grabs nodes (e.g., cy.nodes() or cy.nodes('[parent_protein]')) and then .filter(...) comparing node.data('parent_protein') === parentId to produce siblings, keeping the subsequent idx = siblings.indexOf(ptmNode) and total = siblings.length logic unchanged.

tonywu1999 added 3 commits February 25, 2026 08:35

first attempt for PTM site as a node

a2f6eac

change PTM site color

b4b84d6

place child PTM nodes adjacent to parent nodes and add compound nodes

4ff6f5d

coderabbitai Bot reviewed Feb 25, 2026

View reviewed changes

Comment thread R/visualizeNetworksWithHTML.R

coderabbit comment

c5d4a50

coderabbitai Bot reviewed Feb 25, 2026

View reviewed changes

tonywu1999 merged commit 0263a37 into devel Feb 25, 2026
4 checks passed

tonywu1999 deleted the refactor-ptm branch February 25, 2026 21:58

This was referenced Feb 26, 2026

Refactor viz #72

Merged

refactor(visualizeNetworks): Refactor exportNetworktoHTML and previewNetwork to use cytoscapeNetwork function #73

Merged

coderabbitai Bot mentioned this pull request Apr 16, 2026

Feat delete #96

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor PTM visualization#69

Refactor PTM visualization#69
tonywu1999 merged 4 commits intodevelfrom
refactor-ptm

tonywu1999 commented Feb 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Feb 25, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Feb 25, 2026

Uh oh!

codecov-commenter commented Feb 25, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Feb 25, 2026

Uh oh!

coderabbitai Bot Feb 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tonywu1999 commented Feb 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Detailed Changes

Unit Tests Added or Modified

Coding Guideline Violations / Risks

Uh oh!

coderabbitai Bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Feb 25, 2026

Uh oh!

codecov-commenter commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tonywu1999 commented Feb 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Feb 25, 2026 •

edited

Loading

codecov-commenter commented Feb 25, 2026 •

edited

Loading