fix: emit CALLS edges for module-scope code (closes #284) by michael-denyer · Pull Request #285 · tirth8205/code-review-graph

michael-denyer · 2026-04-15T00:36:49Z

Summary

Closes #284.

_extract_calls and 4 sibling helpers gated CALLS edge emission on enclosing_func being set, so module-scope calls (top-level script glue, CLI entrypoints, if __name__ == '__main__' blocks, Jupyter/Databricks notebook cells) produced zero CALLS edges. Any function invoked only from those contexts was flagged as dead by find_dead_code.

Notebooks were hit hardest: PR #69 added node + IMPORTS_FROM extraction, but every cell is module-scope by definition, so notebooks emitted no CALLS edges at all — making the dead-code detector's notebook coverage vacuous.

What changed

Parser (5 emission sites): when enclosing_func is None, attribute the CALLS edge to the File node instead of dropping it. Matches the existing convention used by _extract_value_references and CONTAINS edges.

Site	Language(s)
`_extract_calls` (the main path)	Python, JS, TS, generic
Elixir call path	Elixir
JSX component invocation	TSX/JSX
Solidity `emit`	Solidity
R call path	R

Downstream fix in detect_entry_points: without filtering, a script's module-scope calls would attribute to the script's own File node, making script-only callees look "called by the script" and hiding them from flow analysis. Added get_all_call_targets(include_file_sources=False) so detect_entry_points excludes File-sourced CALLS. Implementation joins against nodes.kind = 'File' rather than pattern-matching source_qualified so future changes to File-node naming can't silently miscategorize edges.

End-to-end verification

Real-world repro: a Databricks notebook (production inference pipeline) calling Predict.extract_data_from_sample_ids().

Before:

>>> CodeParser().parse_file(Path('ML_wpredict_apply_v1.0.ipynb'))
nodes: 1, edges: 3 (all IMPORTS_FROM, zero CALLS)
>>> find_dead_code(...) → flags extract_data_from_sample_ids, extract_data_from_files

After:

nodes: 1, edges: 17 (3 IMPORTS_FROM, 14 CALLS)
>>> find_dead_code(...) → no longer flags either method

Tests

5 new tests, all passing:

test_parser.py::test_module_scope_calls_attributed_to_file — bare .py script
test_parser.py::test_module_scope_calls_in_notebook — .ipynb file
test_flows.py::test_detect_entry_points_module_scope_caller_is_still_root — flow analysis treats File-sourced CALLS correctly
test_refactor.py::test_module_scope_caller_prevents_dead_code_flag — end-to-end parse → store → find_dead_code
test_refactor.py::test_if_main_block_caller_prevents_dead_code_flag — same for __main__ block

Full impacted suite: 318 passed, 0 failures (parser, refactor, flows, multilang, notebook).

$ uv run pytest tests/test_parser.py tests/test_refactor.py tests/test_notebook.py tests/test_multilang.py tests/test_flows.py
================== 316 passed, 2 xpassed, 1 warning ===================

Test plan

Parser emits CALLS edges from module-scope code (Python .py)
Parser emits CALLS edges from notebook cells (.ipynb)
detect_entry_points excludes File-sourced CALLS so script-only callees remain roots
find_dead_code does not flag module-scope-called functions
find_dead_code does not flag __main__-block-called functions
No regressions in existing parser/refactor/flows/multilang/notebook tests

Not addressed (scope kept tight)

Reviewed all dead-code-related PRs/issues (#104, #108, #154, #158, #160, #247, #249) — none address module-scope CALLS emission. The other 4 helper sites (Elixir, JSX, Solidity, R) had the same gating shape so are fixed in the same PR for consistency, even though the original repro only needed the Python path.

The parser gated CALLS edge emission on `enclosing_func` being set, so calls made from module scope (top-level script glue, CLI entrypoints, `if __name__ == "__main__"` blocks, and Jupyter/Databricks notebook cells) produced zero CALLS edges. Any function invoked only from those contexts was flagged as dead by `find_dead_code`, even when the function was the entire reason the script existed. Notebooks are particularly affected because every cell is module-scope by definition, so the existing notebook parser (PR tirth8205#69) emitted nodes and IMPORTS_FROM edges but no CALLS edges — making the dead-code detector's notebook coverage vacuous. Fix: when `enclosing_func` is None, attribute the CALLS edge to the File node instead of dropping it. Matches the existing convention used by `_extract_value_references` and CONTAINS edges. Applied to all 5 gated emission sites: generic Python/JS/TS path, JSX components, Elixir, Solidity `emit`, and R. Downstream: `detect_entry_points` now filters File-sourced CALLS via `get_all_call_targets(include_file_sources=False)` so script-only callees remain detectable as entry points (otherwise `run_job()` called from `script.py` module scope would look "called" by `script.py` and disappear from flow analysis). Verified end-to-end against a Databricks `.ipynb` that calls `Predict.extract_data_from_sample_ids()` from cell-level code: edge count went from 0 to 14 CALLS edges, and `find_dead_code` no longer flags the method. Tests: - `test_module_scope_calls_attributed_to_file` — bare `.py` script - `test_module_scope_calls_in_notebook` — `.ipynb` file - `test_detect_entry_points_module_scope_caller_is_still_root` — flow analysis treats File-sourced CALLS correctly - `test_module_scope_caller_prevents_dead_code_flag` — end-to-end parse → store → find_dead_code - `test_if_main_block_caller_prevents_dead_code_flag` — same for `__main__` block

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: emit CALLS edges for module-scope code (closes #284)#285

fix: emit CALLS edges for module-scope code (closes #284)#285
michael-denyer wants to merge 1 commit intotirth8205:mainfrom
michael-denyer:fix/module-scope-calls

michael-denyer commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

michael-denyer commented Apr 15, 2026

Summary

What changed

End-to-end verification

Tests

Test plan

Not addressed (scope kept tight)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant