feat: louvain community detection for module boundary analysis#133
feat: louvain community detection for module boundary analysis#133carlos-alm merged 1 commit intomainfrom
Conversation
Add `codegraph communities` command that runs Louvain clustering on the dependency graph, compares discovered communities against directory structure, and surfaces architectural drift (split/merge candidates, drift score). Supports file-level (default) and function-level modes, configurable resolution, and drift-only output. Integrated into stats, MCP, and programmatic API. Impact: 9 functions changed, 8 affected
|
Claude finished @carlos-alm's task —— View job PR Review: Community Detection FeatureComprehensive review complete using the strict codegraph framework. Todo List
Phase 1: Root Cause Analysis ✅Problem Being Solved: This PR addresses the need to detect natural module boundaries in codebases by identifying tightly coupled groups of files/functions that may not align with the current directory structure. This helps identify architectural drift and suggests refactoring opportunities. Approach: Uses Louvain community detection algorithm on the dependency graph to discover natural clusters, then compares these communities against directory structure to identify split/merge candidates and calculate a drift score. Risk Assessment: LOW - This is a genuine analysis feature that adds value without modifying core parsing or graph building logic. No symptoms are being hidden. Phase 2: Configuration Change Detection ✅Result: PASS - No high-risk configuration changes detected:
Phase 3: Test Integrity ✅Result: PASS - Strong test coverage with no red flags:
Phase 4: Code Quality and Security Review ✅Architecture & Integration: EXCELLENT
Security: SECURE
Code Quality: HIGH
Performance: EFFICIENT
Minor Observations:
Root Cause AnalysisProblem Being Solved: Identifying architectural drift between logical code dependencies and directory structure to guide refactoring decisions. Final Recommendation
|
Greptile SummaryThis PR adds Louvain community detection to analyze module boundaries and architectural drift. The implementation is well-structured with comprehensive test coverage (13 tests, 710 total passing). Key strengths:
Technical details:
The code follows project conventions, has no security issues, and integrates cleanly into CLI, MCP, and programmatic APIs. Confidence Score: 5/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
CLI[CLI: codegraph communities] --> CommData[communitiesData]
Stats[CLI: codegraph stats] --> CommSummary[communitySummaryForStats]
MCP[MCP Server] --> CommData
API[Programmatic API] --> CommData
CommData --> BuildGraph[buildGraphologyGraph]
CommSummary --> CommData
BuildGraph --> DB[(SQLite DB)]
BuildGraph --> FileLevel{Mode?}
FileLevel -->|file-level| FileNodes[nodes: files<br/>edges: imports]
FileLevel -->|function-level| FnNodes[nodes: functions<br/>edges: calls]
FileNodes --> GraphObj[Graphology Graph]
FnNodes --> GraphObj
GraphObj --> Louvain[Louvain Algorithm]
Louvain --> Communities[Community Assignments]
Communities --> DirAnalysis[Directory Analysis]
DirAnalysis --> Split[Split Candidates]
DirAnalysis --> Merge[Merge Candidates]
DirAnalysis --> DriftScore[Drift Score 0-100]
Communities --> Output[JSON Output]
Split --> Output
Merge --> Output
DriftScore --> Output
Last reviewed commit: cc28daa |
Update README, CLAUDE.md, BACKLOG, titan-paradigm, recommended-practices, and CLI/MCP examples to reflect today's merged PRs: complexity metrics (#130/#139), Louvain community detection (#133/#134), and manifesto rule engine (#138). Updates MCP tool count from 21 to 24 (25 in multi-repo), marks backlog items 6/11/21/22 as done, and adds real CLI output examples.
Update README, CLAUDE.md, BACKLOG, titan-paradigm, recommended-practices, and CLI/MCP examples to reflect today's merged PRs: complexity metrics (#130/#139), Louvain community detection (#133/#134), and manifesto rule engine (#138). Updates MCP tool count from 21 to 24 (25 in multi-repo), marks backlog items 6/11/21/22 as done, and adds real CLI output examples.
* fix: strict type validation for threshold values in complexity queries
Replace loose `!= null` checks with `typeof === 'number' && Number.isFinite()`
to prevent `Number("")`, `Number(null)`, and `Number(true)` from silently
coercing into valid SQL values. Add integration test verifying exceeds
arrays and summary.aboveWarn are correctly computed.
Addresses Greptile review feedback on #136.
Impact: 2 functions changed, 3 affected
* docs: add complexity, communities, and manifesto to all docs
Update README, CLAUDE.md, BACKLOG, titan-paradigm, recommended-practices,
and CLI/MCP examples to reflect today's merged PRs: complexity metrics
(#130/#139), Louvain community detection (#133/#134), and manifesto rule
engine (#138). Updates MCP tool count from 21 to 24 (25 in multi-repo),
marks backlog items 6/11/21/22 as done, and adds real CLI output examples.
* fix: remove redundant condition in paginate guard clauses
When limit === undefined, limit !== 0 is always true — the && check
was dead code. Simplified to just check limit === undefined.
Impact: 2 functions changed, 18 affected
* docs: update dogfood report with fix statuses
All 4 bugs now fixed (PR #117 merged, #116 closed via reverse-dep
cascade). 3 of 4 suggestions addressed. MCP tool counts updated
18→23 / 19→24. Rating upgraded 7/10 → 9/10 post-fix.
* fix: rename misleading test to match actual behavior
Test was named "handles non-numeric thresholds gracefully" but only
validated baseline exceeds/aboveWarn with valid thresholds. Actual
non-numeric threshold tests exist separately. Renamed to "produces
correct exceeds and aboveWarn with valid thresholds".
* fix: update stale MCP tool count in dogfood skill (21→24)
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* fix: strict type validation for threshold values in complexity queries
Replace loose `!= null` checks with `typeof === 'number' && Number.isFinite()`
to prevent `Number("")`, `Number(null)`, and `Number(true)` from silently
coercing into valid SQL values. Add integration test verifying exceeds
arrays and summary.aboveWarn are correctly computed.
Addresses Greptile review feedback on #136.
Impact: 2 functions changed, 3 affected
* docs: add complexity, communities, and manifesto to all docs
Update README, CLAUDE.md, BACKLOG, titan-paradigm, recommended-practices,
and CLI/MCP examples to reflect today's merged PRs: complexity metrics
(#130/#139), Louvain community detection (#133/#134), and manifesto rule
engine (#138). Updates MCP tool count from 21 to 24 (25 in multi-repo),
marks backlog items 6/11/21/22 as done, and adds real CLI output examples.
* fix: remove redundant condition in paginate guard clauses
When limit === undefined, limit !== 0 is always true — the && check
was dead code. Simplified to just check limit === undefined.
Impact: 2 functions changed, 18 affected
* docs: update dogfood report with fix statuses
All 4 bugs now fixed (PR #117 merged, #116 closed via reverse-dep
cascade). 3 of 4 suggestions addressed. MCP tool counts updated
18→23 / 19→24. Rating upgraded 7/10 → 9/10 post-fix.
* fix: rename misleading test to match actual behavior
Test was named "handles non-numeric thresholds gracefully" but only
validated baseline exceeds/aboveWarn with valid thresholds. Actual
non-numeric threshold tests exist separately. Renamed to "produces
correct exceeds and aboveWarn with valid thresholds".
* fix: update stale MCP tool count in dogfood skill (21→24)
* feat: add complexity analysis for Python, Go, Rust, Java, C#, Ruby, PHP
Parameterize the complexity algorithm to support all 10 languages instead
of just JS/TS/TSX. Add per-language COMPLEXITY_RULES, HALSTEAD_RULES, and
COMMENT_PREFIXES with three else-if detection patterns (else-wraps-if,
explicit elif, alternative field). Guard against tree-sitter keyword leaf
tokens that share node type names with their parent constructs.
Impact: 4 functions changed, 4 affected
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Summary
codegraph communitiescommand that runs Louvain clustering on the dependency graph, compares discovered communities against directory structure, and surfaces architectural drift (split/merge candidates, drift score)--functions) modes with configurable resolution parametercodegraph stats, MCP server (communitiestool), and programmatic API exportsDetails
New file:
src/communities.js(~200 lines) — core module withcommunitiesData(),communitySummaryForStats(), andcommunities()CLI display function. Usesgraphology+graphology-communities-louvain(pure JS, ~50KB).CLI options:
--functions— function-level instead of file-level--resolution <n>— Louvain resolution (default 1.0, higher = more communities)--drift— show only drift analysis-T,-j,-d— standard flagsDrift analysis:
Dependencies:
graphology^0.25.4,graphology-communities-louvain^2.0.2 (regular deps)Test plan
communities,--functions,--drift -T,--resolution 1.5 -j,stats -T