diff --git a/.claude/skills/mo-hooks/SKILL.md b/.claude/skills/mo-hooks/SKILL.md
index 5053eae..e3b45a1 100644
--- a/.claude/skills/mo-hooks/SKILL.md
+++ b/.claude/skills/mo-hooks/SKILL.md
@@ -9,13 +9,21 @@ Set up Git hooks so README_AI.md files automatically update when code changes ar
## How It Works
+**Architecture**: Thin wrapper shell script → Python logic (auto-upgradeable via pip)
+
When a developer commits code changes:
-1. **post-commit hook** detects which directories were affected
-2. `codeindex affected --json` analyzes the change scope
-3. `codeindex scan` regenerates README_AI.md for affected directories
-4. Updated README_AI.md files are auto-committed
+1. Shell wrapper skips doc-only commits (loop guard), activates venv
+2. Delegates to `codeindex hooks run post-commit` (Python)
+3. `codeindex affected --json` analyzes the change scope
+4. `codeindex scan` regenerates structural README_AI.md for affected directories
+5. Updated README_AI.md files are auto-committed
+
+**Key**: Hook only updates structural content. AI blockquote descriptions
+(module purpose) are not regenerated per-commit — run `codeindex scan-all`
+to refresh those.
-This keeps documentation always in sync with code — zero manual effort.
+**Upgrade**: `pip install --upgrade ai-codeindex` auto-updates hook logic.
+No need to reinstall hooks after package upgrade.
## Prerequisites
@@ -137,6 +145,16 @@ codeindex hooks uninstall post-commit
codeindex hooks uninstall --all
```
+## Upgrading Hooks
+
+```bash
+# Usually NOT needed — pip upgrade auto-updates Python logic
+pip install --upgrade ai-codeindex
+
+# Only if release notes say "reinstall hooks":
+codeindex hooks install post-commit --force
+```
+
## Troubleshooting
| Problem | Solution |
@@ -146,6 +164,7 @@ codeindex hooks uninstall --all
| Hook too slow | Set `mode: async` in .codeindex.yaml hooks config |
| Want manual control | Set `mode: prompt` — shows notification, you decide when to update |
| Virtual env not found | Ensure `.venv/` or `venv/` exists at project root |
+| Old hook with AI prompts | Run `codeindex hooks install post-commit --force` to upgrade to thin wrapper |
## Advanced: CLAUDE.md Integration
diff --git a/.claude/skills/mo-index/SKILL.md b/.claude/skills/mo-index/SKILL.md
index 709dd49..abf0f56 100644
--- a/.claude/skills/mo-index/SKILL.md
+++ b/.claude/skills/mo-index/SKILL.md
@@ -43,19 +43,21 @@ codeindex list-dirs
### Step 4: Index Directories
-**All directories (recommended - structural documentation, works immediately):**
+**All directories (recommended):**
```bash
+# When ai_command is configured, automatically includes AI enrichment (Phase 2)
codeindex scan-all
+
+# Disable AI enrichment
+codeindex scan-all --no-ai
```
**Single directory:**
```bash
codeindex scan ./src/module
-```
-**AI-enhanced mode (requires ai_command in config):**
-```bash
-codeindex scan-all --ai
+# Full AI-generated README for a single directory (requires --ai)
+codeindex scan ./src/module --ai
codeindex scan ./src/module --ai --dry-run # Preview AI prompt
```
@@ -110,8 +112,9 @@ Done! Your project is now indexed:
| Mode | Command | Description |
|------|---------|-------------|
-| Structural (default) | `codeindex scan-all` | Fast, no AI needed, works immediately |
-| AI-enhanced | `codeindex scan-all --ai` | Richer docs, requires ai_command config |
+| Auto (default) | `codeindex scan-all` | Structural + AI enrichment if ai_command configured |
+| No AI | `codeindex scan-all --no-ai` | Structural only, skip AI enrichment |
+| Single dir AI | `codeindex scan ./dir --ai` | Full AI-generated README for one directory |
## Configuration Reference
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 3bd91b4..0f6c34d 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,43 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]
+## [0.23.0] - 2026-03-12
+
+### Added
+
+- **AI-Enhanced Module Descriptions** (Epic #25, Stories 25.1–25.3): Redefine `--ai` mode from full AI takeover to structural + AI micro-enhancement.
+ - **Blockquote description support**: `extract_module_description()` now reads `> description` as Strategy 0 (highest priority)
+ - **AI enrichment module** (`src/codeindex/enricher.py`): Generates one-line functional descriptions per module using symbol names + file names
+ - **Concise prompt design**: ~200-400 tokens per directory, ≤30 char output, 10-20x cheaper than old `--ai` mode
+ - **Batch AI calls**: Groups multiple directories per AI invocation to reduce overhead
+ - **scan-all auto-AI**: Automatically enables Phase 2 AI enrichment when `ai_command` is configured (no `--ai` flag needed)
+ - **`--no-ai` opt-out**: Explicitly disable AI enrichment for structural-only output
+ - **`--ai` / `--no-ai` mutual exclusion**: Clear error when both flags are used
+
+- **Post-commit hook thin wrapper** (Issue #30 fix): Redesigned hook architecture for maintainability.
+ - **Thin shell wrapper** (~30 lines): Only handles loop guard + venv activation
+ - **Python logic** via `codeindex hooks run post-commit`: All business logic in upgradeable Python
+ - **Upgrade path**: `pip install --upgrade ai-codeindex` auto-updates hook behavior (no reinstall needed)
+ - **No custom AI prompts**: Uses `codeindex scan` pipeline, eliminating commit changelog noise
+
+### Fixed
+
+- **AI mode commit changelog noise** (#30): Post-commit hook no longer injects git diff into custom AI prompts. Uses standard `codeindex scan` pipeline instead.
+- **Enricher prompt accuracy**: Improved from "20 chars/brief" to "30 chars/concise" for better description quality.
+
+### Changed
+
+- **`--no-ai` flag**: Changed from hidden/deprecated to active opt-out flag for scan-all.
+- **Post-commit hook generation**: `_generate_post_commit_script()` now generates thin wrapper delegating to Python.
+- **Documentation**: All hook and scan-all docs rewritten for AI CLI agent readers.
+
+### Technical Details
+
+- **New module**: `src/codeindex/enricher.py` — AI enrichment with `enrich_directory()` and `_enrich_directories_with_ai()`
+- **New CLI subcommand**: `codeindex hooks run post-commit` — Python-side post-commit logic
+- **New tests**: 15 tests (7 scan-all auto-AI + 8 post-commit hook)
+- **Total tests**: 1532 passed
+
## [0.22.2] - 2026-03-08
### Added
diff --git a/CLAUDE.md b/CLAUDE.md
index 7e52efa..79d2cfb 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -607,7 +607,7 @@ codeindex status
## 📈 Version History
-**Current version**: v0.22.2
+**Current version**: v0.23.0
For complete version history, see:
- **[CHANGELOG.md](CHANGELOG.md)** - Detailed changes for each version
diff --git a/README.md b/README.md
index 3d232c0..7eab632 100644
--- a/README.md
+++ b/README.md
@@ -160,8 +160,14 @@ codeindex scan-all
# Scan a single directory
codeindex scan ./src/auth
-# AI-enhanced documentation (requires ai_command in config)
-codeindex scan-all --ai
+# When ai_command is configured, auto-enables AI module descriptions
+codeindex scan-all
+
+# Disable AI enrichment (structural only)
+codeindex scan-all --no-ai
+
+# Full AI-generated README for a single directory
+codeindex scan ./src/auth --ai
# Preview AI prompt without executing
codeindex scan ./src/auth --ai --dry-run
@@ -428,7 +434,7 @@ See [Release Automation Guide](docs/development/QUICK_START_RELEASE.md) for deta
## Roadmap
-**Current version**: v0.22.2
+**Current version**: v0.23.0
**Recent milestones**:
- v0.22.2 — Auto-update CLAUDE.md on `pip upgrade`, `/codeindex-update-guide` skill
diff --git a/docs/guides/advanced-usage.md b/docs/guides/advanced-usage.md
index 533efcf..8f36760 100644
--- a/docs/guides/advanced-usage.md
+++ b/docs/guides/advanced-usage.md
@@ -7,11 +7,11 @@
The simplest way to scan entire projects:
```bash
-# Structural documentation (default, no AI needed)
+# When ai_command is configured, auto-enables AI enrichment
codeindex scan-all
-# AI-enhanced documentation
-codeindex scan-all --ai
+# Disable AI enrichment (structural only)
+codeindex scan-all --no-ai
# Custom timeout per directory
codeindex scan-all --timeout 180
@@ -20,6 +20,8 @@ codeindex scan-all --timeout 180
codeindex scan-all --workers 4
```
+> When `ai_command` is configured, `scan-all` runs in two phases: Phase 1 generates structural README_AI.md for all directories, Phase 2 uses AI to add a short functional description (`> description`) to each non-leaf directory. Cost: ~300-800 tokens input, ~20-50 tokens output per directory. Use `--no-ai` to skip Phase 2.
+
### Traditional Parallel with xargs
For fine-grained control, use traditional parallel scanning:
@@ -468,17 +470,20 @@ claude /mo:arch "Where is authentication implemented?"
## Tips & Tricks
-### 1. Selective AI Enhancement
+### 1. AI Enhancement Strategies
```bash
-# Generate structural docs for everything (fast)
+# Option A: Auto AI enrichment (default when ai_command configured)
codeindex scan-all
-# Then enhance critical modules with AI
-codeindex scan ./src/core --ai
-codeindex scan ./src/auth --ai
+# Option B: Selective full AI README for critical modules only
+codeindex scan-all --no-ai # Structural docs only
+codeindex scan ./src/core --ai # Full AI README for core module
+codeindex scan ./src/auth --ai # Full AI README for auth module
```
+> **Tip**: When `ai_command` is configured, `scan-all` automatically adds concise AI-generated module descriptions (e.g., "支付网关(微信、支付宝)") at low cost (~$0.5-2 for 2000+ directories). Use `scan --ai` only when you need a full AI-written README for a specific directory.
+
### 2. Conditional Scanning
Only scan if README_AI.md is missing:
diff --git a/docs/guides/getting-started.md b/docs/guides/getting-started.md
index 5a8fb85..62e5fd4 100644
--- a/docs/guides/getting-started.md
+++ b/docs/guides/getting-started.md
@@ -83,7 +83,7 @@ Generate documentation for a single directory:
# Structural documentation (default, no AI needed)
codeindex scan ./src/auth
-# AI-enhanced documentation (requires ai_command in config)
+# Full AI-generated README for a single directory (requires ai_command)
codeindex scan ./src/auth --ai
```
@@ -115,15 +115,18 @@ Scan all directories at once:
```bash
# Structural documentation for entire project
+# When ai_command is configured, automatically includes AI module descriptions
codeindex scan-all
-# AI-enhanced documentation for entire project
-codeindex scan-all --ai
+# Disable AI enrichment (structural only)
+codeindex scan-all --no-ai
# Traditional parallel scanning
codeindex list-dirs | xargs -P 4 -I {} codeindex scan {}
```
+> **Note**: When `ai_command` is configured in `.codeindex.yaml`, `scan-all` automatically runs a two-phase pipeline: Phase 1 generates structural README_AI.md, Phase 2 adds a short AI-generated functional description (`> ...`) to each non-leaf directory. Use `--no-ai` to skip Phase 2. This is different from `scan --ai`, which uses AI to generate the entire README for a single directory.
+
### 6. Symbol Indexes (v0.1.2+)
Generate project-wide indexes for navigation:
diff --git a/docs/guides/git-hooks-integration.md b/docs/guides/git-hooks-integration.md
index 0694fb7..eff4f5d 100644
--- a/docs/guides/git-hooks-integration.md
+++ b/docs/guides/git-hooks-integration.md
@@ -171,39 +171,39 @@ All checks passed!
### Post-commit Hook
-**Purpose**: Automatic documentation updates
+**Purpose**: Automatic structural documentation updates
+
+**Architecture** (v0.23.0+): Thin wrapper pattern
+- Shell script (~30 lines): loop guard + venv activation
+- Python logic via `codeindex hooks run post-commit`: all business logic
+- **Upgrade path**: `pip install --upgrade ai-codeindex` automatically updates hook behavior (no need to reinstall hooks)
**Features**:
- Analyzes commit changes (`codeindex affected`)
-- Updates README_AI.md for affected directories
+- Runs `codeindex scan` for affected directories (structural regeneration)
- Creates follow-up commit with updates
- Avoids infinite loops (skips doc-only commits)
+- No custom AI prompts — uses standard codeindex scan pipeline
**Workflow**:
```
Code Change Commit
↓
-Post-commit Hook Triggered
+Shell wrapper (loop guard + venv)
+ ↓
+codeindex hooks run post-commit (Python)
↓
-Analyze: Which directories changed?
+codeindex affected --json → affected directories
↓
-Update: README_AI.md files
+codeindex scan
for each → structural README_AI.md update
↓
Auto-commit: "docs: auto-update README_AI.md for "
```
-**Example Output**:
-```
-📝 Post-commit: Analyzing changes...
- Update level: full
- Found 2 directory(ies) to check
-
-→ Updating src/codeindex/README_AI.md
- Invoking AI CLI...
- ✓ Updated via AI
-
-✓ Post-commit hook completed
-```
+> **Note**: Post-commit hook only updates structural content. AI-generated
+> module descriptions (blockquotes) are not regenerated on every commit —
+> they describe module purpose which rarely changes. Run `codeindex scan-all`
+> (with `ai_command` configured) to refresh AI descriptions.
### Pre-push Hook
@@ -522,14 +522,34 @@ fi
**Note**: Manual edits will be lost if you reinstall with `--force`.
-### Hook Versioning
+### Hook Architecture and Upgrades
+
+**Thin wrapper pattern** (v0.23.0+):
+
+Post-commit hooks use a thin shell wrapper that delegates to Python:
+
+```
+.git/hooks/post-commit (shell, ~30 lines)
+ → loop guard (skip doc-only commits)
+ → activate venv
+ → codeindex hooks run post-commit ← Python logic
+
+codeindex hooks run post-commit (Python, in cli_hooks.py)
+ → codeindex affected --json
+ → codeindex scan for each affected dir
+ → git add + git commit
+```
-Hooks are marked with `# codeindex-managed hook` comment.
+**Upgrade behavior**:
+- `pip install --upgrade ai-codeindex` → Python logic auto-updates, no hook reinstall needed
+- `codeindex hooks install --force` → only needed if shell wrapper itself changes (rare)
+- Hooks are marked with `# codeindex-managed hook` comment
-To update hooks to latest version:
+**To update hooks to latest version**:
```bash
-# Reinstall all hooks
+# Usually not needed (Python logic auto-updates via pip)
+# Only if instructed by release notes:
codeindex hooks install --all --force
```
@@ -597,5 +617,5 @@ A: Hooks persist across branches (stored in `.git/hooks/`, not tracked by Git).
---
-**Last Updated**: 2026-02-13
-**Status**: Production Ready (v0.17.2)
+**Last Updated**: 2026-03-12
+**Status**: Production Ready (v0.23.0)
diff --git a/docs/planning/README.md b/docs/planning/README.md
index 6c08c42..2b863e4 100644
--- a/docs/planning/README.md
+++ b/docs/planning/README.md
@@ -1,7 +1,7 @@
# Planning Index
**Last Updated**: 2026-02-20
-**Current Version**: v0.22.2
+**Current Version**: v0.23.0
---
diff --git a/docs/planning/ROADMAP.md b/docs/planning/ROADMAP.md
index 53debf8..2a79570 100644
--- a/docs/planning/ROADMAP.md
+++ b/docs/planning/ROADMAP.md
@@ -1,7 +1,7 @@
# codeindex Strategic Roadmap
**Last Updated**: 2026-03-06
-**Current Version**: v0.22.2
+**Current Version**: v0.23.0
**Vision**: Universal Code Parser - Best-in-class multi-language AST parser for AI-assisted development
**Positioning**: Focused on code parsing and structured data extraction, not AI analysis
@@ -534,4 +534,4 @@
**Next Review**: 2026-03-31
**Maintained By**: @dreamlx
**Last Updated**: 2026-03-06
-**Current Version**: v0.22.2
+**Current Version**: v0.23.0
diff --git a/pyproject.toml b/pyproject.toml
index a278361..8b5abe9 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
[project]
name = "ai-codeindex"
-version = "0.22.2"
+version = "0.23.0"
description = "AI-native code indexing tool for large codebases"
readme = "README.md"
requires-python = ">=3.10"
diff --git a/src/codeindex/README_AI.md b/src/codeindex/README_AI.md
index ee6e202..a3529fb 100644
--- a/src/codeindex/README_AI.md
+++ b/src/codeindex/README_AI.md
@@ -1,261 +1,1690 @@
-
+
-# README_AI.md - codeindex
+# codeindex
## Overview
-- **Files**: 42
-- **Symbols**: 320+
-- **Subdirectories**: 2
-
-## Subdirectories
-
-- **extractors/** - Framework-specific route extractors.
-- **parsers/** - Java language parser module.
+- **Files**: 86
+- **Symbols**: 547
## Files
-- __init__.py
-- **adaptive_config.py** - AdaptiveSymbolsConfig
-- **adaptive_selector.py** - AdaptiveSymbolSelector
-- **ai_helper.py** - aggregate_parse_results
-- **cli.py** - main
-- cli_common.py
-- **cli_config.py** - init (with interactive wizard), status, list_dirs, _print_post_init_message
-- **cli_config_commands.py** - config command group, explain command
-- **cli_docs.py** - docs, show_ai_guide
-- **cli_hooks.py** - HookStatus, HookManager, generate_hook_script
-- **cli_parse.py** - parse
-- **cli_scan.py** - _process_directory_with_smartwriter, scan, scan_all
-- **cli_symbols.py** - extract_module_purpose, index, symbols
-- **cli_tech_debt.py** - _find_source_files, _analyze_files, _format_and_output
-- **config.py** - SymbolsConfig, GroupingConfig, SemanticConfig
-- **config_help.py** - CONFIG_PARAMS, show_full_config_help, explain_parameter, get_current_config_value
-- **directory_tree.py** - DirectoryNode, DirectoryTree, walk_directory
-- **docstring_processor.py** - DocstringProcessor
-- **errors.py** - ErrorCode, ErrorInfo, create_error_response
-- __init__.py
-- **spring.py** - SpringRouteExtractor
-- **thinkphp.py** - ThinkPHPRouteExtractor
-- **file_classifier.py** - FileSizeCategory, FileSizeAnalysis, FileSizeClassifier
-- **framework_detect.py** - RouteInfo, ModelInfo, FrameworkInfo
-- **hierarchical.py** - DirectoryInfo, build_directory_hierarchy, create_processing_batches
-- **incremental.py** - UpdateLevel, FileChange, ChangeAnalysis
-- **init_wizard.py** - WizardResult, PARSER_PACKAGES, CLAUDE_MD_MARKER_START, CLAUDE_MD_MARKER_END, CLAUDE_MD_SECTION, check_parser_installed, get_parser_install_guidance, detect_languages, detect_frameworks, infer_include_patterns, infer_exclude_patterns, calculate_parallel_workers, calculate_batch_size, count_files, inject_claude_md, has_claude_md_injection, run_interactive_wizard, generate_config_yaml, create_codeindex_md
-- **invoker.py** - InvokeResult, clean_ai_output, validate_markdown_output
-- **parallel.py** - BatchResult, parse_files_parallel, scan_directories_parallel
-- **parser.py** - CallType, Call, Symbol, Import, Inheritance, ParseResult, JAVA_LANG_CLASSES, build_alias_map, resolve_alias, _extract_python_calls_from_tree, _extract_decorator_calls, _extract_python_calls, _parse_python_call, _determine_python_call_type, _extract_call_name, _is_simple_decorator, _extract_decorator_name, _count_arguments, _extract_package_namespace, _strip_generic_type, _resolve_java_type, _build_java_import_map, _extract_java_inheritances, _extract_type_from_node, _get_parser, _get_node_text, parse_file, _build_java_static_import_map, _resolve_java_static_import, _parse_java_method_call, _parse_java_constructor_call, _extract_java_calls, _extract_java_calls_from_tree, _extract_php_calls_from_tree, _extract_php_calls, _parse_php_function_call, _parse_php_member_call, _parse_php_scoped_call, _parse_php_object_creation
-- __init__.py
-- **java_parser.py** - is_java_file, get_java_parser, parse_java_file
-- **route_extractor.py** - ExtractionContext, RouteExtractor
-- **route_registry.py** - RouteExtractorRegistry
-- **scanner.py** - ScanResult, get_language_extensions, is_pass_through, should_exclude
-- **semantic_extractor.py** - DirectoryContext, BusinessSemantic, SimpleDescriptionGenerator
-- **smart_writer.py** - WriteResult, SmartWriter, determine_level
-- **symbol_index.py** - SymbolEntry, GlobalSymbolIndex
-- **symbol_scorer.py** - ScoringContext, SymbolImportanceScorer
-- **tech_debt.py** - DebtSeverity, DebtIssue, DebtAnalysisResult
-- **tech_debt_formatters.py** - ReportFormatter, ConsoleFormatter, MarkdownFormatter
-- **writer.py** - WriteResult, format_symbols_for_prompt, format_imports_for_prompt
-
-
-**Improved Init Setup Flow (CLAUDE.md Injection & Post-Init Message)**:
-- **init_wizard.py**: `CLAUDE_MD_SECTION` template updated with structured first-time setup and daily usage sections — includes steps to review `.codeindex.yaml`, run `codeindex scan-all`, and optionally install post-commit hooks; removed "Full command reference: see CODEINDEX.md" line
-- **cli_config.py**: `_print_post_init_message()` revised to a 3-step flow: (1) Review `.codeindex.yaml` to verify include/exclude patterns, (2) Run `codeindex scan-all` to generate documentation indexes, (3) Run `codeindex status` to check coverage; removed separate AI-enhanced documentation section
-
-**CLAUDE.md Injection Feature**:
-- **init_wizard.py**: `CLAUDE_MD_MARKER_START`, `CLAUDE_MD_MARKER_END` constants and `CLAUDE_MD_SECTION` template for marker-based idempotent injection
-- **init_wizard.py**: `inject_claude_md()` function — creates CLAUDE.md if missing, prepends section if no markers found, replaces between markers if already injected
-- **init_wizard.py**: `has_claude_md_injection()` helper to check for existing injection
-- **init_wizard.py**: `WizardResult` gains `inject_claude_md` (default `True`) and `claude_md_injected` fields
-- **init_wizard.py**: Interactive wizard expanded from 5 to 6 steps — new Step 5/6 asks "Inject codeindex instructions into CLAUDE.md?"
-- **cli_config.py**: Non-interactive mode (`--yes`) always calls `inject_claude_md()` as safe default
-- **cli_config.py**: Interactive mode calls `inject_claude_md()` when `result.inject_claude_md` is True
-- **cli_config.py**: Both paths report injection in console output (`✓ Injected: CLAUDE.md`)
-
-**Post-Commit Hook Auto-Update & Auto-Commit (Epic 19 Story 19.3)**:
-- **cli_hooks.py**: Removed `set -e` from post-commit hook to prevent aborting on non-fatal errors
-- **cli_hooks.py**: Hook now `cd`s to repo root before activating venv or running commands
-- **cli_hooks.py**: Hook iterates affected directories, runs `codeindex scan` for each with existing README_AI.md, and auto-commits updated files with `--no-verify`
-- **cli_hooks.py**: Skips directories without README_AI.md (warns user to run `codeindex scan` first)
-- **init_wizard.py**: `create_codeindex_md()` now includes "Auto-Update Hooks" section documenting `codeindex hooks install post-commit`, `codeindex hooks status`, and `.codeindex.yaml` hooks configuration
+### __init__.py
+_codeindex - AI-native code indexing tool for large codebases
-**Parser Detection + Init Wizard Updates (Epic 19 Stories 19.4 + 19.2)**:
-- **init_wizard.py**: New `PARSER_PACKAGES` mapping, `check_parser_installed()`, and `get_parser_install_guidance()` functions for detecting installed tree-sitter parsers and generating install commands
-- **cli_config.py**: Non-interactive mode (`--yes`) now checks for missing parsers and warns with install guidance; new `_print_post_init_message()` helper provides unified post-init next steps for both interactive and non-interactive modes
+Usage:
+ codeindex scan # Scan a directory and generate README_AI.md
+ co_
-**Java Tech-Debt Improvements (Epic 19 Stories 19.6a + 19.6b)**:
-- **cli_tech_debt.py**: Auto-enables `--recursive` for Java projects instead of showing a post-hoc hint; removes the empty-directory hint code in favor of proactive behavior
-- **tech_debt.py**: `_analyze_noise_breakdown()` is now language-aware via a `file_type` parameter; Java getter/setter methods (`get*`/`set*`) are no longer counted as noise, recognizing them as standard JavaBeans convention
+### adaptive_config.py
+_Adaptive symbols configuration.
-**Pass-Through Directory Skipping (Epic 19 Story 19.5)**:
-- **scanner.py**: New `is_pass_through()` function detects directories with no code files and a single subdirectory
-- **directory_tree.py**: `get_processing_order()` filters out pass-through directories to avoid redundant README_AI.md generation in deep structures (e.g., Java Maven `src/main/java/com/...`)
-- **directory_tree.py**: `print_tree()` switched from `print()` to `logger.debug()` for proper logging
+This module defines the configuration structure for adaptive symbol extraction,
+which allows dynamically adjusting th_
-**Reversed Scan Defaults (Epic 19 - v0.16.0+)**:
-- **Structural mode is now the default** for both `scan` and `scan-all` commands
-- New `--ai` opt-in flag replaces the previous AI-by-default behavior
-- `DEFAULT_AI_COMMAND` in `config.py` changed to empty string (AI requires explicit configuration)
-- Deprecated flags (`--fallback`, `--no-ai`) are hidden but still functional with deprecation warnings
-- `--dry-run` now requires `--ai` flag (validates at command level)
-- `--ai` validates that `ai_command` is configured in `.codeindex.yaml`
-- Error messages and fallback text updated to reflect new terminology ("structural fallback")
+**class** `class AdaptiveSymbolsConfig`
+> Configuration for adaptive symbol extraction.
-**Dynamic Version Resolution**:
-- `__init__.py` now uses `importlib.metadata.version("ai-codeindex")` instead of a hardcoded version string
-- Falls back to `"0.0.0-dev"` when package metadata is unavailable
-- `pyproject.toml` is the single source of truth for version numbers
+ Adaptive symbol extraction adjusts the number of
-**Language Support Clarification**:
-- **Supported Languages**: Documentation now correctly reflects that only Python, PHP, and Java have full parser support
-- `config_help.py`: Updated language parameter to show "python, php, java (fully supported with parsers)"
-- `init_wizard.py`: Commented out detection patterns for languages without parser support (JavaScript, TypeScript, Go, Rust, Ruby) with note "detection only, no parser yet"
+### adaptive_selector.py
+_Adaptive symbol selector for dynamic symbol limit calculation.
-**Implementation Details**:
-- Detection logic preserved for future language support expansion
-- Clear distinction between "fully supported" (with parser) and "planned support" (detection only)
-- YAML examples updated to show Python, PHP, and Java configuration
+This module implements the core algorithm for adaptive symbol extraction,
+which adjust_
-**Java scan-all, Tech Debt, and Getter/Setter Scoring Fixes**:
-- **scanner.py**: Replaced per-language if/elif extension checks with unified `get_language_extensions()` call in both `scan_directory()` and `find_all_directories()`, making language support extensible via a single map
-- **cli_tech_debt.py**: Added Java-aware hint when tech-debt analysis on a non-recursive scan returns empty results for Java-configured projects
-- **symbol_scorer.py**: Java files no longer penalize getter/setter method names (`get*`, `set*`, `is*`, `has*`) since these follow standard JavaBeans convention; penalty retained for other languages
+**class** `class AdaptiveSymbolSelector`
+> Selects appropriate symbol limit based on file size.
+ This selector implements a tiered approach
-**Commit ``**:
+**Methods:**
+- `def calculate_limit(self, file_lines: int, total_symbols: int) -> int`
+- `def _determine_size_category(self, lines: int) -> str`
+- `def _apply_constraints(self, limit: int, total_symbols: int) -> int`
-Changed files:
-- `smart_writer.py`
+### ai_helper.py
+_AI enhancement helper functions (Epic 4 Story 4.1).
+This module provides reusable functions for AI enhancement operations,
+eliminating code duplicati_
-**Commit ``**:
+**Functions:**
+- `def aggregate_parse_results(
+ parse_results: list[ParseResult],
+ path: Path,
+) -> ParseResult`
-Changed files:
-- `smart_writer.py`
+### cli.py
+_CLI entry point for codeindex.
+This module serves as the main entry point for the codeindex CLI tool.
+It imports and registers commands from speciali_
-**Commit ``**:
+**Functions:**
+- `def main()`
-Changed files:
-- `parser.py`
-- `scanner.py`
+### cli_common.py
+_Common utilities for CLI modules.
+This module provides shared resources used across all CLI command modules,
+such as the Rich console instance for fo_
-**Commit ``**:
+### cli_config.py
+_CLI commands for configuration and project status.
-Changed files:
-- `smart_writer.py`
+This module provides commands for initializing configuration files,
+checking indexing status, and _
+
+**Functions:**
+- `def _print_post_init_message()`
+- `def init(force: bool, yes: bool, quiet: bool, help_config: bool)`
+- `def status(root: Path)`
+- `def list_dirs(root: Path)`
+
+### cli_config_commands.py
+_CLI commands for configuration help and explanation (Epic 15 Story 15.3).
+
+This module provides commands for:
+- Explaining individual configuration pa_
+
+**Functions:**
+- `def config()`
+- `def explain(parameter: str)`
+
+### cli_docs.py
+_Documentation CLI commands for codeindex._
+
+**Functions:**
+- `def docs()`
+- `def show_ai_guide()`
+
+### cli_hooks.py
+_Git Hooks management module for codeindex.
+
+Epic 6, P3.1: Automate Git Hooks installation and management.
+
+This module provides:
+- HookManager: Manage_
+
+**class** `class HookStatus(Enum)`
+> Status of a Git hook.
+
+**class** `class HookManager`
+> Manage Git hooks for codeindex.
+
+**Methods:**
+- `def _find_git_repo(self) -> Path`
+- `def install_hook(
+ self, hook_name: str, backup: bool = True, force: bool = False
+ ) -> bool`
+- `def uninstall_hook(
+ self, hook_name: str, restore_backup: bool = True
+ ) -> bool`
+- `def list_all_hooks(self) -> dict[str, HookStatus]`
+
+**Functions:**
+- `def generate_hook_script(
+ hook_name: str, config: Optional[dict] = None
+) -> str`
+- `def _generate_pre_commit_script(config: dict) -> str`
+- `def _generate_post_commit_script(config: dict) -> str`
+- `def _generate_pre_push_script(config: dict) -> str`
+- `def backup_existing_hook(hook_path: Path) -> Path`
+- `def detect_existing_hooks(hooks_dir: Path) -> list[str]`
+- `def install_hook(hook_name: str, repo_path: Optional[Path] = None) -> bool`
+- `def uninstall_hook(hook_name: str, repo_path: Optional[Path] = None) -> bool`
+- `def run_post_commit_hook() -> int`
+
+_... and 5 more symbols_
+
+### cli_parse.py
+_CLI parse command - Parse a single source file and output JSON.
+
+Epic 12, Story 12.1: Single File Parse Command
+This module provides the 'parse' comma_
+
+**Functions:**
+- `def parse(file_path: str)`
+
+### cli_scan.py
+_CLI commands for scanning directories and generating README files.
+
+This module provides the core scanning functionality, including single directory
+s_
+
+**Functions:**
+- `def _validate_scan_args(fallback: bool, dry_run: bool, ai: bool, quiet: bool) -> None`
+- `def _validate_and_resolve_path(path: Path, output: str) -> Path`
+- `def _load_and_prepare_config(
+ ai: bool,
+ parallel: int | None,
+ docstring_mode: str | None,
+) -> tuple[Config, DocstringProcessor | None]`
+- `def _scan_and_parse_directory(
+ path: Path, config: Config, quiet: bool, output: str
+) -> list | None`
+- `def _output_scan_json(parse_results: list) -> None`
+- `def _generate_structural_readme(
+ path: Path,
+ parse_results: list,
+ config: Config,
+ docstring_processor: DocstringProcessor | None,
+ quiet: bool,
+ show_cost: bool,
+) -> None`
+- `def _generate_ai_readme(
+ path: Path,
+ parse_results: list,
+ config: Config,
+ dry_run: bool,
+ quiet: bool,
+ timeout: int,
+) -> None`
+- `def _validate_scanall_args(fallback: bool, quiet: bool) -> None`
+- `def _load_scanall_config(
+ root: Path,
+ output: str,
+ parallel: int | None,
+ docstring_mode: str | None,
+) -> tuple[Config, DocstringProcessor | None]`
+- `def _output_scanall_json(root: Path, config: Config) -> None`
+- `def _build_and_print_tree(root: Path, config: Config, quiet: bool) -> DirectoryTree`
+- `def _process_directories_parallel(
+ dirs: list[Path],
+ tree: DirectoryTree,
+ config: Config,
+ docstring_processor: DocstringProcessor | None,
+ quiet: bool,
+ show_cost: bool,
+ ai: bool = False,
+ timeout: int = 120,
+) -> None`
+- `def _enrich_directories_with_ai(
+ dirs: list[Path],
+ tree: DirectoryTree,
+ config: Config,
+ quiet: bool,
+ timeout: int,
+) -> None`
+- `def _process_directory_with_smartwriter(
+ dir_path: Path,
+ tree: DirectoryTree,
+ config: Config,
+ docstring_processor=None,
+) -> tuple[Path, bool, str, int]`
+- `def scan(
+ path: Path,
+ ai: bool,
+ dry_run: bool,
+ fallback: bool,
+ quiet: bool,
+ timeout: int,
+ parallel: int | None,
+ docstring_mode: str | None,
+ show_cost: bool,
+ output: str,
+)`
+
+_... and 1 more symbols_
+
+### cli_symbols.py
+_CLI commands for symbol indexing and dependency analysis.
+
+This module provides commands for generating project-wide indices
+and analyzing code depend_
+
+**Functions:**
+- `def extract_module_purpose(
+ dir_path: Path,
+ config: Config,
+ output_file: str = "README_AI.md"
+) -> str`
+- `def index(root: Path, output: str)`
+- `def symbols(root: Path, output: str, quiet: bool)`
+- `def affected(since: str, until: str, as_json: bool)`
+
+### cli_tech_debt.py
+_CLI commands for technical debt analysis.
+
+This module provides the tech-debt command for analyzing technical debt
+in a directory, including file size_
+
+**Functions:**
+- `def _find_source_files(
+ path: Path, recursive: bool, languages: list[str] | None = None
+) -> list[Path]`
+- `def _analyze_files(
+ files: list[Path],
+ detector: TechDebtDetector,
+ reporter: TechDebtReporter,
+ show_progress: bool,
+) -> list[dict]`
+- `def _get_file_type(file_path: Path) -> str`
+- `def _create_scorer(parse_result, file_type: str) -> SymbolImportanceScorer`
+- `def _analyze_single_file(
+ file_path: Path,
+ parse_result,
+ detector: TechDebtDetector,
+ reporter: TechDebtReporter,
+ test_smell_detector,
+) -> list[dict]`
+- `def _collect_test_smells(file_path: Path, parse_result, test_smell_detector) -> list[dict]`
+- `def _handle_parse_error(file_path: Path, error: str, show_progress: bool)`
+- `def _handle_analysis_error(file_path: Path, error: Exception, show_progress: bool)`
+- `def _format_and_output(
+ report: TechDebtReport,
+ format: str,
+ output: Path | None,
+ quiet: bool,
+ test_smells: list[dict] | None = None,
+ target_path: Path | None = None,
+) -> None`
+- `def tech_debt(path: Path, format: str, output: Path | None, recursive: bool, quiet: bool)`
+
+### config.py
+_Configuration management for codeindex._
+
+**class** `class SymbolsConfig`
+> Configuration for symbol extraction.
+
+**class** `class GroupingConfig`
+> Configuration for file grouping.
+
+**class** `class SemanticConfig`
+> Configuration for semantic extraction.
+
+**class** `class IndexingConfig`
+> Configuration for smart indexing.
+
+**class** `class IncrementalConfig`
+> Configuration for incremental updates.
+
+**class** `class DocstringConfig`
+> Configuration for docstring extraction (Epic 9).
+
+ Supports AI-powered docstring extraction and n
+
+**class** `class PostCommitConfig`
+> Configuration for post-commit Git hook.
+
+ Modes:
+ - auto: Smart detection (≤2 dirs = sync, >2
+
+**class** `class HooksConfig`
+> Configuration for Git hooks (Story 6).
+
+**class** `class Config`
+> Configuration for codeindex.
+
+### config_help.py
+_Configuration help documentation for codeindex (Epic 15 Story 15.3).
+
+This module provides comprehensive help documentation for all .codeindex.yaml
+co_
+
+**Functions:**
+- `def show_full_config_help() -> None`
+- `def _show_param_section(param_name: str) -> None`
+- `def explain_parameter(
+ param_name: str, current_value: Optional[any] = None, cpu_count: Optional[int] = None
+) -> int`
+
+### directory_tree.py
+_Directory tree structure for hierarchical indexing._
+
+**class** `class DirectoryNode`
+> A node in the directory tree.
+
+**class** `class DirectoryTree`
+> Pre-scanned directory tree for determining index levels.
+
+ This enables two-pass indexing:
+ 1.
+
+**Methods:**
+- `def _build_tree(self)`
+- `def _scan_directory_structure(self)`
+- `def _add_intermediate_directories(self)`
+- `def _establish_relationships(self)`
+- `def print_tree(self, max_depth: int = 3)`
+
+### docstring_processor.py
+_Docstring Processor - AI-powered documentation extraction.
+
+Story 9.1: Docstring Processor Core
+
+This module provides AI-powered docstring extraction _
+
+**class** `class DocstringProcessor`
+> AI-powered docstring extraction and normalization.
+
+ Uses external AI CLI (Claude, GPT-4, etc.) t
+**Methods:**
+- `def process_file(
+ self, file_path: Path, symbols: list[Symbol]
+ ) -> dict[str, str]`
+- `def _should_process(self, docstring: str) -> bool`
+- `def _should_use_ai(self, docstring: str) -> bool`
+- `def _contains_non_ascii(self, text: str) -> bool`
+- `def _process_simple(self, symbols: list[Symbol]) -> dict[str, str]`
+- `def _process_with_ai(
+ self,
+ file_path: Path,
+ symbols_to_process: list[Symbol],
+ all_symbols: list[Symbol],
+ ) -> dict[str, str]`
+- `def _generate_prompt(self, file_path: Path, symbols: list[Symbol]) -> str`
+- `def _call_ai(self, prompt: str) -> str`
+- `def _parse_ai_response(self, response: str) -> dict[str, str]`
+- `def _fallback_extract(self, docstring: str) -> str`
-**Commit ``**:
+### enricher.py
+_AI enrichment module for generating one-line module descriptions.
-Changed files:
-- `file_classifier.py`
-- `tech_debt.py`
+Epic 25: Instead of AI generating entire README_AI.md files, AI only
+produces a sho_
+**Functions:**
+- `def extract_symbol_summary(parse_results: list[ParseResult]) -> str`
+- `def extract_summary_from_readme(readme_path: Path) -> str`
+- `def build_enrich_prompt(
+ dir_name: str,
+ symbol_summary: str,
+ parent_name: str = "",
+) -> str`
+- `def inject_blockquote(readme_path: Path, description: str) -> None`
+- `def should_enrich(level: str) -> bool`
-**Commit ``**:
+### errors.py
+_Error codes and structures for JSON output.
-Changed files:
-- `parser.py`
+Story 4: Structured error handling for machine-readable errors._
+**class** `class ErrorCode(str, Enum)`
+> Error codes for command-level errors.
-**Commit ``**:
+**class** `class ErrorInfo`
+> Structured error information for JSON output.
-Changed files:
-- `tech_debt.py`
+**Methods:**
+- `def to_dict(self) -> dict`
+**Functions:**
+- `def create_error_response(
+ error: ErrorInfo,
+ results: Optional[list] = None,
+) -> dict`
-**Commit ``**:
+### __init__.py
+_Framework-specific route extractors.
-Changed files:
-- `parser.py`
+This package contains route extractors for different frameworks.
+Each extractor implements the RouteExtractor in_
+### spring.py
+_Spring Framework route extractor.
-**Commit ``**:
+Extracts REST routes from Spring controllers using annotations.
-Changed files:
-- `objc_association.py`
+Spring routing via annotations:
+- Controller: @Res_
+
+**class** `class SpringRouteExtractor`
+> Route extractor for Spring Framework.
+
+ Extracts REST API routes from Spring controllers by analy
+
+**Methods:**
+- `def extract_routes(self, result: ParseResult) -> list[RouteInfo]`
+- `def _get_mapping_info(self, annotation_name: str) -> tuple[str, str] | None`
+- `def _extract_path_from_annotation(self, arguments: dict | str) -> str`
+- `def _build_path(self, prefix: str, path: str) -> str`
+
+### thinkphp.py
+_ThinkPHP route extractor.
+
+Extracts routes from ThinkPHP framework controllers using convention-based routing.
+
+ThinkPHP routing convention:
+- URL: /m_
+
+**class** `class ThinkPHPRouteExtractor(RouteExtractor)`
+> Route extractor for ThinkPHP framework.
+
+ ThinkPHP uses convention-based routing where:
+ - Con
+
+**Methods:**
+- `def can_extract(self, context: ExtractionContext) -> bool`
+- `def extract_routes(self, context: ExtractionContext) -> list[RouteInfo]`
+- `def _extract_description(self, symbol) -> str`
+
+### file_classifier.py
+_Unified file size classification system (Epic 4 Story 4.2).
+
+This module provides a unified approach to file size classification,
+replacing hard-coded_
+
+**class** `class FileSizeCategory(Enum)`
+> File size categories for classification.
+
+**class** `class FileSizeAnalysis`
+> Result of file size analysis.
+
+ Attributes:
+ category: File size category (enum)
+ f
+
+**class** `class FileSizeClassifier`
+> Unified file size classifier for all modules.
+
+ This classifier provides consistent file size det
+
+**Methods:**
+- `def classify(self, parse_result: ParseResult) -> FileSizeAnalysis`
+- `def is_super_large(self, parse_result: ParseResult) -> bool`
+- `def is_large(self, parse_result: ParseResult) -> bool`
+
+### framework_detect.py
+_Framework detection and pattern extraction for PHP projects._
+
+**class** `class RouteInfo`
+> Information about a route.
+
+**class** `class ModelInfo`
+> Information about a model.
+
+**class** `class FrameworkInfo`
+> Detected framework information.
+
+**Functions:**
+- `def detect_framework(root: Path) -> FrameworkType`
+- `def extract_thinkphp_routes(
+ parse_results: list[ParseResult],
+ module_name: str,
+) -> list[RouteInfo]`
+- `def extract_thinkphp_models(
+ parse_results: list[ParseResult],
+) -> list[ModelInfo]`
+- `def analyze_thinkphp_project(
+ root: Path,
+ parse_results_by_dir: dict[Path, list[ParseResult]],
+) -> FrameworkInfo`
+- `def format_framework_info(info: FrameworkInfo, max_routes: int = 20) -> str`
+
+### hierarchical.py
+_Bottom-up hierarchical processing for codeindex._
+
+**class** `class DirectoryInfo`
+> Information about a directory in the hierarchy.
+
+**Functions:**
+- `def build_directory_hierarchy(
+ directories: List[Path],
+) -> Tuple[Dict[Path, DirectoryInfo], List[Path]]`
+- `def create_processing_batches(
+ dir_info: Dict[Path, DirectoryInfo], max_workers: int
+) -> List[List[Path]]`
+- `def process_directory_batch(
+ batch: List[Path],
+ config: Config,
+ use_fallback: bool = False,
+ quiet: bool = False,
+ timeout: int = 120,
+ root_path: Path = None,
+) -> Dict[Path, bool]`
+- `def process_normal(
+ path: Path,
+ config: Config,
+ use_fallback: bool,
+ quiet: bool,
+ timeout: int,
+ root_path: Path = None,
+) -> bool`
+- `def process_with_children(
+ path: Path, config: Config, use_fallback: bool, quiet: bool, timeout: int
+) -> bool`
+- `def scan_directories_hierarchical(
+ root: Path,
+ config: Config,
+ max_workers: int = 8,
+ use_fallback: bool = True,
+ quiet: bool = False,
+ timeout: int = 120
+) -> bool`
+- `def _handle_no_directories(quiet: bool)`
+- `def _build_and_populate_hierarchy(directories: List[Path], config: Config, quiet: bool)`
+- `def _create_and_report_batches(dir_info_map: dict, max_workers: int, quiet: bool) -> tuple[list, int]`
+- `def _process_all_batches(
+ batches: list,
+ dir_info_map: dict,
+ config: Config,
+ use_fallback: bool,
+ quiet: bool,
+ timeout: int,
+ root: Path
+) -> int`
+- `def generate_enhanced_fallback_readme(
+ dir_path: Path,
+ parse_results: list,
+ child_readmes: List[Path],
+ output_file: str = "README_AI.md"
+)`
+
+### hooks.py
+_Post-install hook for automatic CLAUDE.md updates.
+
+Epic #25, Story #26: Implement post-install hook that updates
+~/.claude/CLAUDE.md with latest code_
+
+**Functions:**
+- `def _extract_version_from_file(file_path: Path) -> Optional[str]`
+- `def _inject_core_guide(file_path: Path, version: str) -> bool`
+- `def _is_ci_environment() -> bool`
+- `def post_install_update_guide() -> None`
+
+### incremental.py
+_Incremental update logic for codeindex.
+
+This module analyzes git changes and determines which directories
+need README_AI.md updates based on configur_
+
+**class** `class UpdateLevel(Enum)`
+> Update decision levels.
+
+**class** `class FileChange`
+> Represents a changed file.
+
+**class** `class ChangeAnalysis`
+> Analysis result of git changes.
+
+**Methods:**
+- `def to_dict(self) -> dict`
+
+**Functions:**
+- `def run_git_command(args: list[str], cwd: Path | None = None) -> str`
+- `def filter_code_files(
+ changes: list[FileChange],
+ languages: list[str],
+) -> list[FileChange]`
+- `def analyze_changes(
+ config: Config,
+ since: str = "HEAD~1",
+ until: str = "HEAD",
+ cwd: Path | None = None,
+) -> ChangeAnalysis`
+- `def should_update_project_index(analysis: ChangeAnalysis, config: Config) -> bool`
+
+### init_wizard.py
+_Interactive Setup Wizard for codeindex (Epic 15 Story 15.1).
+
+This module provides an intelligent, interactive setup wizard that:
+- Auto-detects proje_
+
+**class** `class WizardResult`
+> Result of running the interactive wizard.
+
+**Functions:**
+- `def check_parser_installed(language: str) -> bool`
+- `def detect_languages(project_dir: Path, max_scan_files: int = 1000) -> List[str]`
+- `def detect_frameworks(project_dir: Path, languages: List[str]) -> List[str]`
+- `def infer_include_patterns(project_dir: Path) -> List[str]`
+- `def infer_exclude_patterns(project_dir: Path) -> List[str]`
+- `def calculate_parallel_workers(file_count: int, cpu_count: Optional[int] = None) -> int`
+- `def calculate_batch_size(file_count: int) -> int`
+- `def count_files(project_dir: Path, patterns: List[str]) -> int`
+- `def inject_claude_md(project_dir: Path) -> Path`
+- `def has_claude_md_injection(project_dir: Path) -> bool`
+- `def run_interactive_wizard(project_dir: Path) -> WizardResult`
+- `def generate_config_yaml(result: WizardResult, project_dir: Path) -> str`
+- `def create_codeindex_md(project_dir: Path) -> Path`
+
+### invoker.py
+_AI CLI invoker - calls external AI CLI tools._
+
+**class** `class InvokeResult`
+> Result of invoking AI CLI.
+
+**Functions:**
+- `def clean_ai_output(output: str) -> str`
+- `def validate_markdown_output(output: str) -> bool`
+- `def format_prompt(
+ dir_path: Path,
+ files_info: str,
+ symbols_info: str,
+ imports_info: str,
+) -> str`
+- `def invoke_ai_cli(
+ command_template: str,
+ prompt: str,
+ timeout: int = 120,
+ dry_run: bool = False,
+) -> InvokeResult`
+- `def invoke_ai_cli_stdin(
+ command: str,
+ prompt: str,
+ timeout: int = 120,
+ dry_run: bool = False,
+) -> InvokeResult`
+
+### objc_association.py
+_Objective-C header/implementation file association utilities (Story 3.2).
+
+This module provides utilities to associate .h and .m files for the same cl_
+
+**class** `class ObjCFilePair`
+> Represents an associated .h/.m file pair.
+
+ Attributes:
+ class_name: Name of the Objective
+
+**Functions:**
+- `def find_objc_pairs(directory: Path) -> list[ObjCFilePair]`
+- `def parse_objc_pair(
+ header_file: Path | None = None,
+ implementation_file: Path | None = None,
+) -> ObjCFilePair`
+- `def merge_objc_results(pair: ObjCFilePair) -> ParseResult`
+- `def calculate_association_accuracy(pairs: list[ObjCFilePair]) -> float`
+
+### parallel.py
+_Parallel processing utilities for codeindex._
+
+**class** `class BatchResult`
+> Result of processing a batch of files.
+
+**Functions:**
+- `def parse_files_parallel(
+ files: List[Path],
+ config: Config,
+ quiet: bool = False
+) -> list[ParseResult]`
+- `def scan_directories_parallel(
+ directories: List[Path],
+ config: Config,
+ quiet: bool = False
+) -> List[Path]`
+
+### parser.py
+_Multi-language AST parser using tree-sitter.
+
+Epic 13: Parser Modularization - Phase 3
+This module serves as the unified entry point for all language _
+
+**class** `class CallType(Enum)`
+> Call type enumeration (Epic 11).
+
+ Distinguishes between different types of function/method calls
+
+**class** `class Call`
+> Function/method call relationship (Epic 11).
+
+ Represents caller → callee relationships for knowl
+
+**class** `class Symbol`
+> Represents a code symbol (class, function, etc.).
+
+**class** `class Import`
+> Represents an import statement (extended for LoomGraph).
+
+ Attributes:
+ module: Module nam
+
+**class** `class Inheritance`
+> Class inheritance information for knowledge graph construction.
+
+ Represents parent-child relatio
+
+**class** `class Annotation`
+> Represents a code annotation/decorator (e.g., Java @RestController).
+
+ Story 7.1.2.1: Annotation
+
+**class** `class ParseResult`
+> Result of parsing a file (extended for LoomGraph).
+
+ Attributes:
+ path: File path
+
+
+**Methods:**
+- `def to_dict(self) -> dict`
+- `def to_dict(self) -> dict`
+- `def to_dict(self) -> dict`
+- `def to_dict(self) -> dict`
+- `def to_dict(self) -> dict`
+- `def to_dict(self) -> dict`
+
+**Functions:**
+- `def _get_parser(language: str) -> Parser | None`
+- `def parse_file(path: Path, language: str | None = None) -> ParseResult`
+
+_... and 2 more symbols_
+
+### __init__.py
+_Language-specific parser modules.
+
+This package contains modular parsers for different programming languages.
+Each language has its own parser module _
+
+### base.py
+_Base class for language parsers.
+
+This module defines the abstract interface that all language-specific parsers must implement.
+It provides a consiste_
+
+**class** `class BaseLanguageParser(ABC)`
+> Abstract base class for language-specific parsers.
+
+ All language parsers (Python, PHP, Java, etc
+
+**Methods:**
+- `def parse(self, path: Path)`
+
+### __init__.py
+_Java language parser.
+
+This module provides the JavaParser class that implements Java-specific
+symbol extraction, import resolution, inheritance extra_
+
+**class** `class JavaParser(BaseLanguageParser)`
+> Java language parser.
+
+ Extracts symbols, imports, calls, and inheritances from Java source code.
+
+**Methods:**
+- `def extract_symbols(self, tree: Tree, source_bytes: bytes) -> list`
+- `def extract_imports(self, tree: Tree, source_bytes: bytes) -> list`
+- `def extract_calls(
+ self, tree: Tree, source_bytes: bytes, symbols: list, imports: list
+ ) -> list`
+- `def extract_inheritances(self, tree: Tree, source_bytes: bytes) -> list`
+- `def parse(self, path: Path)`
+
+**Functions:**
+- `def is_java_file(path: str) -> bool`
+- `def parse_java_file(file_path: str, content: str)`
+
+### calls.py
+_Call extraction for Java parser.
+
+This module provides functions to extract method calls, constructor calls,
+and call relationships from Java source c_
+
+**Functions:**
+- `def extract_calls(
+ tree: Tree,
+ source_bytes: bytes,
+ symbols: list,
+ imports: list[Import],
+ namespace: str = "",
+ import_map: dict[str, str] | None = None
+) -> list[Call]`
+- `def _parse_method_call(
+ node: Node,
+ source_bytes: bytes,
+ caller: str,
+ import_map: dict[str, str],
+ static_import_map: dict[str, str],
+ namespace: str,
+ parent_map: dict[str, str]
+) -> Optional[Call]`
+- `def _parse_constructor_call(
+ node: Node,
+ source_bytes: bytes,
+ caller: str,
+ import_map: dict[str, str],
+ namespace: str
+) -> Optional[Call]`
+- `def _extract_calls_recursive(
+ node: Node,
+ source_bytes: bytes,
+ caller: str,
+ import_map: dict[str, str],
+ static_import_map: dict[str, str],
+ namespace: str,
+ parent_map: dict[str, str]
+) -> list[Call]`
+- `def _extract_calls_from_tree(
+ tree: Tree,
+ source_bytes: bytes,
+ inheritances: list[Inheritance],
+ namespace: str,
+ import_map: dict[str, str]
+) -> list[Call]`
+
+### imports.py
+_Import extraction for Java parser.
+
+This module provides functions to extract import statements and build import
+mappings from Java source code using _
+
+**Functions:**
+- `def extract_imports(tree: Tree, source_bytes: bytes) -> list[Import]`
+- `def build_import_map(root: Node, source_bytes: bytes) -> dict[str, str]`
+- `def build_static_import_map(root: Node, source_bytes: bytes) -> dict[str, str]`
+- `def resolve_static_import(
+ method_name: str,
+ static_import_map: dict[str, str]
+) -> str | None`
+- `def _parse_java_import(node: Node, source_bytes: bytes) -> Import | None`
+
+### inheritance.py
+_Inheritance extraction for Java parser.
+
+This module provides functions to extract class and interface inheritance
+relationships from Java source code_
+
+**Functions:**
+- `def extract_inheritances(
+ tree: Tree,
+ source_bytes: bytes,
+ namespace: str = "",
+ import_map: dict[str, str] | None = None
+) -> list[Inheritance]`
+- `def extract_package(tree: Tree, source_bytes: bytes) -> str`
+
+### symbols.py
+_Symbol extraction for Java parser.
+
+This module provides functions to extract symbols (classes, interfaces, enums,
+records, methods, fields, construct_
+
+**Functions:**
+- `def extract_symbols(
+ tree: Tree,
+ source_bytes: bytes,
+ namespace: str = "",
+ import_map: Optional[dict[str, str]] = None,
+ inheritances: Optional[list[Inheritance]] = None
+) -> list[Symbol]`
+- `def extract_module_docstring(tree: Tree, source_bytes: bytes) -> str`
+- `def _strip_generic_type(type_name: str) -> str`
+- `def _extract_package_namespace(class_full_name: str) -> str`
+- `def _resolve_java_type(
+ short_name: str,
+ namespace: str,
+ import_map: dict[str, str]
+) -> str`
+- `def _extract_java_modifiers(node: Node, source_bytes: bytes) -> list[str]`
+- `def _build_java_signature(modifiers: list[str], *parts: str) -> str`
+- `def _extract_java_annotations(node: Node, source_bytes: bytes) -> list[Annotation]`
+- `def _parse_annotation_arguments(arg_list_node: Node, source_bytes: bytes) -> dict[str, str]`
+- `def _find_child_by_type(node: Node, type_name: str) -> Node | None`
+- `def _extract_java_docstring(node: Node, source_bytes: bytes) -> str`
+- `def _parse_java_method(node: Node, source_bytes: bytes, class_name: str = "") -> Symbol`
+- `def _parse_java_constructor(node: Node, source_bytes: bytes, class_name: str) -> Symbol`
+- `def _parse_java_field(node: Node, source_bytes: bytes, class_name: str = "") -> list[Symbol]`
+- `def _extract_java_inheritances(
+ node: Node,
+ source_bytes: bytes,
+ child_name: str,
+ package_namespace: str,
+ import_map: dict[str, str]
+) -> list[Inheritance]`
+
+_... and 5 more symbols_
+
+### java_parser.py
+_Java language parser (backward compatibility shim).
+
+DEPRECATED: This module exists for backward compatibility only.
+New code should import from codei_
+
+### __init__.py
+_Objective-C language parser.
+
+This module provides the ObjCParser class that implements Objective-C-specific
+symbol extraction, import resolution, inh_
+
+**class** `class ObjCParser(BaseLanguageParser)`
+> Objective-C language parser.
+
+ Extracts symbols, imports, calls, and inheritances from Objective-
+
+**Methods:**
+- `def _preprocess_source(self, source_bytes: bytes) -> bytes`
+- `def parse(self, path: Path)`
+- `def extract_symbols(self, tree: Tree, source_bytes: bytes) -> list`
+- `def extract_imports(self, tree: Tree, source_bytes: bytes) -> list`
+- `def extract_calls(
+ self, tree: Tree, source_bytes: bytes, symbols: list, imports: list
+ ) -> list`
+- `def extract_inheritances(self, tree: Tree, source_bytes: bytes) -> list`
+
+### calls.py
+_Call extraction for Objective-C parser.
+
+This module provides functions to extract function/method calls from Objective-C code.
+
+Note: Call graph extr_
+
+**Functions:**
+- `def extract_calls(
+ tree: Tree, source_bytes: bytes, symbols: list, imports: list
+) -> list`
+
+### imports.py
+_Import extraction for Objective-C parser.
+
+This module provides functions to extract import statements from Objective-C:
+- #import "LocalFile.h" (loca_
+
+**Functions:**
+- `def extract_imports(tree: Tree, source_bytes: bytes) -> list`
+- `def _extract_import(node, source_bytes: bytes) -> Import | None`
+
+### inheritance.py
+_Inheritance extraction for Objective-C parser.
+
+This module provides functions to extract inheritance relationships:
+- Class inheritance (superclass)
+_
+
+**Functions:**
+- `def extract_inheritances(tree: Tree, source_bytes: bytes) -> list`
+- `def _extract_interface_inheritance(node, source_bytes: bytes) -> list[Inheritance]`
+- `def _extract_protocol_inheritance(node, source_bytes: bytes) -> list[Inheritance]`
+
+### symbols.py
+_Symbol extraction for Objective-C parser.
+
+This module provides functions to extract symbols from Objective-C source code:
+- @interface declarations (_
+
+**Functions:**
+- `def extract_symbols(tree: Tree, source_bytes: bytes) -> list`
+- `def _extract_interface(node, source_bytes: bytes) -> list[Symbol]`
+- `def _extract_implementation(node, source_bytes: bytes) -> list[Symbol]`
+- `def _extract_protocol(node, source_bytes: bytes) -> list[Symbol]`
+- `def _extract_declarations(
+ decl_list_node, source_bytes: bytes, class_name: str
+) -> list[Symbol]`
+- `def _extract_method(
+ node, source_bytes: bytes, class_name: str
+) -> Symbol | None`
+- `def _extract_property(
+ node, source_bytes: bytes, class_name: str
+) -> Symbol | None`
+- `def _build_interface_signature(
+ node, source_bytes: bytes, class_name: str
+) -> str`
+- `def _build_method_signature(
+ node, source_bytes: bytes, is_class_method: bool
+) -> str`
+
+### __init__.py
+_PHP language parser.
+
+This module provides the PhpParser class that implements PHP-specific
+symbol extraction, import resolution, inheritance extracti_
+
+**class** `class PhpParser(BaseLanguageParser)`
+> PHP language parser.
+
+ Extracts symbols, imports, calls, and inheritances from PHP source code.
+
+
+**Methods:**
+- `def extract_symbols(self, tree: Tree, source_bytes: bytes) -> list`
+- `def extract_imports(self, tree: Tree, source_bytes: bytes) -> list`
+- `def extract_calls(
+ self, tree: Tree, source_bytes: bytes, symbols: list, imports: list
+ ) -> list`
+- `def extract_inheritances(self, tree: Tree, source_bytes: bytes) -> list`
+- `def parse(self, path)`
+- `def _parse_namespace(self, node, source_bytes: bytes) -> str`
+
+### calls.py
+_Call extraction for PHP parser.
+
+This module provides functions to extract function/method call relationships
+from PHP source code using tree-sitter._
+
+**Functions:**
+- `def extract_calls(
+ tree: Tree, source_bytes: bytes, symbols: list, imports: list
+) -> list`
+- `def _parse_namespace(node, source_bytes: bytes) -> str`
+- `def _parse_use_for_map(node, source_bytes: bytes) -> list[Import]`
+- `def _extract_class_inheritances(
+ node,
+ source_bytes: bytes,
+ namespace: str,
+ use_map: dict[str, str],
+ inheritances: list
+) -> None`
+- `def _extract_calls_from_tree(
+ tree,
+ source_bytes: bytes,
+ imports: list[Import],
+ inheritances: list,
+ namespace: str,
+ use_map: dict[str, str]
+) -> list[Call]`
+- `def _extract_calls_from_node(
+ node: Node,
+ source_bytes: bytes,
+ caller: str,
+ use_map: dict[str, str],
+ namespace: str,
+ parent_map: dict[str, str],
+ current_class: str
+) -> list[Call]`
+- `def _parse_function_call(
+ node: Node,
+ source_bytes: bytes,
+ caller: str,
+ use_map: dict[str, str],
+ namespace: str
+) -> Optional[Call]`
+- `def _parse_member_call(
+ node: Node,
+ source_bytes: bytes,
+ caller: str,
+ use_map: dict[str, str],
+ namespace: str,
+ current_class: str
+) -> Optional[Call]`
+- `def _parse_scoped_call(
+ node: Node,
+ source_bytes: bytes,
+ caller: str,
+ use_map: dict[str, str],
+ namespace: str,
+ parent_map: dict[str, str],
+ current_class: str
+) -> Optional[Call]`
+- `def _parse_object_creation(
+ node: Node,
+ source_bytes: bytes,
+ caller: str,
+ use_map: dict[str, str],
+ namespace: str
+) -> Optional[Call]`
+
+### imports.py
+_Import extraction for PHP parser.
+
+This module provides functions to extract import/use statements and
+require/include statements from PHP source code_
+
+**Functions:**
+- `def extract_imports(tree: Tree, source_bytes: bytes) -> list`
+- `def _parse_use(node, source_bytes: bytes) -> list[Import]`
+- `def _parse_include(node, source_bytes: bytes) -> Import | None`
+
+### inheritance.py
+_Inheritance extraction for PHP parser.
+
+This module provides functions to extract class inheritance relationships
+(extends and implements) from PHP so_
+
+**Functions:**
+- `def extract_inheritances(tree: Tree, source_bytes: bytes) -> list`
+- `def _parse_namespace(node, source_bytes: bytes) -> str`
+- `def _parse_use_for_map(node, source_bytes: bytes) -> list[Import]`
+- `def _parse_class_inheritances(
+ node,
+ source_bytes: bytes,
+ namespace: str,
+ use_map: dict[str, str],
+ inheritances: list[Inheritance]
+) -> None`
+
+### symbols.py
+_Symbol extraction for PHP parser.
+
+This module provides functions to extract symbols (classes, functions, methods, properties)
+from PHP source code us_
+
+**Functions:**
+- `def extract_symbols(tree: Tree, source_bytes: bytes) -> list`
+- `def _extract_docstring(node, source_bytes: bytes) -> str`
+- `def _parse_phpdoc_text(text: str) -> str`
+- `def _parse_function(node, source_bytes: bytes, class_name: str = "") -> Symbol`
+- `def _parse_method(node, source_bytes: bytes, class_name: str) -> Symbol`
+- `def _parse_property(node, source_bytes: bytes, class_name: str) -> Symbol`
+- `def _parse_class(
+ node,
+ source_bytes: bytes,
+ namespace: str = "",
+ use_map: dict[str, str] | None = None,
+ inheritances: list[Inheritance] | None = None
+) -> list[Symbol]`
+- `def _parse_namespace(node, source_bytes: bytes) -> str`
+- `def _parse_use(node, source_bytes: bytes) -> list`
+
+### __init__.py
+_Python language parser.
+
+This module provides the PythonParser class that implements Python-specific
+symbol extraction, import resolution, and call re_
+
+**class** `class PythonParser(BaseLanguageParser)`
+> Python language parser.
+
+ Extracts symbols (classes, functions, methods), imports, inheritances,
+
+
+**Methods:**
+- `def extract_symbols(self, tree: Tree, source_bytes: bytes) -> list`
+- `def extract_imports(self, tree: Tree, source_bytes: bytes) -> list`
+- `def extract_calls(
+ self, tree: Tree, source_bytes: bytes, symbols: list, imports: list
+ ) -> list`
+- `def extract_inheritances(self, tree: Tree, source_bytes: bytes) -> list`
+- `def parse(self, path: Path)`
+
+### calls.py
+_Call relationship extraction for Python parser.
+
+This module provides functions to extract function/method call relationships
+from Python source code _
+
+**Functions:**
+- `def extract_calls(
+ tree: Tree, source_bytes: bytes, symbols: list, imports: list, inheritances: list
+) -> list`
+- `def _build_alias_map(imports: list[Import]) -> dict[str, str]`
+- `def _resolve_alias(callee: str, alias_map: dict[str, str]) -> str`
+- `def _determine_python_call_type(func_node: Node, source_bytes: bytes) -> CallType`
+- `def _extract_call_name(func_node: Node, source_bytes: bytes) -> str`
+- `def _parse_python_call(
+ node: Node,
+ source_bytes: bytes,
+ caller: str,
+ alias_map: dict[str, str],
+ parent_map: dict[str, str],
+) -> Optional[Call]`
+- `def _extract_python_calls(
+ node: Node,
+ source_bytes: bytes,
+ context: str,
+ alias_map: dict[str, str],
+ parent_map: dict[str, str],
+) -> list[Call]`
+- `def _is_simple_decorator(decorator_node: Node) -> bool`
+- `def _extract_decorator_name(decorator_node: Node, source_bytes: bytes) -> str`
+- `def _extract_decorator_calls(
+ node: Node, source_bytes: bytes, context: str
+) -> list[Call]`
+- `def _extract_python_calls_from_tree(
+ tree: Tree,
+ source_bytes: bytes,
+ imports: list[Import],
+ inheritances: list[Inheritance],
+) -> list[Call]`
+
+### imports.py
+_Import extraction for Python parser.
+
+This module provides functions to extract import statements from Python source code
+using tree-sitter._
+
+**Functions:**
+- `def extract_imports(tree: Tree, source_bytes: bytes) -> list`
+- `def _parse_import(node: Node, source_bytes: bytes) -> list[Import]`
+
+### inheritance.py
+_Inheritance extraction for Python parser.
+
+This module provides functions to extract class inheritance relationships
+from Python source code using tre_
+
+**Functions:**
+- `def extract_inheritances(tree: Tree, source_bytes: bytes) -> list`
+- `def _extract_class_inheritances(
+ node: Node,
+ source_bytes: bytes,
+ parent_class: str,
+ inheritances: list[Inheritance],
+) -> None`
+
+### symbols.py
+_Symbol extraction for Python parser.
+
+This module provides functions to extract symbols (classes, functions, methods)
+from Python source code using tr_
+
+**Functions:**
+- `def extract_symbols(tree: Tree, source_bytes: bytes) -> list`
+- `def extract_module_docstring(tree: Tree, source_bytes: bytes) -> str`
+- `def _extract_docstring(node: Node, source_bytes: bytes) -> str`
+- `def _parse_function(
+ node: Node,
+ source_bytes: bytes,
+ class_name: str = "",
+ decorators: list[str] | None = None,
+) -> Symbol`
+- `def _parse_class(
+ node: Node,
+ source_bytes: bytes,
+ parent_class: str = "",
+ inheritances: list[Inheritance] | None = None,
+) -> list[Symbol]`
+
+### __init__.py
+_Swift language parser.
+
+This module provides the SwiftParser class that implements Swift-specific
+symbol extraction, import resolution, and inheritanc_
+
+**class** `class SwiftParser(BaseLanguageParser)`
+> Swift language parser.
+
+ Extracts symbols (classes, structs, enums, protocols, functions, methods
+
+**Methods:**
+- `def extract_symbols(self, tree: Tree, source_bytes: bytes) -> list`
+- `def extract_imports(self, tree: Tree, source_bytes: bytes) -> list`
+- `def extract_calls(
+ self, tree: Tree, source_bytes: bytes, symbols: list, imports: list
+ ) -> list`
+- `def extract_inheritances(self, tree: Tree, source_bytes: bytes) -> list`
+
+### calls.py
+_Call extraction for Swift parser.
+
+This module provides functions to extract call relationships from Swift source code.
+
+TODO: Phase 3 implementation _
+
+**Functions:**
+- `def extract_calls(
+ tree: Tree, source_bytes: bytes, symbols: list, imports: list
+) -> list`
+
+### imports.py
+_Import extraction for Swift parser.
+
+This module provides functions to extract import statements from Swift source code._
+
+**Functions:**
+- `def extract_imports(tree: Tree, source_bytes: bytes) -> list`
+
+### inheritance.py
+_Inheritance extraction for Swift parser.
+
+This module provides functions to extract inheritance relationships from Swift
+source code, including class _
+
+**Functions:**
+- `def extract_inheritances(tree: Tree, source_bytes: bytes) -> list`
+- `def _extract_type_inheritances(node: Node, source_bytes: bytes) -> list`
+- `def _extract_parent_type_from_specifier(node: Node, source_bytes: bytes) -> str | None`
+- `def _extract_extension_inheritances(node: Node, source_bytes: bytes) -> list`
+- `def _extract_parent_types(node: Node, source_bytes: bytes) -> list[str]`
+
+### symbols.py
+_Symbol extraction for Swift parser.
+
+This module provides functions to extract symbols (classes, structs, enums,
+protocols, functions, methods, proper_
+
+**Functions:**
+- `def extract_symbols(tree: Tree, source_bytes: bytes) -> list`
+- `def _build_docstring_map(node: Node, source_bytes: bytes) -> dict[int, str]`
+- `def _is_doc_comment(comment_text: str) -> bool`
+- `def _clean_docstring(comment_text: str) -> str`
+- `def _extract_generic_parameters(node: Node) -> str`
+- `def _extract_access_modifier(node: Node) -> str`
+- `def _extract_class(node: Node, source_bytes: bytes, docstring: str = "") -> list[Symbol]`
+- `def _extract_struct(node: Node, source_bytes: bytes, docstring: str = "") -> list[Symbol]`
+- `def _extract_enum(node: Node, source_bytes: bytes, docstring: str = "") -> list[Symbol]`
+- `def _extract_protocol(node: Node, source_bytes: bytes, docstring: str = "") -> list[Symbol]`
+- `def _extract_extension(
+ node: Node, source_bytes: bytes, docstring: str = ""
+) -> list[Symbol]`
+- `def _extract_function(
+ node: Node, source_bytes: bytes, docstring: str = ""
+) -> Symbol | None`
+- `def _extract_class_methods(
+ class_name: str, body_node: Node, source_bytes: bytes
+) -> list[Symbol]`
+- `def _extract_property(
+ class_name: str, node: Node, source_bytes: bytes
+) -> Symbol | None`
+- `def _find_property_name(node: Node) -> str | None`
+
+_... and 2 more symbols_
+
+### __init__.py
+_TypeScript/JavaScript language parser.
+
+This module provides parsing for TypeScript (.ts), TSX (.tsx),
+JavaScript (.js), and JSX (.jsx) files using tr_
+
+**class** `class TypeScriptParser(BaseLanguageParser)`
+> TypeScript/JavaScript language parser.
+
+ Handles .ts, .tsx, .js, .jsx files using the appropriate
+
+**Methods:**
+- `def extract_symbols(self, tree: Tree, source_bytes: bytes) -> list`
+- `def extract_imports(self, tree: Tree, source_bytes: bytes) -> list`
+- `def extract_calls(
+ self, tree: Tree, source_bytes: bytes, symbols: list, imports: list
+ ) -> list`
+- `def extract_inheritances(self, tree: Tree, source_bytes: bytes) -> list`
+- `def parse(self, path: Path)`
+
+**Functions:**
+- `def is_typescript_file(path: str) -> bool`
+
+### calls.py
+_Call extraction for TypeScript/JavaScript parser.
+
+This module provides functions to extract function and method call relationships
+from TypeScript/Ja_
+
+**Functions:**
+- `def extract_calls(
+ tree: Tree, source_bytes: bytes, symbols: list, imports: list
+) -> list`
+- `def _extract_calls_from_node(
+ node: Node, source_bytes: bytes, caller: str, calls: list[Call]
+)`
+- `def _parse_call_expression(
+ node: Node, source_bytes: bytes, caller: str
+) -> Call | None`
+- `def _parse_new_expression(
+ node: Node, source_bytes: bytes, caller: str
+) -> Call | None`
+
+### imports.py
+_Import extraction for TypeScript/JavaScript parser.
+
+This module provides functions to extract import/export statements
+from TypeScript/JavaScript sou_
+
+**Functions:**
+- `def extract_imports(tree: Tree, source_bytes: bytes) -> list`
+- `def _parse_import_statement(
+ node: Node, source_bytes: bytes
+) -> list[Import]`
+- `def _parse_export_as_import(
+ node: Node, source_bytes: bytes
+) -> list[Import]`
+- `def _parse_require(
+ node: Node, source_bytes: bytes
+) -> Import | None`
+- `def _extract_string_content(node: Node, source_bytes: bytes) -> str`
+
+### inheritance.py
+_Inheritance extraction for TypeScript/JavaScript parser.
+
+This module provides functions to extract class and interface inheritance
+relationships from_
+
+**Functions:**
+- `def extract_inheritances(tree: Tree, source_bytes: bytes) -> list`
+- `def _extract_inheritances_from_node(
+ node: Node, source_bytes: bytes, inheritances: list[Inheritance]
+)`
+- `def _parse_class_heritage(
+ node: Node, source_bytes: bytes
+) -> tuple[str, list[str]]`
+- `def _strip_generic_type(type_name: str) -> str`
+
+### symbols.py
+_Symbol extraction for TypeScript/JavaScript parser.
+
+This module provides functions to extract symbols (classes, functions, methods,
+interfaces, enums_
+
+**Functions:**
+- `def extract_symbols(tree: Tree, source_bytes: bytes) -> list`
+- `def extract_module_docstring(tree: Tree, source_bytes: bytes) -> str`
+- `def _extract_node_symbols(
+ node: Node, source_bytes: bytes, class_name: str = ""
+) -> list[Symbol]`
+- `def _parse_function_declaration(
+ node: Node, source_bytes: bytes
+) -> Symbol | None`
+- `def _parse_class_declaration(
+ node: Node, source_bytes: bytes, abstract: bool = False
+) -> list[Symbol]`
+- `def _parse_class_heritage(
+ node: Node, source_bytes: bytes
+) -> tuple[str, list[str]]`
+- `def _parse_class_body(
+ node: Node, source_bytes: bytes, class_name: str
+) -> list[Symbol]`
+- `def _parse_method_definition(
+ node: Node, source_bytes: bytes, class_name: str
+) -> Symbol | None`
+- `def _parse_field_definition(
+ node: Node, source_bytes: bytes, class_name: str
+) -> Symbol | None`
+- `def _parse_interface_declaration(
+ node: Node, source_bytes: bytes
+) -> Symbol | None`
+- `def _parse_enum_declaration(
+ node: Node, source_bytes: bytes
+) -> Symbol | None`
+- `def _parse_type_alias(
+ node: Node, source_bytes: bytes
+) -> Symbol | None`
+- `def _parse_lexical_declaration(
+ node: Node, source_bytes: bytes
+) -> list[Symbol]`
+- `def _parse_variable_declarator(
+ node: Node, source_bytes: bytes, decl_keyword: str
+) -> Symbol | None`
+- `def _parse_namespace(
+ node: Node, source_bytes: bytes
+) -> Symbol | None`
+
+_... and 2 more symbols_
+
+### utils.py
+_Utility functions for language parsers.
+
+This module contains helper functions that are shared across multiple language parsers._
+
+**Functions:**
+- `def count_arguments(args_node: Node) -> Optional[int]`
+
+### route_extractor.py
+_Route extraction framework for multi-framework support.
+
+This module provides the abstract base class for route extractors and
+the extraction context _
+
+**class** `class ExtractionContext`
+> Context for route extraction.
+
+ Provides all necessary information for a route extractor to analy
+
+**class** `class RouteExtractor(ABC)`
+> Abstract base class for framework-specific route extractors.
+
+ Each framework (ThinkPHP, Laravel,
+
+### route_registry.py
+_Route extractor registry for framework-agnostic route extraction.
+
+This module provides a registry to store and retrieve route extractors
+for differen_
+
+**class** `class RouteExtractorRegistry`
+> Registry for route extractors.
+
+ Stores and manages route extractors for different frameworks.
+
+
+**Methods:**
+- `def register(self, extractor: RouteExtractor) -> None`
+- `def has_extractor(self, framework: str) -> bool`
+- `def list_frameworks(self) -> list[str]`
+
+### scanner.py
+_Directory scanner for codeindex._
+
+**class** `class ScanResult`
+> Result of scanning a directory.
+
+**Functions:**
+- `def is_pass_through(dir_path: Path, config: Config) -> bool`
+- `def should_exclude(path: Path, exclude_patterns: list[str], base_path: Path) -> bool`
+- `def scan_directory(
+ path: Path,
+ config: Config,
+ base_path: Path | None = None,
+ recursive: bool = True
+) -> ScanResult`
+- `def find_all_directories(root: Path, config: Config) -> list[Path]`
+
+### semantic_extractor.py
+_Business Semantic Extractor
+
+Story 4.4: Extract business semantics from directory structure
+Task 4.4.5: KISS Universal Description Generator
+
+This mod_
+
+**class** `class DirectoryContext`
+> Context information about a directory
+
+ Used to collect information for semantic extraction.
+
+**class** `class BusinessSemantic`
+> Business semantic information
+
+ Extracted description of what a directory does.
+
+**class** `class SimpleDescriptionGenerator`
+> Universal description generator: zero assumptions, zero semantic understanding
+ Only extracts ob
-**Commit ``**:
+**class** `class SemanticExtractor`
+> Extract business semantics from directory context
-Changed files:
-- `cli_tech_debt.py`
+ Supports two modes:
+ - Heuristic mode: KIS
+**Methods:**
+- `def generate(self, context: DirectoryContext) -> str`
+- `def _extract_path_context(self, path: str) -> str`
+- `def _analyze_symbol_pattern(self, symbols: List[str]) -> str`
+- `def _pluralize(self, suffix: str) -> str`
+- `def _extract_entity_names(self, symbols: List[str]) -> List[str]`
+- `def extract_directory_semantic(
+ self,
+ context: DirectoryContext
+ ) -> BusinessSemantic`
+- `def _heuristic_extract(self, context: DirectoryContext) -> BusinessSemantic`
+- `def _ai_extract(self, context: DirectoryContext) -> BusinessSemantic`
+- `def _build_ai_prompt(self, context: DirectoryContext) -> str`
+- `def _parse_ai_response(self, response: str) -> BusinessSemantic`
-**Commit ``**:
+### skill_helpers.py
+_Skill helpers for codeindex-update-guide.
-Changed files:
-- `tech_debt.py`
+Epic #25, Story #27: Provides helper functions for the
+/codeindex-update-guide skill to analyze projects, g_
+**Functions:**
+- `def detect_project_languages(project_path: Path) -> list[str]`
+- `def detect_codeindex_config(project_path: Path) -> Optional[dict]`
+- `def detect_loomgraph_integration(project_path: Path) -> bool`
+- `def generate_version_diff(old_version: str, new_version: str) -> str`
+- `def generate_language_table_diff(old_languages: list[str], new_languages: list[str]) -> str`
+- `def generate_suggestions(profile: dict, current_version: str) -> list[str]`
+- `def apply_updates(
+ file_path: Path,
+ updates: list[dict],
+ select_all: bool = False,
+ selected_indices: Optional[list[int]] = None,
+) -> bool`
+- `def create_backup(file_path: Path) -> Optional[Path]`
+- `def rollback_from_backup(file_path: Path, backup_path: Path) -> bool`
-**Commit ``**:
+### smart_writer.py
+_Smart README writer — backward-compatibility shim.
-Changed files:
-- `cli.py`
-- `cli_tech_debt.py`
-- `tech_debt_formatters.py`
-- `test_smells.py`
+All implementation has moved to the writers/ package.
+This module re-exports the public API so exi_
+### symbol_index.py
+_Global symbol index generator for PROJECT_SYMBOLS.md._
-**Commit ``**:
+**class** `class SymbolEntry`
+> A symbol entry in the global index.
-Changed files:
-- `cli_tech_debt.py`
-- `tech_debt_formatters.py`
+**class** `class GlobalSymbolIndex`
+> Generates a global symbol index (PROJECT_SYMBOLS.md) for a project.
+ Collects all classes, funct
-**Commit ``**:
+**Methods:**
+- `def collect_symbols(self, quiet: bool = False) -> dict`
+- `def generate_index(self, output_file: str = "PROJECT_SYMBOLS.md") -> Path`
+- `def _group_by_type(self) -> dict[str, list[SymbolEntry]]`
-Changed files:
-- `cli_scan.py`
+### symbol_scorer.py
+_Symbol importance scoring system.
+This module provides functionality to score symbols based on their importance,
+helping to prioritize which symbols _
-**Commit ``**:
+**class** `class ScoringContext`
+> Scoring context for symbols.
-Changed files:
-- `cli_scan.py`
+ Attributes:
+ framework: The framework being used (e.g., 'th
+**class** `class SymbolImportanceScorer`
+> Score symbols by importance for inclusion in documentation.
-**Commit ``**:
+ This scorer evaluates symbols acros
-Changed files:
-- `parser.py`
+**Methods:**
+- `def _score_visibility(self, symbol: Symbol) -> float`
+- `def _score_semantics(self, symbol: Symbol) -> float`
+- `def _score_documentation(self, symbol: Symbol) -> float`
+- `def _score_complexity(self, symbol: Symbol) -> float`
+- `def _score_naming_pattern(self, symbol: Symbol) -> float`
+- `def score(self, symbol: Symbol) -> float`
+### tech_debt.py
+_Technical debt detection for code analysis.
-**Commit ``**:
+This module provides tools to detect and analyze technical debt in codebases,
+including file size issues,_
-Changed files:
-- `cli_tech_debt.py`
-- `directory_tree.py`
-- `hierarchical.py`
-- `tech_debt.py`
+**class** `class DebtSeverity(IntEnum)`
+> Severity levels for technical debt issues.
+ Lower values indicate higher severity (CRITICAL is m
-**Commit ``**:
+**class** `class DebtIssue`
+> Represents a technical debt issue detected in code.
-Changed files:
-- `hooks.py`
+ Attributes:
+ severity: The severity
+**class** `class DebtAnalysisResult`
+> Result of analyzing a file for technical debt.
-**Commit ``**:
+ Attributes:
+ issues: List of detected tec
-Changed files:
-- `hooks.py`
+**class** `class SymbolOverloadAnalysis`
+> Analysis result of symbol overload detection.
+ Attributes:
+ total_symbols: Total number o
-**Commit ``**:
+**class** `class FileReport`
+> Report for a single file's technical debt analysis.
-Changed files:
-- `skill_helpers.py`
+ Attributes:
+
---
-
-## Recent Changes
-
-**Commit ``**:
-
-Changed files:
-- `skill_helpers.py`
+_Content truncated due to size limit. See individual module README files for details._
diff --git a/src/codeindex/cli_hooks.py b/src/codeindex/cli_hooks.py
index 50d3601..41371f1 100644
--- a/src/codeindex/cli_hooks.py
+++ b/src/codeindex/cli_hooks.py
@@ -9,7 +9,10 @@
- Detect and merge with existing hooks
"""
+import json
+import logging
import shutil
+import subprocess
from datetime import datetime
from enum import Enum
from pathlib import Path
@@ -19,6 +22,8 @@
from .cli_common import console
+logger = logging.getLogger(__name__)
+
class HookStatus(Enum):
"""Status of a Git hook."""
@@ -349,21 +354,14 @@ def _generate_post_commit_script(config: dict) -> str: # noqa: E501
return """#!/bin/zsh
# codeindex-managed hook
# Post-commit hook for codeindex
-# Smart incremental update based on change analysis
-
-# Colors
-RED='\\033[0;31m'
-GREEN='\\033[0;32m'
-YELLOW='\\033[0;33m'
-CYAN='\\033[0;36m'
-NC='\\033[0m'
+# Thin wrapper — all logic in Python (auto-updated via pip)
# Avoid infinite loop: skip if last commit only contains README_AI.md
LAST_COMMIT_FILES=$(git diff-tree --no-commit-id --name-only -r HEAD)
NON_DOC_FILES=$(echo "$LAST_COMMIT_FILES" | \\
grep -v "README_AI.md" | grep -v "PROJECT_INDEX.md" || true)
if [ -z "$NON_DOC_FILES" ]; then
- exit 0 # Only doc files changed, skip to avoid loop
+ exit 0
fi
# Set up working directory
@@ -377,100 +375,8 @@ def _generate_post_commit_script(config: dict) -> str: # noqa: E501
source "$REPO_ROOT/venv/bin/activate"
fi
-echo "\\n${CYAN}📝 Post-commit: Analyzing changes...${NC}"
-
-# Check if codeindex is available
-if ! command -v codeindex &> /dev/null; then
- echo "${YELLOW}⚠ codeindex not found, skipping auto-update${NC}"
- exit 0
-fi
-
-# Get change analysis as JSON
-ANALYSIS=$(codeindex affected --json 2>/dev/null || echo '{"level": "skip"}')
-
-# Extract level from JSON
-LEVEL=$(echo "$ANALYSIS" | python3 -c \\
- "import sys, json; print(json.load(sys.stdin).get('level', 'skip'))" \\
- 2>/dev/null || echo "skip")
-
-if [ "$LEVEL" = "skip" ]; then
- echo "${GREEN}✓ Changes below threshold, skipping update${NC}"
- exit 0
-fi
-
-echo " Update level: ${YELLOW}${LEVEL}${NC}"
-
-# Get affected directories
-AFFECTED_DIRS=$(echo "$ANALYSIS" | python3 -c "
-import sys, json
-data = json.load(sys.stdin)
-for d in data.get('affected_dirs', []):
- print(d)
-" 2>/dev/null || true)
-
-if [ -z "$AFFECTED_DIRS" ]; then
- echo "${GREEN}✓ No directories need updating${NC}"
- exit 0
-fi
-
-DIR_COUNT=$(echo "$AFFECTED_DIRS" | wc -l | tr -d ' ')
-echo " Found ${DIR_COUNT} directory(ies) to update"
-
-# Update README_AI.md for each affected directory
-UPDATED_LIST=/tmp/codeindex_updated_$$
-rm -f "$UPDATED_LIST"
-
-while IFS= read -r dir; do
- [ -z "$dir" ] && continue
-
- README_PATH="$REPO_ROOT/$dir/README_AI.md"
-
- # Skip if no README_AI.md exists (not indexed yet)
- if [ ! -f "$README_PATH" ]; then
- echo " ${YELLOW}⚠ $dir: not indexed (run 'codeindex scan $dir' first)${NC}"
- continue
- fi
-
- echo " ${CYAN}→ Updating $dir/README_AI.md${NC}"
- if codeindex scan "$dir" 2>&1 | tail -1; then
- echo " ${GREEN}✓ $dir: updated${NC}"
- echo "$README_PATH" >> "$UPDATED_LIST"
- else
- echo " ${YELLOW}⚠ $dir: scan failed, skipping${NC}"
- fi
-done <<< "$AFFECTED_DIRS"
-
-# Auto-commit updated README_AI.md files
-if [ -f "$UPDATED_LIST" ]; then
- UPDATED_COUNT=$(wc -l < "$UPDATED_LIST" | tr -d ' ')
-
- if [ "$UPDATED_COUNT" -gt 0 ]; then
- # Stage updated files
- while IFS= read -r readme; do
- git add "$readme"
- done < "$UPDATED_LIST"
-
- # Check if there are actual staged changes
- if git diff --cached --quiet; then
- echo " ${GREEN}✓ README_AI.md already up to date${NC}"
- else
- echo "\\n${CYAN}→ Committing ${UPDATED_COUNT} updated README_AI.md file(s)...${NC}"
-
- COMMIT_HASH=$(git rev-parse --short HEAD)
- git commit --no-verify -m "docs: auto-update README_AI.md for ${COMMIT_HASH}
-
-Updated by post-commit hook based on code changes.
-Update level: ${LEVEL}"
-
- echo "${GREEN}✓ README_AI.md updates committed${NC}"
- fi
- fi
-
- rm -f "$UPDATED_LIST"
-fi
-
-echo "\\n${GREEN}✓ Post-commit hook completed${NC}\\n"
-exit 0
+# Delegate to Python (upgradeable via pip)
+codeindex hooks run post-commit 2>/dev/null || true
"""
@@ -575,6 +481,87 @@ def uninstall_hook(hook_name: str, repo_path: Optional[Path] = None) -> bool:
return manager.uninstall_hook(hook_name, restore_backup=True)
+def run_post_commit_hook() -> int:
+ """Execute post-commit hook logic in Python.
+
+ This is called by the thin wrapper shell script via
+ `codeindex hooks run post-commit`. All logic lives here so that
+ `pip install --upgrade` automatically updates the behavior.
+
+ Returns:
+ Exit code (0 = success)
+ """
+ # Step 1: Get affected directories
+ try:
+ result = subprocess.run(
+ ["codeindex", "affected", "--json"],
+ capture_output=True, text=True, timeout=30,
+ )
+ if result.returncode != 0:
+ return 0 # Silently skip on error
+
+ analysis = json.loads(result.stdout)
+ except (subprocess.TimeoutExpired, json.JSONDecodeError, FileNotFoundError):
+ return 0
+
+ level = analysis.get("level", "skip")
+ affected_dirs = analysis.get("affected_dirs", [])
+
+ if level == "skip" or not affected_dirs:
+ return 0
+
+ # Step 2: Run codeindex scan for each affected directory
+ repo_root = Path.cwd()
+ updated_readmes: list[str] = []
+
+ for dir_path in affected_dirs:
+ readme_path = repo_root / dir_path / "README_AI.md"
+ if not readme_path.exists():
+ continue
+
+ try:
+ scan_result = subprocess.run(
+ ["codeindex", "scan", dir_path, "--quiet"],
+ capture_output=True, text=True, timeout=120,
+ )
+ if scan_result.returncode == 0:
+ updated_readmes.append(str(readme_path))
+ except (subprocess.TimeoutExpired, FileNotFoundError):
+ continue
+
+ if not updated_readmes:
+ return 0
+
+ # Step 3: Stage and commit updated README_AI.md files
+ try:
+ for readme in updated_readmes:
+ subprocess.run(["git", "add", readme], capture_output=True, timeout=10)
+
+ # Check if there are actual staged changes
+ diff_result = subprocess.run(
+ ["git", "diff", "--cached", "--quiet"],
+ capture_output=True, timeout=10,
+ )
+ if diff_result.returncode == 0:
+ return 0 # No actual changes
+
+ commit_hash = subprocess.run(
+ ["git", "rev-parse", "--short", "HEAD"],
+ capture_output=True, text=True, timeout=10,
+ ).stdout.strip()
+
+ subprocess.run(
+ ["git", "commit", "--no-verify", "-m",
+ f"docs: auto-update README_AI.md for {commit_hash}\n\n"
+ f"Updated by post-commit hook.\nUpdate level: {level}"],
+ capture_output=True, timeout=30,
+ )
+ except (subprocess.TimeoutExpired, FileNotFoundError):
+ pass
+
+ return 0
+
+
# ============================================================================
# CLI Commands
# ============================================================================
@@ -763,6 +750,25 @@ def uninstall(hook_name: Optional[str], uninstall_all: bool, keep_backup: bool):
raise click.Abort()
+@hooks.command("run")
+@click.argument("hook_name")
+def run_hook(hook_name: str):
+ """Run hook logic (called by thin wrapper scripts).
+
+ This is not intended for direct user invocation.
+ The shell hook script delegates to this command so that
+ hook logic can be updated via pip install --upgrade.
+
+ Example: codeindex hooks run post-commit
+ """
+ if hook_name == "post-commit":
+ exit_code = run_post_commit_hook()
+ raise SystemExit(exit_code)
+ else:
+ console.print(f"[yellow]No run handler for hook: {hook_name}[/yellow]")
+ raise SystemExit(0)
+
+
@hooks.command()
def status():
"""Show status of Git hooks."""
diff --git a/src/codeindex/cli_scan.py b/src/codeindex/cli_scan.py
index cf145c9..62cfc13 100644
--- a/src/codeindex/cli_scan.py
+++ b/src/codeindex/cli_scan.py
@@ -370,20 +370,18 @@ def _generate_ai_readme(
# ========== Helper functions for scan_all (validation and config) ==========
-def _validate_scanall_args(fallback: bool, no_ai: bool, quiet: bool) -> None:
+def _validate_scanall_args(fallback: bool, quiet: bool) -> None:
"""Validate scan_all command arguments.
Args:
fallback: Deprecated --fallback flag
- no_ai: Deprecated --no-ai flag
quiet: --quiet flag
"""
# Handle deprecated flags
- if fallback or no_ai:
+ if fallback:
if not quiet:
- flag_name = "--fallback" if fallback else "--no-ai"
console.print(
- f"[yellow]Warning: {flag_name} is deprecated. "
+ "[yellow]Warning: --fallback is deprecated. "
"Structural mode is now the default. "
"This flag will be removed in a future version.[/yellow]"
)
@@ -545,6 +543,8 @@ def _process_directories_parallel(
docstring_processor: DocstringProcessor | None,
quiet: bool,
show_cost: bool,
+ ai: bool = False,
+ timeout: int = 120,
) -> None:
"""Process directories in parallel using SmartWriter.
@@ -555,6 +555,8 @@ def _process_directories_parallel(
docstring_processor: Optional docstring processor
quiet: Suppress progress messages
show_cost: Show token cost information
+ ai: Enable AI enrichment (Phase 2)
+ timeout: AI CLI timeout in seconds
"""
# ========== Phase 1: SmartWriter parallel generation ==========
if not quiet:
@@ -601,6 +603,106 @@ def _process_directories_parallel(
f"(~${estimated_cost:.4f})[/dim]"
)
+ # ========== Phase 2: AI enrichment (--ai flag) ==========
+ if ai and config.ai_command:
+ _enrich_directories_with_ai(dirs, tree, config, quiet, timeout)
+
+
+def _enrich_directories_with_ai(
+ dirs: list[Path],
+ tree: DirectoryTree,
+ config: Config,
+ quiet: bool,
+ timeout: int,
+) -> None:
+ """Phase 2: Enrich non-leaf directories with AI-generated descriptions.
+
+ For each overview/navigation directory, extracts symbol names and file names,
+ sends a minimal prompt to AI, and injects the one-line description as a
+ blockquote in the directory's README_AI.md.
+
+ Args:
+ dirs: All directories that were processed in Phase 1
+ tree: DirectoryTree for level information
+ config: Configuration (must have ai_command)
+ quiet: Suppress progress messages
+ timeout: AI CLI timeout in seconds
+ """
+ from .enricher import (
+ build_enrich_prompt,
+ extract_summary_from_readme,
+ inject_blockquote,
+ should_enrich,
+ )
+ from .invoker import invoke_ai_cli
+
+ # Filter to only enrichable directories
+ enrich_dirs = [d for d in dirs if should_enrich(tree.get_level(d))]
+
+ if not enrich_dirs:
+ return
+
+ if not quiet:
+ console.print(
+ f"\n[bold]🤖 Phase 2: AI enrichment "
+ f"({len(enrich_dirs)} directories)...[/bold]"
+ )
+
+ enriched_count = 0
+ for dir_path in enrich_dirs:
+ # Extract summary from the README_AI.md already generated in Phase 1
+ readme_path = dir_path / config.output_file
+ summary = extract_summary_from_readme(readme_path)
+
+ if not summary:
+ # Fallback: use child directory names
+ child_dirs = tree.get_children(dir_path)
+ child_names = [d.name for d in child_dirs]
+ if child_names:
+ summary = f"Subdirectories: {', '.join(child_names)}"
+
+ if not summary:
+ continue
+
+ # Include parent directory name for context
+ parent_name = dir_path.parent.name if dir_path.parent != dir_path else ""
+ prompt = build_enrich_prompt(dir_path.name, summary, parent_name)
+
+ # Invoke AI CLI
+ invoke_result = invoke_ai_cli(config.ai_command, prompt, timeout=timeout)
+
+ if not invoke_result.success:
+ if not quiet:
+ console.print(f"[yellow]⚠[/yellow] {dir_path.name}: AI error")
+ continue
+
+ # Clean AI output (strip whitespace, quotes, markdown)
+ description = invoke_result.output.strip()
+ description = description.strip('"\'`')
+ # Remove markdown formatting if AI added any
+ if description.startswith("# ") or description.startswith("> "):
+ description = description.lstrip("#> ").strip()
+ # Truncate if too long
+ if len(description) > 80:
+ description = description[:77] + "..."
+
+ if not description:
+ continue
+
+ # Inject into README_AI.md
+ readme_path = dir_path / config.output_file
+ if readme_path.exists():
+ inject_blockquote(readme_path, description)
+ enriched_count += 1
+ if not quiet:
+ console.print(f"[green]✓[/green] {dir_path.name}: {description}")
+
+ if not quiet:
+ console.print(
+ f"[dim]→ Phase 2 complete: "
+ f"{enriched_count}/{len(enrich_dirs)} enriched[/dim]"
+ )
+
# ========== Helper functions for scan_all (extracted from nested functions) ==========
@@ -745,8 +847,11 @@ def scan(
@click.option("--root", type=click.Path(exists=True, file_okay=False, path_type=Path), default=".")
@click.option("--parallel", "-p", type=int, help="Override parallel workers")
@click.option("--timeout", default=120, help="Timeout per directory in seconds")
-@click.option("--ai", is_flag=True, help="Enable AI-enhanced documentation (requires ai_command in config)")
-@click.option("--no-ai", is_flag=True, hidden=True, help="[Deprecated] Structural mode is now the default")
+@click.option(
+ "--ai", is_flag=True,
+ help="Enable AI enrichment (auto-detected when ai_command is configured)",
+)
+@click.option("--no-ai", is_flag=True, help="Disable automatic AI enrichment")
@click.option("--fallback", is_flag=True, hidden=True, help="[Deprecated] Structural mode is now the default")
@click.option("--quiet", "-q", is_flag=True, help="Minimal output")
@click.option("--hierarchical", "-h", is_flag=True, help="Use hierarchical processing (bottom-up)")
@@ -783,18 +888,45 @@ def scan_all(
):
"""Scan all project directories for README_AI.md generation.
- By default, generates structural documentation without AI.
- Use --ai to enable AI-enhanced documentation.
+ By default, auto-detects ai_command in config and enables AI enrichment.
+ Use --no-ai to disable AI enrichment.
"""
# Determine root path first (needed for config loading)
root = Path.cwd() if root is None else root
# Validate arguments
- _validate_scanall_args(fallback, no_ai, quiet)
+ _validate_scanall_args(fallback, quiet)
+
+ # --ai and --no-ai are mutually exclusive
+ if ai and no_ai:
+ console.print("[red]Error: --ai and --no-ai are mutually exclusive[/red]")
+ raise SystemExit(1)
# Load configuration
config, docstring_processor = _load_scanall_config(root, output, parallel, docstring_mode)
+ # Compute effective AI flag:
+ # - --no-ai explicitly passed → disable
+ # - --ai explicitly passed → enable (requires ai_command)
+ # - Neither → auto-enable if ai_command is configured
+ if no_ai:
+ effective_ai = False
+ elif ai:
+ if not config.ai_command:
+ console.print("[red]Error: --ai requires ai_command in .codeindex.yaml[/red]")
+ console.print(" Add ai_command to your config, for example:")
+ console.print(' ai_command: \'claude -p "{prompt}" --allowedTools "Read"\'')
+ raise SystemExit(1)
+ effective_ai = True
+ else:
+ # Auto-detect: enable if ai_command is configured
+ effective_ai = bool(config.ai_command)
+ if effective_ai and not quiet:
+ console.print(
+ "[dim]→ ai_command detected, AI enrichment enabled "
+ "(use --no-ai to disable)[/dim]"
+ )
+
# Use hierarchical processing if requested
if hierarchical:
if not quiet:
@@ -807,7 +939,7 @@ def scan_all(
root,
config,
config.parallel_workers,
- not ai, # fallback parameter
+ not effective_ai, # fallback parameter
quiet,
timeout
)
@@ -825,5 +957,8 @@ def scan_all(
dirs = tree.get_processing_order()
- # Process directories in parallel
- _process_directories_parallel(dirs, tree, config, docstring_processor, quiet, show_cost)
+ # Process directories in parallel (Phase 1: structural, Phase 2: AI enrich if enabled)
+ _process_directories_parallel(
+ dirs, tree, config, docstring_processor, quiet, show_cost,
+ ai=effective_ai, timeout=timeout,
+ )
diff --git a/src/codeindex/enricher.py b/src/codeindex/enricher.py
new file mode 100644
index 0000000..63dee01
--- /dev/null
+++ b/src/codeindex/enricher.py
@@ -0,0 +1,182 @@
+"""AI enrichment module for generating one-line module descriptions.
+
+Epic 25: Instead of AI generating entire README_AI.md files, AI only
+produces a short functional description (~30 chars) per module.
+This is injected as a blockquote line after the title in README_AI.md.
+
+Cost: ~300-800 tokens input, ~20-50 tokens output per directory.
+"""
+
+import re
+from pathlib import Path
+
+from .parser import ParseResult
+
+# Maximum symbol names to include per file in the prompt
+_MAX_SYMBOLS_PER_FILE = 5
+
+# Maximum total files to include in the prompt
+_MAX_FILES = 15
+
+
+def extract_symbol_summary(parse_results: list[ParseResult]) -> str:
+ """Extract a compact summary of file names + symbol names for AI prompt.
+
+ Uses short names (no class:: prefix) to keep prompt compact.
+
+ Args:
+ parse_results: Parsed file results for a directory
+
+ Returns:
+ Compact string like "ImageController.php: uploadAvatar, login; User.php: getUserInfo"
+ """
+ if not parse_results:
+ return ""
+
+ parts = []
+ for result in parse_results[:_MAX_FILES]:
+ if result.error:
+ continue
+ filename = result.path.name
+ # Use short names: "MyClass::method" → "method", "MyClass" stays
+ symbol_names = []
+ for s in result.symbols[:_MAX_SYMBOLS_PER_FILE]:
+ short = s.name.split("::")[-1].split(".")[-1]
+ symbol_names.append(short)
+ if symbol_names:
+ parts.append(f"{filename}: {', '.join(symbol_names)}")
+ else:
+ parts.append(filename)
+
+ return "; ".join(parts)
+
+
+def extract_summary_from_readme(readme_path: Path) -> str:
+ """Extract a compact summary from an existing README_AI.md.
+
+ Reads the file listing and subdirectory names from a structural
+ README_AI.md to build a summary for the AI prompt. This avoids
+ re-scanning the directory when Phase 1 already generated the README.
+
+ Args:
+ readme_path: Path to an existing README_AI.md
+
+ Returns:
+ Compact summary string, or empty string if unreadable
+ """
+ if not readme_path.exists():
+ return ""
+
+ try:
+ content = readme_path.read_text(encoding="utf-8", errors="replace")
+ except Exception:
+ return ""
+
+ parts = []
+
+ # Extract subdirectory names
+ for m in re.finditer(r'\*\*(\w[\w/]*)/\*\*\s*-\s*(.+?)$', content, re.MULTILINE):
+ dir_name = m.group(1)
+ desc = m.group(2).strip()
+ # Skip stats-only descriptions like "48 files | 386 symbols"
+ if re.match(r'^\d+ files', desc):
+ parts.append(dir_name)
+ else:
+ parts.append(f"{dir_name}: {desc[:60]}")
+
+ # Extract file names with their key symbols
+ for m in re.finditer(r'\*\*(\w[\w.]*\.[\w.]+)\*\*\s*-\s*(\w[\w, ]*)', content, re.MULTILINE):
+ filename = m.group(1)
+ symbols = m.group(2).strip()
+ parts.append(f"{filename}: {symbols[:50]}")
+
+ if len(parts) > 20:
+ parts = parts[:20]
+
+ return "; ".join(parts)
+
+
+def build_enrich_prompt(
+ dir_name: str,
+ symbol_summary: str,
+ parent_name: str = "",
+) -> str:
+ """Build a minimal prompt for AI to generate a one-line module description.
+
+ Args:
+ dir_name: Directory name (e.g., "SmallProgramApi")
+ symbol_summary: Output of extract_symbol_summary() or extract_summary_from_readme()
+ parent_name: Parent directory name for context (e.g., "Application")
+
+ Returns:
+ Prompt string for AI CLI
+ """
+ context = f"Directory: {dir_name}\n"
+ if parent_name:
+ context += f"Parent: {parent_name}\n"
+ context += f"Contents: {symbol_summary}\n"
+
+ return (
+ context
+ + "\n"
+ "Based ONLY on the file names and symbol names above, write a concise "
+ "functional description of this module (30 chars or less). "
+ "Describe WHAT it does, not HOW. "
+ "Examples: '会员等级、积分、权益卡管理', 'Payment gateway (Alipay/WeChat)', "
+ "'物流配送与运费计算'. "
+ "Output ONLY the description text. No quotes, no markdown, no explanation. "
+ "Do NOT invent features not evidenced by the symbol names."
+ )
+
+
+def inject_blockquote(readme_path: Path, description: str) -> None:
+ """Inject or replace a blockquote description line in README_AI.md.
+
+ Inserts `> description` after the first `# Title` line.
+ If a blockquote already exists after the title, it is replaced.
+
+ Args:
+ readme_path: Path to README_AI.md file
+ description: The one-line description to inject
+ """
+ content = readme_path.read_text(encoding="utf-8", errors="replace")
+ lines = content.split("\n")
+
+ # Find the title line (first line starting with "# ")
+ title_idx = None
+ for i, line in enumerate(lines):
+ if line.startswith("# "):
+ title_idx = i
+ break
+
+ if title_idx is None:
+ # No title found — prepend blockquote at top
+ lines.insert(0, f"> {description}")
+ readme_path.write_text("\n".join(lines), encoding="utf-8")
+ return
+
+ # Check if next non-empty line is already a blockquote
+ next_content_idx = title_idx + 1
+ while next_content_idx < len(lines) and lines[next_content_idx].strip() == "":
+ next_content_idx += 1
+
+ if next_content_idx < len(lines) and lines[next_content_idx].startswith(">"):
+ # Replace existing blockquote
+ lines[next_content_idx] = f"> {description}"
+ else:
+ # Insert after title
+ lines.insert(title_idx + 1, f"> {description}")
+
+ readme_path.write_text("\n".join(lines), encoding="utf-8")
+
+
+def should_enrich(level: str) -> bool:
+ """Determine if a directory at this level should get AI enrichment.
+
+ Only overview and navigation levels benefit from AI descriptions.
+ Detailed (leaf) levels have enough symbol information already.
+
+ Args:
+ level: One of "overview", "navigation", "detailed"
+ """
+ return level in ("overview", "navigation")
diff --git a/src/codeindex/templates/claude_md_core.md b/src/codeindex/templates/claude_md_core.md
index f7081b5..8efd81a 100644
--- a/src/codeindex/templates/claude_md_core.md
+++ b/src/codeindex/templates/claude_md_core.md
@@ -14,17 +14,25 @@ codeindex init # 生成 .codeindex.yaml 配置文件
#### 2. 扫描与文档生成
```bash
-# 扫描单个目录(生成 README_AI.md)
+# 扫描单个目录(生成结构化 README_AI.md)
codeindex scan ./src/auth
# 扫描所有目录(SmartWriter 模式)
codeindex scan-all
-# 预览提示词(不执行 AI)
-codeindex scan ./src/auth --dry-run
+# 当配置了 ai_command 时,自动启用 AI 一句话模块描述
+# Phase 1: 生成结构化 README_AI.md
+# Phase 2: AI 为每个非叶子目录生成 blockquote 功能描述
+codeindex scan-all
+
+# 禁用 AI 增强(仅结构化文档)
+codeindex scan-all --no-ai
+
+# 单目录完整 AI 生成(AI 生成整个 README 内容)
+codeindex scan ./src/auth --ai
-# 无 AI 模式(仅结构化文档)
-codeindex scan ./src/auth --fallback
+# 预览 AI 提示词(不执行)
+codeindex scan ./src/auth --ai --dry-run
# 生成 JSON 输出(用于工具集成)
codeindex scan ./src --output json
@@ -89,11 +97,12 @@ codeindex status
1. **初始化项目时**:
```bash
codeindex init
- codeindex scan-all # 生成完整索引
+ codeindex scan-all # 生成完整索引(ai_command 配置后自动 AI 增强)
```
2. **日常开发**:
- - 使用 Git Hooks 自动更新(mode: auto)
+ - 安装 Git Hooks 自动更新: `codeindex hooks install post-commit`
+ - Hook 使用 thin wrapper 模式,`pip install --upgrade ai-codeindex` 自动更新逻辑
- 大改动后手动运行 `codeindex affected` 检查影响范围
3. **AI Code 使用**:
diff --git a/src/codeindex/writers/utils.py b/src/codeindex/writers/utils.py
index 3ddc81f..cf6a64e 100644
--- a/src/codeindex/writers/utils.py
+++ b/src/codeindex/writers/utils.py
@@ -42,6 +42,7 @@ def extract_module_description(
"""Extract brief description from a child module's README.
Strategies (in order):
+ 0. AI-generated blockquote description (> description) — highest priority
1. Parse structured stats (Files/Symbols) + class names from README_AI.md
2. Find first free-text line (non-header, non-list)
3. Fallback to "Module directory"
@@ -53,6 +54,13 @@ def extract_module_description(
try:
content = readme_path.read_text(encoding="utf-8", errors="replace")
+ # Strategy 0: AI-generated blockquote description (Epic 25)
+ bq_match = re.search(r'^>[ \t]*(\S[^\n]*)$', content, re.MULTILINE)
+ if bq_match:
+ desc = bq_match.group(1).strip()
+ if desc:
+ return desc
+
# Strategy 1: Structured info from codeindex output
files_match = re.search(r'\*\*Files\*\*:\s*(\d+)', content)
symbols_match = re.search(r'\*\*Symbols\*\*:\s*(\d+)', content)
diff --git a/tests/README_AI.md b/tests/README_AI.md
index 4c9eb96..2d9105d 100644
--- a/tests/README_AI.md
+++ b/tests/README_AI.md
@@ -1,264 +1,1548 @@
-
+
-# README_AI.md - tests
+# tests
## Overview
-- **Files**: 76
-- **Symbols**: 1340+
-- **Subdirectories**: 1
-
-## Subdirectories
-
-- **extractors/** - _Tests for route extractors._
+- **Files**: 119
+- **Symbols**: 2062
## Files
-- __init__.py
-- **conftest.py** - create_mock_symbol, create_mock_parse_result, mock_config
-- __init__.py
-- **test_spring.py** - TestBasicRouteExtraction, TestMultipleRoutes, TestControllerAnnotation
-- **test_thinkphp.py** - TestThinkPHPRouteExtractor, test_extract_routes_only_public_methods
-- **test_thinkphp_description.py** - TestThinkPHPDescriptionExtraction
-- **test_adaptive_config.py** - TestAdaptiveSymbolsConfig, TestConfigurationValidation, TestExpectedDefaults
-- **test_adaptive_selector.py** - TestAdaptiveSymbolSelectorBase, TestSizeCategoryDetermination, TestConstraintApplication
-- **test_ai_helper.py** - test_aggregate_multiple_parse_results, test_aggregate_single_parse_result, test_aggregate_empty_parse_results
-- **test_backward_compatibility.py** - TestBackwardCompatibility
-- **test_claude_md_injection.py** - TestInjectClaudeMd, TestHasClaudeMdInjection. Unit tests for CLAUDE.md injection functions (inject_claude_md / has_claude_md_injection) from codeindex.init_wizard. TestInjectClaudeMd validates: file creation when missing, section content (markers, instructions, codeindex scan-all reference), prepending to existing files with content preservation, idempotent replacement between markers, empty/whitespace-only file handling, return path. TestHasClaudeMdInjection validates: False when no file exists, False when no marker present, True after injection, True when marker exists in manually written content.
-- **test_cli_docstring_options.py** - TestDocstringCLIOptions, TestDocstringHelp
-- **test_cli_hooks.py** - TestHookManager, TestHookGeneration, TestBackupAndRestore
-- **test_cli_json.py** - TestScanJSONOutput, TestScanAllJSONOutput, TestJSONErrorHandling
-- **test_cli_parse.py** - TestCliParse. Tests `parse` command: basic functionality (Python/PHP/Java JSON output, help text, version compatibility), JSON format validation (required fields, symbols structure, optional fields, round-trip, consistency with scan), error handling (file not found, unsupported language, syntax errors, empty files, permission denied), framework features (ThinkPHP routes, Spring annotations, inheritance), and performance benchmarks (<0.5s for small files, <3s for large files).
-- **test_cli_scan_defaults_bdd.py** - BDD tests for CLI scan default behavior (Epic 19 Story 19.1). Tests that scan/scan-all defaults to structural mode (no AI required) and AI is opt-in via --ai flag. Validates scan command behavior: without --ai (SmartWriter structural output, no AI CLI invocation), with --ai (AI CLI invocation when ai_command configured, error when not configured), --fallback deprecation warning, --dry-run requiring --ai flag. Also validates scan-all command: without --ai (all directories processed with SmartWriter), with --ai (AI CLI invoked for directories), --fallback deprecation. Uses Click CliRunner with InProjectDir context manager for temporary project directory isolation.
-- **test_cli_tech_debt.py** - TestTechDebtCommand, TestTechDebtIntegration, sample_files
-- **test_config_adaptive.py** - TestSymbolsConfigAdaptive, TestConfigLoadingAdaptive, TestConfigurationMerging
-- **test_dataclass_structure.py** - TestInheritanceDataclass, TestImportWithAlias, TestParseResultWithInheritances
-- **test_directory_tree.py** - _create_test_structure, test_directory_tree_build, test_directory_tree_levels
-- **test_docstring_config.py** - TestDocstringConfig, TestBackwardCompatibility
-- **test_docstring_processor.py** - TestDocstringProcessor
-- **test_error_handling.py** - TestCommandLevelErrors, TestErrorObjectStructure, TestFileLevelErrors
-- **test_file_classifier.py** - test_classify_tiny_file, test_classify_small_file, test_classify_medium_file
-- **test_help_system_bdd.py** - BDD tests for Enhanced Help System (Epic 15 Story 15.3). Tests help command execution and output validation: parameter descriptions, range information, structured formatting (tables/headers), terminal readability, configuration examples, YAML syntax examples, current value display from .codeindex.yaml, and warning messages. Fixtures: help_context (command execution state), cli_available (CLI availability check). Step functions handle config creation with specific parameter values, CPU core mocking, command execution with output capture, and comprehensive output assertions (text matching, formatting validation, example presence, exit code verification).
-- **test_hooks_config.py** - TestHooksConfig, TestPostCommitConfig
-- **test_init_wizard_bdd.py** - BDD tests for Interactive Setup Wizard (Epic 15 Story 15.1). Tests wizard execution flow with full implementation integration: language detection (detect_languages), framework detection (detect_frameworks), pattern inference (infer_include_patterns, infer_exclude_patterns), performance optimization (calculate_parallel_workers, calculate_batch_size), configuration generation (generate_config_yaml), CODEINDEX.md creation (create_codeindex_md), and CLAUDE.md injection (inject_claude_md, has_claude_md_injection). Validates end-to-end wizard workflow from project scanning to config file creation with accurate values and zero-input automation support. Includes CLAUDE.md injection BDD steps: existing file handling, injection request/skip, marker validation, content preservation, idempotent updates. Mocks os.cpu_count() unconditionally (returns 16) for consistent parallel_workers calculation across all CI environments (e.g., GitHub macOS runners with fewer cores).
-- **test_java_annotations.py** - TestClassAnnotations, TestMethodAnnotations, TestFieldAnnotations
-- **test_java_calls.py** - TestBasicMethodCalls, TestConstructorCalls, TestStaticImportResolution, TestFullQualifiedNameCalls, TestMethodReferences, TestProjectInternalFiltering, TestEdgeCases, TestAnnotationBasedCalls. Validates fully-qualified caller names (e.g., com.example.UserService.createUser) for accurate call graph analysis, ensuring package.Class.method format in all assertions.
-- **test_java_edge_cases.py** - TestNestedClasses, TestComplexGenerics, TestLongSignatures
-- **test_java_error_recovery.py** - TestSyntaxErrors, TestIncompleteDeclarations, TestMalformedGenerics
-- **test_java_generic_bounds.py** - TestSingleExtendsBound, TestMultipleBounds, TestWildcardBounds
-- **test_java_inheritance.py** - TestBasicInheritance, TestGenericTypes, TestImportResolution, TestNestedClasses, TestRealWorldFrameworks, TestEdgeCases, TestAdvancedJavaInheritance. Auto-generated from test_generator/specs/java.yaml. Tests Java inheritance extraction: basic extends/implements, generic type parameters, import resolution (explicit, java.lang implicit, same-package, FQN), nested/static nested classes, real-world framework patterns (Spring Boot, JPA, Lombok), edge cases (enum, record, annotation), and advanced patterns (diamond inheritance, sealed classes, wildcard extends, multiple interface extends).
-- **test_java_lambda.py** - TestSimpleLambda, TestLambdaWithParameters, TestBlockLambda
-- **test_java_lombok.py** - TestBasicLombokAnnotations, TestConstructorAnnotations, TestBuilderAnnotation
-- **test_java_module.py** - TestBasicModuleDeclaration, TestRequiresDirective, TestExportsDirective
-- **test_java_parser.py** - TestJavaParserBasics, TestJavaSymbolExtraction, TestJavaImports
-- **test_java_spring.py** - TestSpringControllerLayer, TestSpringServiceLayer, TestSpringRepositoryLayer
-- **test_java_throws.py** - TestSingleThrows, TestMultipleThrows, TestGenericThrows
-- **test_json_output.py** - TestSymbolSerialization, TestImportSerialization, TestParseResultSerialization
-- **test_loomgraph_integration.py** - TestLoomGraphJSONFormat, TestLoomGraphDataMapping, TestLoomGraphRealWorldExample
-- **test_parallel_scan.py** - TestParallelScanning, TestParallelPerformance, TestParallelCorrectness
-- **test_parser.py** - test_parse_simple_function, test_parse_class_with_methods, test_parse_imports
-- **test_parser_detection.py** - TestCheckParserInstalled, TestParserInstallGuidance, TestInitWizardPostMessage. Tests for parser installation detection (Epic 19 Story 19.4). TestCheckParserInstalled validates check_parser_installed function: Python/Java/PHP parsers detected as installed, unknown languages (cobol) and unsupported languages (rust) report not installed. TestParserInstallGuidance validates get_parser_install_guidance: all installed parsers return empty missing list, mixed installed/missing languages separate correctly, empty language list returns empty guidance. TestInitWizardPostMessage validates updated post-init messages (Story 19.2): post-init suggests scan-all command, guides user to review .codeindex.yaml config and run scan-all, generated config has no active ai_command (AI is opt-in). Uses Click CliRunner with tmp_path for CLI integration tests.
-- **test_php_comment_extraction.py** - TestPHPCommentExtraction
-- **test_php_docstring_extraction.py** - TestPHPDocstringExtraction, TestSmartWriterIntegration, TestPHPParserEdgeCases
-- **test_php_import_alias.py** - TestPHPImportAliasBasic, TestPHPImportAliasGroupImports, TestPHPImportAliasNamespace
-- **test_php_inheritance.py** - TestPHPInheritanceBasic, TestPHPInheritanceNamespace, TestPHPInheritanceModifiers, TestPHPInheritanceEdgeCases, TestPHPInheritanceGroupImports, TestPHPInheritanceRealWorld, TestPHPInheritanceAdvanced. Auto-generated from test_generator/specs/php.yaml. Tests PHP inheritance extraction: basic extends/implements, namespace resolution (use statements, aliases, group imports), class modifiers (abstract, final), edge cases (no inheritance, empty file, multiple classes, inheritance chains), real-world patterns (Laravel Model, Symfony Controller), and advanced patterns (interface extends, traits, constructor inheritance, readonly classes, PHP 8.1 enums).
-- **test_php_loomgraph_integration.py** - TestPHPLoomGraphJSONFormat, TestPHPLoomGraphRealWorld, TestPHPLoomGraphEdgeCases
-- **test_project_index_semantic.py** - TestExtractModulePurpose, TestProjectIndexGeneration, TestBDDProjectIndex
-- **test_python_calls.py** - TestBasicFunctionCalls, TestMethodCalls, TestConstructorCalls, TestAliasResolution, TestDecoratorCalls, TestProjectInternalFiltering, TestEdgeCases. Tests document AST-level classification decisions: module.function() classified as METHOD (requires semantic analysis for FUNCTION), local variable methods preserved as-is (type inference out of scope), constructors formatted with .__init__ suffix for semantic clarity.
-- **test_python_docstring_description.py** - TestPythonDocstringDescription
-- **test_python_import_alias.py** - TestImportAsBasic, TestFromImportAs, TestComplexScenarios
-- **test_python_inheritance.py** - TestSingleInheritance, TestMultipleInheritance, TestNoInheritance, TestNestedClassInheritance, TestGenericInheritance, TestComplexScenarios, TestEdgeCases, TestAdvancedInheritance. Auto-generated from test_generator/specs/python.yaml. Tests Python inheritance extraction: single/multiple inheritance, qualified names, no inheritance, nested classes (external parent, deeply nested), generic types (Generic[T], List[str], multiple type params), complex scenarios (mixed inheritance, chains, methods), edge cases (empty file, no classes, comments, object parent), and advanced patterns (ABC, dataclass, enum, exception hierarchy, mixins, diamond inheritance, Protocol, metaclass, decorators).
-- **test_route_extractor.py** - TestExtractionContext, TestRouteExtractor
-- **test_route_info.py** - TestRouteInfoLineNumber
-- **test_route_registry.py** - TestFrameworkExtractor, AnotherFrameworkExtractor, TestRouteExtractorRegistry
-- **test_route_table_description.py** - TestRouteTableDescription
-- **test_route_table_display.py** - TestRouteTableDisplay
-- **test_scanner_passthrough.py** - TestIsPassThrough, TestDirectoryTreePassthrough. Tests for pass-through directory skipping (Epic 19 Story 19.5). TestIsPassThrough validates the is_pass_through function: single subdir with no code files detected as pass-through, directories with code files not skipped, multiple subdirs not pass-through (navigation value), empty dirs not pass-through, excluded subdirs not counted, non-code files ignored, Java extension checking. TestDirectoryTreePassthrough validates DirectoryTree integration: Java Maven structure skips intermediate directories (src/main/java/com/zcyl/gateway/ reduces to leaf only), Python flat structures with code files never skipped, directories with multiple subdirs kept for navigation value, deep pass-through chains all skipped (com/zcyl/service/impl/).
-- **test_semantic_extractor.py** - TestDirectoryContext, TestBusinessSemantic, TestSemanticExtractor
-- **test_smart_writer.py** - _create_mock_parse_result, test_smart_writer_overview_level, test_smart_writer_navigation_level
-- **test_smart_writer_adaptive.py** - TestSmartWriterAdaptiveDisabled, TestSmartWriterAdaptiveEnabled, TestRealWorldScenarios
-- **test_smart_writer_docstring.py** - TestSmartWriterDocstringIntegration, TestSmartWriterProcessorFactory
-- **test_smart_writer_integration.py** - TestSmartWriterRouteIntegration
-- **test_smart_writer_semantic.py** - TestSemanticConfig, TestSmartWriterSemanticIntegration, TestBDDSemanticIntegration
-- **test_story_4_4_integration.py** - TestStory44EndToEnd, TestStory44PerformanceValidation
-- **test_symbol_overload.py** - TestSymbolOverloadAnalysis, TestSymbolCountDetection, TestNoiseRatioDetection
-- **test_symbol_scorer.py** - TestSymbolScorerBase, TestVisibilityScoring, TestSemanticScoring
-- **test_tech_debt_bdd.py** - tech_debt_detector_fixture, symbol_scorer_fixture, file_with_lines
-- **test_tech_debt_detector.py** - TestDebtSeverity, TestDebtIssue, TestDebtAnalysisResult
-- **test_tech_debt_formatters.py** - TestConsoleFormatter, TestMarkdownFormatter, TestJSONFormatter
-- **test_tech_debt_java.py** - TestJavaAutoRecursive, TestJavaNoiseAnalysis, TestJavaScorerBoost. Tests for Java-specific tech-debt improvements (Epic 19 Story 19.6). TestJavaAutoRecursive validates auto-enabling recursive scanning for Java projects: Java config auto-enables recursive for nested files, explicit --recursive still works, non-Java projects do not auto-recurse, Java hint message removed. TestJavaNoiseAnalysis validates language-aware noise analysis: Java getters/setters not counted as noise, _analyze_noise_breakdown skips Java getters, Python noise analysis unchanged (get_xxx/set_xxx still noise), PHP noise analysis unchanged. TestJavaScorerBoost validates Java getter/setter scoring: Java getter scores >= 30.0 threshold, Java setter scores >= 30.0 threshold, Python getter scoring unchanged.
-- **test_tech_debt_reporter.py** - TestFileReport, TestTechDebtReport, TestTechDebtReporter
-- **test_thinkphp_route_extractor.py** - TestThinkPHPRouteLineNumbers, test_extract_routes_only_public_methods
-- **test_windows_path_optimization.py** - TestRelativePathOptimization, TestAbsolutePathBackwardCompatibility, TestPathLengthReduction, TestWindowsCrossDrive, TestSymlinkHandling, TestPathSeparatorNormalization, TestExistingScannerBehavior. Tests Windows path length optimization (Issue #8): verifies relative path usage reduces path lengths, maintains backward compatibility with absolute paths, handles cross-drive scenarios, symlink resolution, and cross-platform path separator normalization.
-
-
-**test_claude_md_injection.py**: New test suite for CLAUDE.md injection functions (inject_claude_md / has_claude_md_injection) from codeindex.init_wizard. TestInjectClaudeMd (8 tests) validates: file creation when CLAUDE.md is missing, created file contains codeindex section with markers and instructions (README_AI.md reference, codeindex status, codeindex scan-all), prepending section to existing CLAUDE.md with original content preserved, idempotent replacement between markers (no duplication on re-run), idempotent update preserves surrounding content, return path correctness, empty file handling, whitespace-only file handling. TestHasClaudeMdInjection (4 tests) validates: returns False when no CLAUDE.md exists, returns False when file exists without marker, returns True after inject_claude_md runs, returns True when marker manually present. Imports CLAUDE_MD_MARKER_END, CLAUDE_MD_MARKER_START, CLAUDE_MD_SECTION, has_claude_md_injection, inject_claude_md from codeindex.init_wizard.
-
-**test_init_wizard_bdd.py**: Added CLAUDE.md injection BDD step definitions. New imports: CLAUDE_MD_MARKER_END, CLAUDE_MD_MARKER_START, has_claude_md_injection, inject_claude_md from codeindex.init_wizard. New step functions: existing_claude_md (given existing CLAUDE.md), claude_md_with_injection (given pre-injected CLAUDE.md), request_claude_md_injection (when injection requested), skip_claude_md_injection (when injection skipped), claude_md_created_with_section (then file created), has_codeindex_marker (then markers present), has_readme_instruction (then README_AI.md instruction present), claude_md_has_section (then section present), original_content_preserved (then existing content kept), exactly_one_section (then single section count), markers_content_updated (then content between markers valid), no_claude_md (then file does not exist). Mocks os.cpu_count() unconditionally for all project sizes to ensure consistent parallel_workers calculation across CI environments with varying core counts.
-
-**test_parser_detection.py**: Test suite for parser installation detection (Epic 19 Story 19.4). TestCheckParserInstalled validates check_parser_installed function from codeindex.init_wizard: Python/Java/PHP parsers detected as installed, unknown languages (cobol) report not installed, unsupported languages without parser mapping (rust) report not installed. TestParserInstallGuidance validates get_parser_install_guidance: all installed parsers (python, java, php) return empty missing list, mixed installed/missing languages correctly separate into installed and missing lists, empty language list returns empty guidance. TestInitWizardPostMessage validates updated post-init messages (Story 19.2): post-init output suggests scan-all command, guides user to review .codeindex.yaml config and run scan-all, generated config has empty ai_command (AI is opt-in). Uses Click CliRunner with tmp_path for CLI integration tests, imports from codeindex.init_wizard and codeindex.cli.
-
-**test_tech_debt_java.py**: New test suite for Java-specific tech-debt improvements (Epic 19 Story 19.6). TestJavaAutoRecursive validates auto-enabling recursive scanning for Java projects via CLI: Java config auto-enables recursive for deeply nested files, explicit --recursive still works for non-Java, non-Java projects do not auto-recurse, old "Try adding --recursive" hint message removed. TestJavaNoiseAnalysis validates language-aware noise analysis in TechDebtDetector: Java getters/setters not counted as noise (no high severity low_quality_symbols issue), _analyze_noise_breakdown skips Java getters/setters, Python and PHP noise analysis unchanged. TestJavaScorerBoost validates SymbolImportanceScorer Java getter/setter scoring above 30.0 threshold. Imports from codeindex.config.Config, codeindex.parser.ParseResult/Symbol, codeindex.symbol_scorer.ScoringContext/SymbolImportanceScorer, codeindex.tech_debt.TechDebtDetector.
-
-**test_scanner_passthrough.py**: New test suite for pass-through directory skipping (Epic 19 Story 19.5). Tests the is_pass_through function and DirectoryTree integration for skipping directories that have no code files and exactly one subdirectory. Validates behavior for Java Maven deep directory structures (src/main/java/com/zcyl/), Python flat structures, navigation-value directories with multiple subdirs, excluded subdirectory handling, non-code file ignoring, and deep pass-through chain skipping. Imports from codeindex.config.Config, codeindex.scanner.is_pass_through, and codeindex.directory_tree.DirectoryTree.
-
-**test_cli_scan_defaults_bdd.py**: New BDD test suite for CLI scan default behavior (Epic 19 Story 19.1). Tests the reversed scan defaults where structural mode (SmartWriter) is the default and AI is opt-in via --ai flag. Covers scan command scenarios: default structural output without AI, AI invocation with --ai flag, error handling when ai_command is missing, --fallback deprecation warning, --dry-run requiring --ai. Also covers scan-all command scenarios: default SmartWriter processing, AI opt-in for all directories, --fallback deprecation. Uses Click CliRunner, pytest_bdd scenarios from features/cli_scan_defaults.feature, and InProjectDir context manager for project directory isolation.
-
-
-**Commit ``**:
-
-Changed files:
-- `test_smart_writer_enriched.py`
-
-
-**Commit ``**:
-
-Changed files:
-- `test_smart_writer_enriched.py`
+### __init__.py
+
+### conftest.py
+_Shared test fixtures and utilities for codeindex tests._
+
+**Functions:**
+- `def create_mock_symbol(
+ name: str = "test_function",
+ kind: str = "function",
+ signature: str = "def test_function():",
+ docstring: str = "A test function",
+ line_start: int = 1,
+ line_end: int = 10,
+) -> Symbol`
+- `def create_mock_parse_result(
+ file_path: str = "test.php",
+ file_lines: int = 300,
+ symbol_count: int = 15,
+ class_name: str | None = None,
+ methods_per_class: int = 0,
+ imports: list[Import] | None = None,
+ functions_count: int | None = None,
+ method_lines: int = 15,
+) -> ParseResult`
+- `def mock_config()`
+- `def symbol_scorer(mock_config)`
+
+### __init__.py
+_Tests for route extractors._
+### test_spring.py
+_Tests for Spring Framework route extractor.
-**Commit ``**:
+Story 7.2: Spring Route Extraction
+Tests extraction of Spring REST routes from controllers:
+- @RestContro_
-Changed files:
-- `test_typescript_integration.py`
-- `test_typescript_parser.py`
+**class** `class TestBasicRouteExtraction`
+> Test basic Spring route extraction.
+**class** `class TestMultipleRoutes`
+> Test multiple routes in one controller.
-**Commit ``**:
+**class** `class TestControllerAnnotation`
+> Test @Controller vs @RestController.
-Changed files:
-- `conftest.py`
-- `test_file_classifier.py`
-- `test_tech_debt_detector.py`
+**class** `class TestEdgeCases`
+> Test edge cases.
+**class** `class TestLineNumbers`
+> Test route line number extraction.
-**Commit ``**:
+**Methods:**
+- `def test_get_mapping(self, tmp_path)`
+- `def test_post_mapping(self, tmp_path)`
+- `def test_put_mapping(self, tmp_path)`
+- `def test_delete_mapping(self, tmp_path)`
+- `def test_crud_controller(self, tmp_path)`
+- `def test_rest_controller(self, tmp_path)`
+- `def test_controller_annotation(self, tmp_path)`
+- `def test_no_controller_annotation(self, tmp_path)`
+- `def test_empty_controller(self, tmp_path)`
+- `def test_method_without_mapping(self, tmp_path)`
-Changed files:
-- `test_parser_swift_poc.py`
+_... and 1 more symbols_
+### test_thinkphp.py
+_Tests for ThinkPHP route extractor (Epic 6, Task 2.3)._
-**Commit ``**:
+**class** `class TestThinkPHPRouteExtractor`
+> Test ThinkPHP route extractor with new architecture.
-Changed files:
-- `test_parser_swift_properties.py`
+**Methods:**
+- `def test_framework_name(self)`
+- `def test_can_extract_in_controller_directory(self)`
+- `def test_can_extract_in_non_controller_directory(self)`
+- `def test_extract_routes_with_line_numbers(self)`
+- `def test_extract_routes_multiple_controllers(self)`
+- `def test_extract_routes_only_public_methods(self)`
+- `def test_extract_routes_skip_magic_methods(self)`
+- `def test_extract_routes_no_controller_class(self)`
+- `def test_extract_routes_with_parse_error(self)`
+### test_thinkphp_description.py
+_Tests for ThinkPHP route extractor description extraction (Epic 6, P2, Task 3.3)._
-**Commit ``**:
+**class** `class TestThinkPHPDescriptionExtraction`
+> Test ThinkPHP route extractor extracts descriptions from docstrings.
-Changed files:
-- `test_parser_swift_protocols.py`
+**Methods:**
+- `def test_extract_description_from_method_docstring(self)`
+- `def test_extract_description_truncates_long_text(self)`
+- `def test_extract_description_empty_for_no_docstring(self)`
+- `def test_extract_description_from_multiple_methods(self)`
+
+### broken.py
+_Parse error: Syntax error in source file_
+
+### complete.py
+
+**class** `class Parent`
+> Parent class
+
+**class** `class Child(Parent)`
+> Child class
+
+**Methods:**
+- `def method(self)`
+
+**Functions:**
+- `def add(x, y)`
+
+### simple.py
+**class** `class Calculator`
+> Simple calculator
-**Commit ``**:
+**Methods:**
+- `def multiply(self, a: int, b: int) -> int`
-Changed files:
-- `test_parser_swift_inheritance.py`
+**Functions:**
+- `def add(a: int, b: int) -> int`
+### file1.py
-**Commit ``**:
+**Functions:**
+- `def func1()`
-Changed files:
-- `test_parser_swift_docstrings.py`
+### file2.py
+**Functions:**
+- `def func2()`
-**Commit ``**:
+### file4.py
-Changed files:
-- `test_parser_swift_signatures.py`
+**Functions:**
+- `def func4()`
+### file3.py
-**Commit ``**:
+**Functions:**
+- `def func3()`
-Changed files:
-- `test_parser_swift_integration.py`
+### test_hierarchy_simple.py
+_Test fixture generator for hierarchical processing.
+This script creates a simple directory hierarchy with Python files for testing
+the hierarchical s_
-**Commit ``**:
+### test_adaptive_config.py
+_Tests for adaptive symbols configuration._
-Changed files:
-- `test_parser_swift_generics.py`
+**class** `class TestAdaptiveSymbolsConfig`
+> Test AdaptiveSymbolsConfig data class.
+**class** `class TestConfigurationValidation`
+> Test configuration validation logic.
-**Commit ``**:
+**Methods:**
+- `def test_default_config_exists(self)`
+- `def test_default_config_disabled_by_default(self)`
+- `def test_default_config_has_thresholds(self)`
+- `def test_default_config_has_limits(self)`
+- `def test_default_thresholds_are_increasing(self)`
+- `def test_default_limits_are_increasing(self)`
+- `def test_default_min_max_symbols(self)`
+- `def test_custom_config_initialization(self)`
+- `def test_config_with_partial_overrides(self)`
+- `def test_thresholds_should_be_positive(self)`
+- `def test_limits_should_be_positive(self)`
+- `def test_min_symbols_should_be_positive(self)`
+- `def test_max_symbols_should_be_reasonable(self)`
-Changed files:
-- `test_parser_swift_property_wrappers.py`
+_... and 6 more symbols_
+### test_adaptive_selector.py
+_Tests for AdaptiveSymbolSelector._
-**Commit ``**:
+**class** `class TestAdaptiveSymbolSelectorBase`
+> Test basic AdaptiveSymbolSelector functionality.
-Changed files:
-- `test_tech_debt_ios.py`
+**class** `class TestSizeCategoryDetermination`
+> Test file size category determination.
+**Methods:**
+- `def test_selector_initialization_with_default_config(self)`
+- `def test_selector_initialization_with_custom_config(self)`
+- `def test_calculate_limit_returns_int(self)`
+- `def test_calculate_limit_returns_positive(self)`
+- `def test_calculate_limit_not_exceed_total_symbols(self)`
+- `def test_tiny_file_category(self)`
+- `def test_tiny_boundary_99_lines(self)`
+- `def test_small_file_category(self)`
+- `def test_medium_file_category(self)`
+- `def test_large_file_category(self)`
+- `def test_xlarge_file_category(self)`
+- `def test_huge_file_category(self)`
+- `def test_mega_file_category(self)`
-**Commit ``**:
+_... and 22 more symbols_
-Changed files:
-- `test_parser_objc_basic.py`
+### test_ai_helper.py
+_Unit tests for AI enhancement helper functions (Epic 4 Story 4.1)._
+**Functions:**
+- `def test_aggregate_multiple_parse_results()`
+- `def test_aggregate_single_parse_result()`
+- `def test_aggregate_empty_parse_results()`
+- `def test_aggregate_preserves_symbol_order()`
-**Commit ``**:
+### test_backward_compatibility.py
+_Backward compatibility tests for Epic 6 refactoring (Task 2.5).
-Changed files:
-- `test_objc_association_utils.py`
-- `test_parser_objc_association.py`
+These tests verify that the new route extractor architecture maintains
+100% compatibi_
+**class** `class TestBackwardCompatibility`
+> Verify new architecture is 100% compatible with old implementation.
-**Commit ``**:
+**Methods:**
+- `def test_extract_thinkphp_routes_still_works(self)`
+- `def test_smart_writer_generates_same_output_structure(self, tmp_path)`
+- `def test_route_table_format_unchanged(self, tmp_path)`
-Changed files:
-- `test_parser_objc_categories.py`
+### test_call_integration.py
+_Story 11.4: Integration & JSON Output Tests
+Tests for call relationship extraction integration with ParseResult,
+JSON serialization, and CLI integrat_
+
+**class** `class TestJSONSerialization`
+> AC1: JSON Output Format (3 tests)
+
+**class** `class TestParseResultIntegration`
+> AC2: ParseResult Integration (3 tests)
+
+**class** `class TestBackwardCompatibility`
+> AC3: Backward Compatibility (2 tests)
+
+**class** `class TestJSONRoundTrip`
+> AC4: JSON Round-Trip (2 tests)
+
+**class** `class TestLanguageConsistency`
+> AC5: Cross-Language Consistency (2 tests)
+
+**Methods:**
+- `def test_basic_json_structure(self, tmp_path)`
+- `def test_multiple_calls_json(self, tmp_path)`
+- `def test_dynamic_call_json(self, tmp_path)`
+- `def test_calls_field_exists(self, tmp_path)`
+- `def test_empty_calls_for_no_calls(self, tmp_path)`
+- `def test_calls_populated_correctly(self, tmp_path)`
+- `def test_existing_fields_unchanged(self, tmp_path)`
+- `def test_to_dict_includes_all_fields(self, tmp_path)`
+- `def test_call_from_dict(self)`
+- `def test_json_serialization_round_trip(self, tmp_path)`
+
+_... and 2 more symbols_
+
+### test_claude_md_injection.py
+_Unit tests for CLAUDE.md injection (inject_claude_md / has_claude_md_injection)._
+
+**class** `class TestInjectClaudeMd`
+> Tests for inject_claude_md().
+
+**class** `class TestHasClaudeMdInjection`
+> Tests for has_claude_md_injection().
+
+**Methods:**
+- `def test_creates_file_when_missing(self, project_dir)`
+- `def test_created_file_has_section(self, project_dir)`
+- `def test_prepends_to_existing_file(self, project_dir)`
+- `def test_idempotent_replace_between_markers(self, project_dir)`
+- `def test_idempotent_preserves_surrounding_content(self, project_dir)`
+- `def test_returns_path(self, project_dir)`
+- `def test_empty_existing_file(self, project_dir)`
+- `def test_whitespace_only_file(self, project_dir)`
+- `def test_returns_false_when_no_file(self, project_dir)`
+- `def test_returns_false_when_no_marker(self, project_dir)`
+- `def test_returns_true_after_injection(self, project_dir)`
+- `def test_returns_true_with_marker_in_existing(self, project_dir)`
+
+**Functions:**
+- `def project_dir(tmp_path)`
+
+### test_cli_debt_scan.py
+_Integration tests for debt-scan CLI command.
+
+Tests the end-to-end functionality of the debt-scan command including:
+- JSON output format
+- Console ou_
+
+**class** `class TestDebtScanCLI`
+> Test debt-scan command integration.
+
+**class** `class TestDebtScanEdgeCases`
+> Test edge cases and error handling.
+
+**class** `class TestDebtScanIntegrationWithRealProject`
+> Test debt-scan on the codeindex project itself.
+
+**Methods:**
+- `def test_debt_scan_json_output(self, tmp_path)`
+- `def test_debt_scan_console_output(self, tmp_path)`
+- `def test_debt_scan_detects_test_smells(self, tmp_path)`
+- `def test_debt_scan_detects_giant_files(self, tmp_path)`
+- `def test_debt_scan_empty_directory(self, tmp_path)`
+- `def test_debt_scan_recursive_flag(self, tmp_path)`
+- `def test_nonexistent_path(self)`
+- `def test_file_instead_of_directory(self, tmp_path)`
+- `def test_scan_own_tests_directory(self)`
+
+### test_cli_docstring_options.py
+_Tests for CLI docstring options.
+
+Story 9.3: Configuration & CLI (CLI commands)
+
+Tests:
+- --docstring-mode option (off|hybrid|all-ai)
+- --show-cost fl_
+
+**class** `class TestDocstringCLIOptions`
+> Test CLI options for docstring extraction.
+
+**class** `class TestDocstringHelp`
+> Test CLI help text for docstring options.
+
+**Methods:**
+- `def test_docstring_mode_option_accepted(self, tmp_path)`
+- `def test_docstring_mode_off(self, tmp_path)`
+- `def test_docstring_mode_hybrid_creates_processor(self, tmp_path, monkeypatch)`
+- `def test_show_cost_flag(self, tmp_path)`
+- `def test_cli_overrides_config(self, tmp_path, monkeypatch)`
+- `def test_invalid_docstring_mode_rejected(self, tmp_path)`
+- `def test_docstring_mode_all_ai(self, tmp_path, monkeypatch)`
+- `def test_docstring_mode_help_shown(self)`
+- `def test_show_cost_help_shown(self)`
+
+### test_cli_hooks.py
+_Tests for Git Hooks CLI module (Epic 6, P3.1, Task 4.1-4.5)._
+
+**class** `class TestHookManager`
+> Test HookManager class.
+
+**class** `class TestHookGeneration`
+> Test hook script generation.
+
+**Methods:**
+- `def test_init_with_repo_path(self, tmp_path)`
+- `def test_init_detects_git_repo(self, tmp_path)`
+- `def test_get_hook_status_not_exists(self, tmp_path)`
+- `def test_get_hook_status_exists_codeindex(self, tmp_path)`
+- `def test_get_hook_status_exists_custom(self, tmp_path)`
+- `def test_install_hook(self, tmp_path)`
+- `def test_install_hook_with_backup(self, tmp_path)`
+- `def test_uninstall_hook(self, tmp_path)`
+- `def test_uninstall_hook_restores_backup(self, tmp_path)`
+- `def test_list_all_hooks_status(self, tmp_path)`
+- `def test_generate_pre_commit_hook(self)`
+- `def test_generate_post_commit_hook(self)`
+- `def test_generate_hook_with_config(self)`
+
+_... and 7 more symbols_
+
+### test_cli_json.py
+_Tests for CLI JSON output.
+
+Story 2 & 3: JSON Output for scan and scan-all commands
+
+Tests:
+- --output json option for scan command
+- --output markdow_
+
+**class** `class TestScanJSONOutput`
+> Test scan command with --output json.
+
+**class** `class TestScanAllJSONOutput`
+> Test scan-all command with --output json.
+
+**class** `class TestJSONErrorHandling`
+> Test error handling in JSON output.
+
+**Methods:**
+- `def test_scan_default_output_markdown(self, tmp_path)`
+- `def test_scan_output_json_to_stdout(self, tmp_path)`
+- `def test_scan_output_json_includes_all_fields(self, tmp_path)`
+- `def test_scan_output_json_with_chinese(self, tmp_path)`
+- `def test_scan_all_output_json(self, tmp_path)`
+- `def test_scan_all_output_markdown_default(self, tmp_path)`
+- `def test_scan_json_with_parse_error(self, tmp_path)`
+
+### test_cli_parse.py
+
+**class** `class TestCliParse`
+> CLI parse command tests
+
+**Methods:**
+- `def test_parse_python_file_json_output(self)`
+- `def test_parse_php_file_json_output(self)`
+- `def test_parse_java_file_json_output(self)`
+- `def test_parse_help_text(self)`
+- `def test_parse_version_compatible(self)`
+- `def test_json_all_required_fields(self)`
+- `def test_json_symbols_structure(self)`
+- `def test_json_optional_fields(self)`
+- `def test_json_round_trip(self)`
+- `def test_json_format_consistency(self)`
+- `def test_parse_file_not_found(self)`
+- `def test_parse_unsupported_language(self)`
+- `def test_parse_syntax_error_file(self)`
+- `def test_parse_empty_file(self)`
+
+_... and 6 more symbols_
+
+### test_cli_scan_defaults_bdd.py
+_BDD tests for CLI scan default behavior (Epic 19 Story 19.1).
+
+Tests that scan/scan-all defaults to structural mode (no AI required)
+and AI is opt-in _
+
+**Functions:**
+- `def cli_runner()`
+- `def scan_context(tmp_path)`
+- `def project_with_python_files(scan_context)`
+- `def valid_config(scan_context)`
+- `def ai_command_configured(scan_context)`
+- `def ai_command_not_configured(scan_context)`
+- `def run_scan_without_ai(cli_runner, scan_context)`
+- `def run_scan_with_ai(cli_runner, scan_context)`
+- `def run_scan_with_fallback(cli_runner, scan_context)`
+- `def run_scan_with_dry_run_and_ai(cli_runner, scan_context)`
+- `def run_scan_with_dry_run_no_ai(cli_runner, scan_context)`
+- `def run_scan_all_without_ai(cli_runner, scan_context)`
+- `def run_scan_all_with_ai(cli_runner, scan_context)`
+- `def run_scan_all_with_fallback(cli_runner, scan_context)`
+- `def output_generated_with_smartwriter(scan_context)`
+
+_... and 10 more symbols_
+
+### test_cli_tech_debt.py
+_Integration tests for tech-debt CLI command._
+
+**class** `class TestTechDebtCommand`
+> Test tech-debt CLI command.
+
+**class** `class TestTechDebtIntegration`
+> Integration tests for full tech-debt workflow.
+
+**Methods:**
+- `def test_tech_debt_command_exists(self)`
+- `def test_analyze_directory_console_format(self, sample_files)`
+- `def test_analyze_directory_markdown_format(self, sample_files)`
+- `def test_analyze_directory_json_format(self, sample_files)`
+- `def test_write_output_to_file(self, sample_files, tmp_path)`
+- `def test_invalid_format_option(self, sample_files)`
+- `def test_nonexistent_directory(self)`
+- `def test_empty_directory(self, tmp_path)`
+- `def test_recursive_option(self, tmp_path)`
+- `def test_detect_multiple_issues(self, tmp_path)`
+- `def test_report_quality_scores(self, sample_files)`
+- `def test_console_output_has_colors(self, sample_files)`
+
+**Functions:**
+- `def sample_files(tmp_path)`
+
+### test_config_adaptive.py
+_Tests for adaptive symbols configuration loading in Config._
+
+**class** `class TestSymbolsConfigAdaptive`
+> Test SymbolsConfig with adaptive_symbols field.
+
+**class** `class TestConfigLoadingAdaptive`
+> Test Config loading with adaptive_symbols configuration.
+
+**class** `class TestConfigurationMerging`
+> Test configuration merging logic (user + defaults).
+
+**class** `class TestBackwardCompatibility`
+> Test backward compatibility with existing configs.
+
+**Methods:**
+- `def test_symbols_config_has_adaptive_field(self)`
+- `def test_symbols_config_default_adaptive_disabled(self)`
+- `def test_symbols_config_uses_default_adaptive_config(self)`
+- `def test_load_config_without_adaptive(self)`
+- `def test_load_config_with_adaptive_enabled(self)`
+- `def test_load_config_with_custom_thresholds(self)`
+- `def test_load_config_with_custom_limits(self)`
+- `def test_load_config_with_partial_overrides(self)`
+- `def test_load_config_with_min_max_symbols(self)`
+- `def test_merge_preserves_user_thresholds(self)`
+- `def test_merge_preserves_user_limits(self)`
+
+_... and 2 more symbols_
+
+### test_dataclass_structure.py
+_Tests for data structure definitions.
+
+Epic 10, Story 10.3: LoomGraph Integration - Data Structures
+Tests for new data classes: Inheritance
+Tests for _
+
+**class** `class TestInheritanceDataclass`
+> Test Inheritance data class.
+
+**class** `class TestImportWithAlias`
+> Test Import data class with alias field.
+
+**class** `class TestParseResultWithInheritances`
+> Test ParseResult with inheritances field.
+
+**Methods:**
+- `def test_inheritance_creation(self)`
+- `def test_inheritance_equality(self)`
+- `def test_inheritance_to_dict(self)`
+- `def test_inheritance_different_instances(self)`
+- `def test_import_with_alias(self)`
+- `def test_import_without_alias(self)`
+- `def test_import_from_with_alias(self)`
+- `def test_import_to_dict_with_alias(self)`
+- `def test_import_to_dict_without_alias(self)`
+- `def test_parse_result_with_inheritances(self)`
+- `def test_parse_result_empty_inheritances(self)`
+- `def test_parse_result_to_dict_with_inheritances(self)`
+
+_... and 3 more symbols_
+
+### test_directory_tree.py
+_Tests for the directory tree module._
+
+**Functions:**
+- `def _create_test_structure(base: Path)`
+- `def test_directory_tree_build()`
+- `def test_directory_tree_levels()`
+- `def test_directory_tree_children()`
+- `def test_directory_tree_processing_order()`
+- `def test_directory_tree_empty()`
+- `def test_directory_tree_deep_nesting()`
+
+### test_docstring_config.py
+_Tests for docstring configuration.
+
+Story 9.3: Configuration & CLI
+
+Tests:
+- Parse docstrings config from YAML
+- Default values (mode=off, backward co_
+
+**class** `class TestDocstringConfig`
+> Test docstring configuration.
+
+**class** `class TestBackwardCompatibility`
+> Test backward compatibility.
+
+**Methods:**
+- `def test_default_config(self)`
+- `def test_parse_docstring_config_from_yaml(self, tmp_path)`
+- `def test_parse_docstring_config_all_ai_mode(self, tmp_path)`
+- `def test_docstring_config_defaults_to_off(self, tmp_path)`
+- `def test_invalid_docstring_mode(self, tmp_path)`
+- `def test_docstring_ai_command_inherits_from_global(self, tmp_path)`
+- `def test_cost_limit_default(self)`
+- `def test_custom_cost_limit(self, tmp_path)`
+- `def test_old_config_without_docstrings_section(self, tmp_path)`
+- `def test_existing_config_fields_unchanged(self, tmp_path)`
+
+### test_docstring_processor.py
+_Tests for DocstringProcessor module.
+
+Story 9.1: Docstring Processor Core
+
+Tests:
+- Hybrid mode: Simple extraction without AI
+- Hybrid mode: AI for co_
+
+**class** `class TestDocstringProcessor`
+> Test DocstringProcessor class.
+
+**Methods:**
+- `def test_init_with_hybrid_mode(self)`
+- `def test_init_with_all_ai_mode(self)`
+- `def test_hybrid_mode_simple_extraction(self)`
+- `def test_hybrid_mode_ai_for_complex(self)`
+- `def test_all_ai_mode(self)`
+- `def test_batch_processing(self)`
+- `def test_fallback_on_ai_failure(self)`
+- `def test_cost_tracking(self)`
+- `def test_json_parsing_malformed(self)`
+- `def test_empty_docstring_handling(self)`
+- `def test_should_use_ai_logic(self)`
+- `def test_fallback_extract(self)`
+
+### test_enricher.py
+_Tests for AI enrichment module (Epic 25, Story 25.2).
+
+Tests the prompt construction and blockquote injection logic.
+AI invocation is mocked — we only_
+
+**class** `class TestExtractSymbolSummary`
+> Extract symbol names + file names for AI prompt input.
+
+**class** `class TestExtractSummaryFromReadme`
+> Extract summary from existing README_AI.md files.
+
+**class** `class TestBuildEnrichPrompt`
+> Build the minimal prompt for AI one-line description.
+
+**Methods:**
+- `def test_extracts_from_parse_results(self)`
+- `def test_empty_results(self)`
+- `def test_limits_symbols_per_file(self)`
+- `def test_extracts_subdirectories(self, tmp_path)`
+- `def test_extracts_file_symbols(self, tmp_path)`
+- `def test_missing_file_returns_empty(self, tmp_path)`
+- `def test_limits_entries(self, tmp_path)`
+- `def test_includes_dir_name(self)`
+- `def test_includes_symbol_summary(self)`
+- `def test_constrains_output_length(self)`
+- `def test_includes_parent_name(self)`
+- `def test_anti_hallucination_instruction(self)`
+
+_... and 9 more symbols_
+
+### test_enricher_integration.py
+_Integration tests for AI enrichment in scan-all pipeline (Epic 25, Story 25.3).
+
+Tests the end-to-end flow: scan-all --ai generates structural READMEs_
+
+**class** `class TestEnrichmentIntegration`
+> Test the enrichment flow as it would work in scan-all --ai.
+
+**Methods:**
+- `def test_full_enrich_flow(self, tmp_path)`
+- `def test_enrich_only_non_leaf_directories(self)`
+- `def test_enrich_prompt_from_parse_results(self)`
+- `def test_re_enrich_updates_description(self, tmp_path)`
+
+### test_error_handling.py
+_Tests for error handling in JSON output mode.
+
+Story 4: Structured error handling for JSON output.
+
+Tests:
+- Command-level errors (directory not found_
+
+**class** `class TestCommandLevelErrors`
+> Test command-level errors return structured JSON.
+
+**class** `class TestErrorObjectStructure`
+> Test error object has correct structure.
+
+**class** `class TestFileLevelErrors`
+> Test file-level errors are properly recorded.
+
+**Methods:**
+- `def test_scan_directory_not_found_json(self, tmp_path)`
+- `def test_scan_all_no_config_json(self, tmp_path)`
+- `def test_scan_empty_directory_json(self, tmp_path)`
+- `def test_error_object_has_required_fields(self, tmp_path)`
+- `def test_parse_error_recorded_in_result(self, tmp_path)`
+- `def test_mixed_success_and_error_files(self, tmp_path)`
+
+### test_file_classifier.py
+_Unit tests for file size classifier (Epic 4 Story 4.2)._
+
+**Functions:**
+- `def test_classify_tiny_file()`
+- `def test_classify_small_file()`
+- `def test_classify_medium_file()`
+- `def test_classify_large_file()`
+- `def test_classify_super_large_by_lines()`
+- `def test_classify_super_large_by_symbols()`
+- `def test_classify_super_large_by_both()`
+- `def test_is_super_large_convenience_method()`
+- `def test_is_large_convenience_method()`
+- `def test_edge_case_exactly_at_threshold()`
+- `def test_edge_case_just_over_threshold()`
+
+### test_help_system_bdd.py
+_BDD tests for Enhanced Help System (Epic 15 Story 15.3)._
+
+**Functions:**
+- `def help_context()`
+- `def cli_available()`
+- `def config_exists_with_param(tmp_path, help_context, param, value)`
+- `def system_cpu_cores(help_context, cores, monkeypatch)`
+- `def run_command(help_context, command)`
+- `def output_contains(help_context, text)`
+- `def output_contains_param_description(help_context, param)`
+- `def output_contains_range(help_context, range_text, param)`
+- `def output_has_formatting(help_context)`
+- `def output_readable(help_context)`
+- `def output_has_examples(help_context)`
+- `def output_has_yaml_examples(help_context)`
+- `def exit_code_matches(help_context, code)`
+- `def output_has_current_value(help_context, value)`
+- `def output_has_warning(help_context, warning)`
+
+### test_hook_post_commit.py
+_Tests for post-commit hook: thin wrapper + Python logic (Epic 25).
+
+The post-commit hook should:
+1. Be a thin shell wrapper calling `codeindex hooks r_
+
+**class** `class TestThinWrapperScript`
+> The generated hook script should be a thin wrapper.
+
+**class** `class TestRunPostCommitHook`
+> Python-side post-commit logic.
+
+**Methods:**
+- `def test_calls_codeindex_hooks_run(self)`
+- `def test_no_custom_ai_prompt(self)`
+- `def test_still_has_loop_guard(self)`
+- `def test_disabled_config(self)`
+- `def test_has_codeindex_marker(self)`
+
+### test_hooks.py
+_Unit tests for codeindex post-install hooks.
+
+Epic #25, Story #26: Post-install Hook Implementation
+Tests follow TDD Red-Green-Refactor cycle._
+
+**class** `class TestExtractVersion`
+> Tests for version extraction from CLAUDE.md.
+
+**class** `class TestInjectCoreGuide`
+> Tests for core guide injection.
+
+**class** `class TestCIEnvironmentDetection`
+> Tests for CI environment detection.
+
+**Methods:**
+- `def test_extract_version_from_valid_marker(self, tmp_path)`
+- `def test_extract_version_returns_none_if_no_marker(self, tmp_path)`
+- `def test_extract_version_handles_missing_file(self, tmp_path)`
+- `def test_extract_version_handles_malformed_marker(self, tmp_path)`
+- `def test_inject_core_guide_new_content(self, tmp_path)`
+- `def test_inject_core_guide_idempotent(self, tmp_path)`
+- `def test_inject_core_guide_preserves_surrounding_content(self, tmp_path)`
+- `def test_is_ci_environment_github_actions(self)`
+- `def test_is_ci_environment_gitlab_ci(self)`
+- `def test_is_ci_environment_jenkins(self)`
+- `def test_is_ci_environment_circle_ci(self)`
+- `def test_is_ci_environment_generic_ci(self)`
+
+_... and 7 more symbols_
+
+### test_hooks_config.py
+_Tests for hooks configuration.
+
+Story 6: Git Hooks performance optimization - Config support._
+
+**class** `class TestHooksConfig`
+> Test hooks configuration loading and defaults.
+
+**class** `class TestPostCommitConfig`
+> Test PostCommitConfig specifics.
+
+**Methods:**
+- `def test_default_hooks_config(self)`
+- `def test_hooks_config_disabled_mode(self)`
+- `def test_hooks_config_async_mode(self)`
+- `def test_hooks_config_sync_mode(self)`
+- `def test_hooks_config_prompt_mode(self)`
+- `def test_hooks_config_custom_threshold(self)`
+- `def test_hooks_config_custom_log_file(self)`
+- `def test_config_loads_hooks_from_yaml(self, tmp_path)`
+- `def test_config_with_disabled_hooks(self, tmp_path)`
+- `def test_config_with_prompt_mode(self, tmp_path)`
+- `def test_valid_modes(self)`
+- `def test_auto_mode_is_default(self)`
+- `def test_max_dirs_sync_default(self)`
+
+_... and 1 more symbols_
+
+### test_hooks_integration.py
+_Integration tests for codeindex post-install hooks.
+
+Epic #25, Story #26: Post-install Hook Implementation
+These tests verify end-to-end behavior of t_
+
+**class** `class TestRealEnvironment`
+> Integration tests with real environment simulation.
+
+**class** `class TestUpgradeScenario`
+> Test upgrade from old version to new version.
+
+**class** `class TestMultipleUpgrades`
+> Test multiple consecutive upgrades.
+
+**Methods:**
+- `def test_hook_updates_claude_md_in_real_environment(self, tmp_path)`
+- `def test_upgrade_from_old_version(self, tmp_path)`
+- `def test_multiple_upgrades_idempotent(self, tmp_path)`
+
+**Functions:**
+- `def cleanup_after_test()`
+
+### test_init_wizard_bdd.py
+_BDD tests for Interactive Setup Wizard (Epic 15 Story 15.1)._
+
+**Functions:**
+- `def wizard_context()`
+- `def project_directory(tmp_path)`
+- `def no_config_exists(project_dir)`
+- `def create_python_files(project_dir, wizard_context)`
+- `def create_python_php_files(project_dir, wizard_context)`
+- `def create_java_spring_files(project_dir, wizard_context)`
+- `def create_src_lib_dirs(project_dir, wizard_context)`
+- `def create_node_git_dirs(project_dir, wizard_context)`
+- `def create_project_with_files(project_dir, wizard_context, file_count, monkeypatch)`
+- `def run_interactive_wizard(wizard_context, project_dir, monkeypatch)`
+- `def select_language(wizard_context, language)`
+- `def accept_default_includes(wizard_context)`
+- `def disable_git_hooks(wizard_context)`
+- `def select_multiple_languages(wizard_context, lang1, lang2)`
+- `def accept_default_patterns(wizard_context)`
+
+_... and 55 more symbols_
+
+### test_integration_swift_objc.py
+_Integration tests for mixed Swift/Objective-C projects (Story 3.5).
+
+This test file validates end-to-end parsing of mixed language projects:
+- Parsing_
+
+**class** `class TestMixedProjectParsing`
+> Test parsing mixed Swift/Objective-C projects.
+
+**class** `class TestFileAssociationAccuracy`
+> Test .h/.m file association accuracy.
+
+**class** `class TestRealisticProjectStructure`
+> Test realistic mixed project structures.
+
+**class** `class TestPerformance`
+> Test parsing performance with realistic file counts.
+
+**class** `class TestEdgeCases`
+> Test edge cases in mixed projects.
+
+**Methods:**
+- `def test_parse_swift_and_objc_together(self, tmp_path)`
+- `def test_bridging_header_with_swift(self, tmp_path)`
+- `def test_high_association_accuracy(self, tmp_path)`
+- `def test_category_association_accuracy(self, tmp_path)`
+- `def test_nested_directory_structure(self, tmp_path)`
+- `def test_multiple_classes_per_file(self, tmp_path)`
+- `def test_parse_many_files_quickly(self, tmp_path)`
+- `def test_mixed_naming_conventions(self, tmp_path)`
+- `def test_empty_implementation_files(self, tmp_path)`
+
+### test_java_annotations.py
+_Tests for Java annotation extraction.
+
+Story 7.1.2.1: Annotation Extraction
+Tests cover class, method, and field annotations with various argument pat_
+
+**class** `class TestClassAnnotations`
+> Test annotation extraction from Java classes.
+
+**class** `class TestMethodAnnotations`
+> Test annotation extraction from Java methods.
+
+**class** `class TestFieldAnnotations`
+> Test annotation extraction from Java fields.
+
+**class** `class TestAnnotationEdgeCases`
+> Test edge cases and special annotation patterns.
+
+**Methods:**
+- `def test_simple_class_annotation(self, tmp_path, )`
+- `def test_class_annotation_with_string_argument(self, tmp_path, )`
+- `def test_multiple_class_annotations(self, tmp_path, )`
+- `def test_simple_method_annotation(self, tmp_path, )`
+- `def test_method_annotation_with_path(self, tmp_path, )`
+- `def test_multiple_method_annotations(self, tmp_path, )`
+- `def test_field_annotation(self, tmp_path, )`
+- `def test_field_annotation_with_arguments(self, tmp_path, )`
+- `def test_annotation_with_array_argument(self, tmp_path, )`
+- `def test_no_annotations(self, tmp_path, )`
+- `def test_annotation_formatting_in_prompt(self, tmp_path, )`
+
+### test_java_calls.py
+_Epic 11 Story 11.2: Java Call Extraction Tests
+
+Test suite for Java call relationship extraction using tree-sitter.
+Following TDD approach: Write test_
+
+**class** `class TestBasicMethodCalls`
+> AC1: Basic Method Calls (6 tests)
+
+**class** `class TestConstructorCalls`
+> AC2: Constructor Calls (5 tests)
+
+**class** `class TestStaticImportResolution`
+> AC3: Static Import Resolution (4 tests)
+
+**Methods:**
+- `def test_instance_method_call(self, tmp_path)`
+- `def test_static_method_call(self, tmp_path)`
+- `def test_method_chaining(self, tmp_path)`
+- `def test_method_call_with_generics(self, tmp_path)`
+- `def test_interface_method_call(self, tmp_path)`
+- `def test_super_method_call(self, tmp_path)`
+- `def test_direct_instantiation(self, tmp_path)`
+- `def test_constructor_with_arguments(self, tmp_path)`
+- `def test_anonymous_class_instantiation(self, tmp_path)`
+- `def test_inner_class_instantiation(self, tmp_path)`
+- `def test_generic_constructor(self, tmp_path)`
+- `def test_static_import_method(self, tmp_path)`
+
+_... and 19 more symbols_
+
+### test_java_edge_cases.py
+_Tests for Java parser edge cases and boundary conditions.
+
+Story 7.1.3.2: Edge Case Tests
+Tests parser robustness with special scenarios:
+- Nested cla_
+
+**class** `class TestNestedClasses`
+> Test nested and inner class declarations.
+
+**class** `class TestComplexGenerics`
+> Test complex generic type combinations.
+
+**class** `class TestLongSignatures`
+> Test very long method signatures.
+
+**class** `class TestSpecialCharacters`
+> Test Unicode and special characters.
+
+**Methods:**
+- `def test_static_nested_class(self, tmp_path)`
+- `def test_inner_class(self, tmp_path)`
+- `def test_multiple_nested_levels(self, tmp_path)`
+- `def test_anonymous_class(self, tmp_path)`
+- `def test_nested_generics(self, tmp_path)`
+- `def test_wildcard_combinations(self, tmp_path)`
+- `def test_multiple_type_parameters_with_bounds(self, tmp_path)`
+- `def test_many_parameters(self, tmp_path)`
+- `def test_long_generic_signature(self, tmp_path)`
+- `def test_unicode_class_name(self, tmp_path)`
+- `def test_dollar_sign_in_name(self, tmp_path)`
+
+_... and 23 more symbols_
+
+### test_java_error_recovery.py
+_Tests for Java parser error recovery capabilities.
+
+Story 7.1.3.3: Error Recovery
+Tests parser's ability to handle and recover from errors:
+- Syntax e_
+
+**class** `class TestSyntaxErrors`
+> Test parser behavior with syntax errors.
+
+**class** `class TestIncompleteDeclarations`
+> Test incomplete or partial declarations.
+
+**class** `class TestMalformedGenerics`
+> Test malformed generic declarations.
+
+**class** `class TestInvalidModifiers`
+> Test invalid modifier combinations.
+
+**class** `class TestPartialFiles`
+> Test partial or cut-off files.
+
+**Methods:**
+- `def test_missing_semicolon(self, tmp_path)`
+- `def test_missing_closing_brace(self, tmp_path)`
+- `def test_unmatched_parentheses(self, tmp_path)`
+- `def test_incomplete_class_declaration(self, tmp_path)`
+- `def test_incomplete_method_signature(self, tmp_path)`
+- `def test_incomplete_generic_declaration(self, tmp_path)`
+- `def test_unmatched_angle_brackets(self, tmp_path)`
+- `def test_invalid_generic_bounds(self, tmp_path)`
+- `def test_conflicting_access_modifiers(self, tmp_path)`
+- `def test_invalid_modifier_order(self, tmp_path)`
+
+_... and 18 more symbols_
+
+### test_java_generic_bounds.py
+_Tests for Java generic bounds parsing.
+
+Story 7.1.2.2: Generic Bounds
+Tests parsing of generic type bounds in Java:
+- Single extends bound: Test single extends bound in generics.
+
+**class** `class TestMultipleBounds`
+> Test multiple bounds with & operator.
+
+**class** `class TestWildcardBounds`
+> Test wildcard bounds (? extends/super).
+
+**class** `class TestNestedGenerics`
+> Test nested generic types with bounds.
+
+**class** `class TestEdgeCases`
+> Test edge cases and boundary conditions.
+
+**Methods:**
+- `def test_class_with_single_extends_bound(self, tmp_path)`
+- `def test_interface_with_extends_bound(self, tmp_path)`
+- `def test_method_with_extends_bound(self, tmp_path)`
+- `def test_class_with_multiple_bounds(self, tmp_path)`
+- `def test_method_with_multiple_bounds(self, tmp_path)`
+- `def test_method_with_extends_wildcard(self, tmp_path)`
+- `def test_method_with_super_wildcard(self, tmp_path)`
+- `def test_return_type_with_wildcard(self, tmp_path)`
+- `def test_nested_generic_bounds(self, tmp_path)`
+- `def test_complex_nested_bounds(self, tmp_path)`
+
+_... and 3 more symbols_
+
+### test_java_inheritance.py
+_Tests for Java inheritance extraction.
+
+Auto-generated from: test_generator/specs/java.yaml
+Generator: test_generator/generator.py
+Template: test_gene_
+
+**class** `class TestBasicInheritance`
+> Test basic Java inheritance extraction.
+
+**class** `class TestGenericTypes`
+> Test inheritance with generic types.
+
+**class** `class TestImportResolution`
+> Test import resolution for fully qualified names.
+
+**Methods:**
+- `def test_single_inheritance_extends(self, tmp_path)`
+- `def test_multiple_interfaces_implements(self, tmp_path)`
+- `def test_extends_and_implements_combined(self, tmp_path)`
+- `def test_interface_extends_interface(self, tmp_path)`
+- `def test_abstract_class_inheritance(self, tmp_path)`
+- `def test_no_inheritance(self, tmp_path)`
+- `def test_generic_single_type_parameter(self, tmp_path)`
+- `def test_generic_multiple_type_parameters(self, tmp_path)`
+- `def test_generic_bounded_type(self, tmp_path)`
+- `def test_generic_in_implements(self, tmp_path)`
+- `def test_import_explicit(self, tmp_path)`
+- `def test_java_lang_implicit_import(self, tmp_path)`
+
+_... and 21 more symbols_
+
+### test_java_lambda.py
+_Tests for Java lambda expressions parsing.
+
+Story 7.1.2.4: Lambda Expressions
+Tests parsing of Java 8+ lambda expressions and method references:
+- Sim_
+
+**class** `class TestSimpleLambda`
+> Test simple lambda expressions.
+
+**class** `class TestLambdaWithParameters`
+> Test lambda expressions with explicit parameters.
+
+**class** `class TestBlockLambda`
+> Test lambda expressions with statement blocks.
+
+**class** `class TestMethodReference`
+> Test method reference expressions.
+
+**class** `class TestLambdaWithStreams`
+> Test lambda expressions with Stream API.
+
+**Methods:**
+- `def test_lambda_in_variable_assignment(self, tmp_path)`
+- `def test_lambda_as_method_parameter(self, tmp_path)`
+- `def test_lambda_with_typed_parameter(self, tmp_path)`
+- `def test_lambda_with_multiple_parameters(self, tmp_path)`
+- `def test_lambda_with_block_body(self, tmp_path)`
+- `def test_lambda_with_multiple_statements(self, tmp_path)`
+- `def test_static_method_reference(self, tmp_path)`
+- `def test_instance_method_reference(self, tmp_path)`
+- `def test_constructor_reference(self, tmp_path)`
+- `def test_lambda_in_stream_filter(self, tmp_path)`
+
+_... and 6 more symbols_
+
+### test_java_lombok.py
+_Tests for Java Lombok annotation support.
+
+Story 7.1.3.4: Lombok Support
+Tests parser's ability to handle Lombok annotations:
+- @Data, @Getter, @Sette_
+
+**class** `class TestBasicLombokAnnotations`
+> Test basic Lombok annotations.
+
+**class** `class TestConstructorAnnotations`
+> Test Lombok constructor annotations.
+
+**class** `class TestBuilderAnnotation`
+> Test @Builder annotation.
+
+**class** `class TestUtilityAnnotations`
+> Test utility Lombok annotations.
+
+**Methods:**
+- `def test_data_annotation(self, tmp_path)`
+- `def test_getter_setter_annotations(self, tmp_path)`
+- `def test_field_level_lombok_annotations(self, tmp_path)`
+- `def test_all_args_constructor(self, tmp_path)`
+- `def test_no_args_constructor(self, tmp_path)`
+- `def test_required_args_constructor(self, tmp_path)`
+- `def test_multiple_constructor_annotations(self, tmp_path)`
+- `def test_builder_annotation(self, tmp_path)`
+- `def test_builder_with_data(self, tmp_path)`
+- `def test_tostring_annotation(self, tmp_path)`
+- `def test_equals_and_hashcode_annotation(self, tmp_path)`
+
+_... and 15 more symbols_
+
+### test_java_module.py
+_Tests for Java module system parsing (Java 9+).
+
+Story 7.1.2.5: Module System
+Tests parsing of Java Platform Module System (JPMS) features:
+- module-i_
+
+**class** `class TestBasicModuleDeclaration`
+> Test basic module declaration.
+
+**class** `class TestRequiresDirective`
+> Test requires directive for module dependencies.
+
+**class** `class TestExportsDirective`
+> Test exports directive for package visibility.
+
+**class** `class TestOpensDirective`
+> Test opens directive for reflection access.
+
+**Methods:**
+- `def test_simple_module_declaration(self, tmp_path)`
+- `def test_module_with_javadoc(self, tmp_path)`
+- `def test_requires_single_module(self, tmp_path)`
+- `def test_requires_multiple_modules(self, tmp_path)`
+- `def test_requires_transitive(self, tmp_path)`
+- `def test_requires_static(self, tmp_path)`
+- `def test_exports_single_package(self, tmp_path)`
+- `def test_exports_multiple_packages(self, tmp_path)`
+- `def test_exports_to_specific_modules(self, tmp_path)`
+- `def test_opens_single_package(self, tmp_path)`
+- `def test_opens_to_specific_modules(self, tmp_path)`
+
+_... and 13 more symbols_
+
+### test_java_parser.py
+_Unit tests for Java parser.
+
+This module tests the Java language parser using tree-sitter-java.
+Tests follow TDD methodology: RED → GREEN → REFACTOR.
+_
+
+**class** `class TestJavaParserBasics`
+> Test basic Java parsing functionality.
+
+**class** `class TestJavaSymbolExtraction`
+> Test Java symbol extraction (classes, methods, fields).
+
+**class** `class TestJavaImports`
+> Test Java import statement extraction.
+
+**Methods:**
+- `def test_java_file_detection(self)`
+- `def test_parser_initialization(self)`
+- `def test_parse_simple_class(self)`
+- `def test_parse_interface(self)`
+- `def test_parse_enum(self)`
+- `def test_parse_syntax_error(self)`
+- `def test_extract_class_name(self)`
+- `def test_extract_methods(self)`
+- `def test_extract_method_signature(self)`
+- `def test_extract_fields(self)`
+- `def test_extract_constructor(self)`
+
+**Functions:**
+- `def load_fixture(filename: str) -> str`
+
+_... and 16 more symbols_
+
+### test_java_spring.py
+_Tests for Java Spring Framework parsing.
+
+Story 7.1.3.1: Spring Test Suite
+Comprehensive tests for Spring Boot project parsing, including:
+- Controlle_
+
+**class** `class TestSpringControllerLayer`
+> Test Spring MVC Controller layer parsing.
+
+**class** `class TestSpringServiceLayer`
+> Test Spring Service layer parsing.
+
+**class** `class TestSpringRepositoryLayer`
+> Test Spring Data JPA Repository parsing.
+
+**class** `class TestSpringEntityLayer`
+> Test JPA Entity parsing.
+
+**Methods:**
+- `def test_rest_controller_annotation(self)`
+- `def test_request_mapping_methods(self)`
+- `def test_parameter_annotations(self)`
+- `def test_service_annotation(self)`
+- `def test_autowired_annotation(self)`
+- `def test_service_methods(self)`
+- `def test_repository_annotation(self)`
+- `def test_query_annotation(self)`
+- `def test_entity_annotations(self)`
+- `def test_jpa_field_annotations(self)`
-**Commit ``**:
+**Functions:**
+- `def load_spring_fixture(filename: str) -> Path`
-Changed files:
-- `test_parser_objc_bridging.py`
+_... and 12 more symbols_
+
+### test_java_throws.py
+_Tests for Java throws declarations parsing.
+
+Story 7.1.2.3: Throws Declarations
+Tests parsing of throws clauses in Java method signatures:
+- Single ex_
+
+**class** `class TestSingleThrows`
+> Test single exception in throws clause.
+
+**class** `class TestMultipleThrows`
+> Test multiple exceptions in throws clause.
+
+**class** `class TestGenericThrows`
+> Test generic exceptions in throws clause.
+
+**class** `class TestThrowsVariants`
+> Test various throws clause variants.
+
+**class** `class TestThrowsWithAnnotations`
+> Test throws clause combined with annotations.
+
+**Methods:**
+- `def test_method_throws_single_exception(self, tmp_path)`
+- `def test_constructor_throws_exception(self, tmp_path)`
+- `def test_interface_method_throws(self, tmp_path)`
+- `def test_method_throws_multiple_exceptions(self, tmp_path)`
+- `def test_throws_with_generic_method(self, tmp_path)`
+- `def test_method_throws_generic_exception(self, tmp_path)`
+- `def test_method_throws_bounded_generic(self, tmp_path)`
+- `def test_throws_with_full_package_name(self, tmp_path)`
+- `def test_method_without_throws(self, tmp_path)`
+- `def test_abstract_method_with_throws(self, tmp_path)`
+
+_... and 4 more symbols_
+
+### test_json_output.py
+_Tests for JSON output serialization._
+
+**class** `class TestSymbolSerialization`
+> Test Symbol.to_dict() method.
+
+**class** `class TestImportSerialization`
+> Test Import.to_dict() method.
+
+**class** `class TestParseResultSerialization`
+> Test ParseResult.to_dict() method.
+
+**class** `class TestJSONCompatibility`
+> Test JSON compatibility.
+
+**Methods:**
+- `def test_symbol_to_dict_basic(self)`
+- `def test_symbol_to_dict_empty_docstring(self)`
+- `def test_symbol_to_dict_method(self)`
+- `def test_import_to_dict_basic(self)`
+- `def test_import_to_dict_no_names(self)`
+- `def test_parse_result_to_dict_basic(self)`
+- `def test_parse_result_to_dict_with_error(self)`
+- `def test_parse_result_to_dict_empty_symbols(self)`
+- `def test_parse_result_to_dict_multiple_symbols(self)`
+- `def test_parse_result_to_dict_path_conversion(self)`
+- `def test_parse_result_to_dict_php_namespace(self)`
+_... and 2 more symbols_
+
+### test_lazy_loading.py
+_Test lazy loading of language parsers.
+
+This test verifies that parsers are only imported when needed,
+not at module load time._
+
+**Functions:**
+- `def test_parser_module_does_not_import_all_languages()`
+- `def test_get_parser_lazy_loads_python_only(tmp_path)`
+- `def test_get_parser_caches_parsers()`
+- `def test_get_parser_unsupported_language()`
+- `def test_parse_file_with_missing_language_dependency(tmp_path)`
+- `def test_multiple_languages_can_be_used_sequentially(tmp_path)`
+
+### test_loomgraph_integration.py
+_Integration tests for LoomGraph compatibility.
+
+Epic 10: LoomGraph Integration - MVP Validation
+Tests that codeindex JSON output meets LoomGraph requi_
+
+**class** `class TestLoomGraphJSONFormat`
+> Test JSON format compatibility with LoomGraph.
+
+**class** `class TestLoomGraphDataMapping`
+> Test data mapping rules from DATA_CONTRACT.md.
+
+**class** `class TestLoomGraphRealWorldExample`
+> Test with realistic code example.
+
+**class** `class TestLoomGraphEdgeCases`
+> Test edge cases that LoomGraph should handle.
+
+**Methods:**
+- `def test_complete_python_example(self, tmp_path)`
+- `def test_json_serializable(self, tmp_path)`
+- `def test_inheritance_extraction(self, tmp_path)`
+- `def test_import_alias_extraction(self, tmp_path)`
+- `def test_nested_class_inheritance(self, tmp_path)`
+- `def test_symbol_to_entity_mapping(self, tmp_path)`
+- `def test_inheritance_to_relation_mapping(self, tmp_path)`
+- `def test_import_to_relation_mapping(self, tmp_path)`
+- `def test_django_like_model(self, tmp_path)`
+- `def test_empty_file(self, tmp_path)`
+- `def test_syntax_error_file(self, tmp_path)`
+
+_... and 2 more symbols_
+
+### test_objc_association_utils.py
+_Test suite for Objective-C file association utilities (Story 3.2).
+
+This test file validates the association and merging utilities:
+- find_objc_pairs(_
+
+**class** `class TestFindObjCPairs`
+> Test finding .h/.m file pairs in directories.
+
+**class** `class TestParseObjCPair`
+> Test parsing associated .h/.m file pairs.
+
+**class** `class TestMergeObjCResults`
+> Test merging header and implementation results.
+
+**Methods:**
+- `def test_find_complete_pairs(self, tmp_path)`
+- `def test_find_header_only(self, tmp_path)`
+- `def test_find_implementation_only(self, tmp_path)`
+- `def test_find_mixed_pairs(self, tmp_path)`
+- `def test_empty_directory(self, tmp_path)`
+- `def test_nonexistent_directory(self)`
+- `def test_parse_complete_pair(self, tmp_path)`
+- `def test_parse_header_only(self, tmp_path)`
+- `def test_parse_implementation_only(self, tmp_path)`
+- `def test_parse_nonexistent_files(self, tmp_path)`
+- `def test_merge_complete_pair(self, tmp_path)`
+- `def test_merge_header_only(self, tmp_path)`
+
+_... and 7 more symbols_
+
+### test_parallel_scan.py
+_Tests for parallel directory scanning.
+
+Story 7.1.4.2: Parallel Directory Scanning
+Verifies that scan-all correctly uses ThreadPoolExecutor to process_
+
+**class** `class TestParallelScanning`
+> Test parallel directory scanning functionality.
+
+**class** `class TestParallelPerformance`
+> Test performance improvements from parallel scanning.
+
+**class** `class TestParallelCorrectness`
+> Test correctness of parallel scanning results.
+
+**class** `class TestParallelEdgeCases`
+> Test edge cases in parallel scanning.
+
+**Methods:**
+- `def test_scan_all_uses_threadpool(self, tmp_path)`
+- `def test_parallel_workers_configuration(self, tmp_path)`
+- `def test_parallel_workers_override(self, tmp_path)`
+- `def test_parallel_faster_than_sequential(self, multi_dir_project)`
+- `def test_all_directories_processed(self, tmp_path)`
+- `def test_parallel_no_race_conditions(self, tmp_path)`
+- `def test_single_directory_still_works(self, tmp_path)`
+- `def test_more_workers_than_directories(self, tmp_path)`
+- `def test_zero_parallel_workers_uses_default(self, tmp_path)`
+
+### test_parser.py
+_Tests for the Python parser._
+
+**Functions:**
+- `def test_parse_simple_function()`
+- `def test_parse_class_with_methods()`
+- `def test_parse_imports()`
+- `def test_parse_module_docstring()`
+- `def test_parse_nonexistent_file()`
+- `def test_parse_php_class_with_inheritance()`
+- `def test_parse_php_method_visibility()`
+- `def test_parse_php_properties()`
+- `def test_parse_php_function()`
+- `def test_parse_php_namespace()`
+- `def test_parse_php_use_statements()`
+- `def test_parse_php_group_use()`
+
+### test_parser_detection.py
+_Tests for parser installation detection (Epic 19 Story 19.4).
+
+Checks that the init wizard detects installed/missing tree-sitter parsers
+and provides _
+
+**class** `class TestCheckParserInstalled`
+> Tests for check_parser_installed function.
+
+**class** `class TestParserInstallGuidance`
+> Tests for parser installation guidance.
+
+**class** `class TestInitWizardPostMessage`
+> Tests for updated post-init messages (Story 19.2).
+
+**Methods:**
+- `def test_python_parser_installed(self)`
+- `def test_java_parser_installed(self)`
+- `def test_php_parser_installed(self)`
+- `def test_unknown_language_not_installed(self)`
+- `def test_unsupported_language_not_installed(self)`
+- `def test_all_parsers_installed_no_missing(self)`
+- `def test_missing_parser_shows_install_command(self)`
+- `def test_empty_languages_no_guidance(self)`
+- `def test_post_init_suggests_scan_all(self, tmp_path)`
+- `def test_post_init_mentions_review_config(self, tmp_path)`
+- `def test_generated_config_no_ai_command(self, tmp_path)`
+
+### test_parser_objc_association.py
+_Test suite for Objective-C header/implementation file association (Story 3.2).
+
+This test file validates .h/.m file pairing and symbol merging:
+- Same_
+
+**class** `class TestBasicAssociation`
+> Test basic .h/.m file association.
+
+**class** `class TestSymbolMerging`
+> Test symbol merging between .h and .m files.
+
+**class** `class TestMissingPairs`
+> Test handling of missing .h or .m files.
+
+**class** `class TestEdgeCases`
+> Test edge cases for file association.
+
+**Methods:**
+- `def test_parse_header_file_alone(self, tmp_path)`
+- `def test_parse_implementation_file_alone(self, tmp_path)`
+- `def test_header_and_implementation_separate_parsing(self, tmp_path)`
+- `def test_method_declarations_vs_definitions(self, tmp_path)`
+- `def test_properties_in_header_only(self, tmp_path)`
+- `def test_header_without_implementation(self, tmp_path)`
+- `def test_implementation_without_header(self, tmp_path)`
+- `def test_multiple_classes_in_same_file(self, tmp_path)`
+- `def test_different_class_names_in_h_and_m(self, tmp_path)`
+- `def test_import_statement_in_implementation(self, tmp_path)`
+
+### test_parser_objc_basic.py
+_Test suite for Objective-C parser basic functionality (Story 3.1).
+
+This test file validates Objective-C parser infrastructure:
+- @interface declarati_
+
+**class** `class TestInterfaceDeclarations`
+> Test @interface declaration parsing.
+
+**class** `class TestImplementationParsing`
+> Test @implementation parsing.
+
+**class** `class TestImportStatements`
+> Test import/include statement parsing.
+
+**class** `class TestInheritanceExtraction`
+> Test inheritance relationship extraction.
+
+**class** `class TestEdgeCases`
+> Test edge cases and error handling.
+
+**Methods:**
+- `def test_simple_interface(self, tmp_path)`
+- `def test_interface_with_properties(self, tmp_path)`
+- `def test_interface_with_instance_methods(self, tmp_path)`
+- `def test_interface_with_class_methods(self, tmp_path)`
+- `def test_simple_implementation(self, tmp_path)`
+- `def test_implementation_with_methods(self, tmp_path)`
+- `def test_implementation_with_class_methods(self, tmp_path)`
+- `def test_import_foundation(self, tmp_path)`
+- `def test_simple_inheritance(self, tmp_path)`
+- `def test_protocol_conformance(self, tmp_path)`
+
+_... and 3 more symbols_
+
+### test_parser_objc_bridging.py
+_Test suite for Objective-C bridging header detection (Story 3.4).
-**Commit ``**:
+This test file validates bridging header handling for Swift/Objective-C interop:
+- _
-Changed files:
-- `test_integration_swift_objc.py`
+**class** `class TestBridgingHeaderDetection`
+> Test bridging header file detection.
+**class** `class TestBridgingHeaderImports`
+> Test import extraction from bridging headers.
-**Commit ``**:
+**class** `class TestBridgingHeaderClasses`
+> Test class exposure in bridging headers.
-Changed files:
-- `test_parser_swift_protocols.py`
-
-
-**Commit ``**:
-
-Changed files:
-- `test_cli_debt_scan.py`
-- `test_test_smells.py`
-
-
-**Commit ``**:
-
-Changed files:
-- `test_cli_debt_scan.py`
-
-
-**Commit ``**:
-
-Changed files:
-- `test_hooks.py`
-- `test_hooks_integration.py`
-
-
-**Commit ``**:
-
-Changed files:
-- `test_hooks_integration.py`
+**class** `class TestBridgingHeaderEdgeCases`
+> Test edge cases for bridging headers.
+
+**Methods:**
+- `def test_detect_bridging_header_by_filename(self, tmp_path)`
+- `def test_project_bridging_header_pattern(self, tmp_path)`
+- `def test_regular_header_not_bridging(self, tmp_path)`
+- `def test_extract_multiple_imports(self, tmp_path)`
+- `def test_framework_import
---
-
-## Recent Changes
-
-**Commit ``**:
-
-Changed files:
-- `test_skill_helpers.py`
+_Content truncated due to size limit. See individual module README files for details._
diff --git a/tests/test_enricher.py b/tests/test_enricher.py
new file mode 100644
index 0000000..98cac7b
--- /dev/null
+++ b/tests/test_enricher.py
@@ -0,0 +1,208 @@
+"""Tests for AI enrichment module (Epic 25, Story 25.2).
+
+Tests the prompt construction and blockquote injection logic.
+AI invocation is mocked — we only test the structural parts.
+"""
+
+from pathlib import Path
+
+from codeindex.enricher import (
+ build_enrich_prompt,
+ extract_summary_from_readme,
+ extract_symbol_summary,
+ inject_blockquote,
+ should_enrich,
+)
+
+
+class TestExtractSymbolSummary:
+ """Extract symbol names + file names for AI prompt input."""
+
+ def test_extracts_from_parse_results(self):
+ from codeindex.parser import ParseResult, Symbol
+
+ results = [
+ ParseResult(
+ path=Path("ImageController.php"),
+ symbols=[
+ Symbol(name="ImageController", kind="class", signature="class ImageController", line_start=1),
+ Symbol(
+ name="uploadAvatar", kind="method",
+ signature="public function uploadAvatar()", line_start=10,
+ ),
+ Symbol(name="reason_img", kind="method", signature="public function reason_img()", line_start=20),
+ ],
+ imports=[],
+ ),
+ ParseResult(
+ path=Path("UserController.php"),
+ symbols=[
+ Symbol(name="UserController", kind="class", signature="class UserController", line_start=1),
+ Symbol(name="login", kind="method", signature="public function login()", line_start=5),
+ ],
+ imports=[],
+ ),
+ ]
+ summary = extract_symbol_summary(results)
+ assert "ImageController.php" in summary
+ assert "uploadAvatar" in summary
+ assert "UserController.php" in summary
+ assert "login" in summary
+
+ def test_empty_results(self):
+ summary = extract_symbol_summary([])
+ assert summary == ""
+
+ def test_limits_symbols_per_file(self):
+ """Should not include all symbols from huge files."""
+ from codeindex.parser import ParseResult, Symbol
+
+ symbols = [
+ Symbol(name=f"method_{i}", kind="method", signature=f"method_{i}()", line_start=i)
+ for i in range(100)
+ ]
+ results = [ParseResult(path=Path("Big.php"), symbols=symbols, imports=[])]
+ summary = extract_symbol_summary(results)
+ # Should be reasonably bounded, not 100 method names
+ assert summary.count("method_") <= 20
+
+
+class TestExtractSummaryFromReadme:
+ """Extract summary from existing README_AI.md files."""
+
+ def test_extracts_subdirectories(self, tmp_path):
+ readme = tmp_path / "README_AI.md"
+ readme.write_text(
+ "# App\n\n## Subdirectories\n"
+ "- **Pay/** - 34 files | 448 symbols\n"
+ "- **Vip/** - 会员管理 | 48 files\n"
+ )
+ summary = extract_summary_from_readme(readme)
+ assert "Pay" in summary
+ assert "Vip" in summary
+
+ def test_extracts_file_symbols(self, tmp_path):
+ readme = tmp_path / "README_AI.md"
+ readme.write_text(
+ "# Mod\n\n## Files\n"
+ "- **Pay.php** - Pay, placeOrder, refund\n"
+ "- **User.php** - User, login\n"
+ )
+ summary = extract_summary_from_readme(readme)
+ assert "Pay.php" in summary
+ assert "placeOrder" in summary
+
+ def test_missing_file_returns_empty(self, tmp_path):
+ summary = extract_summary_from_readme(tmp_path / "nonexistent.md")
+ assert summary == ""
+
+ def test_limits_entries(self, tmp_path):
+ readme = tmp_path / "README_AI.md"
+ lines = ["# Big\n\n## Files\n"]
+ for i in range(50):
+ lines.append(f"- **File{i}.php** - Class{i}, method{i}\n")
+ readme.write_text("".join(lines))
+ summary = extract_summary_from_readme(readme)
+ # Should be bounded
+ assert summary.count("File") <= 20
+
+
+class TestBuildEnrichPrompt:
+ """Build the minimal prompt for AI one-line description."""
+
+ def test_includes_dir_name(self):
+ prompt = build_enrich_prompt("SmallProgramApi", "ImageController.php: uploadAvatar, login")
+ assert "SmallProgramApi" in prompt
+
+ def test_includes_symbol_summary(self):
+ prompt = build_enrich_prompt("Pay", "Alipay.php: placeOrder; WechatPay.php: placeOrder")
+ assert "placeOrder" in prompt
+
+ def test_constrains_output_length(self):
+ """Prompt should instruct AI to keep description short."""
+ prompt = build_enrich_prompt("Vip", "CardBag, Integral, Membership")
+ assert "30" in prompt or "concise" in prompt.lower()
+
+ def test_includes_parent_name(self):
+ prompt = build_enrich_prompt("Pay", "Alipay, WechatPay", parent_name="Application")
+ assert "Application" in prompt
+
+ def test_anti_hallucination_instruction(self):
+ prompt = build_enrich_prompt("Mod", "file1, file2")
+ assert "NOT" in prompt or "ONLY" in prompt
+
+
+class TestInjectBlockquote:
+ """Inject blockquote description into existing README_AI.md."""
+
+ def test_inject_after_title(self, tmp_path):
+ readme = tmp_path / "README_AI.md"
+ readme.write_text(
+ "\n"
+ "\n"
+ "# Vip\n"
+ "\n"
+ "## Overview\n"
+ "- **Files**: 48\n"
+ )
+ inject_blockquote(readme, "会员等级管理、积分兑换、权益卡券")
+ content = readme.read_text()
+ assert "> 会员等级管理、积分兑换、权益卡券\n" in content
+ # Title should still be there
+ assert "# Vip\n" in content
+ # Overview should still be there
+ assert "## Overview" in content
+
+ def test_replace_existing_blockquote(self, tmp_path):
+ readme = tmp_path / "README_AI.md"
+ readme.write_text(
+ "# Vip\n"
+ "> 旧描述\n"
+ "\n"
+ "## Overview\n"
+ )
+ inject_blockquote(readme, "新描述")
+ content = readme.read_text()
+ assert "> 新描述\n" in content
+ assert "旧描述" not in content
+
+ def test_no_title_appends_at_top(self, tmp_path):
+ readme = tmp_path / "README_AI.md"
+ readme.write_text("## Overview\n- **Files**: 5\n")
+ inject_blockquote(readme, "描述")
+ content = readme.read_text()
+ assert "> 描述\n" in content
+
+ def test_preserves_rest_of_content(self, tmp_path):
+ readme = tmp_path / "README_AI.md"
+ original = (
+ "\n"
+ "\n"
+ "# Pay\n"
+ "\n"
+ "## Overview\n"
+ "- **Files**: 34\n"
+ "- **Symbols**: 448\n"
+ "\n"
+ "## Subdirectories\n"
+ "- **Business/** - 10 files\n"
+ )
+ readme.write_text(original)
+ inject_blockquote(readme, "支付网关(微信、支付宝、云支付)")
+ content = readme.read_text()
+ assert "## Subdirectories" in content
+ assert "**Business/**" in content
+ assert "**Files**: 34" in content
+
+
+class TestShouldEnrich:
+ """Determine if a directory should get AI enrichment."""
+
+ def test_overview_level_should_enrich(self):
+ assert should_enrich("overview") is True
+
+ def test_navigation_level_should_enrich(self):
+ assert should_enrich("navigation") is True
+
+ def test_detailed_level_should_not_enrich(self):
+ assert should_enrich("detailed") is False
diff --git a/tests/test_enricher_integration.py b/tests/test_enricher_integration.py
new file mode 100644
index 0000000..cdbcb11
--- /dev/null
+++ b/tests/test_enricher_integration.py
@@ -0,0 +1,96 @@
+"""Integration tests for AI enrichment in scan-all pipeline (Epic 25, Story 25.3).
+
+Tests the end-to-end flow: scan-all --ai generates structural READMEs
+then enriches non-leaf directories with AI-generated blockquote descriptions.
+"""
+
+from pathlib import Path
+
+from codeindex.enricher import (
+ build_enrich_prompt,
+ extract_symbol_summary,
+ inject_blockquote,
+ should_enrich,
+)
+
+
+class TestEnrichmentIntegration:
+ """Test the enrichment flow as it would work in scan-all --ai."""
+
+ def test_full_enrich_flow(self, tmp_path):
+ """Simulate: structural generation → AI enrich → parent reads description."""
+ # Step 1: Simulate structural README_AI.md (as SmartWriter would generate)
+ module_dir = tmp_path / "Vip"
+ module_dir.mkdir()
+ readme = module_dir / "README_AI.md"
+ readme.write_text(
+ "\n"
+ "\n"
+ "# Vip\n"
+ "\n"
+ "## Overview\n"
+ "- **Files**: 48\n"
+ "- **Symbols**: 386\n"
+ "\n"
+ "## Subdirectories\n"
+ "- **Controller/** - 12 files\n"
+ )
+
+ # Step 2: Simulate AI enrichment (inject blockquote)
+ inject_blockquote(readme, "会员等级管理、积分兑换、权益卡券")
+
+ # Step 3: Verify parent can read the description
+ from codeindex.writers.utils import extract_module_description
+
+ desc = extract_module_description(module_dir)
+ assert desc == "会员等级管理、积分兑换、权益卡券"
+
+ # Step 4: Verify structural content is preserved
+ content = readme.read_text()
+ assert "## Overview" in content
+ assert "**Files**: 48" in content
+ assert "## Subdirectories" in content
+
+ def test_enrich_only_non_leaf_directories(self):
+ """Enrichment should only apply to overview and navigation levels."""
+ assert should_enrich("overview") is True
+ assert should_enrich("navigation") is True
+ assert should_enrich("detailed") is False
+
+ def test_enrich_prompt_from_parse_results(self):
+ """Full flow: ParseResult → symbol summary → prompt."""
+ from codeindex.parser import ParseResult, Symbol
+
+ results = [
+ ParseResult(
+ path=Path("Pay.php"),
+ symbols=[
+ Symbol(name="Pay", kind="class", signature="class Pay", line_start=1),
+ Symbol(
+ name="placeOrder", kind="method",
+ signature="public function placeOrder()", line_start=10,
+ ),
+ ],
+ imports=[],
+ ),
+ ]
+
+ summary = extract_symbol_summary(results)
+ prompt = build_enrich_prompt("Pay", summary)
+
+ assert "Pay" in prompt
+ assert "placeOrder" in prompt
+ # Prompt should ask for concise description
+ assert "30" in prompt or "concise" in prompt.lower()
+
+ def test_re_enrich_updates_description(self, tmp_path):
+ """Running enrich twice should update, not duplicate."""
+ readme = tmp_path / "README_AI.md"
+ readme.write_text("# Mod\n> 旧描述\n\n## Overview\n")
+
+ inject_blockquote(readme, "新描述")
+ content = readme.read_text()
+
+ assert content.count(">") == 1
+ assert "> 新描述" in content
+ assert "旧描述" not in content
diff --git a/tests/test_hook_post_commit.py b/tests/test_hook_post_commit.py
new file mode 100644
index 0000000..c8131bc
--- /dev/null
+++ b/tests/test_hook_post_commit.py
@@ -0,0 +1,108 @@
+"""Tests for post-commit hook: thin wrapper + Python logic (Epic 25).
+
+The post-commit hook should:
+1. Be a thin shell wrapper calling `codeindex hooks run post-commit`
+2. Python logic handles: affected dirs → codeindex scan → auto-commit
+3. No custom AI prompts, no git diff injection
+"""
+
+from unittest.mock import MagicMock, patch
+
+from codeindex.cli_hooks import (
+ _generate_post_commit_script,
+ generate_hook_script,
+ run_post_commit_hook,
+)
+
+
+class TestThinWrapperScript:
+ """The generated hook script should be a thin wrapper."""
+
+ def test_calls_codeindex_hooks_run(self):
+ """Hook script delegates to `codeindex hooks run post-commit`."""
+ script = _generate_post_commit_script({})
+ assert "codeindex hooks run post-commit" in script
+
+ def test_no_custom_ai_prompt(self):
+ """Hook script must not contain custom AI prompts."""
+ script = _generate_post_commit_script({})
+ assert "PROMPT" not in script
+ assert "Code Diff" not in script
+ assert "git diff HEAD" not in script
+
+ def test_still_has_loop_guard(self):
+ """Hook script still guards against infinite commit loops."""
+ script = _generate_post_commit_script({})
+ assert "README_AI.md" in script
+
+ def test_disabled_config(self):
+ """Disabled config generates exit-only script."""
+ script = _generate_post_commit_script({"auto_update": False})
+ assert "exit 0" in script
+
+ def test_has_codeindex_marker(self):
+ """Generated script contains codeindex marker for management."""
+ script = generate_hook_script("post-commit")
+ assert "codeindex-managed hook" in script
+
+
+class TestRunPostCommitHook:
+ """Python-side post-commit logic."""
+
+ @patch("codeindex.cli_hooks.subprocess.run")
+ def test_skips_when_no_affected_dirs(self, mock_run):
+ """No affected dirs → no scan, no commit."""
+ mock_run.return_value = MagicMock(
+ returncode=0,
+ stdout='{"level": "skip", "affected_dirs": []}',
+ )
+ result = run_post_commit_hook()
+ assert result == 0
+
+ @patch("codeindex.cli_hooks.Path.cwd")
+ @patch("codeindex.cli_hooks.subprocess.run")
+ def test_scans_affected_directories(self, mock_run, mock_cwd, tmp_path):
+ """Affected dirs → codeindex scan for each."""
+ # Create the README_AI.md so the check passes
+ auth_dir = tmp_path / "src" / "auth"
+ auth_dir.mkdir(parents=True)
+ (auth_dir / "README_AI.md").write_text("# Auth\n")
+ mock_cwd.return_value = tmp_path
+
+ diff_mock = MagicMock(returncode=1, stdout="") # has changes
+ mock_run.side_effect = [
+ MagicMock(
+ returncode=0,
+ stdout='{"level": "minor", "affected_dirs": ["src/auth"]}',
+ ),
+ MagicMock(returncode=0, stdout=""), # scan
+ MagicMock(returncode=0, stdout=""), # git add
+ diff_mock, # git diff --cached --quiet (1 = has changes)
+ MagicMock(returncode=0, stdout="abc123"), # git rev-parse
+ MagicMock(returncode=0, stdout=""), # git commit
+ ]
+
+ run_post_commit_hook()
+
+ # Should have called codeindex scan for the affected dir
+ scan_calls = [
+ c for c in mock_run.call_args_list
+ if "scan" in str(c)
+ ]
+ assert len(scan_calls) >= 1
+
+ @patch("codeindex.cli_hooks.subprocess.run")
+ def test_no_ai_prompt_in_scan(self, mock_run):
+ """Scan should use `codeindex scan`, not custom AI prompts."""
+ mock_run.return_value = MagicMock(
+ returncode=0,
+ stdout='{"level": "minor", "affected_dirs": ["src/mod"]}',
+ )
+
+ run_post_commit_hook()
+
+ # Check no call contains AI prompt keywords
+ for call in mock_run.call_args_list:
+ cmd = str(call)
+ assert "PROMPT" not in cmd
+ assert "Code Diff" not in cmd
diff --git a/tests/test_scanall_auto_ai.py b/tests/test_scanall_auto_ai.py
new file mode 100644
index 0000000..4d1dd5a
--- /dev/null
+++ b/tests/test_scanall_auto_ai.py
@@ -0,0 +1,165 @@
+"""Tests for scan-all auto-AI enrichment behavior (Epic 25).
+
+When ai_command is configured in .codeindex.yaml, scan-all should
+automatically enable Phase 2 AI enrichment. --no-ai disables it.
+"""
+
+from unittest.mock import patch
+
+import pytest
+from click.testing import CliRunner
+
+from codeindex.cli import main
+
+
+@pytest.fixture
+def cli_runner():
+ return CliRunner()
+
+
+@pytest.fixture
+def project_with_ai_command(tmp_path):
+ """Project with ai_command configured."""
+ config = tmp_path / ".codeindex.yaml"
+ config.write_text(
+ "codeindex: 1\n"
+ "ai_command: 'echo test'\n"
+ "languages:\n - python\n"
+ "include:\n - src/\n"
+ )
+ src = tmp_path / "src"
+ src.mkdir()
+ (src / "app.py").write_text("def main(): pass\n")
+ return tmp_path
+
+
+@pytest.fixture
+def project_without_ai_command(tmp_path):
+ """Project without ai_command configured."""
+ config = tmp_path / ".codeindex.yaml"
+ config.write_text(
+ "codeindex: 1\n"
+ "languages:\n - python\n"
+ "include:\n - src/\n"
+ )
+ src = tmp_path / "src"
+ src.mkdir()
+ (src / "app.py").write_text("def main(): pass\n")
+ return tmp_path
+
+
+class TestScanAllAutoAI:
+ """scan-all auto-detects ai_command and enables Phase 2."""
+
+ @patch("codeindex.cli_scan._enrich_directories_with_ai")
+ @patch("codeindex.cli_scan._process_directory_with_smartwriter")
+ def test_auto_enables_ai_when_ai_command_configured(
+ self, mock_smartwriter, mock_enrich, cli_runner, project_with_ai_command
+ ):
+ """When ai_command is in config, Phase 2 runs automatically."""
+ mock_smartwriter.return_value = (
+ project_with_ai_command / "src", True, "ok", 100
+ )
+ mock_enrich.return_value = None
+
+ result = cli_runner.invoke(
+ main, ["scan-all", "--root", str(project_with_ai_command)]
+ )
+
+ assert result.exit_code == 0
+ mock_enrich.assert_called_once()
+
+ @patch("codeindex.cli_scan._enrich_directories_with_ai")
+ @patch("codeindex.cli_scan._process_directory_with_smartwriter")
+ def test_no_ai_flag_disables_auto_enrichment(
+ self, mock_smartwriter, mock_enrich, cli_runner, project_with_ai_command
+ ):
+ """--no-ai explicitly disables Phase 2 even with ai_command."""
+ mock_smartwriter.return_value = (
+ project_with_ai_command / "src", True, "ok", 100
+ )
+
+ result = cli_runner.invoke(
+ main, ["scan-all", "--root", str(project_with_ai_command), "--no-ai"]
+ )
+
+ assert result.exit_code == 0
+ mock_enrich.assert_not_called()
+
+ @patch("codeindex.cli_scan._enrich_directories_with_ai")
+ @patch("codeindex.cli_scan._process_directory_with_smartwriter")
+ def test_no_enrichment_without_ai_command(
+ self, mock_smartwriter, mock_enrich, cli_runner, project_without_ai_command
+ ):
+ """Without ai_command in config, Phase 2 does not run."""
+ mock_smartwriter.return_value = (
+ project_without_ai_command / "src", True, "ok", 100
+ )
+
+ result = cli_runner.invoke(
+ main, ["scan-all", "--root", str(project_without_ai_command)]
+ )
+
+ assert result.exit_code == 0
+ mock_enrich.assert_not_called()
+
+ @patch("codeindex.cli_scan._enrich_directories_with_ai")
+ @patch("codeindex.cli_scan._process_directory_with_smartwriter")
+ def test_explicit_ai_flag_still_works(
+ self, mock_smartwriter, mock_enrich, cli_runner, project_with_ai_command
+ ):
+ """--ai flag is backward compatible (same as auto-detect)."""
+ mock_smartwriter.return_value = (
+ project_with_ai_command / "src", True, "ok", 100
+ )
+ mock_enrich.return_value = None
+
+ result = cli_runner.invoke(
+ main, ["scan-all", "--root", str(project_with_ai_command), "--ai"]
+ )
+
+ assert result.exit_code == 0
+ mock_enrich.assert_called_once()
+
+ def test_explicit_ai_without_ai_command_errors(
+ self, cli_runner, project_without_ai_command
+ ):
+ """--ai without ai_command still gives clear error."""
+ result = cli_runner.invoke(
+ main, ["scan-all", "--root", str(project_without_ai_command), "--ai"]
+ )
+
+ assert result.exit_code != 0
+ assert "ai_command" in result.output
+
+ @patch("codeindex.cli_scan._enrich_directories_with_ai")
+ @patch("codeindex.cli_scan._process_directory_with_smartwriter")
+ def test_ai_and_no_ai_mutually_exclusive(
+ self, mock_smartwriter, mock_enrich, cli_runner, project_with_ai_command
+ ):
+ """--ai and --no-ai together should error."""
+ result = cli_runner.invoke(
+ main, ["scan-all", "--root", str(project_with_ai_command),
+ "--ai", "--no-ai"]
+ )
+
+ assert result.exit_code != 0
+ assert "mutually exclusive" in result.output.lower() or "conflict" in result.output.lower()
+
+ @patch("codeindex.cli_scan._enrich_directories_with_ai")
+ @patch("codeindex.cli_scan._process_directory_with_smartwriter")
+ def test_auto_ai_shows_info_message(
+ self, mock_smartwriter, mock_enrich, cli_runner, project_with_ai_command
+ ):
+ """When auto-detecting AI, show informational message."""
+ mock_smartwriter.return_value = (
+ project_with_ai_command / "src", True, "ok", 100
+ )
+ mock_enrich.return_value = None
+
+ result = cli_runner.invoke(
+ main, ["scan-all", "--root", str(project_with_ai_command)]
+ )
+
+ assert result.exit_code == 0
+ assert "--no-ai" in result.output # Should mention how to disable
diff --git a/tests/writers/test_utils.py b/tests/writers/test_utils.py
index 7702fa9..419f0b2 100644
--- a/tests/writers/test_utils.py
+++ b/tests/writers/test_utils.py
@@ -92,6 +92,56 @@ def test_custom_output_file(self, tmp_path):
desc = extract_module_description(tmp_path, output_file="INDEX.md")
assert "1 files" in desc
+ def test_blockquote_description_highest_priority(self, tmp_path):
+ """Blockquote AI description should take priority over stats."""
+ (tmp_path / "README_AI.md").write_text(
+ "# Vip\n"
+ "> 会员等级管理、积分兑换、权益卡券\n"
+ "\n"
+ "## Overview\n"
+ "- **Files**: 48\n"
+ "- **Symbols**: 386\n"
+ )
+ desc = extract_module_description(tmp_path)
+ assert desc == "会员等级管理、积分兑换、权益卡券"
+
+ def test_blockquote_with_comment_header(self, tmp_path):
+ """Blockquote should work even with codeindex comment header."""
+ (tmp_path / "README_AI.md").write_text(
+ "\n"
+ "\n"
+ "# SmallProgramApi\n"
+ "> 小程序端API(用户登录、头像上传、商品浏览)\n"
+ "\n"
+ "## Overview\n"
+ "- **Files**: 124\n"
+ )
+ desc = extract_module_description(tmp_path)
+ assert desc == "小程序端API(用户登录、头像上传、商品浏览)"
+
+ def test_blockquote_not_present_falls_through(self, tmp_path):
+ """Without blockquote, existing strategies should still work."""
+ (tmp_path / "README_AI.md").write_text(
+ "# mod\n"
+ "## Overview\n"
+ "- **Files**: 3\n"
+ "- **Symbols**: 15\n"
+ )
+ desc = extract_module_description(tmp_path)
+ assert "3 files" in desc
+ assert "15 symbols" in desc
+
+ def test_blockquote_empty_ignored(self, tmp_path):
+ """Empty blockquote should be skipped."""
+ (tmp_path / "README_AI.md").write_text(
+ "# mod\n"
+ "> \n"
+ "## Overview\n"
+ "- **Files**: 5\n"
+ )
+ desc = extract_module_description(tmp_path)
+ assert "5 files" in desc
+
# --- collect_top_symbols ---