[BUG] Compilation extremely slow on projects with apm_modules dependencies

**Describe the bug**

`apm compile` takes a very long time on projects that have APM dependencies installed under `apm_modules/`. The main contributor appears to be `apm_modules/` not being excluded from the directory scanning phase, combined with an expensive pattern matching loop that amplifies the issue.

**To Reproduce**

1. Create a project that depends on a large APM package (e.g., a team package that transitively pulls in many skills)
2. Run `apm install` to populate `apm_modules/`
3. Run `apm compile --verbose`
4. Observe compilation takes significantly longer than expected

**Expected behavior**

Compilation should complete in seconds. The `apm_modules/` directory contains installed dependency source and should ideally be excluded from filesystem scanning during compilation, similar to how `node_modules/` is already excluded.

**Environment (please complete the following information):**
- OS: macOS
- Python Version: 3.13.11
- APM Version: 0.7.4
- VSCode Version: N/A (CLI only)

**Logs**

```
⚙️ Starting context compilation...
Compiling for AGENTS.md (VSCode/Copilot) - detected .github/ folder
Verbose mode: showing source attribution and optimizer analysis
⏱️  📊 Project Analysis: 8.6ms
⏱️  🎯 Instruction Processing: 821231.2ms
Analyzing project structure...
├─ 90 directories scanned (max depth: 6)
├─ 369 files analyzed across 18 file types
└─ 10 instruction patterns detected
...
Generated 1 AGENTS.md file
┌─ Context efficiency:    71.0%
└─ Generation time:       1095937ms

Placement Distribution
└─ .                              10 instructions from 10 sources
✅ Compilation completed successfully!
```

Project Analysis completes in ~9ms, but Instruction Processing takes ~821 seconds. Total generation time is ~1,096 seconds. The bottleneck is in the pattern matching phase scanning through `apm_modules/` contents.

**Additional context**

Some investigation into `src/apm_cli/compilation/context_optimizer.py` surfaced a few things that seem to be contributing — sharing in case it's helpful:

**1. `apm_modules/` not in default exclusion list**

`_analyze_project_structure()` hardcodes exclusions for `node_modules`, `__pycache__`, `.git`, `dist`, `build` — but not `apm_modules/`. When a project has large transitive dependencies (e.g., a team package pulling in multiple squads and skills), this adds hundreds of extra directories to the scan.

**2. O(n×m×k) pattern matching in `_find_matching_directories()`**

For each instruction pattern, the method iterates every cached directory, then every file in each directory, calling `_file_matches_pattern()` per file. With lots of directories from `apm_modules/`, this gets expensive quickly.

**3. `Set[Path]` recreation on every match check**

`_file_matches_pattern()` converts cached glob results from `List[str]` to `Set[Path]` on every call rather than caching the converted set. This creates and discards large sets tens of thousands of times.

**4. `os.walk` doesn't prune hardcoded exclusions**

`_analyze_project_structure()` calls `continue` when it encounters hardcoded exclusion directories, but doesn't modify `dirs[:]` to prevent `os.walk()` from descending into those subtrees.

Just adding `apm_modules` to the default exclusion list would likely make the biggest difference. Happy to help with a PR if that'd be useful!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Compilation extremely slow on projects with apm_modules dependencies #154

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Compilation extremely slow on projects with apm_modules dependencies #154

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions