perf: exclude apm_modules from compilation scanning and cache Set[Path]#157
Merged
danielmeppiel merged 3 commits intomicrosoft:mainfrom Mar 4, 2026
Conversation
Add apm_modules to the hardcoded exclusion list in _analyze_project_structure(), _should_exclude_subdir(), and _get_all_files() so that installed dependency trees are not scanned during compilation. Cache the Set[Path] conversion in _file_matches_pattern() to avoid recreating large sets on every call. Before: ~821s instruction processing, ~1096s total on a project with apm_modules. After: ~17s instruction processing, ~18s total (~62x improvement). Fixes microsoft#154
Contributor
There was a problem hiding this comment.
Pull request overview
This PR improves apm compile performance by reducing filesystem scanning work during context optimization and by reusing cached data structures for glob matching. It also hardens brace-expansion for applyTo patterns with multiple brace groups (fixing #153) and avoids scanning installed dependencies under apm_modules/ (fixing #154).
Changes:
- Exclude
apm_modules/from project scanning in the optimizer (similar tonode_modules/). - Make
_expand_glob_pattern()recursively expand multiple brace groups. - Cache
Set[Path]conversions for glob results to avoid repeated set construction.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
src/apm_cli/compilation/context_optimizer.py |
Adds apm_modules to exclusion logic; improves brace expansion; introduces set-caching for glob matches. |
tests/unit/compilation/test_context_optimizer.py |
Adds unit tests covering brace expansion, apm_modules exclusion, and set-cache reuse behavior. |
You can also share your feedback on Copilot code review. Take the survey.
- Extract DEFAULT_EXCLUDED_DIRNAMES frozenset constant to eliminate duplication across _analyze_project_structure, _should_exclude_subdir, and _get_all_files (review comment 1) - Use dedicated _glob_set_cache: Dict[str, Set[Path]] instead of overloading _glob_cache with '_set_' prefixed keys (review comment 2) - Update docs/cli-reference.md and docs/compilation.md to list apm_modules in default exclusions (review comment 4)
danielmeppiel
approved these changes
Mar 4, 2026
danielmeppiel
approved these changes
Mar 4, 2026
sergio-sisternes-epam
added a commit
to sergio-sisternes-epam/apm
that referenced
this pull request
Mar 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
apm compilescans the entire project tree during the optimization phase, includingapm_modules/. On projects with large transitive dependencies, this causes Instruction Processing to take hundreds of seconds due to the O(n×m×k) pattern matching loop amplified by hundreds of extra directories.This PR:
apm_modulesto the hardcoded exclusion list in_analyze_project_structure(),_should_exclude_subdir(), and_get_all_files()— consistent with existingnode_modulesexclusion.Set[Path]conversion in_file_matches_pattern()to avoid recreating large sets on every call.Fixes #154
Type of change
Testing
New test classes (7 tests):
TestApmModulesExclusion(4 tests):test_apm_modules_excluded_from_directory_cache— verifies noapm_modulespaths leak into the cachetest_cache_size_unaffected_by_apm_modules— asserts cache size reflects only project dirstest_os_walk_prunes_apm_modules— confirms_should_exclude_subdir()flagsapm_modulestest_find_matching_dirs_ignores_apm_modules— verifies pattern matching skipsapm_modulescontentsTestGlobCacheReuse(1 test):test_set_path_cached_across_calls— confirmsSet[Path]is created once and reusedTestExpandGlobPattern(1 additional test from [BUG] Compilation fails on applyTo patterns with multiple brace groups #153 follow-up):test_three_brace_groups— validates three nested brace groups expand correctlyAll 43 tests pass.
Local benchmark on a project with transitive APM dependencies: