Skip to content

[BUG] Substring matching in default directory exclusions causes false positives #158

@sergio-sisternes-epam

Description

@sergio-sisternes-epam

Describe the bug

The default directory exclusion logic in _analyze_project_structure() uses substring matching (ignore in str(current_path)) to check if a path should be excluded. This can produce false positives when a directory name contains an exclusion token as a substring.

Examples of false positives:

  • docs/apm_modules_guide/ would be excluded because apm_modules is a substring of the path
  • src/rebuild/ would be excluded because build is a substring of rebuild
  • tools/node_modules_compat/ would be excluded because node_modules is a substring
  • lib/redistribution/ would be excluded because dist is a substring of redistribution

Affected code:

# context_optimizer.py, _analyze_project_structure()
if any(ignore in str(current_path) for ignore in DEFAULT_EXCLUDED_DIRNAMES):
    continue

To Reproduce

  1. Create a project with a directory whose name contains (but does not equal) an exclusion token, e.g. docs/apm_modules_guide/
  2. Place an .instructions.md file inside that directory
  3. Run apm compile --verbose
  4. The instruction is silently ignored because the directory is incorrectly excluded

Expected behavior

Only directories whose path components exactly match an exclusion name should be excluded. A path like docs/apm_modules_guide/ should not be excluded just because it contains apm_modules as a substring.

Suggested fix:

Replace substring matching with path-component matching:

relative_parts = current_path.relative_to(self.base_dir).parts
if any(part in DEFAULT_EXCLUDED_DIRNAMES for part in relative_parts):
    continue

Environment:

Additional context

Discovered during Copilot code review on PR #157. This is a pre-existing pattern, not introduced by the performance fix, but the addition of apm_modules to the default exclusion list increases the surface area for false positives. The _should_exclude_subdir() method already uses exact component matching (path.name in ...) which is correct — only _analyze_project_structure() uses the problematic substring approach.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingneeds-triageNew issue, not yet reviewed by maintainers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions