Skip to content

Conversation

@luandro
Copy link
Contributor

@luandro luandro commented Dec 9, 2025

Fixes #95
Summary

This PR addresses the issue where content was not being updated correctly during incremental fetches due to path comparison inconsistencies in the page metadata cache.
Problem

The cache stored output paths in inconsistent formats depending on how they were generated:

Relative paths without leading slash: docs/intro.md
Relative paths with leading slash: /docs/intro.md
Absolute paths: /home/user/comapeo-docs/docs/intro.md

This caused hasMissingOutputs() and path comparisons to fail when the same file was represented differently, leading to:

Unnecessary regeneration of unchanged pages
Missing updates when paths didn't match
Duplicate entries in cache for the same file

Solution

Added a normalizePath() function that resolves all paths to absolute paths relative to PROJECT_ROOT:

Paths starting with PROJECT_ROOT are returned unchanged
Paths like /docs/intro.md are treated as project-relative (not filesystem absolute)
Relative paths like docs/intro.md are resolved against PROJECT_ROOT

Changes

pageMetadataCache.ts:
    Added normalizePath() helper function
    Exported PROJECT_ROOT for use across modules
    Updated hasMissingOutputs() to normalize paths before checking
    Updated updatePageInCache() to normalize both existing and new paths
    Fixed empty outputPaths array handling (now triggers regeneration)
    Updated interface documentation

generateBlocks.ts:
    Import PROJECT_ROOT for consistent path resolution
    Replace process.cwd() with PROJECT_ROOT for S3 URL detection

New test file pathNormalization.test.ts:
    28 test cases covering all edge cases

Extended incrementalSync.test.ts:
    4 integration tests for path normalization scenarios

Test Plan

All 1516 tests pass
TypeScript compiles without errors
ESLint passes (warnings only, no errors)
Tested edge cases:
    Empty string/null paths → filtered out
    Paths with ../ or ./ segments → normalized
    Mixed path formats in cache → deduplicated
    Empty outputPaths array → triggers regeneration
    Old cache migration → works automatically

Backward Compatibility

Cache version unchanged at "1.0" (no breaking changes)
Old caches with relative paths are automatically migrated on next update
No manual intervention required

This addresses issue #95 where content might not be updated correctly
during incremental fetches due to path comparison inconsistencies.

Changes:
- Add normalizePath() helper that resolves all paths to absolute paths
  relative to PROJECT_ROOT for consistent comparison
- Update hasMissingOutputs() to use normalizePath when checking file existence
- Update updatePageInCache() to normalize paths before storage
- Fix hasMissingOutputs() to return true for empty outputPaths arrays,
  ensuring pages with no outputs get regenerated
- Add comprehensive test coverage in pathNormalization.test.ts
- Update existing tests to expect normalized absolute paths
…ation

When merging existing paths with new paths in updatePageInCache,
existing paths from older caches (stored in non-normalized format)
were being added without normalization. This caused duplicate entries
when the same file was represented in different formats.

Changes:
- Normalize existing paths in updatePageInCache before merging
- Add test case verifying old-format paths are properly deduplicated
- Comment clarifies this handles migration from older cache formats
…andling

Phase 5-6 of issue #95 fix:
- Import PROJECT_ROOT in generateBlocks.ts for consistent path resolution
- Replace process.cwd() with PROJECT_ROOT for S3 URL path detection
- Add integration tests for path normalization edge cases:
  - Mixed path formats in cache (migration scenario)
  - Missing outputs detection regardless of format
  - Path deduplication across updates
  - Multi-language path consistency
The interface comment incorrectly stated paths were "relative to project root"
but they are now stored as absolute paths for consistency. This documentation
fix aligns the comment with the actual implementation.
@github-actions
Copy link
Contributor

github-actions bot commented Dec 9, 2025

🚀 Preview Deployment

Your documentation preview is ready!

Preview URL: https://pr-112.comapeo-docs.pages.dev

🔄 Content: Regenerated 5 pages from Notion (script changes detected)

💡 Tip: Add label fetch-all-pages to test with full content, or fetch-10-pages for broader coverage.

This preview will update automatically when you push new commits to this PR.


Built with commit 7090e3b

Ensures consistent path comparison in the needsProcessing check by
normalizing filePath before comparing with cached outputPaths. This
prevents edge cases where path format differences could cause
incorrect skip/process decisions.
@digidem digidem deleted a comment from chatgpt-codex-connector bot Dec 9, 2025
The normalizePath function now correctly distinguishes between:
1. Project-relative paths like /docs/intro.md (where /docs doesn't exist)
2. Genuine system paths like /tmp/foo or /etc/config (where parent exists)

Uses filesystem check (fs.existsSync on parent directory) to determine
if an absolute path outside PROJECT_ROOT should be preserved as-is or
treated as project-relative.

Fixes external feedback about /tmp/foo being incorrectly rewritten to
PROJECT_ROOT/tmp/foo.
Paths like /a where the parent is "/" should be treated as project-relative,
not system paths. The root directory always exists, so we can't use its
existence as an indicator of a genuine system path.

Added check: if parent directory IS the root (path.parse(parentDir).root === parentDir),
don't trust fs.existsSync and treat as project-relative instead.

Added tests for:
- /a (single directory at root level)
- /foo/bar/baz.txt (nested paths where only root exists)
@luandro luandro merged commit 5f5864c into main Dec 10, 2025
3 of 4 checks passed
@luandro luandro deleted the claude/investigate-issue-95-01BL5WJwWJnZx8j68XSQENE8 branch December 10, 2025 14:44
@github-actions
Copy link
Contributor

🧹 Preview Deployment Cleanup

The preview deployment for this PR has been cleaned up.

Preview URL was: https://pr-112.comapeo-docs.pages.dev


Note: Cloudflare Pages deployments follow automatic retention policies. Old previews are cleaned up automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Some content is not being updated or skipped during fetch

3 participants