diff --git a/openspec/changes/governance-02-github-hierarchy-cache/TDD_EVIDENCE.md b/openspec/changes/governance-02-github-hierarchy-cache/TDD_EVIDENCE.md index e4b54c2c..6eb812c8 100644 --- a/openspec/changes/governance-02-github-hierarchy-cache/TDD_EVIDENCE.md +++ b/openspec/changes/governance-02-github-hierarchy-cache/TDD_EVIDENCE.md @@ -26,3 +26,11 @@ - `python3 -m py_compile scripts/sync_github_hierarchy_cache.py` → PASS - `python3 scripts/sync_github_hierarchy_cache.py --force` → generated `.specfact/backlog/github_hierarchy_cache.md` - Second `python3 scripts/sync_github_hierarchy_cache.py` run → `GitHub hierarchy cache unchanged (46 issues).` + +## Final scoped quality gates + +- Timestamp: `2026-04-09T22:04:18+02:00` +- Ruff: `PATH=/home/dom/git/nold-ai/specfact-cli/.venv/bin:$PATH PYTHONPATH=/home/dom/git/nold-ai/specfact-cli-worktrees/feature/governance-02-github-hierarchy-cache/src /home/dom/git/nold-ai/specfact-cli/.venv/bin/ruff check scripts/sync_github_hierarchy_cache.py tests/unit/scripts/test_sync_github_hierarchy_cache.py` → PASS +- basedpyright: `PATH=/home/dom/git/nold-ai/specfact-cli/.venv/bin:$PATH PYTHONPATH=/home/dom/git/nold-ai/specfact-cli-worktrees/feature/governance-02-github-hierarchy-cache/src /home/dom/git/nold-ai/specfact-cli/.venv/bin/basedpyright scripts/sync_github_hierarchy_cache.py tests/unit/scripts/test_sync_github_hierarchy_cache.py` → PASS (`0 errors, 0 warnings`) +- pytest: `hatch test -- tests/unit/scripts/test_sync_github_hierarchy_cache.py -q` → PASS (`15 passed`, after parity with modules-side script/tests) +- SpecFact code review: `hatch run specfact code review run --json --out .specfact/code-review.json scripts/sync_github_hierarchy_cache.py tests/unit/scripts/test_sync_github_hierarchy_cache.py` → PASS (`overall_verdict` PASS, `ci_exit_code` 0; low-severity DRY hints on icontract preconditions documented in `proposal.md`) diff --git a/openspec/changes/governance-02-github-hierarchy-cache/design.md b/openspec/changes/governance-02-github-hierarchy-cache/design.md index c0653ca4..b490b9e5 100644 --- a/openspec/changes/governance-02-github-hierarchy-cache/design.md +++ b/openspec/changes/governance-02-github-hierarchy-cache/design.md @@ -1,4 +1,4 @@ -## Context +# Context `specfact-cli` already maintains GitHub planning hierarchy through issue labels, parent-child links, and `openspec/CHANGE_ORDER.md`, but contributors still discover that structure by hitting the GitHub API manually. The new requirement is to make hierarchy lookup deterministic, cheap, and local: a generated markdown file under ignored `.specfact/backlog/` becomes the first source for parent Feature and Epic resolution, and the sync command is rerun only when the hierarchy changed. @@ -7,12 +7,14 @@ This is a cross-cutting governance change because it affects GitHub automation, ## Goals / Non-Goals **Goals:** + - Generate a deterministic markdown cache of Epic and Feature issues for this repository. - Include enough metadata for issue-parenting work without another GitHub lookup: issue number, title, short summary, labels, parent/child relationships, and issue URLs. - Make the sync fast on no-op runs by using a small fingerprint/state check before regenerating markdown. - Update repo guidance so contributors use the cache first and only rerun sync when needed. **Non-Goals:** + - Replacing GitHub as the authoritative source of issue hierarchy. - Caching every issue type or full issue bodies. - Synchronizing User Story issues into the cache in this first version. @@ -21,33 +23,43 @@ This is a cross-cutting governance change because it affects GitHub automation, ## Decisions ### Use `gh api graphql` as the sole upstream source + The script will query GitHub through `gh api graphql` so it can access issue type, labels, relationships, and brief body content in one supported path. This avoids scraping markdown or depending on REST endpoints that do not expose hierarchy fields consistently. Alternative considered: + - `gh issue list/view` JSON loops: simpler, but requires many calls and awkward relationship reconstruction. ### Split the sync into a lightweight fingerprint pass and a full render pass + The script will first fetch only the Epic and Feature issue identity set plus timestamps/relationship fingerprints, hash that data, and compare it with a local state file. If the fingerprint matches, the script exits successfully without rewriting markdown. If it differs, the script performs a fuller metadata query and regenerates the cache. Alternative considered: + - Always regenerate markdown: deterministic but wastes GitHub calls and makes local workflows slower. ### Store human-readable cache plus machine-readable state under ignored `.specfact/backlog` + The canonical human-facing output will be `.specfact/backlog/github_hierarchy_cache.md`. A companion state file, `.specfact/backlog/github_hierarchy_cache_state.json`, will hold the last fingerprint and generator metadata. Both files stay local and ignored by Git so the cache can be recreated freely without creating repository drift. Alternative considered: + - State embedded in markdown comments: workable, but couples machine state to user-facing output and complicates deterministic rendering. ### Render by deterministic section and sort order + The markdown will use fixed sections for Epics and Features, with issues sorted stably by type, then issue number. Relationship lists and labels will also be sorted deterministically so reruns only change the file when source metadata actually changes. Alternative considered: + - Preserve GitHub API order: easier, but can drift between runs and create noisy diffs. ### Keep instruction updates in repo-local governance files + The change will update `openspec/config.yaml` and `AGENTS.md` in this repo so the workflow explicitly says: consult the cache first, regenerate it when fresh planning metadata is needed, and avoid ad hoc GitHub lookups unless the cache is stale or missing. Alternative considered: + - Document the behavior only in the script help text: insufficient because agents and OpenSpec flows read governance files first. ## Risks / Trade-offs diff --git a/openspec/changes/governance-02-github-hierarchy-cache/proposal.md b/openspec/changes/governance-02-github-hierarchy-cache/proposal.md index edd72b3c..6738003a 100644 --- a/openspec/changes/governance-02-github-hierarchy-cache/proposal.md +++ b/openspec/changes/governance-02-github-hierarchy-cache/proposal.md @@ -1,3 +1,5 @@ +# Governance: GitHub hierarchy cache (specfact-cli) + ## Why OpenSpec and agent workflows still have to query GitHub ad hoc to rediscover Epics, Features, and their parent links before creating or syncing change issues. That is slow, expensive, and error-prone, especially now that planning hierarchy matters in both `specfact-cli` and `specfact-cli-modules`. @@ -5,17 +7,18 @@ OpenSpec and agent workflows still have to query GitHub ad hoc to rediscover Epi ## What Changes - Add a deterministic repo-local hierarchy cache generator for GitHub Epic and Feature issues. -- Persist a central markdown inventory under `openspec/` with issue number, title, brief summary, labels, and hierarchy relationships. -- Add a lightweight fingerprint/state check so the sync exits quickly when Epic and Feature metadata has not changed. -- Update governance instructions in `openspec/config.yaml` and `AGENTS.md` to consult the cached hierarchy inventory first and rerun the sync script when fresh data is needed. +- Persist a repo-local markdown hierarchy cache at `.specfact/backlog/github_hierarchy_cache.md` (ignored; not committed) with issue number, title, brief summary, labels, and hierarchy relationships, plus a companion fingerprint/state file `.specfact/backlog/github_hierarchy_cache_state.json` so the sync can exit quickly when Epic and Feature metadata has not changed. +- Update governance instructions in `openspec/config.yaml` and `AGENTS.md` to consult the cached hierarchy first and rerun `python scripts/sync_github_hierarchy_cache.py` when fresh data is needed. - Cover the script with tests so cache output and no-change behavior remain stable. ## Capabilities ### New Capabilities + - `github-hierarchy-cache`: Deterministic synchronization of GitHub Epic and Feature hierarchy metadata into a repo-local OpenSpec markdown cache for low-cost parent and planning lookups. ### Modified Capabilities + - `agile-feature-hierarchy`: Local governance workflows must be able to resolve current Epic and Feature planning metadata from the repo-local cache before performing manual GitHub lookups. ## Impact @@ -29,3 +32,7 @@ OpenSpec and agent workflows still have to query GitHub ad hoc to rediscover Epi - GitHub Issue: [#491](https://github.com/nold-ai/specfact-cli/issues/491) - Parent Feature: [#486](https://github.com/nold-ai/specfact-cli/issues/486) - Related Modules Change: `specfact-cli-modules/governance-03-github-hierarchy-cache` + +## Code review note (SpecFact dogfood) + +Icontract `@require` preconditions on `fetch_hierarchy_issues`, `render_cache_markdown`, and `sync_cache` intentionally use small, similarly shaped predicates. The code-review module may emit low-severity DRY / duplicate-shape hints for those helpers; that is accepted here because collapsing them would break icontract’s per-parameter argument binding (e.g. `**kwargs` predicates are not supported the same way). diff --git a/openspec/changes/governance-02-github-hierarchy-cache/specs/github-hierarchy-cache/spec.md b/openspec/changes/governance-02-github-hierarchy-cache/specs/github-hierarchy-cache/spec.md index 01728b27..f81a5656 100644 --- a/openspec/changes/governance-02-github-hierarchy-cache/specs/github-hierarchy-cache/spec.md +++ b/openspec/changes/governance-02-github-hierarchy-cache/specs/github-hierarchy-cache/spec.md @@ -1,28 +1,34 @@ -## ADDED Requirements +# ADDED Requirements + +## Requirement: Repository hierarchy cache sync -### Requirement: Repository hierarchy cache sync The repository SHALL provide a deterministic sync mechanism that retrieves GitHub Epic and Feature issues for the current repository and writes a local hierarchy cache under ignored `.specfact/backlog/`. -#### Scenario: Generate hierarchy cache from GitHub metadata +### Scenario: Generate hierarchy cache from GitHub metadata + - **WHEN** the user runs the hierarchy cache sync script for the repository - **THEN** the script retrieves GitHub issues whose Type is `Epic` or `Feature` - **AND** writes a markdown cache under ignored `.specfact/backlog/` with each issue's number, title, URL, short summary, labels, and hierarchy relationships - **AND** the output ordering is deterministic across repeated runs with unchanged source data -#### Scenario: Represent hierarchy relationships in cache output +### Scenario: Represent hierarchy relationships in cache output + - **WHEN** a synced Epic or Feature has parent or child hierarchy links - **THEN** the markdown cache includes those relationships in normalized form - **AND** missing relationships are rendered as explicit empty or none values rather than omitted ambiguously -#### Scenario: Fast exit on unchanged hierarchy state +### Scenario: Fast exit on unchanged hierarchy state + - **WHEN** the script detects that the current Epic and Feature hierarchy fingerprint matches the last synced fingerprint - **THEN** it exits successfully without regenerating the markdown cache - **AND** it reports that no hierarchy update was required -### Requirement: Repository governance must use cache-first hierarchy lookup +## Requirement: Repository governance must use cache-first hierarchy lookup + Repository governance instructions SHALL direct contributors and agents to consult the local hierarchy cache before performing manual GitHub lookups for Epic or Feature parenting. -#### Scenario: Cache-first governance guidance +### Scenario: Cache-first governance guidance + - **WHEN** a contributor reads `AGENTS.md` or `openspec/config.yaml` for GitHub issue setup guidance - **THEN** the instructions tell them to consult the local hierarchy cache first - **AND** the instructions define when the sync script must be rerun to refresh stale hierarchy metadata diff --git a/openspec/changes/governance-02-github-hierarchy-cache/tasks.md b/openspec/changes/governance-02-github-hierarchy-cache/tasks.md index 5b85e0ba..ee49e6d4 100644 --- a/openspec/changes/governance-02-github-hierarchy-cache/tasks.md +++ b/openspec/changes/governance-02-github-hierarchy-cache/tasks.md @@ -17,4 +17,4 @@ ## 4. Verification - [x] 4.1 Re-run the targeted tests and record the passing run in `openspec/changes/governance-02-github-hierarchy-cache/TDD_EVIDENCE.md`. -- [ ] 4.2 Run the required repo quality gates for the touched scope, including code review JSON refresh if stale. +- [x] 4.2 Run the required repo quality gates for the touched scope, including code review JSON refresh if stale. diff --git a/openspec/config.yaml b/openspec/config.yaml index c89e95fd..91ba81c9 100644 --- a/openspec/config.yaml +++ b/openspec/config.yaml @@ -88,9 +88,12 @@ rules: - **Repository**: /, - **Last Synced Status**: ). - After creation, update proposal.md Source Tracking section with issue number, URL, repository, and status. - - Resolve Parent Feature or Epic from `.specfact/backlog/github_hierarchy_cache.md` first; this - cache is ephemeral local state and MUST NOT be committed. If it is missing or stale, rerun - `python scripts/sync_github_hierarchy_cache.py` before manual GitHub lookup. + - >- + Resolve Parent Feature or Epic from `.specfact/backlog/github_hierarchy_cache.md` first (regenerate + via `python scripts/sync_github_hierarchy_cache.py` when missing or stale). The cache is ephemeral + local state and MUST NOT be committed. **Pending:** until backlog commands read this cache + automatically, treat this as contributor/agent workflow (docs + local script), not enforced CLI + behavior for every `specfact backlog` path. - Source tracking: Only track public repos (specfact-cli, platform-frontend). Skip for internal repos (specfact-cli-internal) specs: @@ -160,9 +163,11 @@ rules: - Place this task after quality gates and documentation, before PR creation. - Include git workflow tasks: branch creation (first task), PR creation (last task) - For public-facing changes in public repos (specfact-cli, platform-frontend): - - Before GitHub issue creation or parent linking, consult `.specfact/backlog/github_hierarchy_cache.md`; - rerun `python scripts/sync_github_hierarchy_cache.py` when the cache is missing or stale. - Treat this cache as ephemeral local state, not a committed OpenSpec artifact. + - >- + Before GitHub issue creation or parent linking, consult `.specfact/backlog/github_hierarchy_cache.md`; + rerun `python scripts/sync_github_hierarchy_cache.py` when the cache is missing or stale. Treat this + cache as ephemeral local state, not a committed OpenSpec artifact. **Pending:** wire cache-first + lookup into backlog add/sync codepaths when the governance hierarchy-cache work lands end-to-end. - Include GitHub issue creation task with format: - title `[Change] ` - labels `enhancement` and `change-proposal` diff --git a/scripts/sync_github_hierarchy_cache.py b/scripts/sync_github_hierarchy_cache.py old mode 100644 new mode 100755 index 1803805d..5e4031d5 --- a/scripts/sync_github_hierarchy_cache.py +++ b/scripts/sync_github_hierarchy_cache.py @@ -19,52 +19,88 @@ DEFAULT_REPO_OWNER = "nold-ai" -DEFAULT_REPO_NAME = Path(__file__).resolve().parents[1].name +_SCRIPT_DIR = Path(__file__).resolve().parent + + +@beartype +@ensure( + lambda result: result is None or bool(str(result).strip()), + "parsed repository name must be non-blank when present", +) +def parse_repo_name_from_remote_url(url: str) -> str | None: + """Return the repository name segment from a Git remote URL, if parseable.""" + stripped = url.strip() + if not stripped: + return None + if stripped.startswith("git@"): + _, _, rest = stripped.partition(":") + path = rest + elif "://" in stripped: + host_and_path = stripped.split("://", 1)[1] + if "/" not in host_and_path: + return None + path = host_and_path.split("/", 1)[1] + else: + path = stripped + path = path.rstrip("/") + if path.endswith(".git"): + path = path[:-4] + segments = [segment for segment in path.split("/") if segment] + if not segments: + return None + return segments[-1] + + +@beartype +def _default_repo_name_from_git(script_dir: Path) -> str | None: + """Resolve the GitHub repository name from ``origin`` (works in worktrees).""" + try: + completed = subprocess.run( + ["git", "-C", str(script_dir), "config", "--get", "remote.origin.url"], + check=False, + capture_output=True, + text=True, + ) + except (FileNotFoundError, OSError): + return None + if completed.returncode != 0: + return None + return parse_repo_name_from_remote_url(completed.stdout) + + +_DEFAULT_REPO_NAME_FALLBACK = Path(__file__).resolve().parents[1].name +DEFAULT_REPO_NAME = _default_repo_name_from_git(_SCRIPT_DIR) or _DEFAULT_REPO_NAME_FALLBACK DEFAULT_OUTPUT_PATH = Path(".specfact") / "backlog" / "github_hierarchy_cache.md" DEFAULT_STATE_PATH = Path(".specfact") / "backlog" / "github_hierarchy_cache_state.json" SUPPORTED_ISSUE_TYPES = frozenset({"Epic", "Feature"}) +SUPPORTED_ISSUE_TYPES_ORDER: tuple[str, ...] = ("Epic", "Feature") _SUMMARY_SKIP_LINES = {"why", "scope", "summary", "changes", "capabilities", "impact"} +_GH_GRAPHQL_TIMEOUT_SEC = 120 -_FINGERPRINT_QUERY = """ -query($owner: String!, $name: String!, $after: String) { - repository(owner: $owner, name: $name) { - issues(first: 100, after: $after, states: [OPEN, CLOSED], orderBy: {field: CREATED_AT, direction: ASC}) { - pageInfo { hasNextPage endCursor } - nodes { - number - title - url - updatedAt - issueType { name } - labels(first: 100) { nodes { name } } - parent { number title url } - subIssues(first: 100) { nodes { number title url } } - } - } - } -} -""" - -_DETAIL_QUERY = """ -query($owner: String!, $name: String!, $after: String) { - repository(owner: $owner, name: $name) { - issues(first: 100, after: $after, states: [OPEN, CLOSED], orderBy: {field: CREATED_AT, direction: ASC}) { - pageInfo { hasNextPage endCursor } - nodes { + +@beartype +def _build_hierarchy_issues_query(*, include_body: bool) -> str: + """Return the shared GitHub GraphQL query, optionally including issue body text.""" + body_field = " bodyText\n" if include_body else "" + return f""" +query($owner: String!, $name: String!, $after: String) {{ + repository(owner: $owner, name: $name) {{ + issues(first: 100, after: $after, states: [OPEN, CLOSED], orderBy: {{field: CREATED_AT, direction: ASC}}) {{ + pageInfo {{ hasNextPage endCursor }} + nodes {{ number title url updatedAt - bodyText - issueType { name } - labels(first: 100) { nodes { name } } - parent { number title url } - subIssues(first: 100) { nodes { number title url } } - } - } - } -} -""" +{body_field} issueType {{ name }} + labels(first: 100) {{ nodes {{ name }} }} + parent {{ number title url }} + subIssues(first: 100) {{ nodes {{ number title url issueType {{ name }} }} }} + }} + }} + }} +}} +""".strip() @dataclass(frozen=True) @@ -131,28 +167,64 @@ def _parse_issue_link(node: Mapping[str, Any] | None) -> IssueLink | None: @beartype -def _parse_issue_node(node: Mapping[str, Any], *, include_body: bool) -> HierarchyIssue | None: - """Convert a GraphQL issue node to HierarchyIssue when supported.""" - issue_type_node = node.get("issueType") - issue_type_name = issue_type_node.get("name") if isinstance(issue_type_node, Mapping) else None - if issue_type_name not in SUPPORTED_ISSUE_TYPES: - return None +def _mapping_value(node: Mapping[str, Any], key: str) -> Mapping[str, Any] | None: + """Return a nested mapping value when present.""" + value = node.get(key) + return value if isinstance(value, Mapping) else None + + +@beartype +def _mapping_nodes(container: Mapping[str, Any] | None) -> list[Mapping[str, Any]]: + """Return a filtered list of mapping nodes from a GraphQL connection.""" + if container is None: + return [] + + raw_nodes = container.get("nodes") + if not isinstance(raw_nodes, list): + return [] + + return [item for item in raw_nodes if isinstance(item, Mapping)] - labels_container = node.get("labels") if isinstance(node.get("labels"), Mapping) else {} - label_nodes = labels_container.get("nodes") if isinstance(labels_container, Mapping) else [] - labels = sorted( - (str(item["name"]) for item in label_nodes if isinstance(item, Mapping) and item.get("name")), - key=str.lower, - ) - subissues_container = node.get("subIssues") if isinstance(node.get("subIssues"), Mapping) else {} - subissue_nodes = subissues_container.get("nodes") if isinstance(subissues_container, Mapping) else [] +@beartype +def _label_names(label_nodes: list[Mapping[str, Any]]) -> list[str]: + """Extract sorted label names from GraphQL label nodes.""" + names: list[str] = [] + for item in label_nodes: + name = item.get("name") + if name: + names.append(str(name)) + return sorted(names, key=str.lower) + + +@beartype +def _subissue_type_name(item: Mapping[str, Any]) -> str | None: + """Return sub-issue type name when present.""" + issue_type_node = _mapping_value(item, "issueType") + if issue_type_node and issue_type_node.get("name"): + return str(issue_type_node["name"]) + return None + + +@beartype +def _child_links(subissue_nodes: list[Mapping[str, Any]]) -> list[IssueLink]: + """Extract sorted child issue links from GraphQL subissue nodes (Epic/Feature only).""" children = [ IssueLink(number=int(item["number"]), title=str(item["title"]), url=str(item["url"])) for item in subissue_nodes - if isinstance(item, Mapping) and item.get("number") is not None + if item.get("number") is not None and _subissue_type_name(item) in SUPPORTED_ISSUE_TYPES ] children.sort(key=lambda item: item.number) + return children + + +@beartype +def _parse_issue_node(node: Mapping[str, Any], *, include_body: bool) -> HierarchyIssue | None: + """Convert a GraphQL issue node to HierarchyIssue when supported.""" + issue_type_node = _mapping_value(node, "issueType") + issue_type_name = str(issue_type_node["name"]) if issue_type_node and issue_type_node.get("name") else None + if issue_type_name not in SUPPORTED_ISSUE_TYPES: + return None summary = _extract_summary(str(node.get("bodyText", ""))) if include_body else "" return HierarchyIssue( @@ -160,11 +232,11 @@ def _parse_issue_node(node: Mapping[str, Any], *, include_body: bool) -> Hierarc title=str(node["title"]), url=str(node["url"]), issue_type=str(issue_type_name), - labels=labels, + labels=_label_names(_mapping_nodes(_mapping_value(node, "labels"))), summary=summary, updated_at=str(node["updatedAt"]), - parent=_parse_issue_link(node.get("parent") if isinstance(node.get("parent"), Mapping) else None), - children=children, + parent=_parse_issue_link(_mapping_value(node, "parent")), + children=_child_links(_mapping_nodes(_mapping_value(node, "subIssues"))), ) @@ -185,7 +257,22 @@ def _run_graphql_query(query: str, *, repo_owner: str, repo_name: str, after: st if after is not None: command.extend(["-F", f"after={after}"]) - completed = subprocess.run(command, check=False, capture_output=True, text=True) + try: + completed = subprocess.run( + command, + check=False, + capture_output=True, + text=True, + timeout=_GH_GRAPHQL_TIMEOUT_SEC, + ) + except subprocess.TimeoutExpired as exc: + detail = f"GitHub GraphQL subprocess timed out after {_GH_GRAPHQL_TIMEOUT_SEC}s" + out = (exc.stdout or "").strip() + err = (exc.stderr or "").strip() + if out or err: + detail = f"{detail}; stdout={out!r}; stderr={err!r}" + raise RuntimeError(detail) from exc + if completed.returncode != 0: raise RuntimeError(completed.stderr.strip() or completed.stdout.strip() or "GitHub GraphQL query failed") @@ -196,9 +283,36 @@ def _run_graphql_query(query: str, *, repo_owner: str, repo_name: str, after: st @beartype +def _is_not_blank(value: str) -> bool: + """Return whether a required CLI string value is non-blank.""" + return bool(value.strip()) + + +@beartype +def _all_supported_issue_types(result: list[HierarchyIssue]) -> bool: + """Return whether every issue has a supported issue type.""" + return all(issue.issue_type in SUPPORTED_ISSUE_TYPES for issue in result) + + +@beartype +def _require_repo_owner_for_fetch(*, repo_owner: str, repo_name: str, fingerprint_only: bool) -> bool: + _ = (repo_name, fingerprint_only) + return _is_not_blank(repo_owner) + + +@beartype +def _require_repo_name_for_fetch(*, repo_owner: str, repo_name: str, fingerprint_only: bool) -> bool: + _ = (repo_owner, fingerprint_only) + return _is_not_blank(repo_name) + + +@beartype +@require(_require_repo_owner_for_fetch, "repo_owner must not be blank") +@require(_require_repo_name_for_fetch, "repo_name must not be blank") +@ensure(_all_supported_issue_types, "Only Epic and Feature issues should be returned") def fetch_hierarchy_issues(*, repo_owner: str, repo_name: str, fingerprint_only: bool) -> list[HierarchyIssue]: """Fetch Epic and Feature issues from GitHub for the given repository.""" - query = _FINGERPRINT_QUERY if fingerprint_only else _DETAIL_QUERY + query = _build_hierarchy_issues_query(include_body=not fingerprint_only) issues: list[HierarchyIssue] = [] after: str | None = None @@ -243,6 +357,79 @@ def compute_hierarchy_fingerprint(issues: list[HierarchyIssue]) -> str: return hashlib.sha256(canonical_json.encode("utf-8")).hexdigest() +@beartype +def _group_issues_by_type(issues: list[HierarchyIssue]) -> dict[str, list[HierarchyIssue]]: + """Return issues grouped by supported type in deterministic order.""" + return { + issue_type: sorted((item for item in issues if item.issue_type == issue_type), key=lambda item: item.number) + for issue_type in SUPPORTED_ISSUE_TYPES_ORDER + } + + +@beartype +def _render_issue_block(issue: HierarchyIssue) -> list[str]: + """Render one issue block for the hierarchy cache.""" + parent_text = "none" + if issue.parent is not None: + parent_text = f"#{issue.parent.number} {issue.parent.title}" + + child_text = "none" + if issue.children: + child_text = ", ".join(f"#{child.number} {child.title}" for child in issue.children) + + label_text = ", ".join(sorted(issue.labels, key=str.lower)) if issue.labels else "none" + return [ + f"### #{issue.number} {issue.title}", + f"- URL: {issue.url}", + f"- Parent: {parent_text}", + f"- Children: {child_text}", + f"- Labels: {label_text}", + f"- Summary: {issue.summary or 'No summary provided.'}", + "", + ] + + +@beartype +def _render_issue_section(*, title: str, issues: list[HierarchyIssue]) -> list[str]: + """Render one section of grouped issues.""" + lines = [f"## {title}", ""] + if not issues: + lines.extend(["_None_", ""]) + return lines + + for issue in issues: + lines.extend(_render_issue_block(issue)) + return lines + + +@beartype +def _require_repo_full_name_for_render( + *, repo_full_name: str, issues: list[HierarchyIssue], generated_at: str, fingerprint: str +) -> bool: + _ = (issues, generated_at, fingerprint) + return _is_not_blank(repo_full_name) + + +@beartype +def _require_generated_at_for_render( + *, repo_full_name: str, issues: list[HierarchyIssue], generated_at: str, fingerprint: str +) -> bool: + _ = (repo_full_name, issues, fingerprint) + return _is_not_blank(generated_at) + + +@beartype +def _require_fingerprint_for_render( + *, repo_full_name: str, issues: list[HierarchyIssue], generated_at: str, fingerprint: str +) -> bool: + _ = (repo_full_name, issues, generated_at) + return _is_not_blank(fingerprint) + + +@beartype +@require(_require_repo_full_name_for_render, "repo_full_name must not be blank") +@require(_require_generated_at_for_render, "generated_at must not be blank") +@require(_require_fingerprint_for_render, "fingerprint must not be blank") def render_cache_markdown( *, repo_full_name: str, @@ -251,10 +438,7 @@ def render_cache_markdown( fingerprint: str, ) -> str: """Render deterministic markdown for the hierarchy cache.""" - grouped = { - "Epic": sorted((item for item in issues if item.issue_type == "Epic"), key=lambda item: item.number), - "Feature": sorted((item for item in issues if item.issue_type == "Feature"), key=lambda item: item.number), - } + grouped = _group_issues_by_type(issues) lines = [ "# GitHub Hierarchy Cache", @@ -264,36 +448,15 @@ def render_cache_markdown( f"- Fingerprint: `{fingerprint}`", f"- Included Issue Types: `{', '.join(sorted(SUPPORTED_ISSUE_TYPES))}`", "", - "Use this file as the first lookup source for parent Epic or Feature relationships during OpenSpec and GitHub issue setup.", + ( + "Use this file as the first lookup source for parent Epic or Feature relationships " + "during OpenSpec and GitHub issue setup." + ), "", ] for section_name, issue_type in (("Epics", "Epic"), ("Features", "Feature")): - lines.append(f"## {section_name}") - lines.append("") - if not grouped[issue_type]: - lines.append("_None_") - lines.append("") - continue - - for issue in grouped[issue_type]: - lines.append(f"### #{issue.number} {issue.title}") - lines.append(f"- URL: {issue.url}") - parent_text = "none" - if issue.parent is not None: - parent_text = f"#{issue.parent.number} {issue.parent.title}" - lines.append(f"- Parent: {parent_text}") - - if issue.children: - child_text = ", ".join(f"#{child.number} {child.title}" for child in issue.children) - else: - child_text = "none" - lines.append(f"- Children: {child_text}") - - label_text = ", ".join(sorted(issue.labels, key=str.lower)) if issue.labels else "none" - lines.append(f"- Labels: {label_text}") - lines.append(f"- Summary: {issue.summary or 'No summary provided.'}") - lines.append("") + lines.extend(_render_issue_section(title=section_name, issues=grouped[issue_type])) return "\n".join(lines).rstrip() + "\n" @@ -326,8 +489,24 @@ def _write_state( @beartype -@require(lambda repo_owner: bool(repo_owner.strip()), "repo_owner must not be blank") -@require(lambda repo_name: bool(repo_name.strip()), "repo_name must not be blank") +def _require_repo_owner_for_sync( + *, repo_owner: str, repo_name: str, output_path: Path, state_path: Path, force: bool = False +) -> bool: + _ = (repo_name, output_path, state_path, force) + return _is_not_blank(repo_owner) + + +@beartype +def _require_repo_name_for_sync( + *, repo_owner: str, repo_name: str, output_path: Path, state_path: Path, force: bool = False +) -> bool: + _ = (repo_owner, output_path, state_path, force) + return _is_not_blank(repo_name) + + +@beartype +@require(_require_repo_owner_for_sync, "repo_owner must not be blank") +@require(_require_repo_name_for_sync, "repo_name must not be blank") def sync_cache( *, repo_owner: str, @@ -337,27 +516,22 @@ def sync_cache( force: bool = False, ) -> SyncResult: """Sync the local hierarchy cache from GitHub.""" - fingerprint_issues = fetch_hierarchy_issues( + state = _load_state(state_path) + detailed_issues = fetch_hierarchy_issues( repo_owner=repo_owner, repo_name=repo_name, - fingerprint_only=True, + fingerprint_only=False, ) - fingerprint = compute_hierarchy_fingerprint(fingerprint_issues) - state = _load_state(state_path) + fingerprint = compute_hierarchy_fingerprint(detailed_issues) if not force and state.get("fingerprint") == fingerprint and output_path.exists(): return SyncResult( changed=False, - issue_count=len(fingerprint_issues), + issue_count=len(detailed_issues), fingerprint=fingerprint, output_path=output_path, ) - detailed_issues = fetch_hierarchy_issues( - repo_owner=repo_owner, - repo_name=repo_name, - fingerprint_only=False, - ) generated_at = datetime.now(tz=UTC).replace(microsecond=0).isoformat().replace("+00:00", "Z") output_path.parent.mkdir(parents=True, exist_ok=True) output_path.write_text( @@ -410,15 +584,9 @@ def main(argv: list[str] | None = None) -> int: force=bool(args.force), ) if result.changed: - print( - f"Updated GitHub hierarchy cache with {result.issue_count} issues at {result.output_path}", - file=sys.stdout, - ) + sys.stdout.write(f"Updated GitHub hierarchy cache with {result.issue_count} issues at {result.output_path}\n") else: - print( - f"GitHub hierarchy cache unchanged ({result.issue_count} issues).", - file=sys.stdout, - ) + sys.stdout.write(f"GitHub hierarchy cache unchanged ({result.issue_count} issues).\n") return 0 diff --git a/tests/unit/scripts/test_sync_github_hierarchy_cache.py b/tests/unit/scripts/test_sync_github_hierarchy_cache.py index 26c7dc64..251fc294 100644 --- a/tests/unit/scripts/test_sync_github_hierarchy_cache.py +++ b/tests/unit/scripts/test_sync_github_hierarchy_cache.py @@ -3,15 +3,28 @@ from __future__ import annotations import importlib.util +import subprocess import sys +from functools import lru_cache from pathlib import Path -from typing import Any +from typing import Any, TypedDict import pytest +class IssueOptions(TypedDict, total=False): + """Optional test issue fields.""" + + labels: list[str] + summary: str + parent: tuple[int, str] + children: list[tuple[int, str]] + updated_at: str + + +@lru_cache(maxsize=1) def _load_script_module() -> Any: - """Load scripts/sync_github_hierarchy_cache.py as a Python module.""" + """Load scripts/sync_github_hierarchy_cache.py as a Python module (cached for stable types).""" script_path = Path(__file__).resolve().parents[3] / "scripts" / "sync_github_hierarchy_cache.py" spec = importlib.util.spec_from_file_location("sync_github_hierarchy_cache", script_path) if spec is None or spec.loader is None: @@ -29,33 +42,34 @@ def _make_issue( number: int, title: str, issue_type: str, - labels: list[str], - summary: str, - parent_number: int | None = None, - parent_title: str | None = None, - children: list[tuple[int, str]] | None = None, - updated_at: str = "2026-04-09T08:00:00Z", + options: IssueOptions | None = None, ) -> Any: """Create a HierarchyIssue instance for tests.""" + issue_options = options or {} + children = issue_options.get("children", []) child_links = [ module.IssueLink(number=child_number, title=child_title, url=f"https://example.test/issues/{child_number}") - for child_number, child_title in (children or []) + for child_number, child_title in children ] + parent_link = None - if parent_number is not None and parent_title is not None: + parent = issue_options.get("parent") + if parent is not None: + parent_number, parent_title = parent parent_link = module.IssueLink( number=parent_number, title=parent_title, url=f"https://example.test/issues/{parent_number}", ) + return module.HierarchyIssue( number=number, title=title, url=f"https://example.test/issues/{number}", issue_type=issue_type, - labels=labels, - summary=summary, - updated_at=updated_at, + labels=issue_options.get("labels", []), + summary=issue_options.get("summary", ""), + updated_at=issue_options.get("updated_at", "2026-04-09T08:00:00Z"), parent=parent_link, children=child_links, ) @@ -70,19 +84,22 @@ def test_compute_hierarchy_fingerprint_is_order_independent() -> None: number=485, title="[Epic] Governance", issue_type="Epic", - labels=["openspec", "Epic"], - summary="Governance epic.", - children=[(486, "[Feature] Alignment")], + options={ + "labels": ["openspec", "Epic"], + "summary": "Governance epic.", + "children": [(486, "[Feature] Alignment")], + }, ) feature = _make_issue( module, number=486, title="[Feature] Alignment", issue_type="Feature", - labels=["Feature", "openspec"], - summary="Alignment feature.", - parent_number=485, - parent_title="[Epic] Governance", + options={ + "labels": ["Feature", "openspec"], + "summary": "Alignment feature.", + "parent": (485, "[Epic] Governance"), + }, ) first = module.compute_hierarchy_fingerprint([epic, feature]) @@ -94,12 +111,51 @@ def test_compute_hierarchy_fingerprint_is_order_independent() -> None: def test_extract_summary_skips_heading_only_lines() -> None: """Summary extraction should skip markdown section headers.""" module = _load_script_module() + extract_summary = module._extract_summary # pylint: disable=protected-access - summary = module._extract_summary("## Why\n\nThis cache avoids repeated GitHub lookups.") + summary = extract_summary("## Why\n\nThis cache avoids repeated GitHub lookups.") assert summary == "This cache avoids repeated GitHub lookups." +@pytest.mark.parametrize( + ("url", "expected"), + [ + ("https://github.com/nold-ai/specfact-cli.git", "specfact-cli"), + ("git@github.com:nold-ai/specfact-cli.git", "specfact-cli"), + ("https://github.com/org/my-repo/", "my-repo"), + ], +) +def test_parse_repo_name_from_remote_url(url: str, expected: str) -> None: + """Remote URL tail parsing should yield the GitHub repository name.""" + module = _load_script_module() + assert module.parse_repo_name_from_remote_url(url) == expected + + +def test_parse_repo_name_from_remote_url_empty_returns_none() -> None: + """Blank remote URLs should not produce a repository name.""" + module = _load_script_module() + assert module.parse_repo_name_from_remote_url("") is None + assert module.parse_repo_name_from_remote_url(" ") is None + + +def test_default_repo_name_matches_git_origin_url() -> None: + """When ``remote.origin.url`` exists, DEFAULT_REPO_NAME must match its repository segment (worktrees).""" + module = _load_script_module() + scripts_dir = Path(__file__).resolve().parents[3] / "scripts" + completed = subprocess.run( + ["git", "-C", str(scripts_dir), "config", "--get", "remote.origin.url"], + check=False, + capture_output=True, + text=True, + ) + if completed.returncode != 0 or not completed.stdout.strip(): + pytest.skip("No git origin in this environment") + expected = module.parse_repo_name_from_remote_url(completed.stdout) + assert expected is not None + assert expected == module.DEFAULT_REPO_NAME + + def test_default_paths_use_ephemeral_specfact_backlog_cache() -> None: """Default cache files should live in ignored .specfact/backlog storage.""" module = _load_script_module() @@ -108,6 +164,20 @@ def test_default_paths_use_ephemeral_specfact_backlog_cache() -> None: assert str(module.DEFAULT_STATE_PATH) == ".specfact/backlog/github_hierarchy_cache_state.json" +def test_child_links_include_only_epic_and_feature_subissues() -> None: + """Sub-issue GraphQL nodes should contribute children only when type is Epic or Feature.""" + module = _load_script_module() + child_links = module._child_links( # pylint: disable=protected-access + [ + {"number": 1, "title": "Task", "url": "https://example.test/1", "issueType": {"name": "Task"}}, + {"number": 2, "title": "Feat", "url": "https://example.test/2", "issueType": {"name": "Feature"}}, + {"number": 3, "title": "Ep", "url": "https://example.test/3", "issueType": {"name": "Epic"}}, + {"number": 4, "title": "Untyped", "url": "https://example.test/4"}, + ] + ) + assert [link.number for link in child_links] == [2, 3] + + def test_render_cache_markdown_groups_epics_and_features() -> None: """Rendered markdown should be deterministic and grouped by issue type.""" module = _load_script_module() @@ -118,19 +188,22 @@ def test_render_cache_markdown_groups_epics_and_features() -> None: number=486, title="[Feature] Alignment", issue_type="Feature", - labels=["openspec", "Feature"], - summary="Alignment feature.", - parent_number=485, - parent_title="[Epic] Governance", + options={ + "labels": ["openspec", "Feature"], + "summary": "Alignment feature.", + "parent": (485, "[Epic] Governance"), + }, ), _make_issue( module, number=485, title="[Epic] Governance", issue_type="Epic", - labels=["Epic", "openspec"], - summary="Governance epic.", - children=[(486, "[Feature] Alignment")], + options={ + "labels": ["Epic", "openspec"], + "summary": "Governance epic.", + "children": [(486, "[Feature] Alignment")], + }, ), ] @@ -166,19 +239,24 @@ def test_sync_cache_skips_write_when_fingerprint_is_unchanged(monkeypatch: pytes number=485, title="[Epic] Governance", issue_type="Epic", - labels=["Epic"], - summary="Governance epic.", + options={ + "labels": ["Epic"], + "summary": "Governance epic.", + }, ) ] def _fake_fetch(*, repo_owner: str, repo_name: str, fingerprint_only: bool) -> list[Any]: assert repo_owner == "nold-ai" assert repo_name == "specfact-cli" - assert fingerprint_only is True + assert fingerprint_only is False return issues + def _same_fingerprint(_: list[Any]) -> str: + return "same" + monkeypatch.setattr(module, "fetch_hierarchy_issues", _fake_fetch) - monkeypatch.setattr(module, "compute_hierarchy_fingerprint", lambda _: "same") + monkeypatch.setattr(module, "compute_hierarchy_fingerprint", _same_fingerprint) result = module.sync_cache( repo_owner="nold-ai", @@ -190,3 +268,129 @@ def _fake_fetch(*, repo_owner: str, repo_name: str, fingerprint_only: bool) -> l assert result.changed is False assert result.issue_count == 1 assert output_path.read_text(encoding="utf-8") == "unchanged cache\n" + + +def test_sync_cache_force_rewrites_when_fingerprint_unchanged(monkeypatch: pytest.MonkeyPatch, tmp_path: Path) -> None: + """sync_cache with force=True must rewrite output even when fingerprint matches state.""" + module = _load_script_module() + + output_path = tmp_path / "GITHUB_HIERARCHY_CACHE.md" + state_path = tmp_path / ".github_hierarchy_cache_state.json" + output_path.write_text("stale cache\n", encoding="utf-8") + state_path.write_text('{"fingerprint":"same"}', encoding="utf-8") + + issues = [ + _make_issue( + module, + number=485, + title="[Epic] Governance", + issue_type="Epic", + options={ + "labels": ["Epic"], + "summary": "Governance epic.", + }, + ) + ] + + def _fake_fetch(*, repo_owner: str, repo_name: str, fingerprint_only: bool) -> list[Any]: + assert repo_owner == "nold-ai" + assert repo_name == "specfact-cli" + assert fingerprint_only is False + return issues + + monkeypatch.setattr(module, "fetch_hierarchy_issues", _fake_fetch) + monkeypatch.setattr(module, "compute_hierarchy_fingerprint", lambda _: "same") + + result = module.sync_cache( + repo_owner="nold-ai", + repo_name="specfact-cli", + output_path=output_path, + state_path=state_path, + force=True, + ) + + assert result.changed is True + assert "# GitHub Hierarchy Cache" in output_path.read_text(encoding="utf-8") + + +def test_sync_cache_propagates_graphql_failure(monkeypatch: pytest.MonkeyPatch, tmp_path: Path) -> None: + """RuntimeError from GitHub GraphQL should surface to callers.""" + module = _load_script_module() + + def _boom(_query: str, *, repo_owner: str, repo_name: str, **_kwargs: Any) -> Any: + assert repo_owner == "nold-ai" + assert repo_name == "specfact-cli" + raise RuntimeError("graphql failed") + + monkeypatch.setattr(module, "_run_graphql_query", _boom) + + with pytest.raises(RuntimeError, match="graphql failed"): + module.sync_cache( + repo_owner="nold-ai", + repo_name="specfact-cli", + output_path=tmp_path / "out.md", + state_path=tmp_path / "state.json", + ) + + +def test_sync_cache_malformed_state_regenerates_cache(monkeypatch: pytest.MonkeyPatch, tmp_path: Path) -> None: + """Invalid state JSON is treated as missing state and triggers a full sync.""" + module = _load_script_module() + + output_path = tmp_path / "GITHUB_HIERARCHY_CACHE.md" + state_path = tmp_path / ".github_hierarchy_cache_state.json" + state_path.write_text("{not-json", encoding="utf-8") + + issues = [ + _make_issue( + module, + number=485, + title="[Epic] Governance", + issue_type="Epic", + options={ + "labels": ["Epic"], + "summary": "Governance epic.", + }, + ) + ] + + fetch_calls = 0 + + def _fake_fetch(*, repo_owner: str, repo_name: str, fingerprint_only: bool) -> list[Any]: + nonlocal fetch_calls + fetch_calls += 1 + assert repo_owner == "nold-ai" + assert repo_name == "specfact-cli" + assert fingerprint_only is False + return issues + + monkeypatch.setattr(module, "fetch_hierarchy_issues", _fake_fetch) + + result = module.sync_cache( + repo_owner="nold-ai", + repo_name="specfact-cli", + output_path=output_path, + state_path=state_path, + ) + + assert fetch_calls == 1 + assert result.changed is True + assert "# GitHub Hierarchy Cache" in output_path.read_text(encoding="utf-8") + + +def test_default_repo_name_falls_back_when_git_unavailable(monkeypatch: pytest.MonkeyPatch) -> None: + """If ``git`` is missing, DEFAULT_REPO_NAME must use the checkout directory fallback.""" + _load_script_module.cache_clear() + sys.modules.pop("sync_github_hierarchy_cache", None) + + def _no_git(*_args: Any, **_kwargs: Any) -> Any: + raise FileNotFoundError("git not found") + + monkeypatch.setattr(subprocess, "run", _no_git) + module = _load_script_module() + script_path = Path(__file__).resolve().parents[3] / "scripts" / "sync_github_hierarchy_cache.py" + expected_fallback = script_path.resolve().parents[1].name + assert expected_fallback == module.DEFAULT_REPO_NAME + + _load_script_module.cache_clear() + sys.modules.pop("sync_github_hierarchy_cache", None)