Skip to content

refactor: replace per-file GitHub API calls with ZIP archive downloads#2607

Merged
baktun14 merged 7 commits intomainfrom
refactor/template-fetching
Jan 29, 2026
Merged

refactor: replace per-file GitHub API calls with ZIP archive downloads#2607
baktun14 merged 7 commits intomainfrom
refactor/template-fetching

Conversation

@baktun14
Copy link
Contributor

@baktun14 baktun14 commented Jan 29, 2026

Summary

  • Replaces hundreds of individual Octokit repos.getContent API calls per refresh cycle with 3 ZIP archive downloads from GitHub's CDN (not rate-limited)
  • Adds GitHubArchiveService that downloads and parses repo archives using yauzl, providing readFile() and listDirectory() via an ArchiveReader interface
  • Chain registry metadata now fetched from raw.githubusercontent.com instead of the GitHub API
  • Adds periodic template gallery refresh (default 15-minute TTL) via TemplateRefreshService, so new templates appear without redeploying the API
  • Extracts TemplateGalleryService into a shared DI singleton so the controller and refresh service operate on the same in-memory cache
  • Uses a setTimeout chain (not setInterval) to guarantee no overlapping refreshes and clean disposal on shutdown
  • Refresh is a no-op if GITHUB_PAT is not configured

How the refresh works

  1. Every 15 minutes (configurable via TEMPLATE_REFRESH_INTERVAL_SECONDS), the refresh service calls TemplateGalleryService.refreshCache()
  2. refreshCache() rebuilds the cache file (SHA-based — skips ZIP download if nothing changed), then resets in-memory caches (#galleriesCache, #parsedTemplates) and clears the GitHubArchiveService archive cache to free old entries
  3. The next request reads the fresh cache file; requests during refresh continue serving the old in-memory data

New files

File Purpose
providers/template-gallery.provider.ts Shared singleton TemplateGalleryService via DI
providers/template-refresh.provider.ts Registers TemplateRefreshService as APP_INITIALIZER with DisposableRegistry
services/template-refresh/template-refresh.service.ts setTimeout-chain periodic refresh with error logging and dispose cleanup

Test plan

  • TypeScript type checking passes (npx tsc --noEmit)
  • All 73 template unit tests pass
  • Manual test: start API with GITHUB_PAT set, verify "TEMPLATE_REFRESH_REGISTERED" log appears, temporarily set TEMPLATE_REFRESH_INTERVAL_SECONDS=30 and confirm refresh logs appear

Summary by CodeRabbit

  • New Features

    • Archive-based repository retrieval for template content (new archive service)
    • Automatic periodic template cache refresh (configurable via env)
  • Improvements

    • Template fetching now uses archive reads for more reliable listings and file access
    • Service wiring updated to support cache/refresh lifecycle and simplified gallery sources (legacy linux-server removed)
  • Tests

    • Tests migrated to archive-based mocks and expanded for archive error scenarios

✏️ Tip: You can customize this high-level summary in your review settings.

@baktun14 baktun14 requested a review from a team as a code owner January 29, 2026 17:51
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 29, 2026

📝 Walkthrough

Walkthrough

Adds a GitHubArchiveService to download and parse GitHub ZIP archives (yauzl), rewires TemplateFetcherService to use archive-based reads, adds TemplateRefreshService and providers, updates DI wiring and tests, and introduces TEMPLATE_REFRESH_INTERVAL_SECONDS / TEMPLATE_REFRESH_ENABLED env vars.

Changes

Cohort / File(s) Summary
Package Dependencies
apps/api/package.json
Added runtime yauzl and dev deps @types/yauzl, yazl, @types/yazl.
GitHub Archive Service
apps/api/src/template/services/github-archive/github-archive.service.ts, apps/api/src/template/services/github-archive/github-archive.service.spec.ts
New service that downloads repo ZIPs, parses via yauzl, normalizes root prefix, builds in-memory file/dir maps, exposes ArchiveReader (readFile/listDirectory) with per-owner/repo/ref LRU cache; tests use in-memory ZIPs.
Template Fetcher Service
apps/api/src/template/services/template-fetcher/template-fetcher.service.ts, apps/api/src/template/services/template-fetcher/template-fetcher.service.spec.ts
Replaced Octokit anonymous/REST flows with archive-based reads via GitHubArchiveService; constructor gains archiveService; added isTemplateFile and clearArchiveCache(); tests migrated to archive mocks and updated expectations.
Template Gallery & Providers
apps/api/src/template/providers/template-gallery.provider.ts, apps/api/src/template/providers/template-refresh.provider.ts, apps/api/src/template/index.ts
Added TEMPLATE_GALLERY_SERVICE DI provider, APP_INITIALIZER wiring for TemplateRefreshService, and side-effect imports to register providers.
Template Refresh Service
apps/api/src/template/services/template-refresh/template-refresh.service.ts
New AppInitializer/Disposable that periodically calls TemplateGalleryService.refreshCache honoring TEMPLATE_REFRESH_INTERVAL_SECONDS and TEMPLATE_REFRESH_ENABLED; includes logging and disposal.
Template Gallery Changes
apps/api/src/template/services/template-gallery/template-gallery.service.ts, apps/api/src/template/services/template-gallery/template-gallery.service.spec.ts
Now constructs TemplateFetcherService with archive service when PAT is present; removed linux-server source from aggregation and adjusted tests.
Controller & Config
apps/api/src/template/controllers/template/template.controller.ts, apps/api/src/template/config/env.config.ts
TemplateController now injected with TEMPLATE_GALLERY_SERVICE; added TEMPLATE_REFRESH_INTERVAL_SECONDS and TEMPLATE_REFRESH_ENABLED to env schema and inferred config type.
Tests Updated
apps/api/src/template/services/template-fetcher/...spec.ts, apps/api/src/template/services/github-archive/...spec.ts
Tests migrated from Octokit mocks to archive/ArchiveReader-based mocks; added ZIP creation helpers and archive failure scenarios.

Sequence Diagram

sequenceDiagram
    participant TFS as TemplateFetcherService
    participant GAS as GitHubArchiveService
    participant GitHub as GitHub (ZIP)
    participant Parser as yauzl Parser
    participant Cache as Archive Cache

    TFS->>GAS: getArchive(owner, repo, ref)
    GAS->>Cache: lookup cache key
    alt cache hit
        Cache-->>GAS: ArchiveReader
    else cache miss
        GAS->>GitHub: GET /{owner}/{repo}/archive/{ref}.zip
        GitHub-->>GAS: ZIP bytes
        GAS->>Parser: parse ZIP (yauzl)
        Parser-->>GAS: entries + root prefix
        GAS->>Cache: store ArchiveReader
    end
    GAS-->>TFS: ArchiveReader
    TFS->>GAS: readFile(path) / listDirectory(path)
    GAS-->>TFS: file contents / directory entries
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • stalniy

Poem

🐇 I hopped through zipped midnight stacks,

Unzipped roots and traced the tracks,
Yauzl hummed while caches grew,
Templates found and queued anew,
A rabbit guards the archive racks.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately reflects the main objective: replacing individual GitHub API calls with ZIP archive downloads. This is the primary change across multiple service files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Jan 29, 2026

Codecov Report

❌ Patch coverage is 68.84422% with 62 lines in your changes missing coverage. Please review.
✅ Project coverage is 50.41%. Comparing base (13bfeeb) to head (1be91b8).
⚠️ Report is 5 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...vices/template-refresh/template-refresh.service.ts 12.90% 23 Missing and 4 partials ⚠️
.../services/github-archive/github-archive.service.ts 87.64% 11 Missing ⚠️
...rc/template/providers/template-refresh.provider.ts 57.14% 6 Missing ⚠️
...vices/template-fetcher/template-fetcher.service.ts 85.00% 6 Missing ⚠️
...rc/template/providers/template-gallery.provider.ts 61.53% 5 Missing ⚠️
...vices/template-gallery/template-gallery.service.ts 44.44% 5 Missing ⚠️
apps/api/src/template/config/env.config.ts 0.00% 1 Missing ⚠️
...mplate/controllers/template/template.controller.ts 50.00% 1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (68.84%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2607      +/-   ##
==========================================
+ Coverage   50.06%   50.41%   +0.35%     
==========================================
  Files        1025     1029       +4     
  Lines       29025    29186     +161     
  Branches     6626     6631       +5     
==========================================
+ Hits        14531    14715     +184     
+ Misses      14213    14076     -137     
- Partials      281      395     +114     
Flag Coverage Δ
api 78.55% <68.84%> (-0.29%) ⬇️
deploy-web 32.23% <ø> (+0.39%) ⬆️
log-collector 75.35% <ø> (ø)
notifications 87.94% <ø> (ø)
provider-console 81.48% <ø> (ø)
provider-proxy 84.35% <ø> (ø)
Files with missing lines Coverage Δ
apps/api/src/template/config/env.config.ts 66.66% <0.00%> (-33.34%) ⬇️
...mplate/controllers/template/template.controller.ts 38.46% <50.00%> (-5.99%) ⬇️
...rc/template/providers/template-gallery.provider.ts 61.53% <61.53%> (ø)
...vices/template-gallery/template-gallery.service.ts 94.95% <44.44%> (-4.16%) ⬇️
...rc/template/providers/template-refresh.provider.ts 57.14% <57.14%> (ø)
...vices/template-fetcher/template-fetcher.service.ts 86.56% <85.00%> (-4.55%) ⬇️
.../services/github-archive/github-archive.service.ts 87.64% <87.64%> (ø)
...vices/template-refresh/template-refresh.service.ts 12.90% <12.90%> (ø)

... and 51 files with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@apps/api/src/template/services/github-archive/github-archive.service.ts`:
- Around line 15-31: The `#cache` in getArchive grows unbounded by ref causing
potential memory bloat; replace the raw Map<string, Promise<ArchiveReader>> with
a bounded cache (e.g., an LRU or size-limited wrapper) so that when adding a new
entry from getArchive (the cacheKey created in getArchive and the promise from
`#downloadAndParse`) older entries are evicted once a max size or total memory
threshold is reached; ensure eviction also deletes from the underlying storage
so failed downloads still remove keys (preserve the existing try/catch that
deletes on error) and reference the existing symbols (`#cache`, getArchive,
`#downloadAndParse`, ArchiveReader) when implementing the bounded-cache
replacement.
- Around line 60-98: Normalize incoming paths in the `#createArchiveReader` so
leading "./" or "/" don't break lookups: inside readFile and listDirectory,
strip any leading "./" and leading "/" from the received path (and treat empty
or "." as ""), then build fullPath and dirPath by concatenating rootPrefix and
the normalized path with exactly one "/" separator (avoid double slashes), and
compute entryPath from the normalized path when pushing DirectoryEntry; update
both readFile and listDirectory logic to use this normalizedPath variable so all
archive lookups are consistent.

In `@apps/api/src/template/services/template-fetcher/template-fetcher.service.ts`:
- Around line 81-94: fetchFileContent currently throws for missing refs/files
and archive read failures which lets transient archive errors abort the whole
refresh; update TemplateFetcherService so fetchTemplatesFromReadme wraps calls
to fetchFileContent (and any calls to this.#archiveService.getArchive /
archive.readFile) in a try/catch, log the error, and return a safe fallback
(e.g., null or an empty category list) instead of letting the error propagate;
ensure fetchTemplatesFromReadme returns the safe fallback when fetchFileContent
fails so the overall template refresh remains resilient.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@apps/api/src/template/config/env.config.ts`:
- Around line 5-8: The env schema for TEMPLATE_REFRESH_INTERVAL_SECONDS uses
z.number(), which rejects string env values; update the schema in
apps/api/src/template/config/env.config.ts to use z.coerce.number() instead of
z.number() so environment strings (e.g., "900") are coerced to numbers, keeping
.optional() and .default(15 * 60) as-is and leaving other fields unchanged.

In `@apps/api/src/template/services/github-archive/github-archive.service.ts`:
- Around line 95-108: The extraction silently skips entries on
zipfile.openReadStream or readStream errors; update the handler in
github-archive.service.ts (the zipfile.openReadStream callback and the
readStream.on("error") handler) to capture and surface errors instead of just
calling zipfile.readEntry(): log the error with context (include relativePath
and the original error) via the service logger or push the error into a
collection (e.g., an extractionErrors array) so callers can inspect/report
missing files, then continue processing by calling zipfile.readEntry().
- Around line 43-55: The fetch in async `#downloadAndParse`(owner, repo, ref) can
hang; wrap the request with an AbortController timeout: create an
AbortController, set a timer (e.g., 30s) that calls controller.abort(), pass
controller.signal to fetch(url, { signal }), and clear the timer after fetch
completes; ensure you catch the abort error and rethrow a descriptive Error
(including url and timeout) so the caller fails fast instead of hanging in
`#downloadAndParse` and downstream in `#extractArchive/`#createArchiveReader.
🧹 Nitpick comments (2)
apps/api/src/template/services/github-archive/github-archive.service.ts (1)

19-21: Consider injecting LoggerService for observability.

This service performs network I/O and archive parsing but has no logging. Injecting LoggerService would provide visibility into download times, parsing errors, and cache behavior, consistent with other services in the codebase.

apps/api/src/template/services/template-gallery/template-gallery.service.ts (1)

36-42: Consider injecting GitHubArchiveService for improved testability.

Creating new GitHubArchiveService() directly couples the service instantiation. While functional, injecting it as a dependency would make unit testing easier by allowing mock injection without modifying the constructor call site.

@@ -111,6 +111,7 @@
"tsyringe": "^4.10.0",
Copy link
Contributor

@github-actions github-actions bot Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔄 Carefully review the package-lock.json diff

Resolve the comment if everything is ok

+ node_modules/@types/yauzl                                                                2.10.3  
+ node_modules/@types/yazl                                                                 3.3.0   
+ node_modules/yazl/node_modules/buffer-crc32                                              1.0.0   
+ node_modules/yazl                                                                        3.3.1   
- node_modules/git-semver-tags/node_modules/conventional-commits-filter                    5.0.0   
- node_modules/git-semver-tags/node_modules/conventional-commits-parser                    6.2.1   

stalniy
stalniy previously approved these changes Jan 29, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/api/src/template/services/template-fetcher/template-fetcher.service.ts (1)

144-148: Replace Promise<any> with a proper return type.

The return type Promise<any> violates the coding guideline to never use type any. Based on the method logic, this should return the template type from processTemplate or null.

🔧 Suggested fix
   private async processTemplateSource(
     templateSource: TemplateSource,
     directoryItems: GithubDirectoryItem[],
     options: { includeConfigJson?: boolean }
-  ): Promise<any> {
+  ): Promise<Template | null> {

You may need to import the Template type if not already imported:

import type { Category, Template, TemplateSource } from "../../types/template.ts";

As per coding guidelines: **/*.{ts,tsx,js}: Never use type any or cast to type any. Always define the proper TypeScript types.

🤖 Fix all issues with AI agents
In `@apps/api/src/template/config/env.config.ts`:
- Around line 4-9: The TEMPLATE_REFRESH_ENABLED schema currently uses
z.coerce.boolean() which treats any non-empty string (e.g. "false") as true;
replace that with z.preprocess to explicitly map common string representations
("true","1","yes" => true; "false","0","no" => false) before validating with
z.boolean().optional().default(true), using z.preprocess(...) and leaving
TEMPLATE_REFRESH_INTERVAL_SECONDS and GITHUB_PAT unchanged; update the
TEMPLATE_REFRESH_ENABLED declaration to use this explicit string-to-boolean
preprocessing so environment values like "false" correctly become false.

In `@apps/api/src/template/services/github-archive/github-archive.service.ts`:
- Around line 30-43: The cache key in getArchive (cacheKey) currently ignores
the fileFilter so cached promises from a previous call (via this.#cache) can be
reused for different filters; update the caching strategy in getArchive to
include a stable identifier for fileFilter (e.g., append fileFilter?.toString()
or a provided filterId) to the cacheKey or maintain a nested cache keyed by
filter identity, then use that augmented key when calling this.#cache.get,
this.#cache.set and this.#cache.delete so each distinct filter/ref/owner/repo
combination gets its own cached promise from this.#downloadAndParse.
🧹 Nitpick comments (4)
apps/api/src/template/services/template-gallery/template-gallery.service.spec.ts (1)

246-262: Align setup helper signature with test guidelines.

The helper should accept a single inline-typed parameter to support overrides and avoid shared state.

♻️ Suggested adjustment
-  function setup() {
+  function setup({ githubPAT = "test-pat" }: { githubPAT?: string } = {}) {
     const logger = mock<LoggerService>();
     const fsMock = mock<FileSystemApi>({
       access: jest.fn(() => Promise.reject(new Error("File not found")))
     });
     const dataFolderPath = "/data";

     const service = new TemplateGalleryService(logger, fsMock, {
-      githubPAT: "test-pat",
+      githubPAT,
       dataFolderPath
     });
As per coding guidelines: Use `setup` function instead of `beforeEach` in test files. The `setup` function must be at the bottom of the root `describe` block, accept a single parameter with inline type definition, avoid shared state, and not have a specified return type.
apps/api/src/template/services/github-archive/github-archive.service.spec.ts (1)

83-99: Align setup helper signature with test guidelines.

Adopt a single inline-typed parameter for the setup helper.

♻️ Suggested adjustment
-  function setup() {
-    const logger = mock<LoggerService>();
+  function setup({ logger = mock<LoggerService>() }: { logger?: LoggerService } = {}) {
     const service = new GitHubArchiveService(logger);
As per coding guidelines: Use `setup` function instead of `beforeEach` in test files. The `setup` function must be at the bottom of the root `describe` block, accept a single parameter with inline type definition, avoid shared state, and not have a specified return type.
apps/api/src/template/services/template-fetcher/template-fetcher.service.spec.ts (1)

472-483: Align setup helper signature with test guidelines.

The helper should accept a single inline-typed parameter to support overrides.

♻️ Suggested adjustment
-  function setup() {
+  function setup({ githubPAT = "test-pat" }: { githubPAT?: string } = {}) {
     const templateProcessor = mock<TemplateProcessorService>();
     const logger = mock<LoggerService>();
     const octokit = mockDeep<Octokit>();
     const archiveService = mock<GitHubArchiveService>();

     const service = new TemplateFetcherService(templateProcessor, logger, () => octokit, archiveService, {
-      githubPAT: "test-pat"
+      githubPAT
     });

     return { service, templateProcessor, logger, octokit, archiveService };
   }
As per coding guidelines: Use `setup` function instead of `beforeEach` in test files. The `setup` function must be at the bottom of the root `describe` block, accept a single parameter with inline type definition, avoid shared state, and not have a specified return type.
apps/api/src/template/services/template-fetcher/template-fetcher.service.ts (1)

124-133: Consider extracting the hardcoded chain-registry reference.

The "master" branch is hardcoded for the cosmos/chain-registry repository. If the default branch changes (as many repos have moved to "main"), this would silently fail.

Consider extracting this to a constant similar to REPOSITORIES for consistency and easier maintenance.

♻️ Suggested refactor
 export const REPOSITORIES = {
   "awesome-akash": {
     repoOwner: "akash-network",
     repoName: "awesome-akash",
     mainBranch: "master"
   },
   "cosmos-omnibus": {
     repoOwner: "akash-network",
     repoName: "cosmos-omnibus",
     mainBranch: "master"
+  },
+  "chain-registry": {
+    repoOwner: "cosmos",
+    repoName: "chain-registry",
+    mainBranch: "master"
   }
 };

Then use it in fetchChainRegistryData:

   private async fetchChainRegistryData(chainPath: string): Promise<GithubChainRegistryChainResponse> {
-    const archive = await this.#archiveService.getArchive("cosmos", "chain-registry", "master", isTemplateFile);
+    const { repoOwner, repoName, mainBranch } = REPOSITORIES["chain-registry"];
+    const archive = await this.#archiveService.getArchive(repoOwner, repoName, mainBranch, isTemplateFile);

@baktun14 baktun14 merged commit c9cc429 into main Jan 29, 2026
99 of 102 checks passed
@baktun14 baktun14 deleted the refactor/template-fetching branch January 29, 2026 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments