-
Notifications
You must be signed in to change notification settings - Fork 16
Experiment: Bun #39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Experiment: Bun #39
Conversation
WalkthroughThis pull request migrates the project toolchain from Node.js/pnpm to Bun as the primary runtime and package manager. Changes include updating the Docker base image from node:18-alpine to oven/bun:alpine, replacing all npm/pnpm commands with Bun equivalents across GitHub Actions workflows, documentation, and package.json scripts. The application code is refactored to accommodate LangChain v0.x package structure updates and to replace the Express.js HTTP server with Bun's native Bun.serve API, including a custom streaming response handler. Package dependencies are reorganized, with LangChain split into separate modules ( Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes DetailsKey Changes by ComponentInfrastructure & Configuration:
Package & Dependencies:
Source Code:
Documentation:
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (2)
Dockerfile (1)
49-52: Thesourcesdirectory is required at runtime and cannot be removed.The hardcoded paths in
src/documents.js(e.g.,./sources/website/src/routes/docs/**/*.markdocand./sources/references/**/*.md) confirm that thesourcesdirectory is accessed at runtime. However, the optimization suggestion fornode_modulesremains valid: since the build bundles the code with--external hnswlib-node, only that native module (and its direct dependencies) are strictly needed at runtime. The full set ofnode_modulesincludes langchain's extensive optional peer dependencies that may not be used by the bundled application. If size optimization is a priority, consider analyzing the actual imports in the bundleddist/main.jsand copying only the necessary dependencies.src/embeddings.js (1)
40-42: Consider removing unusedgetOpenAIChatexport.This function now simply wraps
createChatModel. Looking atsrc/main.js, it is still used in the generic prompt endpoint. Consider whether the indirection is necessary, or ifcreateChatModelshould be exported directly instead.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
bun.lockis excluded by!**/*.lockpnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (12)
.dockerignore.github/workflows/format.yml.github/workflows/lint.ymlCONTRIBUTING.mdDockerfileREADME.mddocker-compose.ymleslint.config.jspackage.jsonsrc/documents.jssrc/embeddings.jssrc/main.js
🧰 Additional context used
🧬 Code graph analysis (1)
src/main.js (1)
src/embeddings.js (5)
prompt(51-54)getRagChain(48-57)getRagChain(48-57)getOpenAIChat(40-42)getOpenAIChat(40-42)
🔇 Additional comments (26)
.dockerignore (1)
1-47: LGTM! Comprehensive ignore patterns improve Docker build hygiene.The expanded
.dockerignorefile is well-structured and covers all the necessary categories: documentation, build artifacts, alternative package manager lock files, tests, IDE files, logs, environment files, OS-specific files, CI configuration, and Docker-related files. This will reduce the Docker context size and prevent unnecessary files from being included in the image.CONTRIBUTING.md (4)
17-22: LGTM! Documentation updated to reflect Bun toolchain.The contributing guide has been correctly updated to replace pnpm commands with Bun equivalents, aligning with the broader toolchain migration in this PR.
32-32: LGTM! Development command updated.The dev command now uses
bun run dev, consistent with the Bun migration.
44-44: LGTM! Test command reference updated.The reference to the test script correctly uses
bun run test.
65-66: LGTM! Formatting and linting commands updated.Both commands now use Bun, consistent with the toolchain migration across the repository.
src/documents.js (1)
3-4: No action required. The LangChain import paths are correct and compatible with the installed package versions.The imports use the proper namespaced structure for LangChain v0.3:
@langchain/core@^0.3.25is installed and exportsDocumentfrom thedocumentssubpath@langchain/textsplitters@^0.1.0is installed and exportsMarkdownTextSplitterBoth imports and package versions are compatible.
README.md (3)
11-25: LGTM! Clear migration to Bun tooling.The installation instructions accurately reflect the migration from pnpm to Bun, with consistent command references and clear documentation source information.
31-37: LGTM! Accurate usage documentation.The usage instructions correctly document the OpenAI configuration options (API key and model selection with gpt-4o default) and update the dev command to use Bun.
41-45: Good addition! Helpful testing example.The curl example provides a practical way for developers to test the server endpoint.
eslint.config.js (1)
7-14: LGTM! Correct ESLint configuration for Bun runtime.Adding
Bunas a readonly global ensures ESLint recognizes Bun-specific APIs without flagging them as undefined, which aligns with the broader migration to Bun runtime.package.json (3)
8-14: LGTM! Complete script migration to Bun.All npm scripts have been correctly updated to use Bun commands, including the
--watchflag for development hot-reloading as a replacement for nodemon.
34-34: LGTM! Correct Bun-specific configuration.The addition of
@types/bunprovides proper TypeScript support, and thetrustedDependenciesarray correctly identifieshnswlib-nodeas a native module that requires build permissions in Bun.Also applies to: 39-41
19-30: LangChain v0.x migration is correctly implemented.The codebase properly uses the modular LangChain v0.3.x architecture. While imports span both the monolithic
langchainpackage (forloadQAStuffChainfromlangchain/chains) and the new modular packages (@langchain/core,@langchain/community,@langchain/openai,@langchain/textsplitters), this hybrid approach is supported in v0.3.x and all imports are correctly configured.Dockerfile (4)
1-8: LGTM! Appropriate base image and dependencies.The migration to
oven/bun:alpinewith necessary build tools (python3, make, g++, git) supports both native module compilation (hnswlib-node) and source fetching operations.
14-16: LGTM! Correct dependency installation.Using
--frozen-lockfileensures reproducible builds, and--productionappropriately excludes development dependencies in the builder stage.
21-30: LGTM! Flexible build configuration.The build arguments for Git and website URLs provide appropriate flexibility for different deployment environments while maintaining sensible defaults.
36-41: LGTM! Proper Bun build configuration.The build flags are appropriate:
--target bunensures Bun runtime compatibility--minifyreduces bundle size--external hnswlib-nodecorrectly excludes the native module from bundling, allowing it to load at runtimedocker-compose.yml (1)
10-10: LGTM! Consistent with Dockerfile and documentation.Adding the
_APP_ASSISTANT_OPENAI_MODELenvironment variable allows runtime configuration of the OpenAI model, consistent with the Dockerfile default and README.md documentation.src/embeddings.js (4)
1-5: LGTM on import updates.The migration to scoped LangChain packages (
@langchain/community,@langchain/openai,@langchain/core) aligns with LangChain's v0.3.x module structure.
11-24: LGTM on the newcreateChatModelhelper.Clean extraction of the ChatOpenAI instantiation logic with streaming support. The callback-based token handling via
handleLLMNewTokenis correctly configured.
26-35: LGTM on embeddings configuration update.The change to
apiKeyaligns with the@langchain/openaiv0.3.15 API, as confirmed in the official documentation.
48-56: No action needed. TheChatPromptTemplatecorrectly uses{context}and{question}placeholders, which match the default expectations ofloadQAStuffChain. The chain will convertinput_documentsto thecontextvariable as intended.src/main.js (4)
16-29: LGTM on the streaming response factory.Clean abstraction for creating streaming responses with proper error handling via
controller.error().
31-46: LGTM on CORS and preflight handling.The CORS configuration is appropriately permissive for a development/internal service. Preflight requests are handled correctly.
68-72: LGTM on document retrieval update.Using
retriever.invoke(prompt)aligns with LangChain's updated retriever API (replacing deprecatedgetRelevantDocuments).
116-123: Race condition: server accepts requests before retriever is initialized.
Bun.servestarts immediately and can accept requests whileinitializeRetriever()is still running. The/v1/models/assistant/promptendpoint correctly checks for!retriever, but this code pattern is unusual—typically initialization completes before the server starts listening.Consider restructuring to initialize before serving:
🔎 Proposed restructure
-Bun.serve({ - port, - async fetch(req) { +const startServer = async () => { + console.log("Initializing search index..."); + try { + retriever = await initializeRetriever(); + console.log("Search index initialized"); + } catch (e) { + console.error("Failed to initialize search index:", e); + process.exit(1); + } + + Bun.serve({ + port, + async fetch(req) { // ... existing handler code ... - }, -}); - -console.log(`Started server on port: ${port}`); -console.log("Initializing search index..."); -try { - retriever = await initializeRetriever(); - console.log("Search index initialized"); -} catch (e) { - console.error(e); -} + }, + }); + + console.log(`Started server on port: ${port}`); +}; + +startServer();
| - name: Setup Bun | ||
| uses: oven-sh/setup-bun@v2 | ||
| with: | ||
| node-version: '18' | ||
| bun-version: latest | ||
|
|
||
| - name: Install dependencies | ||
| run: npm install | ||
| run: bun install | ||
|
|
||
| - name: Format | ||
| run: npm run format | ||
| run: bun run format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
latest stable version of Bun runtime
💡 Result:
The latest stable Bun release is v1.3.5 (released December 17, 2025). [1]
To verify locally: run bun --version. [2]
Sources:
- Bun release notes — v1.3.5 (Dec 17, 2025). [1]
- Bun upgrade / version guide (how to check with bun --version). [2]
Pin the Bun version for reproducible CI builds.
Using bun-version: latest can lead to non-deterministic builds when new Bun versions are released. A breaking change or behavioral difference in a future Bun release could cause unexpected CI failures or inconsistent formatting results across different workflow runs. Pin to the latest stable version (currently v1.3.5).
🔎 Recommended fix to pin Bun version
- name: Setup Bun
uses: oven-sh/setup-bun@v2
with:
- bun-version: latest
+ bun-version: "1.3.5"
- name: Install dependencies
run: bun install
- name: Format
run: bun run format📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Setup Bun | |
| uses: oven-sh/setup-bun@v2 | |
| with: | |
| node-version: '18' | |
| bun-version: latest | |
| - name: Install dependencies | |
| run: npm install | |
| run: bun install | |
| - name: Format | |
| run: npm run format | |
| \ No newline at end of file | |
| run: bun run format | |
| - name: Setup Bun | |
| uses: oven-sh/setup-bun@v2 | |
| with: | |
| bun-version: "1.3.5" | |
| - name: Install dependencies | |
| run: bun install | |
| - name: Format | |
| run: bun run format |
🤖 Prompt for AI Agents
In .github/workflows/format.yml around lines 16 to 25, the workflow pins Bun to
"latest" which makes CI non-deterministic; update the uses step to set with:
bun-version: v1.3.5 (the current stable) instead of latest so the workflow
installs a fixed Bun release and produces reproducible builds and formatting
results.
| - name: Setup Bun | ||
| uses: oven-sh/setup-bun@v2 | ||
| with: | ||
| node-version: '18' | ||
| bun-version: latest | ||
|
|
||
| - name: Install dependencies | ||
| run: npm install | ||
| run: bun install | ||
|
|
||
| - name: Lint | ||
| run: npm run lint | ||
| run: bun run lint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pin the Bun version for reproducible CI builds.
Using bun-version: latest can lead to non-deterministic builds when new Bun versions are released. A breaking change or behavioral difference in a future Bun release could cause unexpected CI failures or inconsistent linting results across different workflow runs.
🔎 Recommended fix to pin Bun version
- name: Setup Bun
uses: oven-sh/setup-bun@v2
with:
- bun-version: latest
+ bun-version: "1.1.38"
- name: Install dependencies
run: bun install
- name: Lint
run: bun run lintConsider using the same pinned version across all workflows for consistency.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Setup Bun | |
| uses: oven-sh/setup-bun@v2 | |
| with: | |
| node-version: '18' | |
| bun-version: latest | |
| - name: Install dependencies | |
| run: npm install | |
| run: bun install | |
| - name: Lint | |
| run: npm run lint | |
| \ No newline at end of file | |
| run: bun run lint | |
| - name: Setup Bun | |
| uses: oven-sh/setup-bun@v2 | |
| with: | |
| bun-version: "1.1.38" | |
| - name: Install dependencies | |
| run: bun install | |
| - name: Lint | |
| run: bun run lint |
🤖 Prompt for AI Agents
In .github/workflows/lint.yml around lines 16 to 25, the workflow currently uses
bun-version: latest; change this to a pinned Bun version (use a concrete semver
string) or reference a repository-wide variable/env that holds the pinned
version, update all other GitHub workflows to use the same pinned value for
consistency, and add a short note in README/CI docs to record the chosen Bun
version and the process for bumping it (bump, test locally/CI, then update the
pinned value).
| ARG _APP_ASSISTANT_OPENAI_API_KEY | ||
| ENV _APP_ASSISTANT_OPENAI_API_KEY=${_APP_ASSISTANT_OPENAI_API_KEY} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
echo "=== Checking OPENAI_API_KEY usage in scripts ==="
rg -n "OPENAI_API_KEY" --max-count=20
echo -e "\n=== Checking Dockerfile context around lines 30-35 ==="
head -n 50 Dockerfile | tail -n +25
echo -e "\n=== Checking prod stage (around line 54) ==="
tail -n +45 Dockerfile | head -n 20
echo -e "\n=== Checking fetch-sources script ==="
if [ -f "scripts/fetch-sources.sh" ]; then
cat scripts/fetch-sources.sh
elif [ -f "scripts/fetch-sources.js" ]; then
head -n 50 scripts/fetch-sources.js
else
find . -name "*fetch-sources*" -type f 2>/dev/null | head -5
fi
echo -e "\n=== Checking for build-embeddings references ==="
rg -n "build-embeddings" --max-count=20Repository: appwrite/assistant
Length of output: 2130
🏁 Script executed:
#!/bin/bash
echo "=== Checking if fetch-sources uses API key ==="
find . -name "fetch-sources*" -type f 2>/dev/null | xargs cat 2>/dev/null | head -50
echo -e "\n=== Checking if API key is actually used in runtime ==="
cat -n src/embeddings.js | head -40
echo -e "\n=== Checking package.json build scripts ==="
grep -A 5 "scripts" package.json | head -20Repository: appwrite/assistant
Length of output: 2083
Remove the OpenAI API key from build arguments—it's only needed at runtime.
The API key is passed as a build ARG and then set as an ENV, but it's never actually consumed during the build process. Embeddings are generated at runtime in src/embeddings.js, not during Docker image build. The build-time ARG is unnecessary and could expose the key in build history.
Recommendation: Remove lines 33-34 and rely only on the API key being provided at container runtime (which is how it's already reset in the prod stage on line 54).
🤖 Prompt for AI Agents
In Dockerfile around lines 33-34, remove the build ARG and ENV for
_APP_ASSISTANT_OPENAI_API_KEY because the key is not used during image build and
should only be provided at runtime; delete the two lines declaring ARG
_APP_ASSISTANT_OPENAI_API_KEY and ENV
_APP_ASSISTANT_OPENAI_API_KEY=${_APP_ASSISTANT_OPENAI_API_KEY} so the secret is
not leaked into build history and rely on the existing runtime provision (as
done in the prod stage at line 54).
| const body = await req.json(); | ||
| const { prompt } = body; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing error handling for malformed JSON body.
If req.json() fails (e.g., invalid JSON or missing body), the server will throw an unhandled exception. Consider wrapping in try-catch.
🔎 Proposed fix
- const body = await req.json();
- const { prompt } = body;
+ let body;
+ try {
+ body = await req.json();
+ } catch {
+ return new Response("Invalid JSON body", {
+ status: 400,
+ headers: corsHeaders,
+ });
+ }
+ const { prompt } = body;
+
+ if (!prompt) {
+ return new Response("Missing 'prompt' field", {
+ status: 400,
+ headers: corsHeaders,
+ });
+ }🤖 Prompt for AI Agents
In src/main.js around lines 65 to 66, the call to await req.json() has no error
handling so a malformed or missing JSON body will throw and crash the request;
wrap the parsing in a try-catch, on parse errors return a 400 response with a
clear error message (e.g., "Invalid JSON body"), and after successful parse
validate that prompt exists and is a string, returning 400 if missing or invalid
before proceeding.
| const stream = createStreamingResponse(async (onToken) => { | ||
| const chain = await getRagChain(onToken, SYSTEM_PROMPT); | ||
| await chain.invoke({ | ||
| input_documents: relevantDocuments, | ||
| question: prompt, | ||
| }); | ||
|
|
||
| if (sources.size > 0) { | ||
| onToken("\n\nSources:\n"); | ||
| for (const sourceUrl of sources) { | ||
| onToken("- " + sourceUrl + "\n"); | ||
| } | ||
| } | ||
| }); | ||
|
|
||
| return new Response(stream, { | ||
| headers: { ...corsHeaders, "Content-Type": "text/plain" }, | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Streaming errors are not propagated to the HTTP response.
If getRagChain or chain.invoke throws, controller.error(error) is called, but this may not translate to a proper HTTP error for the client—the response stream may just terminate abruptly. Consider catching errors within the callback and emitting an error message to the stream before closing, or returning a non-streaming error response for critical failures.
| const body = await req.json(); | ||
| const { prompt } = body; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same JSON parsing vulnerability exists here.
The generic prompt endpoint also lacks error handling for malformed JSON, similar to the assistant endpoint. Apply the same fix.
🤖 Prompt for AI Agents
In src/main.js around lines 96-97, the code directly calls await req.json() and
destructures prompt without handling malformed JSON or missing fields; wrap the
JSON parsing and destructuring in a try/catch, return a 400 Bad Request with a
clear error message when JSON.parse fails or when prompt is absent/invalid, and
only proceed when prompt is a non-empty string (matching the fix applied to the
assistant endpoint).
What does this PR do?
(Provide a description of what this PR does.)
Test Plan
(Write your test plan here. If you changed any code, please provide us with clear instructions on how you verified your changes work.)
Related PRs and Issues
(If this PR is related to any other PR or resolves any issue or related to any issue link all related PR and issues here.)
Have you read the Contributing Guidelines on issues?
(Write your answer here.)
Summary by CodeRabbit
New Features
_APP_ASSISTANT_OPENAI_MODELwith defaultgpt-4o)Chores
✏️ Tip: You can customize this high-level summary in your review settings.