Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ Create `.env` from `.env.example` and fill required values:
- `DATABASE_URL`
- `NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY`
- `CLERK_SECRET_KEY`
- `BLOB_READ_WRITE_TOKEN` (Vercel Blob read/write token)
- `OPENAI_API_KEY`
- `INNGEST_EVENT_KEY`, as placeholder

Expand All @@ -137,6 +138,18 @@ Optional integrations:
- `LANGCHAIN_TRACING_V2`, `LANGCHAIN_API_KEY`, `LANGCHAIN_PROJECT`
- `DEBUG_PERF` (`1` or `true`) to enable dev perf logs for middleware and key auth/dashboard APIs

### 2.1) Configure Vercel Blob Storage

Vercel Blob is used for storing uploaded documents. Both **public** and **private** stores are supported -- the upload logic auto-detects which mode the store uses and adapts automatically.

1. In the Vercel dashboard, go to **Storage → Blob → Create Store**.
2. Choose either **Public** or **Private** access. Both work:
- **Public** stores produce URLs the browser can load directly (faster for previews).
- **Private** stores keep files behind authentication; the app proxies content through `/api/documents/[id]/content` and `/api/files/[id]` so previews still work.
3. Generate a **Read/Write token** for the store and add it as `BLOB_READ_WRITE_TOKEN` in your environment (`.env` locally, or Vercel Project Settings for deploys).
4. Redeploy so the token is available at build and runtime.
5. Verify: sign in to the Employer Upload page, upload a small PDF, and confirm `/api/upload-local` returns a `vercel-storage.com` URL without errors.

### 3) Start database and apply schema

```bash
Expand Down
11 changes: 9 additions & 2 deletions docs/deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ docker compose --env-file .env --profile dev up

1. Import repository into Vercel.
2. Configure managed PostgreSQL (Vercel Postgres, Neon, Supabase, etc.).
3. Set `DATABASE_URL` and app environment variables.
3. Set `DATABASE_URL`, `BLOB_READ_WRITE_TOKEN`, and the other app environment variables.
4. Deploy with Vercel defaults.
5. Apply schema once:

Expand All @@ -65,6 +65,12 @@ Optional integrations:
- LangSmith for tracing
- Sidecar (deploy separately and set `SIDECAR_URL`)

### Verifying Blob uploads on Vercel

1. After deploy, sign in to the Employer portal and open `/employer/upload`.
2. Upload any small PDF or DOCX. The `/api/upload-local` response should return a `vercel-storage.com` URL.
3. Paste that URL into a new tab. The file should download directly, confirming Blob access end to end.

## Option 3: VPS self-hosted (Node + reverse proxy)

1. Install Node.js 18+, pnpm, Nginx, and PostgreSQL with pgvector.
Expand All @@ -89,7 +95,8 @@ Optional: Run the sidecar separately and point `SIDECAR_URL` to it.
| `CLERK_SECRET_KEY` | Yes | Clerk secret key |
| `OPENAI_API_KEY` | Yes | OpenAI API key |
| `INNGEST_EVENT_KEY` | Yes (prod) | Inngest event key for background jobs |
| `UPLOADTHING_TOKEN` | Optional | UploadThing for cloud storage |
| `BLOB_READ_WRITE_TOKEN` | Yes (Vercel) | Required for Vercel Blob uploads |
| `UPLOADTHING_TOKEN` | Optional | UploadThing legacy uploader |
| `SIDECAR_URL` | Optional | Sidecar URL for reranking and Graph RAG |
| `TAVILY_API_KEY` | Optional | Web search for analysis |
| `AZURE_DOC_INTELLIGENCE_*` | Optional | OCR for scanned PDFs |
Expand Down
8 changes: 8 additions & 0 deletions drizzle/0002_vercel_blob.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
ALTER TABLE "file_uploads"
ADD COLUMN IF NOT EXISTS "storage_provider" varchar(64) NOT NULL DEFAULT 'database',
ADD COLUMN IF NOT EXISTS "storage_url" varchar(1024),
ADD COLUMN IF NOT EXISTS "storage_pathname" varchar(1024),
ADD COLUMN IF NOT EXISTS "blob_checksum" varchar(128);

ALTER TABLE "file_uploads"
ALTER COLUMN "file_data" DROP NOT NULL;
3 changes: 3 additions & 0 deletions next.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,9 @@ const config: NextConfig = {
"@img/sharp-libvips-linuxmusl-x64",
"@img/sharp-libvips-linux-x64",
"pdf-lib",
"jszip",
"readable-stream",
"mammoth",
],
};

Expand Down
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@
"@tiptap/starter-kit": "^3.20.0",
"@uploadthing/react": "^7.3.3",
"@vercel/analytics": "^1.6.1",
"@vercel/blob": "^2.3.0",
"cheerio": "^1.2.0",
"class-variance-authority": "^0.7.1",
"clsx": "*",
Expand Down
45 changes: 45 additions & 0 deletions pnpm-lock.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion scripts/ensure-pgvector.mjs
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import 'dotenv/config';
import dotenv from "dotenv";
dotenv.config();
import postgres from "postgres";

const url = process.env.DATABASE_URL;
Expand Down
3 changes: 2 additions & 1 deletion scripts/test-trend-search.ts
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,8 @@ Running pipeline (plan → search → synthesize)…
}
*/

import "dotenv/config";
import dotenv from "dotenv";
dotenv.config();

// Skip the full env validation so we don't need DB/Clerk/Inngest keys
process.env.SKIP_ENV_VALIDATION = "true";
Expand Down
91 changes: 91 additions & 0 deletions src/app/api/documents/[id]/content/route.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
import { NextResponse } from "next/server";
import { eq } from "drizzle-orm";
import { auth } from "@clerk/nextjs/server";
import { db } from "~/server/db";
import { document } from "~/server/db/schema";
import { isPrivateBlobUrl, fetchBlob } from "~/server/storage/vercel-blob";

const EXTENSION_TO_MIME: Record<string, string> = {
".pdf": "application/pdf",
".png": "image/png",
".jpg": "image/jpeg",
".jpeg": "image/jpeg",
".gif": "image/gif",
".webp": "image/webp",
".tiff": "image/tiff",
".tif": "image/tiff",
".bmp": "image/bmp",
".docx": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
".xlsx": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
".pptx": "application/vnd.openxmlformats-officedocument.presentationml.presentation",
".txt": "text/plain",
".csv": "text/csv",
".html": "text/html",
".md": "text/markdown",
};

function inferMime(name: string): string {
const match = /(\.[a-z0-9]+)(?:\?|#|$)/i.exec(name);
return (match?.[1] && EXTENSION_TO_MIME[match[1].toLowerCase()]) ?? "application/octet-stream";
}

interface RouteParams {
params: Promise<{ id: string }>;
}

export async function GET(_request: Request, { params }: RouteParams) {
try {
const { userId } = await auth();
if (!userId) {
return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
}

const { id } = await params;
const docId = parseInt(id, 10);
if (isNaN(docId)) {
return NextResponse.json({ error: "Invalid document ID" }, { status: 400 });
}

const [doc] = await db
.select({ url: document.url, title: document.title })
.from(document)
.where(eq(document.id, docId));

if (!doc) {
return NextResponse.json({ error: "Document not found" }, { status: 404 });
}

if (!isPrivateBlobUrl(doc.url)) {
return NextResponse.redirect(doc.url, { status: 307 });
}

const blobRes = await fetchBlob(doc.url);
if (!blobRes.ok) {
return NextResponse.json(
{ error: "Failed to retrieve document from storage" },
{ status: 502 },
);
}

const mimeType =
blobRes.headers.get("content-type") ?? inferMime(doc.title);

return new NextResponse(blobRes.body, {
status: 200,
headers: {
"Content-Type": mimeType,
...(blobRes.headers.get("content-length")
? { "Content-Length": blobRes.headers.get("content-length")! }
: {}),
"Content-Disposition": `inline; filename="${encodeURIComponent(doc.title)}"; filename*=UTF-8''${encodeURIComponent(doc.title)}`,
"Cache-Control": "private, max-age=3600",
},
});
} catch (error) {
console.error("Error serving document content:", error);
return NextResponse.json(
{ error: "Failed to serve document", details: error instanceof Error ? error.message : "Unknown error" },
{ status: 500 },
);
}
}
7 changes: 7 additions & 0 deletions src/app/api/fetchDocument/route.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import { document, users, fileUploads } from "../../../server/db/schema/base";
import { eq, inArray } from "drizzle-orm";
import { validateRequestBody, UserIdSchema } from "~/lib/validation";
import { auth } from '@clerk/nextjs/server';
import { isPrivateBlobUrl } from "~/server/storage/vercel-blob";

/** Extract file id from /api/files/{id} URL so we can look up mimeType from file_uploads */
const FILE_API_ID_REGEX = /\/api\/files\/(\d+)/;
Expand Down Expand Up @@ -104,8 +105,14 @@ export async function POST(request: Request) {
const mimeType = mimeFromFile
?? inferMimeFromName(doc.title)
?? inferMimeFromName(doc.url);

const url = isPrivateBlobUrl(doc.url)
? `/api/documents/${Number(doc.id)}/content`
: doc.url;

return {
...doc,
url,
id: Number(doc.id),
companyId: Number(doc.companyId),
...(mimeType && { mimeType }),
Expand Down
Loading