Skip to content

[Workers] add best practices guide#28295

Merged
elithrar merged 11 commits intoproductionfrom
workers-best-practices-guide
Feb 12, 2026
Merged

[Workers] add best practices guide#28295
elithrar merged 11 commits intoproductionfrom
workers-best-practices-guide

Conversation

@elithrar
Copy link
Collaborator

Adds a new "Workers Best Practices" page covering 21 rules across 7 sections, with code examples for each. Closes #28291.

The guide is informed by common issues from cloudflare/workers-sdk and cloudflare/cloudflare-docs over the past year, the code-review skill's common-patterns reference, and the existing Durable Objects and Workflows best practices guides.

Covers the 9 explicitly requested topics — compatibility dates, nodejs_compat, observability/tracing with sampling, waitUntil and Tail Workers, streaming, wrangler secret, wrangler types, bindings over REST APIs, Queues/Workflows for background work — plus 12 more from issue/PR research:

  • Use wrangler.jsonc for non-secret config
  • Set cache headers explicitly
  • Service bindings for Worker-to-Worker RPC
  • Do not store request-scoped state in global scope
  • Keep startup fast (lazy init)
  • Always await or waitUntil Promises
  • Handle nullable binding returns
  • Prefer Workers-compatible libraries
  • Web Crypto for secure token generation + timingSafeEqual
  • Explicit error handling over passThroughOnException
  • Test with @cloudflare/vitest-pool-workers (with nodejs_compat gotcha)

Format mirrors "Rules of Durable Objects" and "Rules of Workflows" — each rule has concise prose followed by a TypeScriptExample or WranglerConfig showing the correct pattern (and anti-pattern where useful).

All code examples validated against latest @cloudflare/workers-types and wrangler config schema via the code-review skill. 0 critical/high/medium issues.

New files:

  • src/content/docs/workers/best-practices/index.mdx
  • src/content/docs/workers/best-practices/workers-best-practices.mdx

@github-actions
Copy link
Contributor

This pull request requires reviews from CODEOWNERS as it changes files that match the following patterns:

Pattern Owners
/src/content/docs/workers/ @cloudflare/workers-docs, @GregBrimble, @irvinebroque, @mikenomitch, @korinne, @WalshyDev, @cloudflare/deploy-config, @cloudflare/pcx-technical-writing, @cloudflare/wrangler, @mattietk, @cloudflare/dev-plat-leads

@elithrar elithrar marked this pull request as ready for review February 12, 2026 19:11
Co-authored-by: irvinebroque <irvinebroque@users.noreply.github.com>
@ask-bonk
Copy link
Contributor

ask-bonk bot commented Feb 12, 2026

The change looks clean. Here is what I did:

  • Removed the entire "Use Tail Workers for durable logging" section (lines 649-695), including its prose, WranglerConfig block, and TypeScriptExample code block.
  • Added a link to the Workers Observability overview (/workers/observability/) as a general reference, so readers are pointed toward the first-class observability tooling instead.

This addresses @irvinebroque's feedback: Tail Workers should not be recommended as a best practice for the general case. Workers Observability (Logs + Traces) is the right recommendation, and that is already covered by the preceding section. Tail Workers are better discussed in the context of Workers for Platforms.

github run


</TypeScriptExample>

### Do not use passThroughOnException as error handling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/bonk please add this to the reference docs on the passThroughOnException()` API, if we do not already call this out. If you are able to, do this in a separate pull request

….mdx

Co-authored-by: Brendan Irvine-Broque <birvine-broque@cloudflare.com>
@elithrar
Copy link
Collaborator Author

/bonk address this feedback:

Co-authored-by: elithrar <elithrar@users.noreply.github.com>
@ask-bonk
Copy link
Contributor

ask-bonk bot commented Feb 12, 2026

All four changes have been applied:

  1. Removed the "Keep startup fast" section — the entire section (heading, prose, and TypeScriptExample code block) has been deleted. The "Code patterns" section now goes directly from "Do not store request-scoped state in global scope" to "Always await or waitUntil your Promises".

  2. Floating promises — added lint rule recommendations — added a paragraph recommending no-floating-promises for both ESLint (@typescript-eslint/no-floating-promises) and oxlint (typescript/no-floating-promises), with a bash block showing a one-liner for each.

  3. Crypto — added Node.js crypto note — added a line after the Web Crypto intro paragraph noting that node:crypto is also fully supported with nodejs_compat, linking to /workers/runtime-apis/nodejs/crypto/.

  4. Secrets — switched from .dev.vars to .env — changed the prose to recommend .env (with an explicit reminder to add it to .gitignore), removed the false claim about automatic version control exclusion, added a link to the Environment variables docs, and updated the bash comment accordingly.

github run

@ask-bonk
Copy link
Contributor

ask-bonk bot commented Feb 12, 2026

Now I see the issue clearly. Let me look at the concat example more carefully — the pipeline async IIFE at line 242-253 is a fire-and-forget promise. The response is returned immediately at line 256, but the pipeline promise is still running in the background pumping data through the stream. If the Worker runtime cleans up after the response is sent, that pipeline could be terminated before it finishes.

Here are my suggestions for the review thread:


Good catch. The concat example at lines 228-263 actually has this exact bug right now — the pipeline async IIFE is a floating promise. The fetch handler returns the Response with the readable side of the stream, but the background work that pumps data into the writable side is not kept alive by anything. If the runtime decides the request is "done" after the response is returned, the pipeline can be killed mid-stream.

Here are three suggestions for improving this section:

1. Fix the existing concat example to use ctx.waitUntil

The pipeline promise on line 242 should be wrapped in ctx.waitUntil(pipeline) to ensure the runtime keeps the Worker alive until all chunks are piped and the writable side is closed. The function signature needs ctx: ExecutionContext added. This is the most impactful change because the current code is subtly broken.

2. Add a callout bridging the streaming and waitUntil sections

After the concat example (around line 265), add a note like:

When you return a Response with a stream body but still need to write to that stream after returning, you must use ctx.waitUntil() to keep the Worker alive. Without it, the runtime may terminate your Worker before the stream is fully written, causing the client to receive a truncated response. See the waitUntil section below.

This explicitly connects the two concepts that are currently presented as independent topics.

3. Add a dedicated waitUntil + streams example in the waitUntil section

The current waitUntil section only shows simple fire-and-forget use cases (analytics, cache writes). Adding a streaming example would demonstrate the more critical use case — where waitUntil is not just an optimization but is required for correctness. Something like:

async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
    const { readable, writable } = new TransformStream();

    // The pipeline writes to the stream after the response is returned.
    // Without waitUntil, the Worker can be terminated before the stream is complete.
    ctx.waitUntil(
        (async () => {
            const writer = writable.getWriter();
            for (const chunk of await getChunks(env)) {
                await writer.write(new TextEncoder().encode(chunk));
            }
            await writer.close();
        })()
    );

    return new Response(readable, {
        headers: { "Content-Type": "text/plain" },
    });
}

This makes the point that waitUntil is not just for "nice to have" background work — it is structurally required when you return a stream whose writer outlives the fetch handler's return.


The most important of these is #1 — the existing concat example is actively misleading without ctx.waitUntil. Suggestions #2 and #3 reinforce the concept for readers who might not connect the two sections on their own.

github run

@irvinebroque
Copy link
Contributor

/bonk please review the content in this pull request, extract out the topics and guidance, and then perform a thorough audit of the existing developer docs, to identify specific topics that are covered here, but are not addressed within our existing developer docs. Do not yet attempt to address these, just leave a comment here on this pull request that details your findings, and provides a prioritized list of recommendations for content additions and modifications to the existing developer docs

@ask-bonk
Copy link
Contributor

ask-bonk bot commented Feb 12, 2026

Docs Coverage Audit: Workers Best Practices Guide

I performed a thorough audit of the existing developer docs against every topic covered in this best practices guide. Below is a summary of what the guide covers that is not adequately addressed in the existing docs, followed by a prioritized list of recommended content actions.


Methodology

I extracted 21 distinct topics/rules from the guide across 7 sections, then searched the full src/content/docs/ tree (Workers, Queues, Workflows, Hyperdrive, Durable Objects, Cache, and relevant partials) for each topic. For each, I assessed whether existing docs (a) contain the same guidance, (b) explain the "why", (c) have comparable examples, and (d) present it prescriptively vs. as pure reference.


Topics with GOOD existing coverage (no action needed)

These topics are already well-documented in existing reference and product docs. The best practices guide provides a useful consolidated view but does not introduce net-new guidance:

  • Use Durable Objects for WebSockets — DO websockets best practices page is among the most thorough in the docs (712 lines)
  • Use Hyperdrive for external database connections — connection-lifecycle.mdx already has good/bad examples and the "create client per request" pattern
  • Use service bindings for Worker-to-Worker communication — service-bindings/index.mdx already calls out zero overhead, no cost, and has full examples
  • Use Queues and Workflows for async work — product docs describe capabilities well
  • Keep compatibility date current — compatibility-dates.mdx covers both "set to today" and "periodically update" thoroughly
  • Test with @cloudflare/vitest-pool-workers — testing docs are extensive, including the nodejs_compat auto-injection pitfall
  • Use wrangler.jsonc for non-secret config — already recommended in wrangler/configuration.mdx line 25

Topics with SIGNIFICANT gaps — Priority actions

P0: Critical inconsistencies and contradictions

1. .env vs .dev.vars — docs actively contradict the new best practice

  • The guide recommends .env exclusively; the shared partial secrets-in-dev.mdx (used in 4+ pages) says "choose either .dev.vars or .env"
  • The Prisma tutorial (using-prisma-postgres-with-workers.mdx, line 93) states "Cloudflare Workers do not support .env files" — this is factually incorrect
  • 8+ tutorials use .dev.vars exclusively; zero tutorials use .env
  • The Vite plugin docs only reference .dev.vars
  • Action: Update secrets-in-dev.mdx partial to recommend .env as the default. Fix the Prisma tutorial's incorrect claim. Update tutorials incrementally.

2. passThroughOnException — messaging conflict between docs

  • The guide says "do not use passThroughOnException as error handling" (line 801)
  • The errors page (errors.mdx, lines 331-367) actively recommends it under "Go to origin on error" without caveats
  • The context API page presents it neutrally
  • Action: Add caveats to both errors.mdx and context.mdx noting that explicit error handling is preferred for most use cases, and that passThroughOnException is primarily a migration tool.

3. wrangler types vs hand-written Env — 32 pages teach the anti-pattern

  • The guide says "Do not hand-write your Env interface" — this is the only place in the docs with that warning
  • 32 pages across Workers docs show hand-written interface Env with no disclaimer
  • The most-visited config page (environment-variables.mdx, line 37) teaches hand-written Env as the standard approach
  • The wrangler-typegen.mdx partial exists but is only used in 4 pages (Workflows, Vectorize, AI Gateway) — not used in any Workers binding docs (KV, R2, D1, Durable Objects)
  • Action: Add the wrangler-typegen partial to KV, R2, D1, and Durable Objects binding docs. Update environment-variables.mdx to recommend wrangler types. Add a note to hand-written Env examples that they are illustrative.

P1: Missing guidance not covered anywhere

4. console.error/console.warn severity mapping — undocumented

  • The guide states these "appear at the correct severity level in the Workers Observability dashboard"
  • This mapping is not documented in workers-logs.mdx, console.mdx, or any observability doc
  • Action: Document the console method → severity level mapping in either workers-logs.mdx or console.mdx. This is factual API behavior that users need.

5. Structured JSON logging — inconsistent examples in observability docs

  • The workers-logs "Best Practices" section (line 87) recommends structured JSON
  • But the "Custom logs" section (line 148) in the same page uses unstructured string interpolation as its primary example
  • Action: Update the custom logs examples in workers-logs.mdx to use structured JSON, consistent with the best practices recommendation.

6. Prefer Workers-compatible libraries — no general guidance exists

  • The guide mentions this topic. The existing docs explain what the runtime supports (nodejs_compat, polyfills, conditional exports) and document workarounds (module aliasing), but there is no prescriptive guidance on how to evaluate or choose compatible libraries
  • Action: Consider adding a brief guide or section on evaluating library compatibility (check for workerd export condition, prefer Web APIs, test early).

7. Set cache headers explicitly — no best-practice guidance

  • Existing cache docs explain the mechanics extensively, but no page prescriptively says "always set cache headers on Worker responses" or explains consequences of not doing so
  • Action: Consider adding to the best practices guide or as a callout in how-the-cache-works.mdx.

8. Handle nullable binding returns — implicit but never called out

  • The guide's code examples show null checks (e.g., if (!object)), and API references document null returns
  • But no page explicitly prescribes "always handle null from binding reads" as a pattern
  • Action: Lower priority, but a brief callout in the bindings index page would help.

P2: Getting started guide gaps

9. Getting started guide is silent on several best practices

  • get-started/guide.mdx does not mention: compatibility dates, nodejs_compat, wrangler types, secrets/.env, or observability
  • The "Next steps" section links to examples, tutorials, bindings, testing, and pricing — but not to observability or the new best practices page
  • Action: Add a "Next steps" link to the best practices guide and observability docs. Consider brief mentions of nodejs_compat and observability in the guide itself.

10. Observability framing — no "enable before production" recommendation

  • The guide says "Production Workers without observability are a black box. Enable logs and traces before you deploy to production."
  • No existing observability doc or getting-started page frames it this way — they explain how but not when/why you should
  • New Workers have observability enabled by default, but this is not called out as a "do not disable this" recommendation
  • Action: Add a brief "recommended for production" note to the observability overview or workers-logs page.

P3: Minor inconsistencies

11. 100:: placeholder DNS record — best practices guide teaches a pattern the custom domains page says to migrate away from

  • The guide (line 171) suggests adding AAAA 100:: for routes without an origin
  • custom-domains.mdx (lines 162-208) recommends migrating away from 100:: to Custom Domains
  • Action: Align messaging. Either recommend Custom Domains as the primary approach, or note that 100:: is a legacy workaround.

12. Wrangler configuration reference is incomplete for observability

  • wrangler/configuration.mdx only documents the flat observability structure, not the newer nested observability.traces / observability.logs sub-blocks used by traces and OTel export docs
  • Action: Update the config reference to document the full nested structure.

13. Compatibility dates example uses hardcoded old date

  • compatibility-dates.mdx line 35 uses "2022-04-05" instead of $today, inconsistent with 80+ other config examples
  • Action: Replace with $today.

14. ctx destructuring warning not cross-linked from context.mdx

  • The "Illegal invocation" from destructuring ctx is documented in errors.mdx (lines 91-136) but not mentioned or linked from context.mdx where developers first encounter waitUntil
  • Action: Add a brief warning or cross-link in context.mdx.

15. Environments page does not warn about accidental root deployments

  • The guide warns "do not deploy without specifying an environment" if you don't intend to use the root Worker
  • The environments page shows npx wrangler deploy to the root config without cautioning that this creates a separate Worker
  • Action: Add a note to wrangler/environments.mdx.

Summary — top 5 actions by impact

# Action Impact Effort
1 Fix .env vs .dev.vars inconsistency across docs (update partial, fix Prisma tutorial) High — currently contradicts best practice Medium
2 Add wrangler-typegen partial to binding docs + update environment-variables.mdx High — 32 pages teach anti-pattern Medium
3 Add passThroughOnException caveats to errors.mdx and context.mdx High — messaging conflict Low
4 Document console severity mapping + fix structured logging examples Medium — undocumented behavior Low
5 Add best practices + observability links to getting started "Next steps" Medium — new user onboarding gap Low

@ask-bonk
Copy link
Contributor

ask-bonk bot commented Feb 12, 2026

Done. I've posted a detailed audit comment on PR #28295: #28295 (comment)

Here's a summary of what I found:

7 topics in the best practices guide are already well-covered in existing docs (DO WebSockets, Hyperdrive, service bindings, Queues/Workflows, compatibility dates, Vitest testing, wrangler.jsonc).

15 gaps were identified, organized by priority:

  • P0 (Critical inconsistencies): The .env vs .dev.vars recommendation contradicts the existing secrets-in-dev.mdx partial and 8+ tutorials (one Prisma tutorial falsely claims Workers don't support .env). The passThroughOnException guidance conflicts with the errors page. 32 pages teach hand-written Env interfaces contrary to the wrangler types recommendation.

  • P1 (Missing guidance): Console severity mapping is undocumented, structured logging examples are inconsistent, no general guidance exists for choosing Workers-compatible libraries, and no best-practice entry exists for cache headers.

  • P2 (Getting started gaps): The getting started guide mentions none of these best practices and doesn't link to observability or the best practices page.

  • P3 (Minor inconsistencies): The 100:: DNS placeholder conflicts with custom domains migration guidance, the wrangler config reference is stale for observability, and the ctx destructuring warning isn't cross-linked from context.mdx.

github run

Co-authored-by: irvinebroque <irvinebroque@users.noreply.github.com>
@ask-bonk
Copy link
Contributor

ask-bonk bot commented Feb 12, 2026

The change addresses both pieces of feedback:

  1. Removed the misleading first line — the old text ("Connecting directly to a remote database from a Worker adds hundreds of milliseconds of latency per request") implied that any Worker-to-database connection is inherently slow, which is not true. The new text leads with the actionable recommendation instead.

  2. Made the guidance direct — the section now opens with "Always use Hyperdrive when connecting to a remote PostgreSQL or MySQL database from a Worker." The benefits (connection pooling, eliminating per-request handshake costs, query caching) are still explained, but as justification rather than as the lead.

github run

elithrar and others added 2 commits February 12, 2026 15:13
….mdx

Co-authored-by: Brendan Irvine-Broque <birvine-broque@cloudflare.com>
Co-authored-by: elithrar <elithrar@users.noreply.github.com>
@ask-bonk
Copy link
Contributor

ask-bonk bot commented Feb 12, 2026

The {name}-{env} on line 119 is inside backticks, so that's fine. All other braces are inside code blocks. No MDX issues.

Here is a summary of the change:

Expanded the "Use Queues and Workflows" section to address @irvinebroque's feedback that it lacked guidance on when to use one vs the other. The section now includes:

  1. "Use Queues when" — explains Queues as a message broker for fan-out, buffering/batching, and simple single-step background jobs.
  2. "Use Workflows when" — explains Workflows as a durable execution engine for multi-step processes, long-running tasks with pause/resume, and complex conditional logic.
  3. "Use both together" — describes the common pattern of Queues feeding into Workflows for high-throughput + complex processing.
  4. Updated code example — now shows a branching pattern where a simple order uses a Queue and a complex order uses a Workflow, making the distinction concrete.
  5. Added "For more information" links to both product docs.

This fills a significant gap — the research confirmed that the only existing Queues-vs-Workflows comparison table in the entire docs is buried in the Agents SDK API reference, which most developers would never find.

github run

@elithrar

This comment was marked as outdated.

@ask-bonk
Copy link
Contributor

ask-bonk bot commented Feb 12, 2026

Bonk workflow was cancelled.

View workflow run

@elithrar
Copy link
Collaborator Author

elithrar commented Feb 12, 2026

Reviewed the current state against all PR comments:

  • All 10 actionable review comments are incorporated: Tail Workers removed in favor of Workers Observability, .env over .dev.vars, no hardcoded compatibility date, Node.js crypto mentioned, no-floating-promises lint rules added, Hyperdrive rewritten to lead with "always use", bindings qualified, Queues vs Workflows guidance added, startup section removed, passThroughOnException caveat present.
  • Fixed the concat example (lines 228-263) — added ctx: ExecutionContext to the signature and wrapped the pipeline promise in ctx.waitUntil(pipeline) so the Worker stays alive until all chunks are piped through. Without this, the runtime could terminate the Worker after returning the response but before the stream is fully written.
  • The broader docs audit bonk posted (.env vs .dev.vars inconsistency across 8+ tutorials, 32 pages with hand-written Env, passThroughOnException messaging conflict in errors.mdx) is out of scope for this PR but worth tracking as follow-ups.

@elithrar elithrar enabled auto-merge (squash) February 12, 2026 20:46

[Workers Static Assets](/workers/static-assets/) is the recommended way to deploy static sites, single-page applications, and full-stack apps on Cloudflare. If you are starting a new project, use Workers instead of Pages. Pages continues to work, but new features and optimizations are focused on Workers.

For a purely static site, point `assets.directory` at your build output. No Worker script is needed. For a full-stack app, add a `main` entry point and an `ASSETS` binding to serve static files alongside your API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/bonk please evalulate if you recommend that this section also include guidance around run_worker_first. https://developers.cloudflare.com/workers/static-assets/binding/#run_worker_first

@elithrar elithrar merged commit 13b6cd8 into production Feb 12, 2026
9 checks passed
@elithrar elithrar deleted the workers-best-practices-guide branch February 12, 2026 20:49
@ask-bonk

This comment was marked as outdated.

@ask-bonk

This comment was marked as outdated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

product:workers Related to Workers product size/l

Projects

None yet

Development

Successfully merging this pull request may close these issues.

new: Workers Best Practices guide

7 participants