Replace token-based CSRF with Sec-Fetch-Site header protection#2689
Conversation
Datasette previously relied on the asgi-csrf library to guard POST forms with a `ds_csrftoken` cookie and matching hidden form field. This commit replaces that mechanism with an inline ASGI middleware (CrossOriginProtectionMiddleware) that inspects the browser-set `Sec-Fetch-Site` and `Origin` headers - the approach described in Filippo Valsorda's research (https://words.filippo.io/csrf/) and implemented in Go 1.25's `http.CrossOriginProtection`. The new middleware rejects unsafe-method requests whose Sec-Fetch-Site is anything other than `same-origin` or `none`, with a Host-vs-Origin fallback for pre-2023 browsers. Non-browser clients (curl, API scripts) send neither header and are passed through - CSRF is a browser-only attack. Defense-in-depth is preserved through Datasette's existing `SameSite=Lax` default on `ds_actor` and `ds_messages` cookies. This works identically on HTTPS, HTTP, and localhost. Changes: - Drop `asgi-csrf` dependency from pyproject.toml - Add CrossOriginProtectionMiddleware in datasette/app.py - Remove the `ds_csrftoken` cookie and `csrftoken` hidden form fields from the six built-in templates - Make `csrftoken()` in templates a no-op returning `""` for backward compatibility with custom templates and plugins - Remove the `skip_csrf` plugin hook (no longer needed - browser JSON POSTs get Sec-Fetch-Site: same-origin; non-browser clients pass through unchanged) - Update `csrf_error.html` to show the middleware's `reason` string - `csrftoken_from=` on the test helper becomes a no-op, so existing tests keep working unchanged - Update CSRF-specific tests and add tests/test_csrf_middleware.py covering all five algorithm branches - Rewrite the CSRF section of the docs
Adds a new section to docs/upgrade_guide.md covering the replacement of the token-based CSRF mechanism with Sec-Fetch-Site header protection, including what plugin authors can remove, the removal of the skip_csrf hook, and the updated csrf_error.html template context.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2689 +/- ##
==========================================
+ Coverage 90.79% 90.81% +0.01%
==========================================
Files 55 56 +1
Lines 8517 8553 +36
==========================================
+ Hits 7733 7767 +34
- Misses 784 786 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Older plugins call request.scope["csrftoken"]() directly from Python and would raise KeyError after the switch to header-based CSRF protection. Reintroduce the scope value as a per-request random string so those plugins keep working, without reviving the ds_csrftoken cookie or treating the value as a security primitive. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bearer tokens are not ambient browser credentials - a cross-origin page cannot cause the browser to attach a target site's bearer token unless the attacker's JavaScript already possesses it. Allowing unsafe-method cross-site requests that carry an explicit Bearer token restores the documented write-API behavior for JavaScript clients on other origins. The exemption is deliberately narrow: it covers only the Bearer scheme (case-insensitive), not Basic or Digest, since those can be browser- managed and are CSRF-relevant. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The skip_csrf hook is gone but pluggy silently ignores unknown hookimpls, so legacy plugins still load - they just lose their cross-origin bypass. Expand the upgrade guide with safe replacement patterns (bearer tokens, signed URLs, body-carried non-ambient credentials) and add a regression test that a legacy skip_csrf hookimpl still registers cleanly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| def test_canned_query_form_csrf_hidden_field( | ||
| canned_write_client, query_name, expect_csrf_hidden_field | ||
| ): | ||
| def test_canned_query_form_has_no_csrf_hidden_field(canned_write_client, query_name): |
There was a problem hiding this comment.
Let's delete this test entirely.
| def test_vary_header(canned_write_client): | ||
| # These forms embed a csrftoken so they should be served with Vary: Cookie | ||
| # CSRF is now header-based so no Vary: Cookie is needed on write-form pages | ||
| assert "vary" not in canned_write_client.get("/data").headers | ||
| assert "Cookie" == canned_write_client.get("/data/update_name").headers["vary"] | ||
| assert "vary" not in canned_write_client.get("/data/update_name").headers |
There was a problem hiding this comment.
Actually rename to test_canned_query_pages_no_vary_header
| @pytest.fixture | ||
| def ds(): | ||
| return Datasette(memory=True) | ||
|
|
There was a problem hiding this comment.
Moving this into conftest.py
| response = await ds.client.post( | ||
| "/-/messages", | ||
| data={"message": "hello", "message_class": "info"}, | ||
| headers={"sec-fetch-site": "none"}, | ||
| ) | ||
| assert response.status_code != 403 |
There was a problem hiding this comment.
Let's use @pytest.mark.parametrize to reduce boilerplate.
| def test_legacy_csrftoken_template_helper_renders( | ||
| restore_working_directory, tmpdir_factory | ||
| ): | ||
| from tests.fixtures import make_app_client | ||
|
|
||
| templates = tmpdir_factory.mktemp("templates") | ||
| (templates / "csrftoken_form.html").write_text( | ||
| "CSRFTOKEN:{{ csrftoken() }}:END", "utf-8" |
|
|
||
| @pytest.mark.asyncio | ||
| async def test_bearer_auth_scheme_case_insensitive(): | ||
| from datasette.app import CrossOriginProtectionMiddleware |
There was a problem hiding this comment.
I do not like this class living in datasette.app.
There was a problem hiding this comment.
Moving it to datasette/csrf.py.
| @pytest.fixture | ||
| def ds(bare_ds): | ||
| return bare_ds | ||
|
|
Claude Code explanation:
Root cause: test_legacy_skip_csrf_hookimpl_does_not_break_loading registered a plugin with a skip_csrf hookimpl on the shared datasette.plugins.pm.
Pluggy lazily creates a _HookCaller on the hook namespace when a hookimpl first registers, and that caller stays there after unregister() — so dir(pm.hook) in test_docs.py::test_plugin_hooks_are_documented then sees a skip_csrf hook with no matching heading in docs/plugin_hooks.rst.
Fix: use a throwaway pluggy.PluginManager("datasette") instead of the shared pm, so the registration doesn't leak.
* Upgrade to latest Datasette CSRF mechanism, refs #6 Refs simonw/datasette#2689
Token-based CSRF is now legacy: browsers have sent Sec-Fetch-Site on every request since Safari 16.4 (2023), and Go 1.25 promoted this pattern to stdlib via net/http.CrossOriginProtection (Aug 2025). Adds a new CSRF subsection with 7 checklist items covering safe-method bypass, Sec-Fetch-Site enforcement, Origin fallback, and SameSite as defense-in-depth. Sources: - https://words.filippo.io/csrf/ (Filippo Valsorda, Aug 2025) - https://pkg.go.dev/net/http#CrossOriginProtection (Go 1.25 stdlib) - https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Sec-Fetch-Site - https://web.dev/articles/fetch-metadata - simonw/datasette#2689 (Datasette adoption, Apr 2026)
<input type="hidden" name="csrftoken" value="{{ csrftoken() }}">in the templates - they are no longer needed.def skip_csrf(datasette, scope):plugin hook defined indatasette/hookspecs.pyand its documentation and tests.📚 Documentation preview 📚: https://datasette--2689.org.readthedocs.build/en/2689/