feat(files): per-volume in-app policy enforcement#197
Conversation
pkosiec
left a comment
There was a problem hiding this comment.
Great work! Overall LGTM, pls see my small comments on the proposal and here - they might affect some methods naming etc.
Disclaimer: unfortunately I wasn't able to test it, I'll run it next week 👍 Unless you could record a short, raw demo video showcasing how the policies work ?
Also, I think it would be worth if we had approval for at least the RFC from Fabian / Mario who were much more involved in the files discussions. Thanks!
There was a problem hiding this comment.
Pull request overview
This PR adds an in-app, per-volume authorization layer to the Files plugin via composable FilePolicy functions, and updates the plugin so HTTP routes execute using the service principal while enforcing access decisions through policies (user identity taken from x-forwarded-user).
Changes:
- Introduces
FilePolicytypes/helpers (policy.*combinators,PolicyDeniedError, READ/WRITE action sets) and wires policy checks into HTTP handlers and the programmatic Volume API. - Switches route execution to service-principal-first and enforces user identity presence (401) + policy decisions (403) before upstream Databricks calls.
- Updates tests and documentation to reflect the new policy model and default
publicRead()behavior.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/appkit/src/plugins/files/policy.ts | Adds policy types, helpers, combinators, and PolicyDeniedError. |
| packages/appkit/src/plugins/files/plugin.ts | Enforces policies on routes and SDK API; defaults volumes to publicRead(); removes user-context gating. |
| packages/appkit/src/plugins/files/types.ts | Adds policy?: FilePolicy to VolumeConfig and updates public API docs. |
| packages/appkit/src/plugins/files/index.ts | Re-exports policy symbols from the files plugin barrel. |
| packages/appkit/src/index.ts | Exposes policy at top-level @databricks/appkit. |
| packages/appkit/src/context/* | Removes isInUserContext and introduces getCurrentUserId usage for caching/identity. |
| packages/appkit/src/plugins/files/tests/policy.test.ts | Unit tests for policy helpers/combinators. |
| packages/appkit/src/plugins/files/tests/plugin.test.ts | Adds policy enforcement tests and updates behavior expectations (SP execution). |
| packages/appkit/src/plugins/files/tests/plugin.integration.test.ts | Updates integration tests for SP execution + default publicRead() denying writes. |
| apps/dev-playground/server/index.ts | Configures dev-playground volume with policy.allowAll(). |
| docs/docs/plugins/files.md | Documents the new policy model and enforcement semantics. |
| docs/docs/api/appkit/Variable.policy.md | Adds API reference page for policy. |
| docs/docs/api/appkit/index.md | Adds policy to API index. |
| docs/docs/api/appkit/typedoc-sidebar.ts | Adds policy to Typedoc sidebar. |
| docs/static/appkit-ui/styles.gen.css | Updates selection styling using color-mix. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
e43da85 to
edabf4b
Compare
b0faf25 to
6ef1ade
Compare
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Signed-off-by: Atila Fassina <atila@fassina.eu>
Add upload denyAll denial test for HTTP handler (closes coverage gap on all 10 route handlers) and asUser() write-path tests for delete and upload to verify policy enforcement on the programmatic API. Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>
2600554 to
02225c7
Compare
Malformed Content-Length headers (e.g. "abc", "123abc", "-1") were parsed via parseInt which silently accepts partial matches and negative values. This could bypass size-gating policies. Now strictly validates with /^\d+$/ and returns 400 for invalid values. Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>
The function was removed on main in feat(files): per-volume in-app policy enforcement (#197) since the files plugin no longer needed it. The telemetry interceptor needs it to set execution.context span attributes. Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>
The function was removed on main in feat(files): per-volume in-app policy enforcement (#197) since the files plugin no longer needed it. The telemetry interceptor needs it to set execution.context span attributes. Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>
* feat: add execution context attributes to telemetry spans Add `execution.context` and `caller.id` span attributes to the telemetry interceptor, allowing traces to distinguish OBO (user) from service principal code paths. Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com> * fix: preserve OTel context across async generator boundary in executeStream The TelemetryInterceptor spans were orphaned because OTel lost the parent HTTP span context when crossing into the async generator. Capture context.active() before the generator and restore it with context.with() inside, so plugin.execute spans appear as children of the HTTP request trace. Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com> * feat: add execution.obo_dev_fallback span attribute for OBO dev mode When asUser() is called in dev mode without x-forwarded-access-token, the telemetry span now includes execution.obo_dev_fallback: true to distinguish intended OBO calls from regular service principal calls. Uses OTel context key + thin proxy pattern to carry the flag without mutable state — scoped automatically per execution and concurrent-safe. Also documents telemetry span attributes in execution-context.md. Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com> * test: add coverage for asUser() dev fallback proxy and OTel context preservation Add tests for previously uncovered behaviors: - asUser() dev fallback Proxy wraps methods correctly and sets isDevOboFallback context - EXCLUDED_FROM_PROXY methods bypass OBO fallback wrapping - executeStream preserves parent OTel context across async generator boundary - isDevOboFallback() returns false outside proxy context Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com> * fix: use getCurrentUserId() for caller.id span attribute The caller.id attribute was using context.userKey which is a cache key, not always the real user ID. The analytics plugin passes "global" for SP queries, so traces showed caller.id: "global" instead of the actual service principal ID. Now uses getCurrentUserId() which always returns the real identity. Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com> * docs: document telemetry span attributes and interceptor chain requirement Add telemetry span attributes table to execution-context.md and note that execute()/executeStream() is required for automatic instrumentation. Update custom-plugins.md to link telemetry attributes from the execution interceptors bullet. Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com> * feat: add db.user to lakebase spans and clarify arrow-result route Add db.user attribute to lakebase.query telemetry spans so traces show which PostgreSQL role executed the query. Also add a comment clarifying that the arrow-result route intentionally bypasses the interceptor chain (it's a data download, not a query execution). Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com> * fix: re-add isInUserContext() removed by files policy refactor The function was removed on main in feat(files): per-volume in-app policy enforcement (#197) since the files plugin no longer needed it. The telemetry interceptor needs it to set execution.context span attributes. Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com> --------- Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>
Summary
FilePolicy— a function(action, resource, user) → boolean— gates every file operation before the Databricks API call. Ships with built-in helpers (policy.publicRead(),policy.allowAll(),policy.denyAll()) and combinators (all,any,not) for composition.x-forwarded-userand passed to the policy function.publicRead()(read-only). A startup warning encourages setting an explicit policy.Motivation
Previously the files plugin relied entirely on OBO token forwarding and Unity Catalog grants. This meant:
Policies close this gap with a composable, per-volume authorization layer evaluated before any API call.
Key changes
policy.ts(new)FileAction,FileResource,FilePolicyUsertypes ·PolicyDeniedError·policynamespace withpublicRead(),allowAll(),denyAll(),all(),any(),not()plugin.tsisInUserContext/OBO gating · Added_enforcePolicy(),_checkPolicy(),_extractUser()·_executeOrThrow()for proper 4xx propagationtypes.tspolicy?: FilePolicyonVolumeConfig· Updated JSDoc to reflect SP-first execution modelindex.tspolicyfrom top-level barrelpolicy.allowAll()on the default volumefiles.md) with permission model, built-in policies, combinators, custom policies · Generated API reference forpolicynamespaceTest coverage
policy.test.ts): all helpers, async policies,PolicyDeniedError, edge casesplugin.test.ts): route-level 401/403 responses, SDK API with user/SP identity, upload size in resource, cache invalidation under SP modelTest plan
pnpm devThis pull request was AI-assisted by Isaac.