fix(infra) custom Cosmos data-plane role permits database creation#62
Merged
Conversation
The built-in 'Cosmos DB Built-in Data Contributor' role granted in PR #60 has dataActions: - readMetadata - sqlDatabases/containers/* - sqlDatabases/containers/items/* Notably MISSING: sqlDatabases/write — meaning the role cannot create databases. The first end-to-end run of '--ensure-cosmos-containers' against the live Phase 1 deploy (pinwiz-shared-dev-20260503122213) hit Forbidden (403); SubStatus 5300 at POST /dbs for exactly that reason. Diagnostic confirmed by reading the role definition's dataActions directly via 'az cosmosdb sql role definition show'. Verified the issue isn't credential-source ordering by forcing AZURE_TOKEN_CREDENTIALS=azureclicredential and seeing the same 403. Verified control-plane works (subscription Owner inheritance covers ARM) by 'az cosmosdb sql database create' succeeding from the same shell. Fix replaces the built-in role assignment with a custom sqlRoleDefinition resource named 'PinWiz Developer Data Contributor' whose dataActions are readMetadata plus the account-wide wildcard Microsoft.DocumentDB/databaseAccounts/sqlDatabases/* — covering database create/replace/delete + container CRUD + item CRUD + query + change feed in one declaration. This is the Microsoft-documented Cosmos pattern for self-sufficient post-deploy bootstrap. The ARM 'Cosmos DB Operator' alternative was rejected because Operator's scope extends to operations like account delete (broader than the developer needs) and splits the auth model across two planes. Bicep manages role assignments by name via guid() derived from the roleDefinitionId — since the role ID changes, the new assignment has a different name from the old one, leaving the original built-in assignment from PR #60 orphaned. It's a strict subset of the new custom role and is harmless; can be cleaned manually with 'az cosmosdb sql role assignment delete' if desired. Re-deploy: pwsh ./infra/scripts/Deploy-SharedResources.ps1 -Environment dev After the re-deploy, '--ensure-cosmos-containers' succeeds end-to-end against deployed Cosmos and is once again the canonical post-deploy smoke-test PR #57 designed it to be. No .NET code changed; tests still 503/503 passing. Build clean. Pre-push self-audit: /local-review (0 critical / 2 minor / 8 categories clean — both minor are over-broad wildcard and missing what-if golden-file, both deferrable for a single-developer personal dev environment) plus 7-item mechanical checklist (all pass).
16 tasks
jkeeley2073
added a commit
that referenced
this pull request
May 4, 2026
…rapperArmSdk feat(infra) ARM-backed CosmosBootstrapper supersedes broken PR #62 RBAC
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the bug surfaced when running
dotnet run --project src/PinballWizard.Cli -- --ensure-cosmos-containersagainst the live Phase 1 deploypinwiz-shared-dev-20260503122213for the first time. The smoke-test returnedForbidden (403); SubStatus 5300atPOST /dbs(database creation).Root cause
The built-in
Cosmos DB Built-in Data Contributorrole assigned in PR #60 has thesedataActions:Notably missing:
Microsoft.DocumentDB/databaseAccounts/sqlDatabases/write. The data-plane SDK callCreateDatabaseIfNotExistsAsyncrequires that action to create databases.This is a known Cosmos RBAC gap — the built-in Contributor role is documented as "operations on existing data and entities; doesn't allow creating, deleting, or modifying entities like databases or containers."
Diagnostic chain
az-authenticated principal Object ID matches thedeveloperObjectIdininfra/main-shared.dev.local.bicepparam(fb4fdb3e-…). MATCH.az cosmosdb sql role assignment list. PRESENT.AZURE_TOKEN_CREDENTIALS=azureclicredential. STILL 403.dataActionsdirectly viaaz cosmosdb sql role definition show ... --id 0…02. CONFIRMEDsqlDatabases/writeis missing.az cosmosdb sql database createsucceeded from the same shell (createdpinwizdatabase manually for the immediate workaround).Fix
Replace the built-in role assignment in
infra/modules/shared.bicepwith a customsqlRoleDefinitionsresource namedPinWiz Developer Data ContributorwhosedataActionsare:Microsoft.DocumentDB/databaseAccounts/readMetadataMicrosoft.DocumentDB/databaseAccounts/sqlDatabases/*— covers database create/replace/delete + container CRUD + item CRUD + query + change feed in one declarationThe role assignment now references
cosmosDeveloperRole.idinstead of the built-in role ID. After re-deploy,--ensure-cosmos-containerswill succeed end-to-end against deployed Cosmos.Why custom role over
Cosmos DB Operator(rejected alternative)The ARM
Cosmos DB Operatorbuilt-in role would also grant database/container management, but:Re-deploy + verify
Test Plan
dotnet build PinballWizard.slnx-> 0 warnings, 0 errorsdotnet test PinballWizard.slnx-> 503 / 503 passing (unchanged from main; this PR adds no .NET code)bicep build infra/main-shared.bicepclean (CI gate)Out of Scope
name(aguid()derived from theroleDefinitionId), so changing the role ID produces a different assignment name. The new assignment is created; the old built-in assignment from PR feat(infra) grant Cosmos data-plane RBAC to developer principal in Bicep #60 is left orphaned. It's a strict subset of the new custom role and is harmless. Manual cleanup if desired:sqlDatabases/*wildcard to specific actions (write/read/delete+containers/*+containers/items/*). Reviewer flagged asminor; deferred for personal dev environment. The wildcard addsthroughputSettings/*(RU/s changes on shared-throughput databases) which the developer doesn't currently need but doesn't actively harm.sqlDatabases/writeaction offline. Deferred — would be a separate test-infra PR.appsettings.jsonCosmos:Containerssection. I started to add this thinking it was needed, then reverted after re-readingCosmosOptions.cs:64-68— the property defaults already declaremachines(PK/manufacturer) andingestion_sources(PK/partitionKey). Adding redundant config to appsettings would have created a new code-vs-config drift surface.Checklist
docs/adr/— N/A (this is a fix to PR feat(infra) grant Cosmos data-plane RBAC to developer principal in Bicep #60's RBAC implementation, not a new architectural choice; the choice of "data-plane RBAC over ARM Operator" is already locked by PR feat(infra) grant Cosmos data-plane RBAC to developer principal in Bicep #60's spirit)README.mdand/ordocs/are updated in the same PR — N/A (the playbook in the session-handoff memory already references--ensure-cosmos-containers; the change here makes that documentation accurate)~/.claude/projects/c--projects-PinballWizard/memory/is now stale, it has been updated or removed in the same PR — handoff memory updated separatelyTODO/FIXME/ commented-out code committed<NoWarn>without a comment explaining why and the removal criterionPre-push self-audit
Step 0 —
/local-review(qualitative)/local-reviewand addressed every critical finding before pushsqlDatabases/*wildcard is broader than strictly needed (addsthroughputSettings/*privilege the developer doesn't use). Acceptable for single-developer personal dev environment; tighten on next visit if desired.sqlDatabases/writeaction offline. Future test-infra PR.Step 1 — Mechanical checklist
*Optionsproperty has at least one real getter call insrc/— N/A (no .NET code changed)shared.bicep; reviewer confirmed this is correct because Cosmos data-plane uses a separate ARM namespace and no built-in Cosmos data role fitscatch { }— N/A (no .NET code changed)ISourceScraper? — N/Agit log -1 --format='%an <%ae>'shows personal noreply, not work email