Implement user authentication for OSA with ORCiD as the primary identity provider using custom JWT-based session management.
Background
OSA needs authentication for:
- Researchers depositing data
- Curators reviewing submissions
- Administrators managing the system
Why ORCiD?
- ORCiD is THE identity for researchers - 18M+ researchers already have one
- No password management - Eliminates password storage, reset flows, MFA enrollment, brute-force protection
- Institutional SSO built-in - Researchers can sign into ORCiD with university credentials via eduGAIN
- Attribution-ready - ORCiD IDs are already used for scientific attribution
This follows the pattern used by DataONE, Zenodo, and other scientific repositories.
Why Custom JWT Auth (not Supabase Auth)?
We initially planned to use Supabase Auth (GoTrue), but found the self-hosting experience problematic:
- Requires special
supabase/postgres image with pre-configured schemas
- Complex migration and initialization requirements
- Additional container to maintain
- Overkill for our current needs (ORCiD-only auth)
Custom JWT approach is simpler because:
- Minimal dependencies - Just PyJWT library, standard Postgres
- Full control - We own the token lifecycle
- Simpler stack - No additional auth container
- SAML later - Can add via
python3-saml when institutional SSO is needed
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ OSA Instance │
│ │
│ ┌─────────────┐ ┌───────────────┐ │
│ │ OSA Web │ │ Postgres │ │
│ │ (Next.js) │───────────────────────────▶│ │ │
│ └──────┬──────┘ │ users │ │
│ │ │ identities │ │
│ │ API calls with JWT │ refresh_tkns │ │
│ ▼ │ │ │
│ ┌─────────────┐ JWT issue/verify │ │ │
│ │ OSA API │◀──────────────────────────▶│ │ │
│ │ (FastAPI) │ └───────────────┘ │
│ └──────┬──────┘ │
│ │ │
└─────────┼──────────────────────────────────────────────────────┘
│ OAuth 2.0
▼
┌───────────┐
│ ORCiD │
│ (OAuth) │
└───────────┘
User Flow
User clicks "Sign in with ORCiD"
│
▼
OSA Web redirects to OSA API /auth/login
│
▼
OSA API redirects to ORCiD OAuth authorization endpoint
│
▼
User authenticates at ORCiD, grants consent
│
▼
ORCiD redirects to OSA API /auth/callback with authorization code
│
▼
OSA API exchanges code for ORCiD tokens, retrieves user info
│
▼
OSA API creates/updates user in database
│
▼
OSA API generates JWT access token + refresh token
│
▼
OSA API redirects to OSA Web with tokens in URL fragment
│
▼
OSA Web stores tokens in localStorage, user is logged in
Functional Requirements
Authentication Flow
- Users authenticate via ORCiD OAuth 2.0
- Backend exchanges authorization code for ORCiD tokens
- Backend retrieves user info (ORCiD ID, display name) from ORCiD API
- New users are created in the database on first login
- Existing users are looked up by ORCiD ID
Session Management
- Access tokens: Short-lived JWTs (1 hour) containing user claims
- Refresh tokens: Long-lived opaque tokens (7 days) stored in database
- Tokens are rotated on each refresh (new refresh token issued)
- Logout revokes the refresh token
JWT Claims
sub: User's internal UUID
orcid_id: User's ORCiD identifier (e.g., 0000-0001-2345-6789)
aud: Audience claim (authenticated)
exp: Expiration timestamp
iat: Issued at timestamp
API Protection
- Protected endpoints require valid JWT in
Authorization: Bearer <token> header
- Invalid/expired tokens return 401 Unauthorized
- Optional authentication supported for public endpoints
Frontend Session
- Tokens stored in localStorage
- Automatic token refresh before expiry
- Session persists across page refreshes
- Logout clears local storage and revokes backend token
Non-Functional Requirements
Security
- JWT signed with HS256 using server-side secret (256+ bits)
- Refresh tokens stored as SHA256 hashes (never plaintext)
- Token family tracking to detect refresh token theft
- HTTPS required in production
Extensibility
- Identity provider abstraction to support future providers
- SAML support can be added later via
python3-saml library
Acceptance Criteria
Out of Scope (Separate Issues)
Dependencies
References
Implement user authentication for OSA with ORCiD as the primary identity provider using custom JWT-based session management.
Background
OSA needs authentication for:
Why ORCiD?
This follows the pattern used by DataONE, Zenodo, and other scientific repositories.
Why Custom JWT Auth (not Supabase Auth)?
We initially planned to use Supabase Auth (GoTrue), but found the self-hosting experience problematic:
supabase/postgresimage with pre-configured schemasCustom JWT approach is simpler because:
python3-samlwhen institutional SSO is neededArchitecture
User Flow
Functional Requirements
Authentication Flow
Session Management
JWT Claims
sub: User's internal UUIDorcid_id: User's ORCiD identifier (e.g.,0000-0001-2345-6789)aud: Audience claim (authenticated)exp: Expiration timestampiat: Issued at timestampAPI Protection
Authorization: Bearer <token>headerFrontend Session
Non-Functional Requirements
Security
Extensibility
python3-samllibraryAcceptance Criteria
orcid_idclaimOut of Scope (Separate Issues)
Dependencies
References