Skip to content

livMatS/dserver-development-stack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dserver-development-stack

Docker Compose stack for developing dserver and the dtool-lookup-webapp.

Overview

This repository provides a complete development environment for:

  • dserver - REST API for registering, looking up, and searching dtool dataset metadata
  • dtool-lookup-webapp - Vue.js web frontend for searching datasets
  • MinIO - S3-compatible object storage for datasets
  • PostgreSQL - SQL database for dserver admin metadata
  • MongoDB - NoSQL database for dataset search and retrieval

Prerequisites

  • Docker and Docker Compose
  • Git

Installation

1. Clone the repository with submodules

git clone --recursive git@github.com:your-org/dserver-development-stack.git
cd dserver-development-stack

If you already cloned without --recursive, initialize the submodules:

git submodule update --init --recursive

2. Start the stack

docker compose up -d

On first run, this will:

  • Build the Docker images
  • Create a Python virtual environment with all dependencies
  • Generate JWT keys for authentication
  • Initialize the PostgreSQL and MongoDB databases
  • Create the MinIO bucket for datasets
  • Start all services

3. Verify the services are running

docker compose ps

All services should show as "healthy" or "running".

Services

Service Port Description
dserver 5000 REST API for dataset metadata (includes token generator)
webapp 8080 Vue.js frontend
minio 9000 (API), 9001 (Console) S3-compatible storage
postgres 5432 PostgreSQL database
mongo 27017 MongoDB database

Usage

Access the services

Create a test dataset

To create a sample dataset and index it in dserver:

docker compose --profile indexer run --rm indexer /scripts/create-test-dataset.sh

Index existing datasets from S3

If you have datasets in the MinIO bucket, index them with:

docker compose --profile indexer run --rm indexer /scripts/index-datasets.sh

Push datasets from the command line

You can push datasets directly to the MinIO S3 storage from your host machine using the dtool command line tool.

Prerequisites

Install dtool with S3 support:

pip install dtool-s3

Configure dtool

Copy the provided configuration file to your home directory:

cp dtool.json ~/.config/dtool/dtool.json

Or set the environment variables directly:

export DTOOL_S3_ENDPOINT_dtool-bucket="http://localhost:9000"
export DTOOL_S3_ACCESS_KEY_ID_dtool-bucket="minioadmin"
export DTOOL_S3_SECRET_ACCESS_KEY_dtool-bucket="minioadmin"
export DTOOL_S3_DISABLE_BUCKET_VERSIONING_dtool-bucket=true

Create and push a dataset

  1. Create a proto dataset:
dtool create my-dataset
  1. Add data to the dataset:
cp some-file.txt my-dataset/data/
  1. Freeze the dataset:
dtool freeze my-dataset
  1. Copy the dataset to the S3 storage:
dtool cp my-dataset s3://dtool-bucket/
  1. Index the dataset in dserver (so it appears in the webapp):
docker compose --profile indexer run --rm indexer /scripts/index-datasets.sh

List datasets on S3

dtool ls s3://dtool-bucket/

Fetch a dataset from S3

dtool cp s3://dtool-bucket/<uuid> ./local-copy/

Access datasets via dserver (without backend credentials)

The dtool-dserver storage broker allows you to access datasets through dserver without requiring direct S3/Azure credentials.

Prerequisites

Install dtool-dserver:

pip install dtool-dserver

Configure access

You can configure dtool-dserver using either environment variables or the dtool.json config file:

Option 1: Using dtool.json (recommended)

Copy the provided configuration file to your dtool config directory and update the token:

cp dtool.json ~/.config/dtool/dtool.json

Then edit ~/.config/dtool/dtool.json and replace your-jwt-token-here with an actual token:

# Get a JWT token from dserver's token endpoint
TOKEN=$(curl -s -X POST http://localhost:5000/auth/token \
  -H "Content-Type: application/json" \
  -d '{"username": "admin"}' | jq -r '.token')

# Update the config file
cat > ~/.config/dtool/dtool.json <<EOF
{
  "DTOOL_S3_ENDPOINT_dtool-bucket": "http://localhost:9000",
  "DTOOL_S3_ACCESS_KEY_ID_dtool-bucket": "minioadmin",
  "DTOOL_S3_SECRET_ACCESS_KEY_dtool-bucket": "minioadmin",
  "DTOOL_S3_DISABLE_BUCKET_VERSIONING_dtool-bucket": true,
  "DSERVER_DEFAULT_BASE_URI": "s3://dtool-bucket",
  "DSERVER_TOKEN": "$TOKEN"
}
EOF

Option 2: Using environment variables

# Get a JWT token from dserver's token endpoint
export DSERVER_TOKEN=$(curl -s -X POST http://localhost:5000/auth/token \
  -H "Content-Type: application/json" \
  -d '{"username": "admin"}' | jq -r '.token')

# Configure default base URI for short URIs (optional)
export DSERVER_DEFAULT_BASE_URI="s3://dtool-bucket"

Using dserver:// URIs

Three URI formats are supported:

Short format (when DSERVER_DEFAULT_BASE_URI is set):

dtool ls dserver://localhost:5000/
dtool cp dserver://localhost:5000/<uuid> ./local-copy/
dtool cp my-dataset dserver://localhost:5000/

Alias format (for multiple data sources):

export DSERVER_BASE_URI_ALIASES='{"main": "s3://dtool-bucket", "archive": "s3://archive-bucket"}'
dtool ls dserver://localhost:5000/main/
dtool cp dserver://localhost:5000/main/<uuid> ./local-copy/

Full format (always works, no configuration needed):

dtool ls dserver://localhost:5000/s3/dtool-bucket/
dtool cp dserver://localhost:5000/s3/dtool-bucket/<uuid> ./local-copy/
dtool cp my-dataset dserver://localhost:5000/s3/dtool-bucket/

Benefits:

  • No need for S3/Azure credentials on client machines
  • Centralized access control through dserver
  • Automatic dataset registration on upload

Get an authentication token

The token generator is integrated into dserver as a plugin (dserver-dummy-token-generator). For development, it accepts any username/password combination:

curl -X POST http://localhost:5000/auth/token \
  -H "Content-Type: application/json" \
  -d '{"username": "admin", "password": "any"}'

Use the returned token in the Authorization: Bearer <token> header.

Note: The token endpoint is at /auth/token (the plugin mounts at /token and the token generation endpoint is /token).

Access dserver API with authentication

TOKEN=$(curl -s -X POST http://localhost:5000/auth/token \
  -H "Content-Type: application/json" \
  -d '{"username": "admin"}' | jq -r '.token')

curl -H "Authorization: Bearer $TOKEN" http://localhost:5000/config/info

Development

Installed packages (editable mode)

The following packages are installed in editable mode, so changes to the code are reflected immediately:

  • dtoolcore - Core dtool library
  • dtool-s3 - S3 storage backend for dtool
  • dtool-dserver - dserver storage broker for accessing datasets via dserver
  • dservercore - dserver core application
  • dserver-search-plugin-mongo - MongoDB search plugin
  • dserver-retrieve-plugin-mongo - MongoDB retrieve plugin
  • dserver-dependency-graph-plugin - Dependency graph extension
  • dserver-signed-url-plugin - Signed URL generation for direct S3/Azure access
  • dserver-dummy-token-generator - Development-only JWT token generator

Rebuilding the virtual environment

If you add new dependencies or want to rebuild:

docker compose down
docker volume rm dserver-development-stack_dserver_venv
docker compose up -d

Viewing logs

# All services
docker compose logs -f

# Specific service
docker compose logs -f dserver

Stopping the stack

docker compose down

To also remove the data volumes:

docker compose down -v

Configuration

Environment variables

Key environment variables are set in docker-compose.yml. The main ones are:

Variable Description
SQLALCHEMY_DATABASE_URI PostgreSQL connection string
SEARCH_MONGO_URI MongoDB URI for search plugin
RETRIEVE_MONGO_URI MongoDB URI for retrieve plugin
JWT_PRIVATE_KEY_FILE Path to JWT private key
JWT_PUBLIC_KEY_FILE Path to JWT public key
DTOOL_S3_ENDPOINT_dtool-bucket MinIO endpoint for the dtool bucket

S3/MinIO Configuration

The stack creates a bucket named dtool-bucket on MinIO. To access datasets from outside the Docker network (e.g., from the host), add this to your /etc/hosts:

127.0.0.1 dserver-minio-alias

Submodules

This repository includes the following submodules:

Submodule Description
dtoolcore Core Python API for managing datasets
dtool-s3 S3 storage backend for dtool
dtool-azure Azure storage backend for dtool
dtool-cli Command-line interface for dtool
dtool-dserver Storage broker for accessing datasets via dserver
dservercore dserver Flask application
dserver-search-plugin-mongo MongoDB search plugin
dserver-retrieve-plugin-mongo MongoDB retrieve plugin
dserver-dependency-graph-plugin Dependency graph extension
dserver-signed-url-plugin Signed URL generation plugin
dserver-dummy-token-generator Development-only JWT token generator plugin
dtool-lookup-webapp Vue.js web frontend
dserver-client-js JavaScript/TypeScript client library

Troubleshooting

dserver won't start

Check the logs:

docker compose logs dserver

Common issues:

  • Database not ready: Wait for postgres/mongo healthchecks
  • Missing search/retrieve plugin: Ensure the venv was built correctly

Webapp build errors

The webapp may have eslint configuration issues with newer Node.js versions. Check:

docker compose logs webapp

Permission issues

If you encounter permission issues with volumes, check that the Docker user has access to the mounted directories.

License

See the LICENSE file for details.

About

Docker compose stack for developing dserver

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published