Skip to content

glee4810/FHIR-AgentBench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

FHIR-AgentBench

This repository contains the code and dataset for FHIR-AgentBench.

πŸ“ Project Structure

FHIR-AgentBench/
β”œβ”€β”€ scripts/                          # Bash scripts for data setup, agent inference, and evaluation
β”œβ”€β”€ agent/                            # Multiple agent implementations
β”œβ”€β”€ tools/                            # Tools for agents
β”œβ”€β”€ utils/                            # Utility modules
β”œβ”€β”€ config.py                         # Configuration settings and constants
β”œβ”€β”€ config.yml                        # YAML configuration file
β”œβ”€β”€ create_db.py                      # Creates database for Q&A conversion to FHIR
β”œβ”€β”€ create_question_answer_dataset.py # Creates Q&A dataset from EHRSQL
β”œβ”€β”€ create_question_fhir_dataset.py   # Creates FHIR-compatible question dataset
β”œβ”€β”€ evaluation_metrics.py             # Main evaluation script
β”œβ”€β”€ fhir_client.py                    # FHIR client for Google Cloud Healthcare API
β”œβ”€β”€ run_agent.py                      # Main script to run agents on datasets
β”œβ”€β”€ question_fixes_complete.json      # Hard-coded question fixes
β”œβ”€β”€ value_mapping_valid_natural.json  # Natural language value mappings 
β”œβ”€β”€ requirements.txt                  # Python package dependencies
└── images/                           # Documentation images

πŸš€ Getting Started

Prerequisites

  • Install required packages:
    # Create a conda environment
    conda create -n fhir-agentbench python=3.11
    conda activate fhir-agentbench
    
    # Install dependencies
    pip install -r requirements.txt

Data Preparation

1. Upload the MIMIC-IV FHIR data to a GCP FHIR store

  • Download MIMIC-IV Clinical Database Demo on FHIR from PhysioNet and extract the .gz files.
  • Create a GCP account, then in the Google Cloud Console search for FHIR Viewer.
  • Click Browser on the left, then Create dataset. dataset creation
  • Next, click Create data store to prepare for the data upload. datastore creation
  • For Configure your FHIR store, select R4 as the FHIR Version. Keep other settings as default and click Create.
  • Separately, in Cloud Storage, upload your unzipped folder containing the MIMIC-IV FHIR data (*.ndjson) to a bucket.
  • Back in the FHIR store, click Actions in the upper right and choose Import. FHIR Data Store Import
  • Select the folder you uploaded. Under FHIR Import Settings, choose Resource for Content Structure. Click Import and grant permissions if prompted. FHIR import settings
  • Open the Import operation to confirm success. It usually completes in about 10 minutes.

2. Enable APIs and authenticate with gcloud

You can enable the required APIs and verify access using the gcloud CLI. This is often the fastest way to confirm your setup before running code.

  1. Log in

    # Authenticate with your Google account
    gcloud auth login
    
    # Set up Application Default Credentials (ADC)
    gcloud auth application-default login --no-launch-browser
  2. Check or set the current project and project number

    # List all available projects to find your PROJECT_ID
    gcloud projects list
    # Set the quota project for ADC (to handle billing and quotas)
    gcloud auth application-default set-quota-project <YOUR_PROJECT_ID>
    
    # Set the default project for gcloud CLI
    gcloud config set project <YOUR_PROJECT_ID>
    # Get the current project ID and project number
    PROJECT_ID="$(gcloud config get-value project)"
    PROJECT_NUMBER="$(gcloud projects describe "$PROJECT_ID" --format="value(projectNumber)")"
    
    # Print them for confirmation
    echo "$PROJECT_ID"
    echo "$PROJECT_NUMBER"
  3. Enable required APIs

    # Enable the Cloud Healthcare API (for FHIR, DICOM, HL7v2 resources)
    gcloud services enable healthcare.googleapis.com --project="$PROJECT_ID"
    
    # Enable the Cloud Asset API (needed for dataset and store discovery)
    gcloud services enable cloudasset.googleapis.com --project="$PROJECT_ID"
    
    # Enable the Cloud Resource Manager API (needed for project and resource management)
    gcloud services enable cloudresourcemanager.googleapis.com --project="$PROJECT_ID"
    
    # Enable the Service Usage API (needed to enable and check other APIs)
    gcloud services enable serviceusage.googleapis.com --project="$PROJECT_ID"
  4. Automatically discover dataset, FHIR store, and location

    # Find the dataset ID and location
    read DATASET_ID LOCATION <<<$(gcloud asset search-all-resources \
    --scope="projects/$PROJECT_NUMBER" \
    --asset-types="healthcare.googleapis.com/Dataset" \
    --format="value(name.basename(), location)")
    
    echo "LOCATION=$LOCATION"
    echo "DATASET_ID=$DATASET_ID"
    
    # Find the FHIR store ID
    STORE_ID="$(gcloud healthcare fhir-stores list \
    --dataset="$DATASET_ID" --location="$LOCATION" --project="$PROJECT_ID" \
    --format="value(name.basename())")"
    
    echo "STORE_ID=$STORE_ID"
  5. Grant IAM permissions to your user (if not already granted)

    # Get the current logged-in user
    USER="$(gcloud config get-value account)"
    
    # Grant FHIR resource read access
    gcloud healthcare datasets add-iam-policy-binding "$DATASET_ID" \
    --location="$LOCATION" --project="$PROJECT_ID" \
    --member="user:$USER" \
    --role="roles/healthcare.fhirResourceReader"
    
    # Grant FHIR store viewer access
    gcloud healthcare datasets add-iam-policy-binding "$DATASET_ID" \
    --location="$LOCATION" --project="$PROJECT_ID" \
    --member="user:$USER" \
    --role="roles/healthcare.fhirStoreViewer"
  6. Project configuration

    Create a file named config.yml in the project root:

    OPENAI_API_KEY: "your-api-key"
    GEMINI_API_KEY: "your-api-key"
    FHIR_CONFIG:
       PROJECT_ID: "your-gcp-project-id"
       LOCATION: "your-fhir-dataset-location"
       DATASET_ID: "your-dataset-id"
       STORE_ID: "fhir-store-id (usually the same as dataset_id)"

3. (Optional) Run the script to download and prepare the dataset:

If final_dataset/questions_answers_sql_fhir.csv already exists, you can skip this stage.

bash scripts/setup_data.sh
python create_question_answer_dataset.py
python create_question_fhir_dataset.py

πŸ€– Agent Execution

The project includes several agent implementations:

# Single-turn agents
bash scripts/run_single_turn_request_agent.sh       # Single-turn FHIR RESTful API generation and retrieval β†’ Natural language reasoning
bash scripts/run_single_turn_resource_agent.sh      # Single-turn FHIR resource retrieval β†’ Natural language reasoning
bash scripts/run_single_turn_code_resource_agent.sh # Single-turn FHIR resource retrieval β†’ Code-based reasoning

# Multi-turn agents
bash scripts/run_multi_turn_resource_agent.sh       # Multi-turn/iterative resource retrieval β†’ Natural language reasoning
bash scripts/run_multi_turn_code_resource_agent.sh  # Multi-turn/iterative resource retrieval β†’ Code-based reasoning

To use open-source models locally with vLLM, start the vLLM server and set base_url to http://localhost:<port>/v1.

CUDA_VISIBLE_DEVICES=<gpu_ids> python -m vllm.entrypoints.openai.api_server --model <model> --load-format safetensors --max-model-len 32768 --tensor-parallel-size <num_gpus> --port <port> --enable-auto-tool-choice --tool-call-parser llama3_json

πŸ“Š Evaluation

Run the following command to normalize, evaluate answers, and visualize performance (FHIR resource retrieval recall/precision, answer correctness):

python evaluation_metrics.py --input <agent_output_json_file_path>

Authorship

FHIR-AgentBench is a joint research effort between Verily Life Sciences, Korea Advanced Institute of Science & Technology (KAIST), and Massachusetts Institute of Technology (MIT).

About

Code and Data for FHIR-AgentBench

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors