A simple Python script to download dataset files from FlowRepository.org. FlowRepository's built-in ZIP download is often broken, so this tool downloads each file individually instead.
If you have a FlowRepository dataset ID (like FR-FCM-Z2SS), this script will:
- Log in to FlowRepository with your account
- Find all the files in that dataset
- Download them to a folder on your computer
- Skip files you've already downloaded (safe to re-run if interrupted)
- Python 3.6+ — Download Python if you don't have it
- A FlowRepository account — Create a free account if you don't have one
-
Download or clone this repository:
git clone https://github.com/YOUR_USERNAME/flowRep.git cd flowRep -
Create a conda environment and install dependencies:
conda create -n flowrep python=3.10 -y conda activate flowrep pip install -r requirements.txtWhat is conda? Conda is a tool that creates isolated Python environments so packages don't conflict with each other. If you don't have conda installed, get it from Miniconda (the lightweight option) or Anaconda.
-
Every time you open a new terminal to use this script, activate the environment first:
conda activate flowrep
Just run the script with no arguments and it will walk you through everything:
python download_flowrepo.py
You'll be prompted for:
- The dataset ID (e.g.,
FR-FCM-Z2SS) - Where to save the files
- Your FlowRepository email and password
For more control, you can pass arguments directly:
python download_flowrepo.py FR-FCM-Z2SS
python download_flowrepo.py FR-FCM-Z2SS --output-dir ./my_data
python download_flowrepo.py FR-FCM-Z2SS --email you@example.com
| Option | Description |
|---|---|
dataset_id |
The dataset ID from FlowRepository (e.g., FR-FCM-Z2SS) |
--output-dir |
Folder to save files to (default: uses the dataset ID as folder name) |
--email |
Your FlowRepository email (will be prompted if not provided) |
--password |
Your FlowRepository password (will be prompted if not provided) |
--max-retries |
Number of times to retry a failed download (default: 3) |
The first time you enter your credentials, you'll be asked if you want to save them for next time. If you choose yes, they are stored in ~/.flowrepo_credentials.
To clear saved credentials, simply delete that file:
rm ~/.flowrepo_credentials
- Login — The script logs in to FlowRepository using your email and password via their web login form
- Resolve dataset — It converts the dataset ID (e.g.,
FR-FCM-Z2SS) to an internal experiment number - Find files — It scrapes the download page to find links to each individual file (FCS files, attachments, etc.)
- Download — Each file is downloaded one at a time, with automatic retries on failure. Files that already exist are skipped.
"Could not find login form" — FlowRepository may have changed their website. Please open an issue.
"Login may have failed" — Double-check your email and password. Make sure you can log in at flowrepository.org in your browser.
"No files found to download" — The dataset may be empty, private, or the ID may be incorrect. Verify the dataset exists by visiting https://flowrepository.org/id/YOUR-DATASET-ID in your browser.
Downloads keep failing — Try increasing retries: --max-retries 5. If the site is slow, you can just re-run the script — it will skip files that were already downloaded successfully.