Skip to content

ijustsnapped/flowRep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

FlowRepository Dataset Downloader

A simple Python script to download dataset files from FlowRepository.org. FlowRepository's built-in ZIP download is often broken, so this tool downloads each file individually instead.

What does this do?

If you have a FlowRepository dataset ID (like FR-FCM-Z2SS), this script will:

  1. Log in to FlowRepository with your account
  2. Find all the files in that dataset
  3. Download them to a folder on your computer
  4. Skip files you've already downloaded (safe to re-run if interrupted)

Prerequisites

Installation

  1. Download or clone this repository:

    git clone https://github.com/YOUR_USERNAME/flowRep.git
    cd flowRep
    
  2. Create a conda environment and install dependencies:

    conda create -n flowrep python=3.10 -y
    conda activate flowrep
    pip install -r requirements.txt
    

    What is conda? Conda is a tool that creates isolated Python environments so packages don't conflict with each other. If you don't have conda installed, get it from Miniconda (the lightweight option) or Anaconda.

  3. Every time you open a new terminal to use this script, activate the environment first:

    conda activate flowrep
    

Usage

Easy mode (interactive)

Just run the script with no arguments and it will walk you through everything:

python download_flowrepo.py

You'll be prompted for:

  • The dataset ID (e.g., FR-FCM-Z2SS)
  • Where to save the files
  • Your FlowRepository email and password

Command-line mode

For more control, you can pass arguments directly:

python download_flowrepo.py FR-FCM-Z2SS
python download_flowrepo.py FR-FCM-Z2SS --output-dir ./my_data
python download_flowrepo.py FR-FCM-Z2SS --email you@example.com

All options

Option Description
dataset_id The dataset ID from FlowRepository (e.g., FR-FCM-Z2SS)
--output-dir Folder to save files to (default: uses the dataset ID as folder name)
--email Your FlowRepository email (will be prompted if not provided)
--password Your FlowRepository password (will be prompted if not provided)
--max-retries Number of times to retry a failed download (default: 3)

Saved credentials

The first time you enter your credentials, you'll be asked if you want to save them for next time. If you choose yes, they are stored in ~/.flowrepo_credentials.

To clear saved credentials, simply delete that file:

rm ~/.flowrepo_credentials

How it works

  1. Login — The script logs in to FlowRepository using your email and password via their web login form
  2. Resolve dataset — It converts the dataset ID (e.g., FR-FCM-Z2SS) to an internal experiment number
  3. Find files — It scrapes the download page to find links to each individual file (FCS files, attachments, etc.)
  4. Download — Each file is downloaded one at a time, with automatic retries on failure. Files that already exist are skipped.

Troubleshooting

"Could not find login form" — FlowRepository may have changed their website. Please open an issue.

"Login may have failed" — Double-check your email and password. Make sure you can log in at flowrepository.org in your browser.

"No files found to download" — The dataset may be empty, private, or the ID may be incorrect. Verify the dataset exists by visiting https://flowrepository.org/id/YOUR-DATASET-ID in your browser.

Downloads keep failing — Try increasing retries: --max-retries 5. If the site is slow, you can just re-run the script — it will skip files that were already downloaded successfully.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages