HAIL on Imperial HPC

This repository provides scripts to run Hail workflows on the Imperial College High Performance Computing (HPC) cluster, specifically for:

Converting multiple single-sample gVCF files into a multi-sample VDS (Variant Dataset) using the Hail Combiner.

Walkthrough

1. 🔧 Setup

a) Clone this repository

git clone https://your.repo.url/here.git
cd Hail  # or the name of your cloned directory

b) Create the Conda environment

This repository uses a predefined Conda environment (conda_env.yml) to ensure all dependencies are consistent. The environment must be created on the login node of the Imperial HPC.

See Imperial HPC Conda guide if needed.

# Enable conda in your shell (adjust the path if needed)
eval "$(~/anaconda3/bin/conda shell.bash hook)"

# Remove existing environment (if it exists)
conda env remove -n hail

# Create the environment from the provided file
conda env create --file conda_env.yml

2. Convert gVCF to VDS

This section describes how to convert a list of single-sample gVCF files into a Hail Variant Dataset (VDS).

a) Prepare a list of gVCFs

Create a text file where each line is the absolute path to a .gvcf.gz file you want to include in the multi-sample dataset.

Example (my_gvcf_list.txt):

/rds/general/project/example/data/sample01.gvcf.gz
/rds/general/project/example/data/sample02.gvcf.gz

b) Configure your run

Edit the set_variables.sh file to define:

File paths (gVCF list, output VDS, logs, etc.)
Runtime parameters (threads, memory, etc.)

Each variable is documented in the file to help guide configuration.

c) Submit your job

Use the provided submission script to run the pipeline. You must pass the absolute path to your set_variables.sh file:

bash scripts/submit_gVCF_to_VDS.sh /full/path/to/set_variables.sh

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
scripts		scripts
.gitignore		.gitignore
README.md		README.md
conda_env.yml		conda_env.yml
set_variables.sh		set_variables.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HAIL on Imperial HPC

Walkthrough

1. 🔧 Setup

a) Clone this repository

b) Create the Conda environment

2. Convert gVCF to VDS

a) Prepare a list of gVCFs

b) Configure your run

c) Submit your job

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HAIL on Imperial HPC

Walkthrough

1. 🔧 Setup

a) Clone this repository

b) Create the Conda environment

2. Convert gVCF to VDS

a) Prepare a list of gVCFs

b) Configure your run

c) Submit your job

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages