Skip to content

benjacobs123456/cam_eu_scRNAseq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Single-cell RNA seq analysis of Multiple Sclerosis CSF & blood

Preamble

  • This directory contains scripts used to analyse scRNAseq data from the Cambridge and TUM cohorts.
  • Single cell sequencing data were generated using 10X 5' and VDJ single cell sequencing technology from CSF and PBMC samples in a cohort of people with Multiple Sclerosis and other neurological disease controls.
  • All analyses were run on the Cambridge Slurm HPC.
  • Unless specified, all scripts were run in R/4.0.3 or R/4.1.0.
  • You can access the paper in Cell Reports Medicine here.
  • Raw data (Seurat objects containing 5' gene expression and VDJ receptor sequencing data for B and T cells) can be downloaded via Zenodo here.
  • Code contributors: Ben Jacobs and Christiane Gasperi

Code

Deconvolution and basic qc

sbatch gex_deconvolution_qc_step1.sh

This script runs Rscript gex_deconvolution_qc.R which performs the following QC steps on each batch of GEX data

  • Filtering by RNA count & MT%
  • Ambient RNA correction with SoupX
  • Doublet identification
  • Per-batch normalisation with SCTransform

Integration

sbatch integration_icelake.sh

This script runs Rscript integration.R which integrates the post-qc datasets using Harmony.

UMAP

sbatch umap_icelake.sh

This script runs Rscript umap.R which performs UMAP and Louvain clustering across a range of parameters.

Cluster biomarkers

sbatch cluster_biomarkers_icelake.sh

This script runs Rscript find_cluster_biomarkers_and_update_pheno.R which does the following

  • Cleans phenotypes
  • Makes some plots exploring clustering
  • Compares annotations of cell types across different methods (SingleR, Azimuth, Celltypist)
  • Calculates cluster-specific biomarkers

CellTypist annotation

sbatch celltypist.sh
  • Which splits clusters with Rscript celltypist_prep.R and then runs celltypist in each cluster.

Update cluster IDs

sbatch update_clusters.sh
  • Which runs Rscript update_cluster_labels.R to update cluster IDs

DE & DA

To run DE and DA using the broad clusters:

sbatch de_icelake.sh
  • Runs DE tests with Rscript de_da_tests_phenotypes.R
  • Summary plots then made with Rscript de_summary_plots.R

GSEA

Then to run GSEA on the broad clusters with those DE results:

sbatch gsea.sh

Pathway analysis

sbatch pathway_analysis.sh

Cell-cell communication

sbatch ccc_per_sample.sh
  • Which runs LIANA on a per-sample basis
  • Rscript Rscript ccc_overall.R then combines and explores these results

Prepare for Immcantation/Dandelion

These scripts prepare the VDJ data for QC with dandelion

Rscript dandelion_preparation.R TCR
Rscript dandelion_preparation.R BCR
Rscript dandelion_preparation2.R TCR
Rscript dandelion_preparation2.R BCR
Rscript make_dandelion_metafile.R

And then to run dandelion pre-processing:

sbatch dandelion_bcr.sh
sbatch dandelion_tcr.sh

Filter GEX based on VDJ QC

These scripts then filter the GEX data based on the QC'd VDJ data:

sbatch dandelion_filtering_tcr.sh
sbatch dandelion_filtering.sh

Analyse VDJ data

Rscript bcr_analysis.R
Rscript tcr_analysis.R

eQTL analysis

./eQTL_analysis/MasterScript.sh contains the eQTL pipeline and refers to scripts in ./eQTL_analysis/.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors