ProteO-Linker is a web-based protein expression analysis tool. It uses a Shiny interface built in R to examine Olink data generated by their proteomics platform. Users are able to upload their data and generate custom plots of expression data.
Olink's Proximity Extension Assay (PEA) uses a high-throughout panel of antibodies conjugated to oligonucleotide barcodes to target proteins of interest. Binding of antibodies is counted by barcode using Illumina NGS to calculate the abundance of the proteins in a sample, giving a measure of target protein expression.
-
Clone the repository
git clone https://github.com/collaborativebioinformatics/Proteolinker.git cd Proteolinker -
Install dependencies
Make sure you have R (≥ 4.0) and Shiny installed.
Install required packages from within R:
install.packages(c("shiny", "ggplot2", "dplyr", "tidyr", "readr", "ComplexHeatmap", "OlinkAnalyze", "gridExtra", "grid", "purrr", "DT", "plotly"))
-
Launch the app
From R or RStudio, run:
shiny::runApp("app.R")
-
Upload data
- Select your Olink NPX or parquet file(s)
- The app will validate input files and provide a summary of Sample/Assay failures or warnings.
- Select a LOD filtering algorithm (optional)
- Upload a differential expression (DE) analysis OR a manifest to base the DE analysis on
-
Explore results
- Run pathway enrichment analysis
- Generate datafiles and various plots
- Download plots (PNG) or results (CSV/PDF)
The goal of ProteO-Linker is to provide a straightforward workflow for examining protein expression data generated by Olink assays. The tool performs quality checks, applies filtering, and supports downstream analysis such as differential expression and pathway enrichment. Users can visualize results through multiple plot types and export figures or tables for reporting. The objective is to lower the barrier for exploratory analysis of Olink data while keeping the workflow modular and easy to adapt.
ProteO-Linker implements a stepwise pipeline:
-
Quality Control (QC) of Input Files
- For each uploaded parquet file, two checks are performed:
- Verify that the file is complete and correctly formatted
- Report failed or warned samples/assays
- Provides users with a summary of QC status for each file
- For each uploaded parquet file, two checks are performed:
-
LOD Filtering
- Two filtering methods are available:
- Fixed LOD (values provided by Olink)
- Standard deviation–based LOD (customizable, current default is 3 SD from mean)
- If applied, a volcano plot of all 6 sample controls is generated, showing proteins above and below LOD
- Two filtering methods are available:
-
Differential Expression (DE) Analysis
- Users may upload a list of differentially expressed proteins with DE analysis
- Or the app will do DE analysis, users need to upload a manifest: male vs. female, treated vs. nontreated
- DE will be done via t-test
- DE results can be visualized with volcano plots (protein name, fold-change, p-value)
- These outputs will be used as input for pathway enrichment
-
Pathway Enrichment
- Enrichment analysis highlights pathways associated with significant proteins
- Will use GSEA; additional methods (FDR, Fisher’s test) planned for future development
-
Visualization
- Histograms/heatmaps of protein expression linked to KEGG, Go terms, Reactome or all three
- Interactive volcano plots for DE analysis
- Volcano plots visualizing LOD filtering results
-
Outputs
- PNG plots for visualization
- CSV files of selected proteins, samples, or enriched pathways
- Future Developement: Optional PDF report summarizing results
We plan to enhance the differential expression analysis by incorporating multivariate approaches and expanding the selection of pathway enrichment methods. More broadly, this is one of several analytical pipelines we aim to develop further.
Additional ideas include:
- Multi-omics integration: Combine proteomics with RNA-Seq or genomics data to uncover regulatory mechanisms.
- Protein-protein interaction (PPI) network analysis: Visualize and analyze protein interaction networks to identify hub proteins that may serve as key regulators within biological pathways.
- Biomarker signature discovery: Apply machine learning techniques to identify panels of proteins that can predict or diagnose disease states or treatment responses.
- Kimberly Walker, HGSC-BCM (Team Leader)
- Tomilola Aderupoko, Hood College, MD
- Vijetha Balakundi, McLean Hospital, Boston | Broad Institute of MIT and Harvard, Cambridge
- Neda Ghohabi Esfahani, Department of Bioengineering, Northeastern University, Boston, MA
- Michael Muchow, DMV Petri Dish
- Aniket Naik, University of Pittsburgh
- Qiaoyan Wang, HGSC-BCM
| Abbreviation | Acronym |
|---|---|
| DE | Differential Expression |
| FDR | False Discovery Rate |
| GO | Gene Ontology |
| GSEA | Gene Set Enrichment Analysis |
| KEGG | Kyoto Encyclopedia of Genes and Genomes |
| LOD | Limit of Detection |
| LOQ | Limit of Quantification |
| NGS | Next-Generation Sequencing |
| NPX | Normalized Protein eXpression |
| PEA | Proximity Extension Assay |
| PPI | Protein-Protein Interaction |
| SD | Standard Deviation |
This software was created as part of an NCBI codeathon, a hackathon-style event focused on rapid innovation. While we encourage you to explore and adapt this code, please be aware that NCBI does not provide ongoing support for it.
