Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
fa354dd
15 raw data - update from Rob Moritz
jessegmeyerlab Oct 5, 2023
583db4d
Update metadata.yaml
jessegmeyerlab Oct 5, 2023
b7af9a3
Update 01.abstract.md
jessegmeyerlab Oct 5, 2023
c372736
Update 02.introduction.md
jessegmeyerlab Oct 6, 2023
591b916
Update 03.biochemistry-basics.md
jessegmeyerlab Oct 6, 2023
c1f1b85
Update 15.raw-data-analysis.md
jessegmeyerlab Oct 6, 2023
e17ba73
Update 02.introduction.md
jessegmeyerlab Oct 6, 2023
db33d76
Update 04.experiment-types.md
jessegmeyerlab Oct 6, 2023
47dc332
Update 05.protein-extraction.md
jessegmeyerlab Oct 6, 2023
9b74ff5
Update 05.protein-extraction.md
jessegmeyerlab Oct 6, 2023
0eb7a95
Update 06.proteolysis.md
jessegmeyerlab Oct 6, 2023
1402f7f
Update 07.peptide-quantification.md
jessegmeyerlab Oct 6, 2023
104f016
Update 08.enrichment.md
jessegmeyerlab Oct 6, 2023
5abfbde
Update 09.peptide-purification.md
jessegmeyerlab Oct 6, 2023
771c20c
Update 10.liquid-chromatography.md
jessegmeyerlab Oct 6, 2023
11cdd6b
Update 11.peptide-ionization.md
jessegmeyerlab Oct 6, 2023
dfdc619
Update 12.mass-spectrometers.md
jessegmeyerlab Oct 6, 2023
47fa660
Update 13.Peptide-Fragmentation.md
jessegmeyerlab Oct 6, 2023
daaca50
Update 14.Data-Acquisition.md
jessegmeyerlab Oct 6, 2023
fb8ab20
Update 14.Data-Acquisition.md
jessegmeyerlab Oct 6, 2023
b9a83f4
Update and rename 17biological-interpretation.md to 17.biological-int…
jessegmeyerlab Oct 6, 2023
7462213
Update metadata.yaml
jessegmeyerlab Oct 6, 2023
7df17d5
Update 11.peptide-ionization.md
jessegmeyerlab Oct 6, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions content/01.abstract.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
## Abstract {.page_break_before}

Proteomics is the large scale study of protein structure and function from biological systems.
Proteomics is the large scale study of protein structure and function from biological systems through protein identification and quantification.
"Shotgun proteomics" or "bottom-up proteomics" is the prevailing strategy, in which proteins are hydrolyzed into peptide that are analyzed by mass spectrometry.
Proteomics studies can be applied to diverse studies ranging from simple protein identification to studies of protein-protein interactions, absolute and relative protein quantification, post-translational modifications, and protein stability.
Proteomics studies can be applied to diverse studies ranging from simple protein identification to studies of proteoforms, protein-protein interactions, absolute and relative protein quantification, post-translational modifications, and protein stability.
To enable this range of different experiments, there are diverse strategies for proteome analysis.
The nuances of how proteomic workflows differ may be difficult to understand for new practitioners.
Here, we provide a comprehensive tutorial of different proteomics methods.
Expand Down
30 changes: 19 additions & 11 deletions content/02.introduction.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,19 @@
## Introduction {.page_break_before}

Proteomics is the large scale study of protein structure and function.
Proteomics is the large-scale study of protein structure and function.
Proteins are translated from mRNAs that are transcribed from the genome.
Although the genome encodes potential cellular functions and states, the study of proteins is necessary to truly understand biology.
Currently, proteomic studies are facilitated by mass spectrometry, although alternative methods are being developed.
Although the genome encodes potential cellular functions and states, the study of proteins in all their forms is necessary to truly understand biology.

Currently, proteomics can be performed with various methods.
Alternative methods based on affinity interactions of antibodies or DNA aptamers have been developed, namely Somascan and Olink.
There are also nascent methods such as nanopores that are under development and not yet applicable to whole proteomes.
Another approach uses parallel immobilization of peptides with total internal reflection microscopy and sequential edman degradation [@DOI:10.1038/nbt.4278].
However, by far the most common method for proteomics is based on mass spectrometry with liquid chromatography.

Modern proteomics started around the year 1990 with the introduction of soft ionization methods that enabled, for the first time, transfer of large biomolecules into the gas phase without destroying them [@DOI:10.1126/science.2675315; @DOI:10.1002/rcm.1290020802].
Shortly afterward, the first computer algorithm for matching peptides to a database was introduced [@PMID:24226387].
Another major milestone that allowed identification of over 1000 proteins were actually improvements to chromatography [@DOI:10.1021/ac010617e].
As the volume of data exploded, methods for statistical analysis transitioned use from the wild west to modern informatics based on statistical models [@DOI:10.1021/ac0341261] and the false discovery rate [@DOI:https://doi.org/10.1038/nmeth1019].
As the volume of data exploded, methods for statistical analysis transitioned use from the wild west to modern informatics based on statistical models [@DOI:10.1021/ac0341261] and the false discovery rate [@DOI:10.1038/nmeth1019].
<!-- Todo: figure 1: major milestones in proteomics technology since 1990 -->

Two strategies of mass spectrometry-based proteomics differ fundamentally by whether proteins are cleaved into peptides before analysis: "top-down" and "bottom-up".
Expand All @@ -22,17 +27,20 @@ However, due to myriad analytical challenges, the depth of protein coverage that
In this tutorial we focus on the bottom-up proteomics workflow.
The most common version of this workflow is generally comprised of the following steps.
First, proteins in a biological sample must be extracted.
Usually this is done by denaturing and solubilizing the proteins while disrupting DNA and tissue.
Next, proteins are hydrolyzed into peptides, usually using a protease like trypsin.
Peptides from proteome hydrolysis must be purified.
Most often this is done with reversed phase chromatography cartridges or tips.
The peptides are then almost always separated by liquid chromatography before they are ionized and introduced into a mass spectrometer.
Usually this is done by denaturing and solubilizing the proteins while mechanically disrupting DNA and tissue to minimize interference in the analysis procedures.
Next, proteins are hydrolyzed into peptides, usually using a protease like trypsin, which produces basic c-terminal amino acids to aid in fragment ion series production during tandem mass spectrometry.
Peptides from proteome hydrolysis must be purified,; most often this is done with reversed phase chromatography cartridges or tips.
The peptides are then almost always separated by liquid chromatography before they are ionized and introduced into a mass spectrometer, although recent reports describe LC-free proteomics by direct infusion [@DOI:10.1038/s41592-020-00999-z; @DOI:10.1021/acs.analchem.2c02249; @DOI:10.1101/2023.06.26.546628].
The mass spectrometer then collects precursor and fragment ion data from those peptides.
The data analysis is usually the rate limiting step.
Peptides must be identified, and proteins are inferred and quantities are assigned.
Peptides must be identified, and proteins are inferred, and quantities are assigned.
Changes in proteins across conditions are determined with statistical tests, and results must be interpreted in the context of the relevant biology.

There are many variations on this workflow. The wide variety of experimental goals that are achievable with proteomics technology leads to a wide variety of potential proteomics workflows. Even choice is important and every choice will affect the results. In this tutorial, we cover all of the required steps in detail to serve as a tutorial for new proteomics practioners. There are 16 sections in total:
There are many variations to this workflow.
The wide variety of experimental goals that are achievable with proteomics technology leads to a wide variety of potential proteomics workflows.
Even choice is important, and every choice will affect the results.
In this tutorial, we cover all the required steps in detail to serve as a comprehensive overview for new proteomics practioners.
There are 16 sections in total:

1. Biochemistry basics
2. Types of experiments
Expand Down
14 changes: 8 additions & 6 deletions content/03.biochemistry-basics.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,10 @@
Proteins are large biomolecules or biopolymers made up of amino acids which are linked by peptide bonds.
They perform various functions in living organisms ranging from having structural roles to functional involvement in cellular signaling and the catalysis of chemical reactions (enzymes).
Proteins are made up of 20 different amino acids (not counting pyrrolysine and selenocysteine, which only occur in specific organisms) and their sequence is encoded in their corresponding genes.
The human genome encodes more than 20,000 different proteins.
Each protein is present at a different abundances.
Previous studies have shown that the concentration range of proteins can span over a range of at least seven orders of magnitude to up to 20 000 000 copies per cell and that their distribution is tissue-specific [@DOI:10.1038/msb.2011.82;@DOI:10.1016/j.cell.2020.08.036].
The human genome encodes approximately 19,778 of the predicted canonical proteins coded in the human genome [@PMID:36318223].
Each protein is present at a different abundance depending on the cell type.
Previous studies have shown that the concentration range of proteins can span over a range of at least seven orders of magnitude to up to 20 000 000 copies per cell, and that their distribution is tissue-specific [@DOI:10.1038/msb.2011.82;@DOI:10.1016/j.cell.2020.08.036].
Proteins can span more than 10 orders of magnitude in human blood, while a few protein make up most of the protein by weight in these fluids, making blood and plasma proteomics one of the most difficult matrices.
Due to genetic variations, as well as alternative splicing and post-translational modifications, multiple different proteoforms can be produced from one single gene (**Figure 1**) [@DOI:10.1038/nmeth.2369; @DOI:10.1038/s41587-023-01714-x].

![**Proteome Complexity.**
Expand All @@ -29,10 +30,10 @@ The most commonly studied and biologically relevant post-translational modificat
Post-translational modification of a protein can alter its function, activity, structure, location and interactions.
PTMs alter signal transduction pathways and gene expression control [@PMID:28656226] regulation of apoptosis [@PMID:23088365; @PMID:11368354] by phosphorylation.
Ubiquitination regulates protein degradation [@PMID:16738015], SUMOylation regulates chromatin structure, DNA repair, transcription, cell-cycle progression [@PMID:26601932; @PMID:29079793], and palmitoylation regulates maintenance of the structural organization of exosome-like extracellular vesicle membranes by [@PMID:30251702].
Glycosylation is a ubiquitous modification that regulates a variety of T cell functions, such as cellular migration, T cell receptor signalling, cell survival, and apoptosis [@PMID:22288421; @PMID:18846099].
Glycosylation is a ubiquitous modification that regulates a variety of T cell functions, such as cellular migration, T cell receptor signaling, cell survival, and apoptosis [@PMID:22288421; @PMID:18846099].
Deregulation of PTMs is linked to cellular stress and diseases [@doi:10.1038/s41570-020-00223-8].

Several non-MS methods exist to study PTMs, including in vitro PTM reaction tests with radioactive isotope-labelled substrates, western blot with PTM-specific antibodies, and peptide and protein arrays [@PMID:11062466; @PMID:12323352].
Several non-MS methods exist to study PTMs, including in vitro PTM reaction tests with radioactive isotope-labelled substrates, western blot with PTM-specific antibodies and superbinders, and peptide and protein arrays [@PMID:11062466; @PMID:12323352; @PMID:35613471].
While effective, these approaches have many limitations, such as inefficiency and difficulty in producing pan-specific antibodies.
MS-based proteomics approaches are currently the predominant tool for identifying and quantifying changes in PTMs.

Expand All @@ -50,6 +51,7 @@ The amino acid chain's folding: α-helix, β-sheet or turn.
- Tertiary structure:
The three-dimensional structure of the protein.

- Quarternary structure:
- Quaternary structure:
The structure of several protein molecules/subunits in one complex.

Of recent note, the development of AlphaFold, has enabled the high-accuracy three-dimensional structure of all human proteins and many hundreds of other species enabling the understanding pf protein fold and its relationship to function [@PMID:34265844; @PMID:37732824].
4 changes: 2 additions & 2 deletions content/04.experiment-types.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,8 @@ The common steps in a XL-MS workflow are as follows [@DOI:10.1021/acs.analchem.7
2. Add a cross-linking reagent to covalently connect adjacent protein regions (such as disuccinimidyl sulfoxide, DSSO) [@doi:10.1021/jasms.9b00085]
3. Proteolysis to produce peptides
4. MS/MS data collection
5. Identify cross-linked peptide pairs using special software (i.e. pLink [@DOI:10.1038/nmeth.2099])
6. Generate cross-link maps for structural modeling
5. Identify cross-linked peptide pairs using special software (i.e. pLink [@DOI:10.1038/nmeth.2099], KOJAK [@PMID:25812159; @PMID:36629399])
6. Generate cross-link maps for structural modeling and visualization [@PMID:27302480; @PMID:30525651]

#### Hydrogen deuterium exchange mass spectrometry (HDX-MS)
HDX-MS works by detecting changes in peptide mass due to exchange of amide hydrogens of the protein backbone is with deuterium from D2O [@doi:10.1038/s41592-019-0459-y].
Expand Down
12 changes: 6 additions & 6 deletions content/05.protein-extraction.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,15 +34,15 @@ AVOID the use of tween-20, triton-X, NP-40, and PEGs as these compounds are chal
For non-denaturing buffer conditions, which preserve tertiary and quaternary protein structures, additional additives may not be necessary for successful extraction and to prevent proteolysis or PTM modifications throughout the extraction process.
Protease, phosphatase and deubiquitinase inhibitors are optional additives in less denaturing conditions or in experiments focused on specific post-translational modifications.
Keep in mind that protease inhibitors may impact digestion conditions and will need to be diluted or removed prior to trypsin addition.
For extraction of DNA or RNA binding proteins, addition of a small amount of nuclease or benzonase might be useful for degradation of any bound nucleic acids and result in a more consistent digestion [@PMID:23792921].
For extraction of DNA or RNA binding proteins, addition of a small amount of nuclease or benzonase is useful for degradation of any bound nucleic acids and result in a more consistent digestion [@PMID:23792921].

### Mechanical or Sonic Disruption
#### Cell lysis
One typical lysis buffer is 8 M urea in 100 mM Tris, pH 8.5; the pH based on optimum trypsin activity [@PMID:25664860]
Small mammalian cell pellets and exosomes will lyse almost instantly upon addition denaturing buffer.
If non-denaturing conditions are desired, osmotic swelling and subsequent shearing or sonication can be applied [@DOI:10.1080/10826068.2020.1728696].
Efficiency of extraction and degradation of nucleic acids can be improved using various sonication methods: 1) probe sonicator with ice; 2) water bath sonicator with ice or cooling; 3) bioruptor® sonication device 4) Adaptive focused acoustics (AFA®) [@PMID:21060726].
Key to these additional lysis techniques are to keep the temperature of the sample from rising significantly which can cause proteins to aggregate or degrade.
Key to these additional lysis techniques is to keep the temperature of the sample from rising significantly which can cause proteins to aggregate or degrade.
Some cell types may require additional force for effective lysis (see below).
For cells with cell walls (i.e. bacteria or yeast), lysozyme is often added in the lysis buffer.
Any added protein will be present in downstream results, however, so excessive addition of lysozyme is to be avoided unless tagged protein purification will occur.
Expand All @@ -58,9 +58,9 @@ Cryo-fractionators homogenize samples in special bags that are frozen in liquid
After homogenization, samples can be sonicated by one of the methods above to fragment DNA and increase solubilization of proteins.

### Measuring the efficiency of protein extraction
Following protein extraction, samples should be centrifuged (10-14,000 g for 10-30 min depending on sample type) to remove debris and any unlysed material prior to determinining protein concentration.
Following protein extraction, samples should be centrifuged (10-14,000 g for 10-30 min depending on sample type) to remove debris and any unlysed material prior to determining protein concentration.
The amount of remaining insoluble material should be noted throughout an experiment as a large change may indicate protein extraction issues.
Protein concentration can be calculated using a number of assays or tools [@PMID:18429326; @PMID:12703310]; generally absorbance measuremnts are facile, fast and affordable, such as Bradford or BCA assays.
Protein concentration can be calculated using a number of assays or tools [@PMID:18429326; @PMID:12703310]; generally absorbance measurements are facile, fast and affordable, such as Bradford or BCA assays.
Protein can also be estimated by tryptophan fluorescence, which has the benefit of not consuming sample [@DOI:10.1021/ac504689z].
A nanodrop UV spectrophotometer may be used to measure absorbance at UV280.
Consistency in this method is important as each method will have inherent bias and error [@PMID:26342307; @PMID:30234128].
Expand All @@ -71,7 +71,7 @@ Typically, disulfide bonds in proteins are reduced and alkylated prior to proteo
This allows better access to all residues during proteolysis and removes the crosslinked peptides created by S-S inter peptide linkages.
There are a variety of reagent options for these steps.
For reduction, the typical agents used are 5-15 mM concentration of tris(2-carboxyethyl)phosphine hydrochloride (TCEP-HCl), dithiothreitol (DTT), or 2-mercaptoethanol (2BME).
TCEP-HCl is an efficient reducing agent, but it also significantly lowers sample pH, which can be abated by increasing sample buffer concentration or resuspending TCEP-HCl in an appropriate buffer system (i.e 1M HEPES pH 7.5).
TCEP-HCl is an efficient reducing agent, but it also significantly lowers sample pH, which can be abated by increasing sample buffer concentration or resuspending TCEP-HCl in an appropriate buffer system (i.e. 1M HEPES pH 7.5).
Following the reducing step, a slightly higher 10-20mM concentration of alkylating agent such as chloroacetamide/iodoacetamide or n-ethyl maleimide is used to cap the free thiols [@PMID:29019370; @PMID:15351294; @PMID:28539326].
In order to monitor which cysteine residues are linked or modified in a protein, it is also possible to alkylate free cysteines with one reagent, reduce di-sulfide bonds (or other cysteine modifications) and alkylate with a different reagent [@PMID:32132231; @PMID:28445428; @PMID:23074338].
Alkylation reactions are generally carried out in the dark at room temperature to avoid excessive off-target alkylation of other amino acids.
Expand All @@ -91,7 +91,7 @@ Any small-molecule removal protocol should be tested for efficiency prior to imp

### Protein quantification
After proteins are isolated from the sample matrix, they are often quantified.
Protein quantification is important to assess the yeild of an extraction procedure, and to adjust the scale of the downstream processing steps to match the amount of protein.
Protein quantification is important to assess the yield of an extraction procedure, and to adjust the scale of the downstream processing steps to match the amount of protein.
For example, when purifying peptides, the amount of sorbent should match the amount of material to be bound.
Presently, there is a wide variety of techniques to quantitate the amount of protein present in a given sample.
These methods can be broadly divided into three types as follows:
Expand Down
Loading