Proteogenomics is an approach to tumor profiling that combines next-generation DNA and RNA sequencing (NGS), a high-throughput methodology that enables rapid sequencing of the base pairs in DNA or RNA samples, with mass spectrometry-based proteomics to provide deep, unbiased quantification of proteins and post-translational modifications such as phosphorylation.
Proteogenomics helps to create a better understanding of the molecular profile of human cell types, which leads to a (better) understanding of its role in normal physiology and disease.  Proteogenomics has contributed significantly to the genome (re)-annotation, whereby novel coding sequences (CDS) are identified and confirmed. 
Using this approach, the researchers were able to propose more precise diagnostics for known treatment targets, identify new tumor susceptibilities for translation into treatments for aggressive tumors and implicate new mechanisms involved in breast cancer treatment resistance.
Proteogenomics combines laboratory techniques for next-generation DNA and RNA sequencing with mass spectrometry-based analysis for deep, unbiased quantification of proteins and protein modifications in cancer cells, along with computational methods for integrated analysis of this data.
Such proteogenomic approaches have been extensively applied to study cancers by investigators at the National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (NCI-CPTAC), a national effort to accelerate the understanding of the molecular basis of cancer through the application of large-scale proteome and genome analysis, or proteogenomics.
Launched in 2011, CPTAC pioneered the integrated proteogenomic analysis of colorectal, breast, and ovarian cancer to reveal new insights into these cancer types, such as identification of proteomic-centric subtypes, prioritization of driver mutations by correlative analysis of copy number alterations and protein abundance, and understanding cancer-relevant pathways through posttranslational modifications.
“Importantly, our analysis included identification of phosphorylation and acetylation, protein modifications that reveal information about the activity of individual proteins. Protein acetylation had not been profiled in breast cancer before. These new approaches promise biological insights into difficult to treat breast cancers and the ability to dissect response heterogeneity,” said co-corresponding author Matthew Ellis, MD, BChir, BSc., Ph.D., FRCP, breast cancer oncologist and professor, and director of Baylor College of Medicine’s Lester and Sue Smith Breast Center, McNair Scholar at Baylor and Susan G. Komen Scholar.
Simultaneously analyzing changes in the genetic code and the resulting alterations in terms of protein function provides a much more complete picture of what is going on inside breast cancer tumors than analyzing each component in isolation.
More precise data
The researchers’ initial proteogenomic analysis of breast cancer using residual samples from the Cancer Genome Atlas provided proof-of-principle that proteogenomics represented an advance in breast cancer profiling.
The current study represents a major step forward in that it included tissue samples that were collected using protocols that specifically preserve protein modifications, analyzed many more samples, carried out genomics and proteomic characterization on exactly the same tissue fragments, and added protein acetylation profiling to protein phosphorylation, DNA and RNA measurements.
Proteogenomic analytical techniques have matured substantially in recent years, and those cutting-edge approaches were applied to this dataset.
The researchers completed proteogenomic analyses of 122 treatment-naïve primary breast cancer samples. Their measurements generated a tremendous amount of data ‒ about 38,000 protein phosphorylation sites and almost 10,000 protein acetylation sites per tumor, as well as whole-exome and RNA sequencing ‒ necessitating advanced computational methods for analyzing and integrating the information.
“Complex analyses like these are now routinely being performed on large-scale proteogenomic data sets, and we are developing tools to automate the process,” noted D.R. Mani, Ph.D., a co-corresponding author and principal computational scientist at Broad.
“We describe here proteogenomic characterization of the largest set to date of breast cancer samples that were purposefully collected for these types of analyses, maximizing the fidelity and accuracy of the results,” Ellis explained.
“Each tumor cell has literally hundreds of genomic changes. Mostly we don’t understand their significance either clinically or biologically. The approach we illustrate enables a deeper and more complete understanding of each individual’s breast cancer,” he added.
Identifying drug targets
For example, the analyses revealed that some subtypes of breast cancer have certain targetable enzymes called kinases that are more heavily phosphorylated than in other cancers, suggesting greater activity and therefore targetability.
These analyses included recently identified drug targets such as CDK4/6 and its regulatory context, as well as programmed cell death receptors and ligands that are the targets of new immunotherapy drugs.
The integrated analyses also identified new sets of estrogen receptor-positive breast cancers that could be treated with these agents. This is significant because currently these agents are restricted to estrogen receptor-negative disease.
Additional analyses raised entirely new insights into the metabolic vulnerabilities of ER+ and ER- breast cancer.
“Our global analysis of the acetylproteome, the first in breast tumors, exposed new details of breast cancer subtype-specific metabolism,” said co-corresponding author Steven A. Carr, Ph.D., director of proteomics at Broad.
Improving diagnosis and treatment
The researchers hope that their findings will motivate breast cancer scientists to explore the therapeutic or diagnostic potential of the new biological alterations they have identified in this study. They also are optimistic that their findings will encourage an effort to translate proteogenomics into a cancer-profiling approach that can be used routinely in the clinic to improve diagnosis and treatment.
“We believe that proteogenomics approaches will continue to help us to identify new candidate therapeutic targets, better understand the immune landscape of breast and other cancers, gain insights into response and resistance, and ultimate progress toward our goal of personalized cancer care,” noted co-corresponding author Michael Gillette, M.D., Ph.D., a pulmonary and critical care physician at Massachusetts General Hospital and senior group leader in proteomics at Broad.
“The science is powerful and exciting, but in the end, it is what we can deliver to the patient that makes it important,” Gillette concluded.
 Madugundu AK, Na CH, Nirujogi RS, Renuse S, Kim KP, Burns KH, Wilks C, Langmead B, Ellis SE, Collado-Torres L, Halushka MK, Kim MS, Pandey A. Integrated Transcriptomic and Proteomic Analysis of Primary Human Umbilical Vein Endothelial Cells. Proteomics. 2019 Aug;19(15):e1800315. doi: 10.1002/pmic.201800315. Epub 2019 Jun 26. PMID: 30983154; PMCID: PMC6812510.
 Ang MY, Low TY, Lee PY, Wan Mohamad Nazarie WF, Guryev V, Jamal R. Proteogenomics: From next-generation sequencing (NGS) and mass spectrometry-based proteomics to precision medicine. Clin Chim Acta. 2019 Nov;498:38-46. doi: 10.1016/j.cca.2019.08.010. Epub 2019 Aug 14. PMID: 31421119.
 Krug K, Jaehnig EJ, Satpathy S, Ellis MJ, Gillette MA. Proteogenomic Landscape of Breast Cancer Tumorigenesis and Targeted Therapy. Cell. Published: November 18, 2020DOI:https://doi.org/10.1016/j.cell.2020.10.036 [Article]
Featured Image: Baylor College of Medicine, Lester and Sue Smith Breast Center. Photo courtesy: © 2020 Baylor College of Medicine. used with permission.