A new study that analyzed protein levels in 2,002 primary tumors from 14 tissue-based cancer types identified 11 distinct molecular subtypes, providing systematic knowledge that greatly expands a searchable online database that has become a go-to platform for cancer data analysis. by users all over the world.
The University of Alabama Birmingham Cancer Data Analysis Portal, or UALCAN, was developed and released for public use in 2017 as an easy-to-use portal for comprehensive cancer data analysis, including transcriptome, epigenetics and proteomics. UALCAN has received nearly 920,000 field visits from researchers in more than 100 countries, and has been cited more than 2,750 times.
“UALCAN is an attempt to distribute comprehensive cancer data to researchers and clinicians in an easy-to-use format to make discoveries and find needles in the haystack,” said Sooryanarayana Varamballly, PhD, professor in UAB’s Department of Molecular Pathology. He is a cytopathologist and director of the Translational Oncology Research Program at UAB. “Cancer detection, diagnosis, treatment, treatment, and research require a global team effort, and realizing the vast amount of data involved needs a way to analyze and interpret this data.”
Cancer is a complex disease, and its initiation, progression, metastasis, and metastasis to distant organs involve dynamic molecular changes in each type of cancer. Individual cancer patients show differences apart from some common genomic events.
In the new study, Varampali worked with longtime collaborator Chad Creighton, Ph.D., Baylor College of Medicine, Houston, Texas. Creighton led the proteomic study published in nature connections, “The 2002 Characterization of Human Cancer Genome Proteins Reveals Pan-cancer Molecular Subtypes and Associated Pathways.” This extends to two early studies on proteins published in 2019 and 2021.
Previously, the team performed RNA transcript analysis, and provided the data to researchers through UALCAN, to identify pathways used by the many forms of cancer to aid growth, spread, and aggressiveness. With this recent study, the team conducted and incorporated large-scale proteomics analysis. The data and results provide new ideas for further research and possible therapeutic interventions.
A protein is a complement to proteins expressed in a cell or tissue, and it can be quantified by recent technological advances in mass spectrometry. In cells, DNA makes mRNA, and mRNA makes protein, processes known as the central tenet of molecular biology. Proteins are major functional parts of cells, and are essential in cell metabolism, structure, growth, signaling and movement.
Cancers represented in the UALCAN proteomic data set include breast, colorectal, stomach, glioblastoma, head and neck, liver, lung adenocarcinoma, squamous lung, ovarian, pancreatic, pediatric brain, prostate, renal and uterine cancers. The number of tumors in each type of cancer in the study ranged from 76 to 230, with an average of 143. Interestingly, the subtypes based on the overall cancer proteins that the current study found cut across tumor lineages.
The summary proteomic data set came from 17 individual studies. Data consistent with mult-omics was available for most of these tumors, including mRNA levels, somatic small mutations and insertions/deletions in DNA, and somatic DNA copy number alterations.
Overall, the researchers found that the protein expression of genes across tumors broadly correlates with corresponding mRNA levels or copy number changes. However, there were some notable exceptions.
They identified 11 distinct subtypes based on cross-cutting proteins – called s1 to s11 – that could provide insight into the pathways and dysregulated processes in tumors that make them cancerous. Each subtype extended to several types of tissue-based cancers, although the s11 subtype was specific to brain tumors, spanning glioblastomas and pediatric brain tumors.
Each subtype expressed specific gene classes, some of which had previously been seen in a previous, less comprehensive proteomic study. Three subtypes showed novel gene classes: s7 subtype with ‘axon guiding’ and ‘crimp splicing’ genes, s10 subtype with ‘DNA repair’ and ‘chromatin regulation’ genes, and s11 subtype with ‘synapse’ and ‘dendritic’ and “axon” genes.
At the DNA level, the study detailed differences between protein-based subtypes in total copy number of genes, and somatic mutations in subtypes associated with higher pathway activity, as inferred from proteomic or transcription data.
“The results of our study provide a framework for understanding the molecular landscape of cancers at the protein level to integrate and compare the data with other molecular correlates of cancers,” Varampalli said. “Linked data sets and associations at the gene level represent a resource for the research community, including helping to identify candidate genes for functional studies and further development of candidates as diagnostic markers or therapeutic targets for a particular subset of cancers.
“Furthermore, this study reinforces the notion that a comprehensive survey of cancers should be conducted at the protein level, although historically expression profiling of tumors has been mostly limited to the level of RNA transcripts. Many of the analyzes are based on the cutting-edge cancer data analysis platform. Consistently based on requests from users or experts, the team owes support and encouragement to researchers who use this platform to make discoveries that make a difference in cancer research.”
Some of the UAB’s large datasets are generated by consortia such as the Cancer Genome Atlas, or TCGA, and the Clinical Tumor Protein Analysis Consortium, or CPTAC, of the National Cancer Institute.
Accurate targeting of cancer requires the identification of individual or subclass-specific genomic and molecular alterations. To help cancer researchers perform various analyzes of data in order to better understand these large data sets, Darshan Shimoga Chandrashekar, Ph.D., led the development of the UALCAN portal under Varamballly’s supervision. Updates to this ever-evolving portal have recently been published in Neoplasia.
The UALCAN initiative and its ongoing development includes contributions from a team of experts including bioinformaticians, computer scientists, statisticians, cancer biologists, pathologists, and oncologists. “It’s a collaborative scientific approach to empowering the global cancer research team to treat cancer,” Varamballi said.
Support came from the National Institutes of Health awarding CA125123 and CA118948 and US Department of Defense grant W81XWH-19-1-0588.
Co-authors of this study are Yiqun Zhang and Fengju Chen, Baylor College of Medicine, and Chandrashekar, Department of Pathology at UAB’s Department of Molecular and Cellular Pathology.
Pathology is a department in the Marnix et Hersinc School of Medicine at Abu Dhabi University. Varamballly is a senior scientist at the O’Neal Comprehensive Cancer Center and Institute for Informatics at UAB and is the co-director of the UAB Biomedical Graduate School of Cancer Biology topic. He holds an adjunct position at the Michigan Center for Transformational Pathology at the University of Michigan, Ann Arbor.