TY - JOUR
T1 - Development of a gene expression panel, for the prediction of protein abundances in cancer cell lines
AU - Lee, Gunhee
AU - Chung, Yeun Jun
AU - Lee, Minho
N1 - Publisher Copyright:
© 2021 Bentham Science Publishers.
PY - 2021
Y1 - 2021
N2 - Background: Due to the ease of quantifying mRNA expression in comparison with that of protein abundances, many studies have utilized it to infer protein product quantification. However, the mRNA expression values for a gene and its protein products are not known to have a strong relationship, because of the complex mechanisms required to regulate the amounts of protein levels, from translation to post-translational modifications. Methods: We have developed, in this study, models to predict protein levels from mRNA expression levels using the transcriptome and reverse phase protein arrays (RPPA)-based on protein levels in pan-cancer cell lines. When predicting the abundance of a protein expression, in addition to using RNA expression of the corresponding gene, we also used RNA expression levels of a particular set of other genes. By applying support vector regression, we have identified a 47-gene expression panel that con-tributes to the improved performance of the prediction, and its optimal subsets specific to each protein species. Result and Conclusion: Eventually, our final prediction models doubled the number of predictable protein expressions (r > 0.7). Due to the weaknesses of RPPA, our model had some limitations, however, we expect that these prediction models and the panel can be widely used in the future to infer protein abundances.
AB - Background: Due to the ease of quantifying mRNA expression in comparison with that of protein abundances, many studies have utilized it to infer protein product quantification. However, the mRNA expression values for a gene and its protein products are not known to have a strong relationship, because of the complex mechanisms required to regulate the amounts of protein levels, from translation to post-translational modifications. Methods: We have developed, in this study, models to predict protein levels from mRNA expression levels using the transcriptome and reverse phase protein arrays (RPPA)-based on protein levels in pan-cancer cell lines. When predicting the abundance of a protein expression, in addition to using RNA expression of the corresponding gene, we also used RNA expression levels of a particular set of other genes. By applying support vector regression, we have identified a 47-gene expression panel that con-tributes to the improved performance of the prediction, and its optimal subsets specific to each protein species. Result and Conclusion: Eventually, our final prediction models doubled the number of predictable protein expressions (r > 0.7). Due to the weaknesses of RPPA, our model had some limitations, however, we expect that these prediction models and the panel can be widely used in the future to infer protein abundances.
KW - Cancer cell line
KW - Gene expression
KW - Prediction model
KW - Protein abundance
KW - Reverse phase protein array
KW - Support vector regression
UR - https://www.scopus.com/pages/publications/85117170295
U2 - 10.2174/1574893616666210517162530
DO - 10.2174/1574893616666210517162530
M3 - Article
AN - SCOPUS:85117170295
SN - 1574-8936
VL - 16
SP - 846
EP - 854
JO - Current Bioinformatics
JF - Current Bioinformatics
IS - 6
ER -