Show simple item record

Clustering analysis of microRNA and mRNA expression data from TCGA using maximum edge-weighted matching algorithms

dc.contributor.authorDing, Lizhong
dc.contributor.authorFeng, Zheyun
dc.contributor.authorBai, Yongsheng
dc.date.accessioned2019-11-26T13:51:00Z
dc.date.available2019-11-26T13:51:00Z
dc.date.issued2019-08-05
dc.identifier.citationBMC Medical Genomics. 2019 Aug 05;12(1):117
dc.identifier.urihttps://doi.org/10.1186/s12920-019-0562-z
dc.identifier.urihttps://hdl.handle.net/2027.42/152220
dc.description.abstractAbstract Background microRNA (miRNA) is a short RNA (~ 22 nt) that regulates gene expression at the posttranscriptional level. Aberration of miRNA expressions could affect their targeting mRNAs involved in cancer-related signaling pathways. We conduct clustering analysis of miRNA and mRNA using expression data from the Cancer Genome Atlas (TCGA). We combine the Hungarian algorithm and blossom algorithm in graph theory. Data analysis is done using programming language R and Python. Methods We first quantify edge-weights of the miRNA-mRNA pairs by combining their expression correlation coefficient in tumor (T_CC) and correlation coefficient in normal (N_CC). We thereby introduce a bipartite graph partition procedure to identify cluster candidates. Specifically, we propose six weight formulas to quantify the change of miRNA-mRNA expression T_CC relative to N_CC, and apply the traditional hierarchical clustering to subjectively evaluate the different weight formulas of miRNA-mRNA pairs. Among these six different weight formulas, we choose the optimal one, which we define as the integrated mean value weights, to represent the connections between miRNA and mRNAs. Then the Hungarian algorithm and the blossom algorithm are employed on the miRNA-mRNA bipartite graph to passively determine the clusters. The combination of Hungarian and the blossom algorithms is dubbed maximum weighted merger method (MWMM). Results MWMM identifies clusters of different sizes that meet the mathematical criterion that internal connections inside a cluster are relatively denser than external connections outside the cluster and biological criterion that the intra-cluster Gene Ontology (GO) term similarities are larger than the inter-cluster GO term similarities. MWMM is developed using breast invasive carcinoma (BRCA) as training data set, but can also applies to other cancer type data sets. MWMM shows advantage in GO term similarity in most cancer types, when compared to other algorithms. Conclusions miRNAs and mRNAs that are likely to be affected by common underlying causal factors in cancer can be clustered by MWMM approach and potentially be used as candidate biomarkers for different cancer types and provide clues for targets of precision medicine in cancer treatment.
dc.titleClustering analysis of microRNA and mRNA expression data from TCGA using maximum edge-weighted matching algorithms
dc.typeArticleen_US
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/152220/1/12920_2019_Article_562.pdf
dc.language.rfc3066en
dc.rights.holderThe Author(s).
dc.date.updated2019-11-26T13:51:02Z
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.