Improving accuracy of protein contact prediction using balanced network deconvolution
dc.contributor.author | Sun, Hai‐ping | en_US |
dc.contributor.author | Huang, Yan | en_US |
dc.contributor.author | Wang, Xiao‐fan | en_US |
dc.contributor.author | Zhang, Yang | en_US |
dc.contributor.author | Shen, Hong‐bin | en_US |
dc.date.accessioned | 2015-03-05T18:24:25Z | |
dc.date.available | 2016-05-10T20:26:27Z | en |
dc.date.issued | 2015-03 | en_US |
dc.identifier.citation | Sun, Hai‐ping ; Huang, Yan; Wang, Xiao‐fan ; Zhang, Yang; Shen, Hong‐bin (2015). "Improving accuracy of protein contact prediction using balanced network deconvolution." Proteins: Structure, Function, and Bioinformatics 83(3): 485-496. | en_US |
dc.identifier.issn | 0887-3585 | en_US |
dc.identifier.issn | 1097-0134 | en_US |
dc.identifier.uri | https://hdl.handle.net/2027.42/110720 | |
dc.description.abstract | Residue contact map is essential for protein three‐dimensional structure determination. But most of the current contact prediction methods based on residue co‐evolution suffer from high false‐positives as introduced by indirect and transitive contacts (i.e., residues A–B and B–C are in contact, but A–C are not). Built on the work by Feizi et al. (Nat Biotechnol 2013; 31:726–733), which demonstrated a general network model to distinguish direct dependencies by network deconvolution, this study presents a new balanced network deconvolution (BND) algorithm to identify optimized dependency matrix without limit on the eigenvalue range in the applied network systems. The algorithm was used to filter contact predictions of five widely used co‐evolution methods. On the test of proteins from three benchmark datasets of the 9th critical assessment of protein structure prediction (CASP9), CASP10, and PSICOV (precise structural contact prediction using sparse inverse covariance estimation) database experiments, the BND can improve the medium‐ and long‐range contact predictions at the L/5 cutoff by 55.59% and 47.68%, respectively, without additional central processing unit cost. The improvement is statistically significant, with a P‐value < 5.93 × 10−3 in the Student's t‐test. A further comparison with the ab initio structure predictions in CASPs showed that the usefulness of the current co‐evolution‐based contact prediction to the three‐dimensional structure modeling relies on the number of homologous sequences existing in the sequence databases. BND can be used as a general contact refinement method, which is freely available at: http://www.csbio.sjtu.edu.cn/bioinf/BND/. Proteins 2015; 83:485–496. © 2014 Wiley Periodicals, Inc. | en_US |
dc.publisher | Wiley Periodicals, Inc. | en_US |
dc.subject.other | predictor | en_US |
dc.subject.other | protein structure prediction | en_US |
dc.subject.other | residue contact map | en_US |
dc.subject.other | residue co‐evolution | en_US |
dc.subject.other | transitive noise | en_US |
dc.subject.other | filter | en_US |
dc.title | Improving accuracy of protein contact prediction using balanced network deconvolution | en_US |
dc.type | Article | en_US |
dc.rights.robots | IndexNoFollow | en_US |
dc.subject.hlbsecondlevel | Biological Chemistry | en_US |
dc.subject.hlbtoplevel | Science | en_US |
dc.description.peerreviewed | Peer Reviewed | en_US |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/110720/1/prot24744.pdf | |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/110720/2/prot24744-sup-0001-suppinfo.pdf | |
dc.identifier.doi | 10.1002/prot.24744 | en_US |
dc.identifier.source | Proteins: Structure, Function, and Bioinformatics | en_US |
dc.identifier.citedreference | Kinch L, Yong Shi S, Cong Q, Cheng H, Liao Y, Grishin NV. CASP9 assessment of free modeling target predictions. Proteins 2011; 79 ( Suppl 10 ): 59 – 73. | en_US |
dc.identifier.citedreference | de Juan D, Pazos F, Valencia A. Emerging methods in protein co‐evolution. Nat Rev Genet 2013; 14: 249 – 261. | en_US |
dc.identifier.citedreference | Berenger F, Zhou Y, Shrestha R, Zhang KY. Entropy‐accelerated exact clustering of protein decoys. Bioinformatics 2011; 27: 939 – 945. | en_US |
dc.identifier.citedreference | Berenger F, Shrestha R, Zhou Y, Simoncini D, Zhang KY. Durandal: fast exact clustering of protein decoys. J Comput Chem 2012; 33: 471 – 474. | en_US |
dc.identifier.citedreference | Kajan L, Hopf TA, Kalas M, Marks DS, Rost B. FreeContact: fast and free software for protein contact prediction from residue co‐evolution. BMC Bioinformatics 2014; 15: 85. | en_US |
dc.identifier.citedreference | Chiu DK, Kolodziejczak T. Inferring consensus structure from nucleic acid sequences. Comput Appl Biosci 1991; 7: 347 – 352. | en_US |
dc.identifier.citedreference | Dunn SD, Wahl LM, Gloor GB. Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction. Bioinformatics 2008; 24: 333 – 340. | en_US |
dc.identifier.citedreference | Feizi S, Marbach D, Medard M, Kellis M. Network deconvolution as a general method to distinguish direct dependencies in networks. Nat Biotechnol 2013; 31: 726 – 733. | en_US |
dc.identifier.citedreference | Jones DT, Buchan DW, Cozzetto D, Pontil M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 2012; 28: 184 – 190. | en_US |
dc.identifier.citedreference | Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, Zecchina R, Onuchic JN, Hwa T, Weigt M. Direct‐coupling analysis of residue coevolution captures native contacts across many protein families. Proc Natl Acad Sci USA 2011; 108: E1293 – E1301. | en_US |
dc.identifier.citedreference | Baldassi C, Zamparo M, Feinauer C, Procaccini A, Zecchina R, Weigt M, Pagnani A. Fast and accurate multivariate Gaussian modeling of protein families: predicting residue contacts and protein‐interaction partners. PloS One 2014; 9: e92721. | en_US |
dc.identifier.citedreference | Ezkurdia I, Grana O, Izarzugaza JM, Tress ML. Assessment of domain boundary predictions and the prediction of intramolecular contacts in CASP8. Proteins 2009; 77 ( S9 ): 196 – 209. | en_US |
dc.identifier.citedreference | Wigner EP. Random matrices in physics. SIAM Rev 1967; 9: 1 – 23. | en_US |
dc.identifier.citedreference | Monastyrskyy B, D'Andrea D, Fidelis K, Tramontano A, Kryshtafovych A. Evaluation of residue‐residue contact prediction in CASP10. Proteins 2014; 82 ( Suppl 2 ): 138 – 153. | en_US |
dc.identifier.citedreference | Karthikraja V, Suresh A, Lulu S, Kangueane U, Kangueane P. Types of interfaces for homodimer folding and binding. Bioinformation 2009; 4: 101. | en_US |
dc.identifier.citedreference | Tai CH, Bai H, Taylor TJ, Lee B. Assessment of template‐free modeling in CASP10 and ROLL. Proteins 2013; 82 ( Suppl 2 ): 57 – 83. | en_US |
dc.identifier.citedreference | Xu D, Zhang Y. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge‐based force field. Proteins 2012; 80: 1715 – 1735. | en_US |
dc.identifier.citedreference | , Zhang Y. ITASSER server for protein 3D structure prediction. BMC Bioinformatics 2008; 9: 40. | en_US |
dc.identifier.citedreference | Roy A, Kucukural A, Zhang Y. ITASSER: a unified platform for automated protein structure and function prediction. Nat Protocols 2010; 5: 725 – 738. | en_US |
dc.identifier.citedreference | Roy A, Yang J, Zhang Y. COFACTOR: an accurate comparative algorithm for structure‐based protein function annotation. Nucleic Acids Res 2012; 40 (Web Server issue): W471 – W477. | en_US |
dc.identifier.citedreference | Zhang J, Wang Q, Barz B, He Z, Kosztin I, Shang Y, Xu D. MUFOLD: a new solution for protein 3D structure prediction. Proteins 2010; 78: 1137 – 1152. | en_US |
dc.identifier.citedreference | Cheng J, Baldi P. Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinformatics 2007; 8: 113. | en_US |
dc.identifier.citedreference | Tegge AN, Wang Z, Eickholt J, Cheng J. NNcon: improved protein contact map prediction using 2D‐recursive neural networks. Nucleic Acids Res 2009; 37 (Web Server issue): W515 – W518. | en_US |
dc.identifier.citedreference | Marbach D, Costello JC, Küffner R, Vega NM, Prill RJ, Camacho DM, Allison KR, Kellis M, Collins JJ, Stolovitzky G. Wisdom of crowds for robust gene network inference. Nat Methods 2012; 9: 796 – 804. | en_US |
dc.identifier.citedreference | Newman ME. Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Phys Rev E 2001; 64: 016132. | en_US |
dc.identifier.citedreference | Di Lena P, Fariselli P, Margara L, Vassura M, Casadio R. Fast overlapping of protein contact maps by alignment of eigenvectors. Bioinformatics 2010; 26: 2250 – 2258. | en_US |
dc.identifier.citedreference | Yang J, Jang R, Zhang Y, Shen HB. High‐accuracy prediction of transmembrane inter‐helix contacts and application to GPCR 3D structure modeling. Bioinformatics 2013; 29: 2579 – 2587. | en_US |
dc.identifier.citedreference | Wu S, Zhang Y. A comprehensive assessment of sequence‐based and template‐based methods for protein contact prediction. Bioinformatics 2008; 24: 924 – 931. | en_US |
dc.identifier.citedreference | Vassura M, Margara L, Di Lena P, Medri F, Fariselli P, Casadio R. Reconstruction of 3D structures from protein contact maps. IEEE/ACM Trans Comput Biol Bioinform 2008; 5: 357 – 367. | en_US |
dc.identifier.citedreference | Nugent T, Jones DT. Predicting transmembrane helix packing arrangements using residue contacts and a force‐directed algorithm. PLoS Comput Biol 2010; 6: e1000714. | en_US |
dc.identifier.citedreference | Taylor WR, Jones DT, Sadowski MI. Protein topology from predicted residue contacts. Protein Sci 2012; 21: 299 – 305. | en_US |
dc.identifier.citedreference | Gromiha MM, Selvaraj S. Inter‐residue interactions in protein folding and stability. Prog Biophys Mol Biol 2004; 86: 235 – 277. | en_US |
dc.identifier.citedreference | Schlessinger A, Punta M, Rost B. Natively unstructured regions in proteins identified from contact predictions. Bioinformatics 2007; 23: 2376 – 2384. | en_US |
dc.identifier.citedreference | Izarzugaza JM, Vazquez M, del Pozo A, Valencia A. wKinMut: an integrated tool for the analysis and interpretation of mutations in human protein kinases. BMC Bioinformatics 2013; 14: 345. | en_US |
dc.identifier.citedreference | Göbel U, Sander C, Schneider R, Valencia A. Correlated mutations and residue contacts in proteins. Proteins 1994; 18: 309 – 317. | en_US |
dc.identifier.citedreference | Olmea O, Valencia A. Improving contact predictions by the combination of correlated mutations and other sources of sequence information. Fold Des 1997; 2: S25 – S32. | en_US |
dc.owningcollname | Interdisciplinary and Peer-Reviewed |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.