Show simple item record

Improving accuracy of protein contact prediction using balanced network deconvolution

dc.contributor.authorSun, Hai‐pingen_US
dc.contributor.authorHuang, Yanen_US
dc.contributor.authorWang, Xiao‐fanen_US
dc.contributor.authorZhang, Yangen_US
dc.contributor.authorShen, Hong‐binen_US
dc.date.accessioned2015-03-05T18:24:25Z
dc.date.available2016-05-10T20:26:27Zen
dc.date.issued2015-03en_US
dc.identifier.citationSun, Hai‐ping ; Huang, Yan; Wang, Xiao‐fan ; Zhang, Yang; Shen, Hong‐bin (2015). "Improving accuracy of protein contact prediction using balanced network deconvolution." Proteins: Structure, Function, and Bioinformatics 83(3): 485-496.en_US
dc.identifier.issn0887-3585en_US
dc.identifier.issn1097-0134en_US
dc.identifier.urihttps://hdl.handle.net/2027.42/110720
dc.description.abstractResidue contact map is essential for protein three‐dimensional structure determination. But most of the current contact prediction methods based on residue co‐evolution suffer from high false‐positives as introduced by indirect and transitive contacts (i.e., residues A–B and B–C are in contact, but A–C are not). Built on the work by Feizi et al. (Nat Biotechnol 2013; 31:726–733), which demonstrated a general network model to distinguish direct dependencies by network deconvolution, this study presents a new balanced network deconvolution (BND) algorithm to identify optimized dependency matrix without limit on the eigenvalue range in the applied network systems. The algorithm was used to filter contact predictions of five widely used co‐evolution methods. On the test of proteins from three benchmark datasets of the 9th critical assessment of protein structure prediction (CASP9), CASP10, and PSICOV (precise structural contact prediction using sparse inverse covariance estimation) database experiments, the BND can improve the medium‐ and long‐range contact predictions at the L/5 cutoff by 55.59% and 47.68%, respectively, without additional central processing unit cost. The improvement is statistically significant, with a P‐value < 5.93 × 10−3 in the Student's t‐test. A further comparison with the ab initio structure predictions in CASPs showed that the usefulness of the current co‐evolution‐based contact prediction to the three‐dimensional structure modeling relies on the number of homologous sequences existing in the sequence databases. BND can be used as a general contact refinement method, which is freely available at: http://www.csbio.sjtu.edu.cn/bioinf/BND/. Proteins 2015; 83:485–496. © 2014 Wiley Periodicals, Inc.en_US
dc.publisherWiley Periodicals, Inc.en_US
dc.subject.otherpredictoren_US
dc.subject.otherprotein structure predictionen_US
dc.subject.otherresidue contact mapen_US
dc.subject.otherresidue co‐evolutionen_US
dc.subject.othertransitive noiseen_US
dc.subject.otherfilteren_US
dc.titleImproving accuracy of protein contact prediction using balanced network deconvolutionen_US
dc.typeArticleen_US
dc.rights.robotsIndexNoFollowen_US
dc.subject.hlbsecondlevelBiological Chemistryen_US
dc.subject.hlbtoplevelScienceen_US
dc.description.peerreviewedPeer Revieweden_US
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/110720/1/prot24744.pdf
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/110720/2/prot24744-sup-0001-suppinfo.pdf
dc.identifier.doi10.1002/prot.24744en_US
dc.identifier.sourceProteins: Structure, Function, and Bioinformaticsen_US
dc.identifier.citedreferenceKinch L, Yong Shi S, Cong Q, Cheng H, Liao Y, Grishin NV. CASP9 assessment of free modeling target predictions. Proteins 2011; 79 ( Suppl 10 ): 59 – 73.en_US
dc.identifier.citedreferencede Juan D, Pazos F, Valencia A. Emerging methods in protein co‐evolution. Nat Rev Genet 2013; 14: 249 – 261.en_US
dc.identifier.citedreferenceBerenger F, Zhou Y, Shrestha R, Zhang KY. Entropy‐accelerated exact clustering of protein decoys. Bioinformatics 2011; 27: 939 – 945.en_US
dc.identifier.citedreferenceBerenger F, Shrestha R, Zhou Y, Simoncini D, Zhang KY. Durandal: fast exact clustering of protein decoys. J Comput Chem 2012; 33: 471 – 474.en_US
dc.identifier.citedreferenceKajan L, Hopf TA, Kalas M, Marks DS, Rost B. FreeContact: fast and free software for protein contact prediction from residue co‐evolution. BMC Bioinformatics 2014; 15: 85.en_US
dc.identifier.citedreferenceChiu DK, Kolodziejczak T. Inferring consensus structure from nucleic acid sequences. Comput Appl Biosci 1991; 7: 347 – 352.en_US
dc.identifier.citedreferenceDunn SD, Wahl LM, Gloor GB. Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction. Bioinformatics 2008; 24: 333 – 340.en_US
dc.identifier.citedreferenceFeizi S, Marbach D, Medard M, Kellis M. Network deconvolution as a general method to distinguish direct dependencies in networks. Nat Biotechnol 2013; 31: 726 – 733.en_US
dc.identifier.citedreferenceJones DT, Buchan DW, Cozzetto D, Pontil M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 2012; 28: 184 – 190.en_US
dc.identifier.citedreferenceMorcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, Zecchina R, Onuchic JN, Hwa T, Weigt M. Direct‐coupling analysis of residue coevolution captures native contacts across many protein families. Proc Natl Acad Sci USA 2011; 108: E1293 – E1301.en_US
dc.identifier.citedreferenceBaldassi C, Zamparo M, Feinauer C, Procaccini A, Zecchina R, Weigt M, Pagnani A. Fast and accurate multivariate Gaussian modeling of protein families: predicting residue contacts and protein‐interaction partners. PloS One 2014; 9: e92721.en_US
dc.identifier.citedreferenceEzkurdia I, Grana O, Izarzugaza JM, Tress ML. Assessment of domain boundary predictions and the prediction of intramolecular contacts in CASP8. Proteins 2009; 77 ( S9 ): 196 – 209.en_US
dc.identifier.citedreferenceWigner EP. Random matrices in physics. SIAM Rev 1967; 9: 1 – 23.en_US
dc.identifier.citedreferenceMonastyrskyy B, D'Andrea D, Fidelis K, Tramontano A, Kryshtafovych A. Evaluation of residue‐residue contact prediction in CASP10. Proteins 2014; 82 ( Suppl 2 ): 138 – 153.en_US
dc.identifier.citedreferenceKarthikraja V, Suresh A, Lulu S, Kangueane U, Kangueane P. Types of interfaces for homodimer folding and binding. Bioinformation 2009; 4: 101.en_US
dc.identifier.citedreferenceTai CH, Bai H, Taylor TJ, Lee B. Assessment of template‐free modeling in CASP10 and ROLL. Proteins 2013; 82 ( Suppl 2 ): 57 – 83.en_US
dc.identifier.citedreferenceXu D, Zhang Y. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge‐based force field. Proteins 2012; 80: 1715 – 1735.en_US
dc.identifier.citedreference, Zhang Y. ITASSER server for protein 3D structure prediction. BMC Bioinformatics 2008; 9: 40.en_US
dc.identifier.citedreferenceRoy A, Kucukural A, Zhang Y. ITASSER: a unified platform for automated protein structure and function prediction. Nat Protocols 2010; 5: 725 – 738.en_US
dc.identifier.citedreferenceRoy A, Yang J, Zhang Y. COFACTOR: an accurate comparative algorithm for structure‐based protein function annotation. Nucleic Acids Res 2012; 40 (Web Server issue): W471 – W477.en_US
dc.identifier.citedreferenceZhang J, Wang Q, Barz B, He Z, Kosztin I, Shang Y, Xu D. MUFOLD: a new solution for protein 3D structure prediction. Proteins 2010; 78: 1137 – 1152.en_US
dc.identifier.citedreferenceCheng J, Baldi P. Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinformatics 2007; 8: 113.en_US
dc.identifier.citedreferenceTegge AN, Wang Z, Eickholt J, Cheng J. NNcon: improved protein contact map prediction using 2D‐recursive neural networks. Nucleic Acids Res 2009; 37 (Web Server issue): W515 – W518.en_US
dc.identifier.citedreferenceMarbach D, Costello JC, Küffner R, Vega NM, Prill RJ, Camacho DM, Allison KR, Kellis M, Collins JJ, Stolovitzky G. Wisdom of crowds for robust gene network inference. Nat Methods 2012; 9: 796 – 804.en_US
dc.identifier.citedreferenceNewman ME. Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Phys Rev E 2001; 64: 016132.en_US
dc.identifier.citedreferenceDi Lena P, Fariselli P, Margara L, Vassura M, Casadio R. Fast overlapping of protein contact maps by alignment of eigenvectors. Bioinformatics 2010; 26: 2250 – 2258.en_US
dc.identifier.citedreferenceYang J, Jang R, Zhang Y, Shen HB. High‐accuracy prediction of transmembrane inter‐helix contacts and application to GPCR 3D structure modeling. Bioinformatics 2013; 29: 2579 – 2587.en_US
dc.identifier.citedreferenceWu S, Zhang Y. A comprehensive assessment of sequence‐based and template‐based methods for protein contact prediction. Bioinformatics 2008; 24: 924 – 931.en_US
dc.identifier.citedreferenceVassura M, Margara L, Di Lena P, Medri F, Fariselli P, Casadio R. Reconstruction of 3D structures from protein contact maps. IEEE/ACM Trans Comput Biol Bioinform 2008; 5: 357 – 367.en_US
dc.identifier.citedreferenceNugent T, Jones DT. Predicting transmembrane helix packing arrangements using residue contacts and a force‐directed algorithm. PLoS Comput Biol 2010; 6: e1000714.en_US
dc.identifier.citedreferenceTaylor WR, Jones DT, Sadowski MI. Protein topology from predicted residue contacts. Protein Sci 2012; 21: 299 – 305.en_US
dc.identifier.citedreferenceGromiha MM, Selvaraj S. Inter‐residue interactions in protein folding and stability. Prog Biophys Mol Biol 2004; 86: 235 – 277.en_US
dc.identifier.citedreferenceSchlessinger A, Punta M, Rost B. Natively unstructured regions in proteins identified from contact predictions. Bioinformatics 2007; 23: 2376 – 2384.en_US
dc.identifier.citedreferenceIzarzugaza JM, Vazquez M, del Pozo A, Valencia A. wKinMut: an integrated tool for the analysis and interpretation of mutations in human protein kinases. BMC Bioinformatics 2013; 14: 345.en_US
dc.identifier.citedreferenceGöbel U, Sander C, Schneider R, Valencia A. Correlated mutations and residue contacts in proteins. Proteins 1994; 18: 309 – 317.en_US
dc.identifier.citedreferenceOlmea O, Valencia A. Improving contact predictions by the combination of correlated mutations and other sources of sequence information. Fold Des 1997; 2: S25 – S32.en_US
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.