Show simple item record

Template‐based and free modeling of I‐TASSER and QUARK pipelines using predicted contact maps in CASP12

dc.contributor.authorZhang, Chengxin
dc.contributor.authorMortuza, S. M.
dc.contributor.authorHe, Baoji
dc.contributor.authorWang, Yanting
dc.contributor.authorZhang, Yang
dc.date.accessioned2018-03-07T18:24:25Z
dc.date.available2019-05-13T14:45:24Zen
dc.date.issued2018-03
dc.identifier.citationZhang, Chengxin; Mortuza, S. M.; He, Baoji; Wang, Yanting; Zhang, Yang (2018). "Template‐based and free modeling of I‐TASSER and QUARK pipelines using predicted contact maps in CASP12." Proteins: Structure, Function, and Bioinformatics 86: 136-151.
dc.identifier.issn0887-3585
dc.identifier.issn1097-0134
dc.identifier.urihttps://hdl.handle.net/2027.42/142472
dc.description.abstractWe develop two complementary pipelines, “Zhang‐Server” and “QUARK”, based on I‐TASSER and QUARK pipelines for template‐based modeling (TBM) and free modeling (FM), and test them in the CASP12 experiment. The combination of I‐TASSER and QUARK successfully folds three medium‐size FM targets that have more than 150 residues, even though the interplay between the two pipelines still awaits further optimization. Newly developed sequence‐based contact prediction by NeBcon plays a critical role to enhance the quality of models, particularly for FM targets, by the new pipelines. The inclusion of NeBcon predicted contacts as restraints in the QUARK simulations results in an average TM‐score of 0.41 for the best in top five predicted models, which is 37% higher than that by the QUARK simulations without contacts. In particular, there are seven targets that are converted from non‐foldable to foldable (TM‐score >0.5) due to the use of contact restraints in the simulations. Another additional feature in the current pipelines is the local structure quality prediction by ResQ, which provides a robust residue‐level modeling error estimation. Despite the success, significant challenges still remain in ab initio modeling of multi‐domain proteins and folding of β‐proteins with complicated topologies bound by long‐range strand‐strand interactions. Improvements on domain boundary and long‐range contact prediction, as well as optimal use of the predicted contacts and multiple threading alignments, are critical to address these issues seen in the CASP12 experiment.
dc.publisherWiley Periodicals, Inc.
dc.subject.otherthreading
dc.subject.otherprotein structure prediction
dc.subject.othercontact prediction
dc.subject.otherresidue quality estimation
dc.subject.otherCASP12
dc.subject.otherab initio folding
dc.titleTemplate‐based and free modeling of I‐TASSER and QUARK pipelines using predicted contact maps in CASP12
dc.typeArticleen_US
dc.rights.robotsIndexNoFollow
dc.subject.hlbsecondlevelBiological Chemistry
dc.subject.hlbtoplevelScience
dc.description.peerreviewedPeer Reviewed
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/142472/1/prot25414_am.pdf
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/142472/2/prot25414-sup-0001-suppinfo1.pdf
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/142472/3/prot25414.pdf
dc.identifier.doi10.1002/prot.25414
dc.identifier.sourceProteins: Structure, Function, and Bioinformatics
dc.identifier.citedreferenceZhang J, Zhang Y, Fernandez‐Fuentes N. A novel side‐chain orientation dependent potential derived from random‐walk reference state for protein fold selection and structure prediction. PLoS One. 2010; 5 ( 10 ): e15386
dc.identifier.citedreferenceZhang Y, Skolnick J. SPICKER: a clustering approach to identify near‐native protein folds. J Comput Chem. 2004; 25 ( 6 ): 865 – 871.
dc.identifier.citedreferenceXu D, Zhang Y. Improving the physical realism and structural accuracy of protein models by a two‐step atomic‐level energy minimization. Biophys J. 2011; 101 ( 10 ): 2525 – 2534.
dc.identifier.citedreferenceZhang J, Liang Y, Zhang Y. Atomic‐level protein structure refinement using fragment‐guided molecular dynamics conformation sampling. Structure. 2011; 19 ( 12 ): 1784 – 1795.
dc.identifier.citedreferenceZhang Y, Kihara D, Skolnick J. Local energy landscape flattening: parallel hyperbolic Monte Carlo sampling of protein folding. Proteins. 2002; 48 ( 2 ): 192 – 201.
dc.identifier.citedreferenceZhang Y, Kolinski A, Skolnick J. TOUCHSTONE II: a new approach to ab initio protein structure prediction. Biophys J. 2003; 85 ( 2 ): 1145 – 1164.
dc.identifier.citedreferenceWu ST, Skolnick J, Zhang Y. Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biol. 2007; 5 ( 1 ): 17
dc.identifier.citedreferenceZhang Y, Skolnick J. TM‐align: a protein structure alignment algorithm based on the TM‐score. Nucleic Acids Res. 2005; 33 ( 7 ): 2302 – 2309.
dc.identifier.citedreferenceLi YQ, Zhang Y. REMO: a new protocol to refine full atomic protein models from C‐alpha traces by optimizing hydrogen‐bonding networks. Proteins. 2009; 76 ( 3 ): 665 – 674.
dc.identifier.citedreferenceZhang Y. I‐TASSER server for protein 3D structure prediction. BMC Bioinformatics. 2008; 9 ( 1):1 ): 40
dc.identifier.citedreferenceZhou HY, Skolnick J. GOAP: a generalized orientation‐dependent, all‐atom statistical potential for protein structure prediction. Biophys J. 2011; 101 ( 8 ): 2043 – 2052.
dc.identifier.citedreferenceShen MY, Sali A. Statistical potential for assessment and prediction of protein structures. Protein Sci. 2006; 15 ( 11 ): 2507 – 2524.
dc.identifier.citedreferenceZhang Y, Skolnick J. Scoring function for automated assessment of protein structure template quality. Proteins. 2004; 57 ( 4 ): 702 – 710.
dc.identifier.citedreferenceCheng JL, Baldi P. Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinformatics. 2007; 8 ( 1 ): 113
dc.identifier.citedreferenceSeemayer S, Gruber M, Söding J. CCMpred—fast and precise prediction of protein residue‐residue contacts from correlated mutations. Bioinformatics. 2014; 30 ( 21 ): 3128 – 3130.
dc.identifier.citedreferenceKaján L, Hopf TA, Kalaš M, Marks DS, Rost B. FreeContact: fast and free software for protein contact prediction from residue co‐evolution. BMC Bioinformatics. 2014; 15 ( 1):1 ): 85
dc.identifier.citedreferenceYang J, Shen H‐B. An ensemble predictor by fusing multiple base predictors composed by both coevolution‐based and machine learning‐based approaches. Abstract of CASP11 experiment. http://www.predictioncenter.org/casp11/doc/CASP11_Abstracts.pdf; 2014. p 209 – 210.
dc.identifier.citedreferenceJones DT, Singh T, Kosciolek T, Tetchner S. MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinformatics. 2015; 31 ( 7 ): 999 – 1006.
dc.identifier.citedreferenceYan RX, Xu D, Yang JY, Walker S, Zhang Y. A comparative assessment and analysis of 20 representative sequence alignment methods for protein structure prediction. Sci Rep. 2013; 3 ( 1 ):
dc.identifier.citedreferenceKryshtafovych A, Barbato A, Monastyrskyy B, Fidelis K, Schwede T, Tramontano A. Methods of model accuracy estimation can help selecting the best models from decoy sets: assessment of model accuracy estimations in CASP11. Proteins. 2016; 84: 349 – 369.
dc.identifier.citedreferenceZhang Y. I‐TASSER: Fully automated protein structure prediction in CASP8. Proteins. 2009; 77 ( S9 ): 100 – 113.
dc.identifier.citedreferenceWang S, Peng J, Ma J, Xu J. Protein secondary structure prediction using deep convolutional neural fields. Sci Rep. 2016; 6 ( 1 ):
dc.identifier.citedreferenceMagnan CN, Baldi P. SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity. Bioinformatics. 2014; 30 ( 18 ): 2592 – 2597.
dc.identifier.citedreferenceTowns J, Cockerill T, Dahan M, et al. XSEDE: accelerating scientific discovery. Comput Sci Eng. 2014; 16 ( 5 ): 62 – 74.
dc.identifier.citedreferenceMariani V, Kiefer F, Schmidt T, Haas J, Schwede T. Assessment of template based protein structure predictions in CASP9. Proteins. 2011; 79 ( S10 ): 37 – 58.
dc.identifier.citedreferenceHuang YJP, Mao BC, Aramini JM, Montelione GT. Assessment of template‐based protein structure predictions in CASP10. Proteins. 2014; 82: 43 – 56.
dc.identifier.citedreferenceModi V, Xu QF, Adhikari S, Dunbrack RL. Assessment of template‐based modeling of protein structure in CASP11. Proteins. 2016; 84: 200 – 220.
dc.identifier.citedreferenceZhang Y. Progress and challenges in protein structure prediction. Curr Opin Struct Biol. 2008; 18 ( 3 ): 342 – 348.
dc.identifier.citedreferenceKinch LN, Li WL, Monastyrskyy B, Kryshtafovych A, Grishin NV. Evaluation of free modeling targets in CASP11 and ROLL. Proteins. 2016; 84: 51 – 66.
dc.identifier.citedreferenceMonastyrskyy B, D’andrea D, Fidelis K, Tramontano A, Kryshtafovych A. New encouraging developments in contact prediction: assessment of the CASP11 results. Proteins. 2016; 84: 131 – 144.
dc.identifier.citedreferenceWu ST, Szilagyi A, Zhang Y. Improving protein structure prediction using multiple sequence‐based contact predictions. Structure. 2011; 19 ( 8 ): 1182 – 1191.
dc.identifier.citedreferenceOvchinnikov S, Kim DE, Wang RYR, Liu Y, DiMaio F, Baker D. Improved de novo structure prediction in CASP11 by incorporating coevolution information into Rosetta. Proteins. 2016; 84: 67 – 75.
dc.identifier.citedreferenceRoy A, Kucukural A, Zhang Y. I‐TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc. 2010; 5 ( 4 ): 725 – 738.
dc.identifier.citedreferenceYang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y. The I‐TASSER Suite: protein structure and function prediction. Nat Methods. 2015; 12 ( 1 ): 7 – 8.
dc.identifier.citedreferenceXu D, Zhang Y. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge‐based force field. Proteins. 2012; 80 ( 7 ): 1715 – 1735.
dc.identifier.citedreferenceXu D, Zhang Y. Toward optimal fragment generations for ab initio protein structure assembly. Proteins. 2013; 81 ( 2 ): 229 – 239.
dc.identifier.citedreferenceZhang Y. Interplay of I‐TASSER and QUARK for template‐based and ab initio protein structure prediction in CASP10. Proteins. 2014; 82: 175 – 187.
dc.identifier.citedreferenceZhang WX, Yang JY, He BJ, et al. Integration of QUARK and I‐TASSER for ab initio protein structure prediction in CASP11. Proteins. 2016; 84: 76 – 86.
dc.identifier.citedreferenceWu ST, Zhang Y. LOMETS: A local meta‐threading‐server for protein structure prediction. Nucleic Acids Res. 2007; 35 ( 10 ): 3375 – 3382.
dc.identifier.citedreferenceYang JY, Zhang WX, He BJ, et al. Template‐based protein structure prediction in CASP11 and retrospect of I‐TASSER in the last decade. Proteins. 2016; 84: 233 – 246.
dc.identifier.citedreferenceKinch L, Shi SY, Cong Q, Cheng H, Liao YX, Grishin NV. CASP9 assessment of free modeling target predictions. Proteins. 2011; 79 ( S10 ): 59 – 73.
dc.identifier.citedreferenceTai CH, Bai HJ, Taylor TJ, Lee B. Assessment of template‐free modeling in CASP10 and ROLL. Proteins. 2014; 82: 57 – 83.
dc.identifier.citedreferenceWeigt M, White RA, Szurmant H, Hoch JA, Hwa T. Identification of direct residue contacts in protein‐protein interaction by message passing. Proc Natl Acad Sci U S A. 2009; 106 ( 1 ): 67 – 72.
dc.identifier.citedreferenceJones DT, Buchan DWA, Cozzetto D, Pontil M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics. 2012; 28 ( 2 ): 184 – 190.
dc.identifier.citedreferenceKamisetty H, Ovchinnikov S, Baker D. Assessing the utility of coevolution‐based residue‐residue contact predictions in a sequence‐ and structure‐rich era. Proc Natl Acad Sci U S A. 2013; 110 ( 39 ): 15674 – 15679.
dc.identifier.citedreferenceCheng JL, Baldi P. Three‐stage prediction of protein beta‐sheets by neural networks, alignments and graph algorithms. Bioinformatics. 2005; 21 ( Suppl 1 ): I75 – I84.
dc.identifier.citedreferenceWu S, Zhang Y. A comprehensive assessment of sequence‐based and template‐based methods for protein contact prediction. Bioinformatics. 2008; 24 ( 7 ): 924 – 931.
dc.identifier.citedreferenceKosciolek T, Jones DT, Deane CM. De novo structure prediction of globular proteins aided by sequence variation‐derived contacts. PLoS One. 2014; 9 ( 3 ): e92197
dc.identifier.citedreferenceHe B, Mortuza SM, Wang Y, Shen HB, Zhang Y. NeBcon: protein contact map prediction using neural network training coupled with naive Bayes classifiers. Bioinformatics. 2017; 33 ( 15 ): 2296 – 2306.
dc.identifier.citedreferenceZhang Y. Protein structure prediction: when is it useful?. Curr Opin Struct Biol. 2009; 19 ( 2 ): 145 – 155.
dc.identifier.citedreferenceYang J, Wang Y, Zhang Y. ResQ: an approach to unified estimation of B‐factor and residue‐specific error in protein structure prediction. J Mol Biol. 2016; 428 ( 4 ): 693 – 701.
dc.identifier.citedreferenceXue Z, Xu D, Wang Y, Zhang Y. ThreaDom: extracting protein domain boundary information from multiple threading alignments. Bioinformatics. 2013; 29 ( 13 ): i247 – i256.
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.