Template‐based and free modeling of I‐TASSER and QUARK pipelines using predicted contact maps in CASP12

Zhang, Chengxin; Mortuza, S. M.; He, Baoji; Wang, Yanting; Zhang, Yang

Template‐based and free modeling of I‐TASSER and QUARK pipelines using predicted contact maps in CASP12

dc.contributor.author	Zhang, Chengxin
dc.contributor.author	Mortuza, S. M.
dc.contributor.author	He, Baoji
dc.contributor.author	Wang, Yanting
dc.contributor.author	Zhang, Yang
dc.date.accessioned	2018-03-07T18:24:25Z
dc.date.available	2019-05-13T14:45:24Z	en
dc.date.issued	2018-03
dc.identifier.citation	Zhang, Chengxin; Mortuza, S. M.; He, Baoji; Wang, Yanting; Zhang, Yang (2018). "Template‐based and free modeling of I‐TASSER and QUARK pipelines using predicted contact maps in CASP12." Proteins: Structure, Function, and Bioinformatics 86: 136-151.
dc.identifier.issn	0887-3585
dc.identifier.issn	1097-0134
dc.identifier.uri	https://hdl.handle.net/2027.42/142472
dc.description.abstract	We develop two complementary pipelines, “Zhang‐Server” and “QUARK”, based on I‐TASSER and QUARK pipelines for template‐based modeling (TBM) and free modeling (FM), and test them in the CASP12 experiment. The combination of I‐TASSER and QUARK successfully folds three medium‐size FM targets that have more than 150 residues, even though the interplay between the two pipelines still awaits further optimization. Newly developed sequence‐based contact prediction by NeBcon plays a critical role to enhance the quality of models, particularly for FM targets, by the new pipelines. The inclusion of NeBcon predicted contacts as restraints in the QUARK simulations results in an average TM‐score of 0.41 for the best in top five predicted models, which is 37% higher than that by the QUARK simulations without contacts. In particular, there are seven targets that are converted from non‐foldable to foldable (TM‐score >0.5) due to the use of contact restraints in the simulations. Another additional feature in the current pipelines is the local structure quality prediction by ResQ, which provides a robust residue‐level modeling error estimation. Despite the success, significant challenges still remain in ab initio modeling of multi‐domain proteins and folding of β‐proteins with complicated topologies bound by long‐range strand‐strand interactions. Improvements on domain boundary and long‐range contact prediction, as well as optimal use of the predicted contacts and multiple threading alignments, are critical to address these issues seen in the CASP12 experiment.
dc.publisher	Wiley Periodicals, Inc.
dc.subject.other	threading
dc.subject.other	protein structure prediction
dc.subject.other	contact prediction
dc.subject.other	residue quality estimation
dc.subject.other	CASP12
dc.subject.other	ab initio folding
dc.title	Template‐based and free modeling of I‐TASSER and QUARK pipelines using predicted contact maps in CASP12
dc.type	Article	en_US
dc.rights.robots	IndexNoFollow
dc.subject.hlbsecondlevel	Biological Chemistry
dc.subject.hlbtoplevel	Science
dc.description.peerreviewed	Peer Reviewed
dc.description.bitstreamurl	https://deepblue.lib.umich.edu/bitstream/2027.42/142472/1/prot25414_am.pdf
dc.description.bitstreamurl	https://deepblue.lib.umich.edu/bitstream/2027.42/142472/2/prot25414-sup-0001-suppinfo1.pdf
dc.description.bitstreamurl	https://deepblue.lib.umich.edu/bitstream/2027.42/142472/3/prot25414.pdf
dc.identifier.doi	10.1002/prot.25414
dc.identifier.source	Proteins: Structure, Function, and Bioinformatics
dc.identifier.citedreference	Zhang J, Zhang Y, Fernandez‐Fuentes N. A novel side‐chain orientation dependent potential derived from random‐walk reference state for protein fold selection and structure prediction. PLoS One. 2010; 5 ( 10 ): e15386
dc.identifier.citedreference	Zhang Y, Skolnick J. SPICKER: a clustering approach to identify near‐native protein folds. J Comput Chem. 2004; 25 ( 6 ): 865 – 871.
dc.identifier.citedreference	Xu D, Zhang Y. Improving the physical realism and structural accuracy of protein models by a two‐step atomic‐level energy minimization. Biophys J. 2011; 101 ( 10 ): 2525 – 2534.
dc.identifier.citedreference	Zhang J, Liang Y, Zhang Y. Atomic‐level protein structure refinement using fragment‐guided molecular dynamics conformation sampling. Structure. 2011; 19 ( 12 ): 1784 – 1795.
dc.identifier.citedreference	Zhang Y, Kihara D, Skolnick J. Local energy landscape flattening: parallel hyperbolic Monte Carlo sampling of protein folding. Proteins. 2002; 48 ( 2 ): 192 – 201.
dc.identifier.citedreference	Zhang Y, Kolinski A, Skolnick J. TOUCHSTONE II: a new approach to ab initio protein structure prediction. Biophys J. 2003; 85 ( 2 ): 1145 – 1164.
dc.identifier.citedreference	Wu ST, Skolnick J, Zhang Y. Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biol. 2007; 5 ( 1 ): 17
dc.identifier.citedreference	Zhang Y, Skolnick J. TM‐align: a protein structure alignment algorithm based on the TM‐score. Nucleic Acids Res. 2005; 33 ( 7 ): 2302 – 2309.
dc.identifier.citedreference	Li YQ, Zhang Y. REMO: a new protocol to refine full atomic protein models from C‐alpha traces by optimizing hydrogen‐bonding networks. Proteins. 2009; 76 ( 3 ): 665 – 674.
dc.identifier.citedreference	Zhang Y. I‐TASSER server for protein 3D structure prediction. BMC Bioinformatics. 2008; 9 ( 1):1 ): 40
dc.identifier.citedreference	Zhou HY, Skolnick J. GOAP: a generalized orientation‐dependent, all‐atom statistical potential for protein structure prediction. Biophys J. 2011; 101 ( 8 ): 2043 – 2052.
dc.identifier.citedreference	Shen MY, Sali A. Statistical potential for assessment and prediction of protein structures. Protein Sci. 2006; 15 ( 11 ): 2507 – 2524.
dc.identifier.citedreference	Zhang Y, Skolnick J. Scoring function for automated assessment of protein structure template quality. Proteins. 2004; 57 ( 4 ): 702 – 710.
dc.identifier.citedreference	Cheng JL, Baldi P. Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinformatics. 2007; 8 ( 1 ): 113
dc.identifier.citedreference	Seemayer S, Gruber M, Söding J. CCMpred—fast and precise prediction of protein residue‐residue contacts from correlated mutations. Bioinformatics. 2014; 30 ( 21 ): 3128 – 3130.
dc.identifier.citedreference	Kaján L, Hopf TA, Kalaš M, Marks DS, Rost B. FreeContact: fast and free software for protein contact prediction from residue co‐evolution. BMC Bioinformatics. 2014; 15 ( 1):1 ): 85
dc.identifier.citedreference	Yang J, Shen H‐B. An ensemble predictor by fusing multiple base predictors composed by both coevolution‐based and machine learning‐based approaches. Abstract of CASP11 experiment. http://www.predictioncenter.org/casp11/doc/CASP11_Abstracts.pdf; 2014. p 209 – 210.
dc.identifier.citedreference	Jones DT, Singh T, Kosciolek T, Tetchner S. MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinformatics. 2015; 31 ( 7 ): 999 – 1006.
dc.identifier.citedreference	Yan RX, Xu D, Yang JY, Walker S, Zhang Y. A comparative assessment and analysis of 20 representative sequence alignment methods for protein structure prediction. Sci Rep. 2013; 3 ( 1 ):
dc.identifier.citedreference	Kryshtafovych A, Barbato A, Monastyrskyy B, Fidelis K, Schwede T, Tramontano A. Methods of model accuracy estimation can help selecting the best models from decoy sets: assessment of model accuracy estimations in CASP11. Proteins. 2016; 84: 349 – 369.
dc.identifier.citedreference	Zhang Y. I‐TASSER: Fully automated protein structure prediction in CASP8. Proteins. 2009; 77 ( S9 ): 100 – 113.
dc.identifier.citedreference	Wang S, Peng J, Ma J, Xu J. Protein secondary structure prediction using deep convolutional neural fields. Sci Rep. 2016; 6 ( 1 ):
dc.identifier.citedreference	Magnan CN, Baldi P. SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity. Bioinformatics. 2014; 30 ( 18 ): 2592 – 2597.
dc.identifier.citedreference	Towns J, Cockerill T, Dahan M, et al. XSEDE: accelerating scientific discovery. Comput Sci Eng. 2014; 16 ( 5 ): 62 – 74.
dc.identifier.citedreference	Mariani V, Kiefer F, Schmidt T, Haas J, Schwede T. Assessment of template based protein structure predictions in CASP9. Proteins. 2011; 79 ( S10 ): 37 – 58.
dc.identifier.citedreference	Huang YJP, Mao BC, Aramini JM, Montelione GT. Assessment of template‐based protein structure predictions in CASP10. Proteins. 2014; 82: 43 – 56.
dc.identifier.citedreference	Modi V, Xu QF, Adhikari S, Dunbrack RL. Assessment of template‐based modeling of protein structure in CASP11. Proteins. 2016; 84: 200 – 220.
dc.identifier.citedreference	Zhang Y. Progress and challenges in protein structure prediction. Curr Opin Struct Biol. 2008; 18 ( 3 ): 342 – 348.
dc.identifier.citedreference	Kinch LN, Li WL, Monastyrskyy B, Kryshtafovych A, Grishin NV. Evaluation of free modeling targets in CASP11 and ROLL. Proteins. 2016; 84: 51 – 66.
dc.identifier.citedreference	Monastyrskyy B, D’andrea D, Fidelis K, Tramontano A, Kryshtafovych A. New encouraging developments in contact prediction: assessment of the CASP11 results. Proteins. 2016; 84: 131 – 144.
dc.identifier.citedreference	Wu ST, Szilagyi A, Zhang Y. Improving protein structure prediction using multiple sequence‐based contact predictions. Structure. 2011; 19 ( 8 ): 1182 – 1191.
dc.identifier.citedreference	Ovchinnikov S, Kim DE, Wang RYR, Liu Y, DiMaio F, Baker D. Improved de novo structure prediction in CASP11 by incorporating coevolution information into Rosetta. Proteins. 2016; 84: 67 – 75.
dc.identifier.citedreference	Roy A, Kucukural A, Zhang Y. I‐TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc. 2010; 5 ( 4 ): 725 – 738.
dc.identifier.citedreference	Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y. The I‐TASSER Suite: protein structure and function prediction. Nat Methods. 2015; 12 ( 1 ): 7 – 8.
dc.identifier.citedreference	Xu D, Zhang Y. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge‐based force field. Proteins. 2012; 80 ( 7 ): 1715 – 1735.
dc.identifier.citedreference	Xu D, Zhang Y. Toward optimal fragment generations for ab initio protein structure assembly. Proteins. 2013; 81 ( 2 ): 229 – 239.
dc.identifier.citedreference	Zhang Y. Interplay of I‐TASSER and QUARK for template‐based and ab initio protein structure prediction in CASP10. Proteins. 2014; 82: 175 – 187.
dc.identifier.citedreference	Zhang WX, Yang JY, He BJ, et al. Integration of QUARK and I‐TASSER for ab initio protein structure prediction in CASP11. Proteins. 2016; 84: 76 – 86.
dc.identifier.citedreference	Wu ST, Zhang Y. LOMETS: A local meta‐threading‐server for protein structure prediction. Nucleic Acids Res. 2007; 35 ( 10 ): 3375 – 3382.
dc.identifier.citedreference	Yang JY, Zhang WX, He BJ, et al. Template‐based protein structure prediction in CASP11 and retrospect of I‐TASSER in the last decade. Proteins. 2016; 84: 233 – 246.
dc.identifier.citedreference	Kinch L, Shi SY, Cong Q, Cheng H, Liao YX, Grishin NV. CASP9 assessment of free modeling target predictions. Proteins. 2011; 79 ( S10 ): 59 – 73.
dc.identifier.citedreference	Tai CH, Bai HJ, Taylor TJ, Lee B. Assessment of template‐free modeling in CASP10 and ROLL. Proteins. 2014; 82: 57 – 83.
dc.identifier.citedreference	Weigt M, White RA, Szurmant H, Hoch JA, Hwa T. Identification of direct residue contacts in protein‐protein interaction by message passing. Proc Natl Acad Sci U S A. 2009; 106 ( 1 ): 67 – 72.
dc.identifier.citedreference	Jones DT, Buchan DWA, Cozzetto D, Pontil M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics. 2012; 28 ( 2 ): 184 – 190.
dc.identifier.citedreference	Kamisetty H, Ovchinnikov S, Baker D. Assessing the utility of coevolution‐based residue‐residue contact predictions in a sequence‐ and structure‐rich era. Proc Natl Acad Sci U S A. 2013; 110 ( 39 ): 15674 – 15679.
dc.identifier.citedreference	Cheng JL, Baldi P. Three‐stage prediction of protein beta‐sheets by neural networks, alignments and graph algorithms. Bioinformatics. 2005; 21 ( Suppl 1 ): I75 – I84.
dc.identifier.citedreference	Wu S, Zhang Y. A comprehensive assessment of sequence‐based and template‐based methods for protein contact prediction. Bioinformatics. 2008; 24 ( 7 ): 924 – 931.
dc.identifier.citedreference	Kosciolek T, Jones DT, Deane CM. De novo structure prediction of globular proteins aided by sequence variation‐derived contacts. PLoS One. 2014; 9 ( 3 ): e92197
dc.identifier.citedreference	He B, Mortuza SM, Wang Y, Shen HB, Zhang Y. NeBcon: protein contact map prediction using neural network training coupled with naive Bayes classifiers. Bioinformatics. 2017; 33 ( 15 ): 2296 – 2306.
dc.identifier.citedreference	Zhang Y. Protein structure prediction: when is it useful?. Curr Opin Struct Biol. 2009; 19 ( 2 ): 145 – 155.
dc.identifier.citedreference	Yang J, Wang Y, Zhang Y. ResQ: an approach to unified estimation of B‐factor and residue‐specific error in protein structure prediction. J Mol Biol. 2016; 428 ( 4 ): 693 – 701.
dc.identifier.citedreference	Xue Z, Xu D, Wang Y, Zhang Y. ThreaDom: extracting protein domain boundary information from multiple threading alignments. Bioinformatics. 2013; 29 ( 13 ): i247 – i256.
dc.owningcollname	Interdisciplinary and Peer-Reviewed

Files in this item

Name:: prot25414_am.pdf
Size:: 2.939MB
Format:: PDF

View/Open

Name:: prot25414-sup-0001-suppinfo1.pdf
Size:: 75.35KB
Format:: PDF

View/Open

Name:: prot25414.pdf
Size:: 2.359MB
Format:: PDF

View/Open

Interdisciplinary and Peer-Reviewed

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.