Template‐based and free modeling of I‐TASSER and QUARK pipelines using predicted contact maps in CASP12
dc.contributor.author | Zhang, Chengxin | |
dc.contributor.author | Mortuza, S. M. | |
dc.contributor.author | He, Baoji | |
dc.contributor.author | Wang, Yanting | |
dc.contributor.author | Zhang, Yang | |
dc.date.accessioned | 2018-03-07T18:24:25Z | |
dc.date.available | 2019-05-13T14:45:24Z | en |
dc.date.issued | 2018-03 | |
dc.identifier.citation | Zhang, Chengxin; Mortuza, S. M.; He, Baoji; Wang, Yanting; Zhang, Yang (2018). "Template‐based and free modeling of I‐TASSER and QUARK pipelines using predicted contact maps in CASP12." Proteins: Structure, Function, and Bioinformatics 86: 136-151. | |
dc.identifier.issn | 0887-3585 | |
dc.identifier.issn | 1097-0134 | |
dc.identifier.uri | https://hdl.handle.net/2027.42/142472 | |
dc.description.abstract | We develop two complementary pipelines, “Zhang‐Server” and “QUARK”, based on I‐TASSER and QUARK pipelines for template‐based modeling (TBM) and free modeling (FM), and test them in the CASP12 experiment. The combination of I‐TASSER and QUARK successfully folds three medium‐size FM targets that have more than 150 residues, even though the interplay between the two pipelines still awaits further optimization. Newly developed sequence‐based contact prediction by NeBcon plays a critical role to enhance the quality of models, particularly for FM targets, by the new pipelines. The inclusion of NeBcon predicted contacts as restraints in the QUARK simulations results in an average TM‐score of 0.41 for the best in top five predicted models, which is 37% higher than that by the QUARK simulations without contacts. In particular, there are seven targets that are converted from non‐foldable to foldable (TM‐score >0.5) due to the use of contact restraints in the simulations. Another additional feature in the current pipelines is the local structure quality prediction by ResQ, which provides a robust residue‐level modeling error estimation. Despite the success, significant challenges still remain in ab initio modeling of multi‐domain proteins and folding of β‐proteins with complicated topologies bound by long‐range strand‐strand interactions. Improvements on domain boundary and long‐range contact prediction, as well as optimal use of the predicted contacts and multiple threading alignments, are critical to address these issues seen in the CASP12 experiment. | |
dc.publisher | Wiley Periodicals, Inc. | |
dc.subject.other | threading | |
dc.subject.other | protein structure prediction | |
dc.subject.other | contact prediction | |
dc.subject.other | residue quality estimation | |
dc.subject.other | CASP12 | |
dc.subject.other | ab initio folding | |
dc.title | Template‐based and free modeling of I‐TASSER and QUARK pipelines using predicted contact maps in CASP12 | |
dc.type | Article | en_US |
dc.rights.robots | IndexNoFollow | |
dc.subject.hlbsecondlevel | Biological Chemistry | |
dc.subject.hlbtoplevel | Science | |
dc.description.peerreviewed | Peer Reviewed | |
dc.description.bitstreamurl | https://deepblue.lib.umich.edu/bitstream/2027.42/142472/1/prot25414_am.pdf | |
dc.description.bitstreamurl | https://deepblue.lib.umich.edu/bitstream/2027.42/142472/2/prot25414-sup-0001-suppinfo1.pdf | |
dc.description.bitstreamurl | https://deepblue.lib.umich.edu/bitstream/2027.42/142472/3/prot25414.pdf | |
dc.identifier.doi | 10.1002/prot.25414 | |
dc.identifier.source | Proteins: Structure, Function, and Bioinformatics | |
dc.identifier.citedreference | Zhang J, Zhang Y, Fernandez‐Fuentes N. A novel side‐chain orientation dependent potential derived from random‐walk reference state for protein fold selection and structure prediction. PLoS One. 2010; 5 ( 10 ): e15386 | |
dc.identifier.citedreference | Zhang Y, Skolnick J. SPICKER: a clustering approach to identify near‐native protein folds. J Comput Chem. 2004; 25 ( 6 ): 865 – 871. | |
dc.identifier.citedreference | Xu D, Zhang Y. Improving the physical realism and structural accuracy of protein models by a two‐step atomic‐level energy minimization. Biophys J. 2011; 101 ( 10 ): 2525 – 2534. | |
dc.identifier.citedreference | Zhang J, Liang Y, Zhang Y. Atomic‐level protein structure refinement using fragment‐guided molecular dynamics conformation sampling. Structure. 2011; 19 ( 12 ): 1784 – 1795. | |
dc.identifier.citedreference | Zhang Y, Kihara D, Skolnick J. Local energy landscape flattening: parallel hyperbolic Monte Carlo sampling of protein folding. Proteins. 2002; 48 ( 2 ): 192 – 201. | |
dc.identifier.citedreference | Zhang Y, Kolinski A, Skolnick J. TOUCHSTONE II: a new approach to ab initio protein structure prediction. Biophys J. 2003; 85 ( 2 ): 1145 – 1164. | |
dc.identifier.citedreference | Wu ST, Skolnick J, Zhang Y. Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biol. 2007; 5 ( 1 ): 17 | |
dc.identifier.citedreference | Zhang Y, Skolnick J. TM‐align: a protein structure alignment algorithm based on the TM‐score. Nucleic Acids Res. 2005; 33 ( 7 ): 2302 – 2309. | |
dc.identifier.citedreference | Li YQ, Zhang Y. REMO: a new protocol to refine full atomic protein models from C‐alpha traces by optimizing hydrogen‐bonding networks. Proteins. 2009; 76 ( 3 ): 665 – 674. | |
dc.identifier.citedreference | Zhang Y. I‐TASSER server for protein 3D structure prediction. BMC Bioinformatics. 2008; 9 ( 1):1 ): 40 | |
dc.identifier.citedreference | Zhou HY, Skolnick J. GOAP: a generalized orientation‐dependent, all‐atom statistical potential for protein structure prediction. Biophys J. 2011; 101 ( 8 ): 2043 – 2052. | |
dc.identifier.citedreference | Shen MY, Sali A. Statistical potential for assessment and prediction of protein structures. Protein Sci. 2006; 15 ( 11 ): 2507 – 2524. | |
dc.identifier.citedreference | Zhang Y, Skolnick J. Scoring function for automated assessment of protein structure template quality. Proteins. 2004; 57 ( 4 ): 702 – 710. | |
dc.identifier.citedreference | Cheng JL, Baldi P. Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinformatics. 2007; 8 ( 1 ): 113 | |
dc.identifier.citedreference | Seemayer S, Gruber M, Söding J. CCMpred—fast and precise prediction of protein residue‐residue contacts from correlated mutations. Bioinformatics. 2014; 30 ( 21 ): 3128 – 3130. | |
dc.identifier.citedreference | Kaján L, Hopf TA, Kalaš M, Marks DS, Rost B. FreeContact: fast and free software for protein contact prediction from residue co‐evolution. BMC Bioinformatics. 2014; 15 ( 1):1 ): 85 | |
dc.identifier.citedreference | Yang J, Shen H‐B. An ensemble predictor by fusing multiple base predictors composed by both coevolution‐based and machine learning‐based approaches. Abstract of CASP11 experiment. http://www.predictioncenter.org/casp11/doc/CASP11_Abstracts.pdf; 2014. p 209 – 210. | |
dc.identifier.citedreference | Jones DT, Singh T, Kosciolek T, Tetchner S. MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinformatics. 2015; 31 ( 7 ): 999 – 1006. | |
dc.identifier.citedreference | Yan RX, Xu D, Yang JY, Walker S, Zhang Y. A comparative assessment and analysis of 20 representative sequence alignment methods for protein structure prediction. Sci Rep. 2013; 3 ( 1 ): | |
dc.identifier.citedreference | Kryshtafovych A, Barbato A, Monastyrskyy B, Fidelis K, Schwede T, Tramontano A. Methods of model accuracy estimation can help selecting the best models from decoy sets: assessment of model accuracy estimations in CASP11. Proteins. 2016; 84: 349 – 369. | |
dc.identifier.citedreference | Zhang Y. I‐TASSER: Fully automated protein structure prediction in CASP8. Proteins. 2009; 77 ( S9 ): 100 – 113. | |
dc.identifier.citedreference | Wang S, Peng J, Ma J, Xu J. Protein secondary structure prediction using deep convolutional neural fields. Sci Rep. 2016; 6 ( 1 ): | |
dc.identifier.citedreference | Magnan CN, Baldi P. SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity. Bioinformatics. 2014; 30 ( 18 ): 2592 – 2597. | |
dc.identifier.citedreference | Towns J, Cockerill T, Dahan M, et al. XSEDE: accelerating scientific discovery. Comput Sci Eng. 2014; 16 ( 5 ): 62 – 74. | |
dc.identifier.citedreference | Mariani V, Kiefer F, Schmidt T, Haas J, Schwede T. Assessment of template based protein structure predictions in CASP9. Proteins. 2011; 79 ( S10 ): 37 – 58. | |
dc.identifier.citedreference | Huang YJP, Mao BC, Aramini JM, Montelione GT. Assessment of template‐based protein structure predictions in CASP10. Proteins. 2014; 82: 43 – 56. | |
dc.identifier.citedreference | Modi V, Xu QF, Adhikari S, Dunbrack RL. Assessment of template‐based modeling of protein structure in CASP11. Proteins. 2016; 84: 200 – 220. | |
dc.identifier.citedreference | Zhang Y. Progress and challenges in protein structure prediction. Curr Opin Struct Biol. 2008; 18 ( 3 ): 342 – 348. | |
dc.identifier.citedreference | Kinch LN, Li WL, Monastyrskyy B, Kryshtafovych A, Grishin NV. Evaluation of free modeling targets in CASP11 and ROLL. Proteins. 2016; 84: 51 – 66. | |
dc.identifier.citedreference | Monastyrskyy B, D’andrea D, Fidelis K, Tramontano A, Kryshtafovych A. New encouraging developments in contact prediction: assessment of the CASP11 results. Proteins. 2016; 84: 131 – 144. | |
dc.identifier.citedreference | Wu ST, Szilagyi A, Zhang Y. Improving protein structure prediction using multiple sequence‐based contact predictions. Structure. 2011; 19 ( 8 ): 1182 – 1191. | |
dc.identifier.citedreference | Ovchinnikov S, Kim DE, Wang RYR, Liu Y, DiMaio F, Baker D. Improved de novo structure prediction in CASP11 by incorporating coevolution information into Rosetta. Proteins. 2016; 84: 67 – 75. | |
dc.identifier.citedreference | Roy A, Kucukural A, Zhang Y. I‐TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc. 2010; 5 ( 4 ): 725 – 738. | |
dc.identifier.citedreference | Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y. The I‐TASSER Suite: protein structure and function prediction. Nat Methods. 2015; 12 ( 1 ): 7 – 8. | |
dc.identifier.citedreference | Xu D, Zhang Y. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge‐based force field. Proteins. 2012; 80 ( 7 ): 1715 – 1735. | |
dc.identifier.citedreference | Xu D, Zhang Y. Toward optimal fragment generations for ab initio protein structure assembly. Proteins. 2013; 81 ( 2 ): 229 – 239. | |
dc.identifier.citedreference | Zhang Y. Interplay of I‐TASSER and QUARK for template‐based and ab initio protein structure prediction in CASP10. Proteins. 2014; 82: 175 – 187. | |
dc.identifier.citedreference | Zhang WX, Yang JY, He BJ, et al. Integration of QUARK and I‐TASSER for ab initio protein structure prediction in CASP11. Proteins. 2016; 84: 76 – 86. | |
dc.identifier.citedreference | Wu ST, Zhang Y. LOMETS: A local meta‐threading‐server for protein structure prediction. Nucleic Acids Res. 2007; 35 ( 10 ): 3375 – 3382. | |
dc.identifier.citedreference | Yang JY, Zhang WX, He BJ, et al. Template‐based protein structure prediction in CASP11 and retrospect of I‐TASSER in the last decade. Proteins. 2016; 84: 233 – 246. | |
dc.identifier.citedreference | Kinch L, Shi SY, Cong Q, Cheng H, Liao YX, Grishin NV. CASP9 assessment of free modeling target predictions. Proteins. 2011; 79 ( S10 ): 59 – 73. | |
dc.identifier.citedreference | Tai CH, Bai HJ, Taylor TJ, Lee B. Assessment of template‐free modeling in CASP10 and ROLL. Proteins. 2014; 82: 57 – 83. | |
dc.identifier.citedreference | Weigt M, White RA, Szurmant H, Hoch JA, Hwa T. Identification of direct residue contacts in protein‐protein interaction by message passing. Proc Natl Acad Sci U S A. 2009; 106 ( 1 ): 67 – 72. | |
dc.identifier.citedreference | Jones DT, Buchan DWA, Cozzetto D, Pontil M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics. 2012; 28 ( 2 ): 184 – 190. | |
dc.identifier.citedreference | Kamisetty H, Ovchinnikov S, Baker D. Assessing the utility of coevolution‐based residue‐residue contact predictions in a sequence‐ and structure‐rich era. Proc Natl Acad Sci U S A. 2013; 110 ( 39 ): 15674 – 15679. | |
dc.identifier.citedreference | Cheng JL, Baldi P. Three‐stage prediction of protein beta‐sheets by neural networks, alignments and graph algorithms. Bioinformatics. 2005; 21 ( Suppl 1 ): I75 – I84. | |
dc.identifier.citedreference | Wu S, Zhang Y. A comprehensive assessment of sequence‐based and template‐based methods for protein contact prediction. Bioinformatics. 2008; 24 ( 7 ): 924 – 931. | |
dc.identifier.citedreference | Kosciolek T, Jones DT, Deane CM. De novo structure prediction of globular proteins aided by sequence variation‐derived contacts. PLoS One. 2014; 9 ( 3 ): e92197 | |
dc.identifier.citedreference | He B, Mortuza SM, Wang Y, Shen HB, Zhang Y. NeBcon: protein contact map prediction using neural network training coupled with naive Bayes classifiers. Bioinformatics. 2017; 33 ( 15 ): 2296 – 2306. | |
dc.identifier.citedreference | Zhang Y. Protein structure prediction: when is it useful?. Curr Opin Struct Biol. 2009; 19 ( 2 ): 145 – 155. | |
dc.identifier.citedreference | Yang J, Wang Y, Zhang Y. ResQ: an approach to unified estimation of B‐factor and residue‐specific error in protein structure prediction. J Mol Biol. 2016; 428 ( 4 ): 693 – 701. | |
dc.identifier.citedreference | Xue Z, Xu D, Wang Y, Zhang Y. ThreaDom: extracting protein domain boundary information from multiple threading alignments. Bioinformatics. 2013; 29 ( 13 ): i247 – i256. | |
dc.owningcollname | Interdisciplinary and Peer-Reviewed |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.