Show simple item record

Overcoming sequence misalignments with weighted structural superposition

dc.contributor.authorKhazanov, Nickolay A.en_US
dc.contributor.authorDamm‐ganamet, Kelly L.en_US
dc.contributor.authorQuang, Daniel X.en_US
dc.contributor.authorCarlson, Heather A.en_US
dc.date.accessioned2012-11-07T17:04:39Z
dc.date.available2014-01-07T14:51:08Zen_US
dc.date.issued2012-11en_US
dc.identifier.citationKhazanov, Nickolay A.; Damm‐ganamet, Kelly L. ; Quang, Daniel X.; Carlson, Heather A. (2012). "Overcoming sequence misalignments with weighted structural superposition." Proteins: Structure, Function, and Bioinformatics 80(11): 2523-2535. <http://hdl.handle.net/2027.42/94274>en_US
dc.identifier.issn0887-3585en_US
dc.identifier.issn1097-0134en_US
dc.identifier.urihttps://hdl.handle.net/2027.42/94274
dc.description.abstractAn appropriate structural superposition identifies similarities and differences between homologous proteins that are not evident from sequence alignments alone. We have coupled our Gaussian‐weighted RMSD (wRMSD) tool with a sequence aligner and seed extension (SE) algorithm to create a robust technique for overlaying structures and aligning sequences of homologous proteins (HwRMSD). HwRMSD overcomes errors in the initial sequence alignment that would normally propagate into a standard RMSD overlay. SE can generate a corrected sequence alignment from the improved structural superposition obtained by wRMSD. HwRMSD's robust performance and its superiority over standard RMSD are demonstrated over a range of homologous proteins. Its better overlay results in corrected sequence alignments with good agreement to HOMSTRAD. Finally, HwRMSD is compared to established structural alignment methods: FATCAT, secondary‐structure matching, combinatorial extension, and Dalilite. Most methods are comparable at placing residue pairs within 2 Å, but HwRMSD places many more residue pairs within 1 Å, providing a clear advantage. Such high accuracy is essential in drug design, where small distances can have a large impact on computational predictions. This level of accuracy is also needed to correct sequence alignments in an automated fashion, especially for omics‐scale analysis. HwRMSD can align homologs with low‐sequence identity and large conformational differences, cases where both sequence‐based and structural‐based methods may fail. The HwRMSD pipeline overcomes the dependency of structural overlays on initial sequence pairing and removes the need to determine the best sequence‐alignment method, substitution matrix, and gap parameters for each unique pair of homologs. Proteins 2012. © 2012 Wiley Periodicals, Inc.en_US
dc.publisherWiley Subscription Services, Inc., A Wiley Companyen_US
dc.subject.otherProtein Flexibilityen_US
dc.subject.otherStructure Overlayen_US
dc.subject.otherRMSDen_US
dc.subject.otherStructure Alignmenten_US
dc.subject.otherHomologen_US
dc.subject.otherSequence Alignmenten_US
dc.titleOvercoming sequence misalignments with weighted structural superpositionen_US
dc.typeArticleen_US
dc.rights.robotsIndexNoFollowen_US
dc.subject.hlbsecondlevelBiological Chemistryen_US
dc.subject.hlbtoplevelScienceen_US
dc.description.peerreviewedPeer Revieweden_US
dc.contributor.affiliationumDepartment of Medicinal Chemistry, 428 Church Street, University of Michigan, Ann Arbor, MI 48109‐1065en_US
dc.contributor.affiliationumDepartment of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109‐2218en_US
dc.contributor.affiliationumDepartment of Medicinal Chemistry, University of Michigan, Ann Arbor, Michigan 48109‐1065en_US
dc.contributor.affiliationotherThe Cooper Union, New York, New York 10003en_US
dc.identifier.pmid22733542en_US
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/94274/1/PROT_24134_sm_SuppInfo2.pdf
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/94274/2/PROT_24134_sm_SuppInfo1.pdf
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/94274/3/24134_ftp.pdf
dc.identifier.doi10.1002/prot.24134en_US
dc.identifier.sourceProteins: Structure, Function, and Bioinformaticsen_US
dc.identifier.citedreferenceDelano W. The PyMOL molecular graphics system, San Carlos, CA: DeLano Scientific; 2002.en_US
dc.identifier.citedreferenceRoland L D, Jr. Sequence comparison and protein structure prediction. Curr Opin Struct Biol 2006; 16: 374 – 384.en_US
dc.identifier.citedreferenceSam V, Tai C, Garnier J, Gibrat J, Lee B, Munson P. Towards an automatic classification of protein structural domains based on structural similarity. Biomed Chromatogr Bioinformatics 2008; 9: 74.en_US
dc.identifier.citedreferenceDamm KL, Carlson HA. Gaussian‐weighted RMSD superposition of proteins: a structural comparison for flexible proteins and predicted protein structures. Biophys J 2006; 90: 4558 – 4573.en_US
dc.identifier.citedreferenceKabsch W. A solution for the best rotation to relate two sets of vectors. Acta Crystallogr A 1976; 32: 922 – 923.en_US
dc.identifier.citedreferenceRice P, Longden I, Bleasby A. EMBOSS: The European molecular biology open software suite. Trends Genet 2000; 16: 276 – 277.en_US
dc.identifier.citedreferenceTai C‐H, Vincent J, Kim C, Lee B. SE: an algorithm for deriving sequence alignment from a pair of superimposed structures. Biomed Chromatogr Bioinformatics 2009; 10: S4.en_US
dc.identifier.citedreferenceBerman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The Protein Data Bank. Nucleic Acids Res 2000; 28: 235 – 242.en_US
dc.identifier.citedreferenceEMBL‐EBI. Help‐About Scoring Matrices, http://www.ebi.ac.uk/help/matrix.html; 2010.en_US
dc.identifier.citedreferenceEnroth C, Neujahr H, Schneider G, Lindqvist Y. The crystal structure of phenol hydroxylase in complex with FAD and phenol provides evidence for a concerted conformational change in the enzyme and its cofactor during catalysis. Structure 1998; 6: 605 – 617.en_US
dc.identifier.citedreferenceMesecar AD, Koshland DE. Sites of binding and orientation in a four‐location model for protein stereospecificity. IUBMB Life 2000; 49: 457 – 466.en_US
dc.identifier.citedreferenceYe Y, Godzik A. Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 2003; 19: ii246 – ii255.en_US
dc.identifier.citedreferenceTheobald DL, Wuttke DS. THESEUS: maximum likelihood superpositioning and analysis of macromolecular structures. Bioinformatics 2006; 22: 2171 – 2172.en_US
dc.identifier.citedreferenceXu Z, Horwich AL, Sigler PB. The crystal structure of the asymmetric GroEL‐GroES‐(ADP)7 chaperonin complex. Nature 1997; 388: 741 – 750.en_US
dc.identifier.citedreferenceBraig K, Adams PD, Brünger AT. Conformational variability in the refined structure of the chaperonin GroEL at 2.8 A resolution. Nat Struct Mol Biol 1995; 2: 1083 – 1094.en_US
dc.identifier.citedreferenceDitzel L, Löwe J, Stock D, Stetter K‐O, Huber H, Huber R, Steinbacher S. Crystal structure of the thermosome, the archaeal chaperonin and homolog of CCT. Cell 1998; 93: 125 – 138.en_US
dc.identifier.citedreferenceMichel G, Sauve V, Larocque R, Li Y, Matte A, Cygler M. The structure of the RlmB 23S rRNA methyltransferase reveals a new methyltransferase fold with a unique knot. Structure 2002; 10: 1303 – 1315.en_US
dc.identifier.citedreferenceNureki O, Shirouzu M, Hashimoto K, Ishitani R, Terada T, Tamakoshi M, Oshima T, Chijimatsu M, Takio K, Vassylyev DG, Shibata T, Inoue Y, Kuramitsu S, Yokoyama S. An enzyme with a deep trefoil knot for the active‐site architecture. Acta Crystallogr D Biol Crystallogr 2002; 58: 1129 – 1137.en_US
dc.identifier.citedreferenceBiopython, version 1.42, http://biopython.org; 2006.en_US
dc.identifier.citedreferencePrice SR, Evans PR, Nagai K. Crystal structure of the spliceosomal U2B”‐U2A′ protein complex bound to a fragment of U2 small nuclear RNA. Nature 1998; 394: 645 – 650.en_US
dc.identifier.citedreferenceMarino M, Braun L, Cossart P, Ghosh P. Structure of the lnlB leucine‐rich repeats, a domain that triggers host cell invasion by the bacterial pathogen L. monocytogenes. Mol Cell 1999; 4: 1063 – 1072.en_US
dc.identifier.citedreferenceOwen DJ, Vallis Y, Pearse BM, McMahon HT, Evans PR. The structure and function of the beta 2‐adaptin appendage domain. EMBO J 2000; 19: 4216 – 4227.en_US
dc.identifier.citedreferenceTraub LM, Downs MA, Westrich JL, Fremont DH. Crystal structure of the alpha appendage of AP‐2 reveals a recruitment platform for clathrin‐coat assembly. Proc Natl Acad Sci USA 1999; 96: 8907 – 8912.en_US
dc.identifier.citedreferenceRost B. Twilight zone of protein sequence alignments. Protein Eng 1999; 12: 85 – 94.en_US
dc.identifier.citedreferenceElofsson A. A study on protein sequence alignment quality. Proteins Struct Funct Bioinformatics 2002; 46: 330 – 339.en_US
dc.identifier.citedreferenceBenson ML, Smith RD, Khazanov NA, Dimcheff B, Beaver J, Dresslar P, Nerothin J, Carlson HA. Binding MOAD, a high‐quality protein–ligand database. Nucleic Acids Res 2008; 36: D674 – D678.en_US
dc.identifier.citedreferenceGong W, O'Gara M, Blumenthal RM, Cheng X. Structure of pvu II DNA‐(cytosine N4) methyltransferase, an example of domain permutation and protein fold assignment. Nucleic Acids Res 1997; 25: 2702 – 2715.en_US
dc.identifier.citedreferenceHolm L, Sander C. Mapping the protein universe. Science 1996; 273: 595 – 602.en_US
dc.identifier.citedreferenceWatson JD, Laskowski RA, Thornton JM. Predicting protein function from sequence and structural data. Curr Opin Struct Biol 2005; 15: 275 – 284.en_US
dc.identifier.citedreferenceMarsden RL, Ranea JA, Sillero A, Redfern O, Yeats C, Maibaum M, Lee D, Addou S, Reeves GA, Dallman TJ, Orengo CA. Exploiting protein structure data to explore the evolution of protein function and biological complexity. Philos Trans R Soc Lond B Biol Sci 2006; 361: 425 – 440.en_US
dc.identifier.citedreferenceAndreeva A, Howorth D, Chandonia JM, Brenner SE, Hubbard TJ, Chothia C, Murzin AG. Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res 2008; 36: D419 – D425.en_US
dc.identifier.citedreferenceGreene LH, Lewis TE, Addou S, Cuff A, Dallman T, Dibley M, Redfern O, Pearl F, Nambudiry R, Reid A, Sillitoe I, Yeats C, Thornton JM, Orengo CA. The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution. Nucleic Acids Res 2007; 35: D291 – D297.en_US
dc.identifier.citedreferenceHolm L, Rosenstrom P. Dali server: conservation mapping in 3D. Nucleic Acids Res 2010; 38: W545 – W549.en_US
dc.identifier.citedreferenceBhaduri A, Pugalenthi G, Sowdhamini R. PASS2: an automated database of protein alignments organised as structural superfamilies. BMC Bioinformatics 2004; 5: 35.en_US
dc.identifier.citedreferenceWang Y, Addess KJ, Chen J, Geer LY, He J, He S, Lu S, Madej T, Marchler‐Bauer A, Thiessen PA, Zhang N, Bryant SH. MMDB: annotating protein sequences with Entrez's 3D‐structure database. Nucleic Acids Research 2007; 35: D298 – D300.en_US
dc.identifier.citedreferenceChandonia JM, Hon G, Walker NS, Lo Conte L, Koehl P, Levitt M, Brenner SE. The ASTRAL Compendium in 2004. Nucleic Acids Research 2004; 32: D189 – D192.en_US
dc.identifier.citedreferenceMizuguchi K, Deane CM, Blundell TL, Overington JP. HOMSTRAD: a database of protein structure alignments for homologous families. Protein Sci 1998; 7: 2469 – 2471.en_US
dc.identifier.citedreferenceSchmidt R, Altman RB, Gerstein M. LPFC: an internet library of protein family core structures. Protein Sci 1997; 6: 246 – 248.en_US
dc.identifier.citedreferenceOrengo CA, Thornton JM. Protein families and their evolution—a structural perspective. Annu Rev Biochem 2005; 74: 867 – 900.en_US
dc.identifier.citedreferenceValas R, Yang S, Bourne P. Nothing about protein structure classification makes sense except in the light of evolution. Curr Opini Struct Biol 2009; 19: 329 – 334.en_US
dc.identifier.citedreferenceKolodny R, Koehl P, Levitt M. Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. J Mol Biol 2005; 346: 1173 – 1188.en_US
dc.identifier.citedreferenceTaylor WR, Orengo CA. Protein structure alignment. J Mol Biol 1989; 208: 1 – 22.en_US
dc.identifier.citedreferenceGerstein M, Levitt M. Comprehensive assessment of automatic structural alignment against a manual standard, the scop classification of proteins. Protein Sci 1998; 7: 445 – 456.en_US
dc.identifier.citedreferenceSubbiah S, Laurents DV, Levitt M. Structural similarity of DNA‐binding domains of bacteriophage repressors and the globin core. Curr Biol 1993; 3: 141 – 148.en_US
dc.identifier.citedreferenceHolm L, Sander C. Protein structure comparison by alignment of distance matrices. J Mol Biol 1993; 233: 123 – 138.en_US
dc.identifier.citedreferenceKleywegt G. Use of Non‐crystallographic symmetry in protein structure refinement. Acta Crystallogr D 1996; 52: 842 – 857.en_US
dc.identifier.citedreferenceShindyalov IN, Bourne PE. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng 1998; 11: 739 – 747.en_US
dc.identifier.citedreferenceKrissinel E, Henrick K. Secondary‐structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr D 2004; 60: 2256 – 2268.en_US
dc.identifier.citedreferenceMayr G, Domingues F, Lackner P. Comparative analysis of protein structure alignments. Biomed Chromatogr Struct Biol 2007; 7: 50.en_US
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.