Show simple item record

Novel gene and gene model detection using a whole genome open reading frame analysis in proteomics

dc.contributor.authorFermin, Damian
dc.contributor.authorAllen, Baxter B
dc.contributor.authorBlackwell, Thomas W
dc.contributor.authorMenon, Rajasree
dc.contributor.authorAdamski, Marcin
dc.contributor.authorXu, Yin
dc.contributor.authorUlintz, Peter
dc.contributor.authorOmenn, Gilbert S
dc.contributor.authorStates, David J
dc.date.accessioned2015-08-07T17:26:04Z
dc.date.available2015-08-07T17:26:04Z
dc.date.issued2006-04-28
dc.identifier.citationGenome Biology. 2006 Apr 28;7(4):R35
dc.identifier.urihttps://hdl.handle.net/2027.42/112345en_US
dc.description.abstractAbstract Background Defining the location of genes and the precise nature of gene products remains a fundamental challenge in genome annotation. Interrogating tandem mass spectrometry data using genomic sequence provides an unbiased method to identify novel translation products. A six-frame translation of the entire human genome was used as the query database to search for novel blood proteins in the data from the Human Proteome Organization Plasma Proteome Project. Because this target database is orders of magnitude larger than the databases traditionally employed in tandem mass spectra analysis, careful attention to significance testing is required. Confidence of identification is assessed using our previously described Poisson statistic, which estimates the significance of multi-peptide identifications incorporating the length of the matching sequence, number of spectra searched and size of the target sequence database. Results Applying a false discovery rate threshold of 0.05, we identified 282 significant open reading frames, each containing two or more peptide matches. There were 627 novel peptides associated with these open reading frames that mapped to a unique genomic coordinate placed within the start/stop points of previously annotated genes. These peptides matched 1,110 distinct tandem MS spectra. Peptides fell into four categories based upon where their genomic coordinates placed them relative to annotated exons within the parent gene. Conclusion This work provides evidence for novel alternative splice variants in many previously annotated genes. These findings suggest that annotation of the genome is not yet complete and that proteomics has the potential to further add to our understanding of gene structures.
dc.titleNovel gene and gene model detection using a whole genome open reading frame analysis in proteomics
dc.typeArticleen_US
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/112345/1/13059_2006_Article_1303.pdf
dc.identifier.doi10.1186/gb-2006-7-4-r35en_US
dc.language.rfc3066en
dc.rights.holderFermin et al..
dc.date.updated2015-08-07T17:26:04Z
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.