HPeak: an HMM-based algorithm for defining read-enriched regions in ChIP-Seq data

Qin, Zhaohui S; Yu, Jianjun; Shen, Jincheng; Maher, Christopher A; Hu, Ming; Kalyana-Sundaram, Shanker; Yu, Jindan; Chinnaiyan, Arul M

HPeak: an HMM-based algorithm for defining read-enriched regions in ChIP-Seq data

dc.contributor.author	Qin, Zhaohui S
dc.contributor.author	Yu, Jianjun
dc.contributor.author	Shen, Jincheng
dc.contributor.author	Maher, Christopher A
dc.contributor.author	Hu, Ming
dc.contributor.author	Kalyana-Sundaram, Shanker
dc.contributor.author	Yu, Jindan
dc.contributor.author	Chinnaiyan, Arul M
dc.date.accessioned	2015-08-07T17:27:24Z
dc.date.available	2015-08-07T17:27:24Z
dc.date.issued	2010-07-02
dc.identifier.citation	BMC Bioinformatics. 2010 Jul 02;11(1):369
dc.identifier.uri	https://hdl.handle.net/2027.42/112381	en_US
dc.description.abstract	Abstract Background Protein-DNA interaction constitutes a basic mechanism for the genetic regulation of target gene expression. Deciphering this mechanism has been a daunting task due to the difficulty in characterizing protein-bound DNA on a large scale. A powerful technique has recently emerged that couples chromatin immunoprecipitation (ChIP) with next-generation sequencing, (ChIP-Seq). This technique provides a direct survey of the cistrom of transcription factors and other chromatin-associated proteins. In order to realize the full potential of this technique, increasingly sophisticated statistical algorithms have been developed to analyze the massive amount of data generated by this method. Results Here we introduce HPeak, a Hidden Markov model (HMM)-based Peak-finding algorithm for analyzing ChIP-Seq data to identify protein-interacting genomic regions. In contrast to the majority of available ChIP-Seq analysis software packages, HPeak is a model-based approach allowing for rigorous statistical inference. This approach enables HPeak to accurately infer genomic regions enriched with sequence reads by assuming realistic probability distributions, in conjunction with a novel weighting scheme on the sequencing read coverage. Conclusions Using biologically relevant data collections, we found that HPeak showed a higher prevalence of the expected transcription factor binding motifs in ChIP-enriched sequences relative to the control sequences when compared to other currently available ChIP-Seq analysis approaches. Additionally, in comparison to the ChIP-chip assay, ChIP-Seq provides higher resolution along with improved sensitivity and specificity of binding site detection. Additional file and the HPeak program are freely available at http://www.sph.umich.edu/csg/qin/HPeak.
dc.title	HPeak: an HMM-based algorithm for defining read-enriched regions in ChIP-Seq data
dc.type	Article	en_US
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/112381/1/12859_2009_Article_3826.pdf
dc.identifier.doi	10.1186/1471-2105-11-369	en_US
dc.language.rfc3066	en
dc.rights.holder	Qin et al.
dc.date.updated	2015-08-07T17:27:24Z
dc.owningcollname	Interdisciplinary and Peer-Reviewed

Files in this item

Name:: 12859_2009_Article_3826.pdf
Size:: 1.095MB
Format:: PDF

View/Open

Interdisciplinary and Peer-Reviewed

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.