Show simple item record

Short sequence motifs, overrepresented in mammalian conserved non-coding sequences

dc.contributor.authorMinovitsky, Simon
dc.contributor.authorStegmaier, Philip
dc.contributor.authorKel, Alexander
dc.contributor.authorKondrashov, Alexey S
dc.contributor.authorDubchak, Inna
dc.date.accessioned2015-08-07T17:25:59Z
dc.date.available2015-08-07T17:25:59Z
dc.date.issued2007-10-18
dc.identifier.citationBMC Genomics. 2007 Oct 18;8(1):378
dc.identifier.urihttps://hdl.handle.net/2027.42/112343en_US
dc.description.abstractAbstract Background A substantial fraction of non-coding DNA sequences of multicellular eukaryotes is under selective constraint. In particular, ~5% of the human genome consists of conserved non-coding sequences (CNSs). CNSs differ from other genomic sequences in their nucleotide composition and must play important functional roles, which mostly remain obscure. Results We investigated relative abundances of short sequence motifs in all human CNSs present in the human/mouse whole-genome alignments vs. three background sets of sequences: (i) weakly conserved or unconserved non-coding sequences (non-CNSs); (ii) near-promoter sequences (located between nucleotides -500 and -1500, relative to a start of transcription); and (iii) random sequences with the same nucleotide composition as that of CNSs. When compared to non-CNSs and near-promoter sequences, CNSs possess an excess of AT-rich motifs, often containing runs of identical nucleotides. In contrast, when compared to random sequences, CNSs contain an excess of GC-rich motifs which, however, lack CpG dinucleotides. Thus, abundance of short sequence motifs in human CNSs, taken as a whole, is mostly determined by their overall compositional properties and not by overrepresentation of any specific short motifs. These properties are: (i) high AT-content of CNSs, (ii) a tendency, probably due to context-dependent mutation, of A's and T's to clump, (iii) presence of short GC-rich regions, and (iv) avoidance of CpG contexts, due to their hypermutability. Only a small number of short motifs, overrepresented in all human CNSs are similar to binding sites of transcription factors from the FOX family. Conclusion Human CNSs as a whole appear to be too broad a class of sequences to possess strong footprints of any short sequence-specific functions. Such footprints should be studied at the level of functional subclasses of CNSs, such as those which flank genes with a particular pattern of expression. Overall properties of CNSs are affected by patterns in mutation, suggesting that selection which causes their conservation is not always very strong.
dc.titleShort sequence motifs, overrepresented in mammalian conserved non-coding sequences
dc.typeArticleen_US
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/112343/1/12864_2007_Article_1091.pdf
dc.identifier.doi10.1186/1471-2164-8-378en_US
dc.language.rfc3066en
dc.rights.holderMinovitsky et al.
dc.date.updated2015-08-07T17:25:59Z
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.