Work Description

Title: Database of PWM scores across the D. melanogaster genome Open Access Deposited

h
Attribute Value
Methodology
  • A collection of position weight matrices were scanned across the entire Drosophila genome using the FIMO program, and then scores were standardized; details of the methods are provided in  https://doi.org/10.1101/516500
Description
  • Genome-wide predictions of all transcription factor binding sites on the D. melanogaster genome were developed for use in predicting the locations of Polycomb response elements, as described in  https://doi.org/10.1101/516500
Creator
Depositor
  • petefred@umich.edu
Contact information
Discipline
ORSP grant number
  • AWD005923
Keyword
Date coverage
  • 2018-12-01
Citations to related material
  • Khabiri, M., & Freddolino, P. L. (2019). Genome-wide Prediction of Potential Polycomb Response Elements and their Functions. Preprint. BioRxiv, 516500. https://doi.org/10.1101/516500
Resource type
Last modified
  • 12/20/2022
Published
  • 12/20/2022
DOI
  • https://doi.org/10.7302/yb9e-aw67
License
To Cite this Work:
Khabiri, M., Freddolino, P. L. (2022). Database of PWM scores across the D. melanogaster genome [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/yb9e-aw67

Relationships

This work is not a member of any user collections.

Files (Count: 2; Size: 978 MB)

The included file contains information on all predicted binding sites for annotated transcription factors in the D. melanogaster genome. Detailed methods are provided in https://doi.org/10.1101/516500

In brief, position weight matrices (PWMs) were downloaded from the CIS-BP database or constructed from ChIP experiments in the modENCODE database, and scanned against the genome using the FIMO program from the MEME software suite. Scores against all possible genomic positions were obtained, and then normalized via calculation of robust z-scores.

The data are provided as a GNU tar archive that has been compressed using the bzip2 program. A separate .dat file is present for each chromosome (as indicated by the chr*** portion of the file name). Each file consists of a tab-delimited table with the following columns:

TF_name -- the protein corresponding to the PWM being checked
chr -- the name of the chromosome of a given binding site
start -- the starting position of a particular binding site
end -- the ending location of a particular binding site
rzscore -- the robust z-score for the match

Only sites with robust z scores of at least 2.3364 (corresponding to roughly the 99th percentile of a standard normal distribution) are included in the table.

Download All Files (To download individual files, select them in the “Files” panel above)

Best for data sets < 3 GB. Downloads all files plus metadata into a zip file.



Best for data sets > 3 GB. Globus is the platform Deep Blue Data uses to make large data sets available.   More about Globus

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.