Date: 6 August, 2019 Dataset Title: Data for Compensatory trans-regulatory alleles minimizing variation in TDH3 expression are common within Saccharomyces cerevisiae Dataset Creators: Wittkopp, Patricia J.; Metzger, Brian P.H. Dataset Contact: Patricia Wittkopp, wittkopp@umich.edu Funding: National Science Foundation (NSF), National Institutes of Health (NIH) Research Overview: Flow cytometry was used to measure the trans-regulatory affect on TDH3 promoter activity among 56 strains of the budding yeast Saccharomyces cerevisiae. Quantitative trait loci (QTL) mapping was used to identify loci with trans-regulatory effects on the TDH3 promoter activity among four genetically distinct S. cerevisiae strains. These data were used to test models natural selection acting on TDH3 trans-regulation. Methodology: Detailed methodology can be found in the linked paper. Briefly, a YFP reporter under control of the TDH3 promoter from the BY laboratory of S. cerevisiae was inserted into the same genomic location in 56 S. cerevisiae strains. Differences in YFP fluorescence among strains was compared to neutral models of expression evolution to detect natural selection. For a subset of strains, expression quantitative trait mapping was performed via bulk segregant analysis. Instrument and/or Software specifications: Flow cytometry data was analyzed on an Accuri C6 flow cytometer connected to an Intellicyt autosampler. FACS sorting was conducted on a FACS canto II. Illumina sequencing data was performed on a HiSeq 2000 using 125 bp paired end sequencing at the University of Michigan Sequencing Core. Some packages required for analyzing flow cytometry data are not available via CRAN but can be found at https://bioconductor.org Data description: Raw flow cytometry data are available from FlowRepository (https://flowrepository.org/; FR-FCM-ZYVQ), and sequencing data from NCBI’s dbSNP (https://www.ncbi.nlm.nih.gov/snp; PRJNA527754 and PRJNA527772). Data deposited here represent processed forms of the data and R code for creating the figures and performing the statistical tests outlined in the accompanying publication. This includes data deposited as supplementary material with the linked paper, as well as additional mapping information. Scripts should be able to be run with some modifications for the local environment. Files contained here: Functions.R - Custom R functions used analysis. Should be loaded into R environment prior to running any analysis script Analysis.1-Data.Cleaning.R - R script used for cleaning and normalization of flow cytometry data Analysis.2-Figures.R - R script used for making figures in the attached paper that are not associated with QTL mapping Analysis.3-Mapping.R - R script used for identifying QTL and generating figures for the attached paper Folder/Plate Layouts - Meta information for flow cytometry data, including strains, treatments, and positions of samples. CIS.NATURAL.LAYOUT.txt - Meta information for naturally occuring promoter variants and new promoter mutations flow cytometry data. Initially published with Metzger, B. P. H., Yuan, D. C., Gruber, J. D., Duveau, F. D., & Wittkopp, P. J. (2015). Selection on noise constrains variation in a eukaryotic promoter. Nature 521:344–347. TRANS.MUTATION.1.LAYOUT.txt - Meta information new trans-regulatory mutation flow cytometry data. Initially published with Metzger, B. P. H., Duveau, F., Yuan, D. C., Tryban, S., Yang, B., & Wittkopp, P. J. (2016). Contrasting Frequencies and Effects of cis- and trans-Regulatory Mutations Affecting Gene Expression. Mol. Biol. Evol. 33:1131–1146. TRANS.MUTATION.2.LAYOUT.txt - Meta information new trans-regulatory mutation flow cytometry data. Initially published with Metzger, B. P. H., Duveau, F., Yuan, D. C., Tryban, S., Yang, B., & Wittkopp, P. J. (2016). Contrasting Frequencies and Effects of cis- and trans-Regulatory Mutations Affecting Gene Expression. Mol. Biol. Evol. 33:1131–1146. TRANS.NATURAL.LAYOUT.txt - Meta information naturally occuring trans-regulatory flow cytometry data. Folder/Processed Flow Data - Flow cytometry data after cleaning and normalization. FILTER.CIS.DATA.txt - Effects of naturally occuring promoter variants and new promoter mutations. FILTER.TRANS.1.DATA.txt - Effects of new trans-regulatory mutations. FILTER.TRANS.2.DATA.txt - Effects of new trans-regulatory mutations. FILTER.TRANS.N.DATA.txt - Effects of naturally occuring trans-regulatory mutations. Folder/Strain Information - General strain information. RC6.hom.boot.nwk - Original Phylogenetic tree published with MacLean, C. J., Metzger, B. P. H., Yang, J.-R. R., Ho, W.-C. C., Moyers, B., & Zhang, J. (2017). Deciphering the Genic Basis of Yeast Fitness Variation by Simultaneous Forward and Reverse Genetics. Mol. Biol. Evol. 34:2486–2502. Sc.nwk - Processed phylogenetic tree that contains only strains used in current analysis. NV.Strains.txt - Meta information mapping glycerol stock names to strain names. Strain.pTDH3.Haplotypes.txt - TDH3 promoter haplotype naturally found in each S. cerevisiae strain. Folder/Fitness - Data used for estimating fitness effects of changes in TDH3 expression. Data originally published in Duveau, F., Hodgins-Davis, A., Metzger, B. P. H., Yang, B., Tryban, S., Walker, E. A., et al. (2018). Fitness effects of altering gene expression noise in Saccharomyces cerevisiae. Elife 7:1–33. Experiment_s.estimates_filtered.txt - Input data for fitness and expression. SUMMARY.DATA.EXPRESSION.2.txt - Processed data for expression. SUMMARY.DATA.FITNESS.txt - Processed data for fitness. Folder/Mapping - Files needed for QTL mapping and intermediate files generated during this process. README.txt - Overview of mapping experiment samples and strains. SAMPLES.txt - Relationship between Illumina datasets and experiments. Layout.Mapping.txt - Strains, crosses, and sorts for each experiment. S288c.length - Length of individual S288c chromosomes. OVERLAP.TRIPLE.R - Overlap function for QTL. SNP.FILTER.R - SNP calling function. *.vcf - vcf files for each experiment identifying all called SNPs. *.peak.calls.5.txt - QTL peak information for each experiment using a |G'| cutoff of 5. *.peak.calls.10.txt - QTL peak information for each experiment using a |G'| cutoff of 10. Related publication(s): #Removed reference to biorxiv version. Will need to update with paper once accepted. Use and Access: This data set is made available under a Creative Commons Attribution-NonCommercial license (CC BY-NC 4.0). To Cite Data: Wittkopp, P.J., Metzger, B.PH. (2019). Data for Compensatory trans-regulatory alleles minimizing variation in TDH3 expression are common within Saccharomyces cerevisiae. University of Michigan Deep Blue Data Repository. https://doi.org/10.xxxxxxx