Work Description

Title: Bioinformatic annotation of bacterial genes encoding SMR proteins Open Access Deposited

h
Attribute Value
Methodology
  • To gauge the distribution of SMR genes across diverse microbes, we evaluated bacterial genomes from the Joint Genome Institute’s curated set of ~1000 Genomic Encyclopedia of Bacteria and Archaea (GEBA) genomes. SMR genes were identified from GEBA genomes with HMMER3.3.2 using a profile Hidden Markov Model (profile HMM) constructed for the SMR family (pfam 00893). Profile HMMs for each subtype (Gdx, Qac, polyamine transport, and lipid transport) were constructed from functionally annotated clusters in a sequence similarity network of reference SMR proteins, and SMR sequences were assigned to the subtype that corresponded to the lowest e-value calculated by HMMR. SMR sequences were annotated “other” if the e-value was >10^-20.
Description
  • This data set includes text files (.csv files) for the bioinformatic annotation of SMR genes found in a dataset of phylogenetically diverse bacterial genomes. Bioinformatic analysis includes genome mining to identify SMR genes, prediction of the functional transporter subtype, and prediction of the direction of insertion in the bacterial membrane. Research overview: This bioinformatic dataset was prepared for a review on the structures, functions, and occurrence of Small Multidrug Resistance (SMR) Transporters. This dataset includes bioinformatic annotation of SMR genes identified in bacterial genomes from the Joint Genome Institute’s curated set of ~1000 Genomic Encyclopedia of Bacteria and Archaea (GEBA) genomes. The file GEBA_SMR_annotation.csv provides NCBI identification information (genome, species and chromosome information, locus tag, translation) and bioinformatic predictions of the SMR subtype and membrane insertion direction for each gene identified in the GEBA genome set. The file GEBA_SMR_species_table.csv has a separate entry for species in the GEBA genome set, along with the bioinformatic prediction of SMR subtype and membrane insertion direction for each SMR gene identified in the genome of that species. Dataset was generated by Christian B. Macdonald and Randy B. Stockbridge (Department of Molecular, Cellular and Developmental Biology, University of Michigan, Ann Arbor, MI, 48019) Generation of this dataset was supported by National Institutes of Health grants R35-GM128768 to Randy B. Stockbridge. Use and access: This dataset is provided as a .csv file (comma separated values) and can be read using any text editor or spreadsheet software such as Microsoft Excel.
Creator
Depositor
  • stockbr@umich.edu
Contact information
Discipline
Funding agency
  • National Science Foundation (NSF)
ORSP grant number
  • National Science Foundation (grant number 1845012) to R.B.S.
Citations to related material
  • Burata OE, Yeh TJ, Macdonald CB, Stockbridge RB. (2022). Still rocking in the structural era: a molecular overview of the Small Multidrug Resistance transporters. Journal of Biological Chemistry. In press.
Resource type
Curation notes
  • PACERDA 2024-02-20: changed funding agency at depositors request.
Last modified
  • 02/20/2024
Published
  • 08/17/2022
DOI
  • https://doi.org/10.7302/0ynd-b343
License
To Cite this Work:
Stockbridge, R. B., Christian B. Macdonald. (2022). Bioinformatic annotation of bacterial genes encoding SMR proteins [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/0ynd-b343

Relationships

This work is not a member of any user collections.

Files (Count: 3; Size: 449 KB)

Date: August 17, 2022
Dataset Title:"Bioinformatic annotation of bacterial genes encoding SMR proteins"
Datset creators: Randy B. Stockbridge and Christian B. Macdonald
Dataset contact: Randy B. Stockbridge, stockbr@umich.edu

Funding: National Science Foundation (grant number 1845012) to R.B.S

Dataset description:
This dataset includes text files (.csv files) for the bioinformatic annotation of SMR genes found in a dataset of phylogenetically diverse bacterial genomes. Bioinformatic analysis includes genome mining to identify SMR genes, prediction of the functional transporter subtype, and prediction of the direction of insertion in the bacterial membrane.

Research overview:
This bioinformatic dataset was prepared for a review on the structures, functions, and occurrence of Small Multidrug Resistance (SMR) Transporters. This dataset includes bioinformatic annotation of SMR genes identified in bacterial genomes from the Joint Genome Institute's curated set of ~1000 Genomic Encyclopedia of Bacteria and Archaea (GEBA) genomes. The file GEBA_SMR_annotation.csv provides NCBI identification information (genome, species and chromosome information, locus tag, translation) and bioinformatic predictions of the SMR subtype and membrane insertion direction for each gene identified in the GEBA genome set. The file GEBA_SMR_species_table.csv has a separate entry for species in the GEBA genome set, along with the bioinformatic prediction of SMR subtype and membrane insertion direction for each SMR gene identified in the genome of that species.

Dataset was generated by Christian B. Macdonald and Randy B. Stockbridge (Department of Molecular, Cellular and Developmental Biology, University of Michigan, Ann Arbor, MI, 48019) Generation of this dataset was supported by National Institutes of Health grants R35-GM128768 to Randy B. Stockbridge.

Use and access:
This dataset is provided as two .csv files (comma separated values) and can be read using any text editor or spreadsheet software such as Microsoft Excel.

Methods:
SMR genes were identified from GEBA genomes with HMMER3.3.2 using a profile Hidden Markov Model (profile HMM) constructed for the SMR family (pfam 00893). Profile HMMs for each subtype (Gdx, Qac, polyamine transport, and lipid transport) were constructed from functionally annotated clusters in a sequence similarity network of reference SMR proteins, and SMR sequences were assigned to the subtype that corresponded to the lowest e-value calculated by HMMR. SMR sequences were annotated "other" if the e-value was >10-20.

Download All Files (To download individual files, select them in the “Files” panel above)

Best for data sets < 3 GB. Downloads all files plus metadata into a zip file.



Best for data sets > 3 GB. Globus is the platform Deep Blue Data uses to make large data sets available.   More about Globus

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.