Work Description

Title: Equilibrium Path Sampling Data for Two Glycosynthetic Reactions of Thermatoga maratima Alpha-L-Fucosidase D224G Open Access Deposited

http://creativecommons.org/licenses/by/4.0/
Attribute Value
Methodology
  • This data concerns equilibrium path sampling, which is a method of measuring the free energy along a specified reaction coordinate by dividing the reaction coordinate into discrete windows and sampling the relative frequencies with which unbiased simulations visit different parts of each window. The data was collected using the proprietary, open-source software ATESA:  https://github.com/team-mayes/atesa
Description
  • This project aimed to discover and analyze the molecular mechanism of synthesis of two particular fucosylated oligosaccharide products in a mutant enzyme, Thermatoga maratima Alpha-L-Fucosidase D224G, whose wild type performs the opposite reaction (cleavage of fucosyl glycosidic bonds). Discovery of the mechanism was performed using an unbiased simulations method known as aimless shooting, whereas analysis of the mechanism in terms of the energy profile was performed using a separate method known as equilibrium path sampling. The data here concerns the latter method.

  • The contents of the atesa_master.zip are the ATESA GitHub project. A Python program for automating transition path sampling with aimless shooting using Amber.  https://github.com/team-mayes/atesa
Creator
Depositor
  • tburgin@umich.edu
Contact information
Discipline
Funding agency
  • National Science Foundation (NSF)
Keyword
Citations to related material
  • 10.1039/C8RE00240A
Resource type
Last modified
  • 11/04/2019
Published
  • 07/19/2019
Language
DOI
  • https://doi.org/10.7302/f2ht-xw04
License
To Cite this Work:
Burgin, T., Mayes, H. (2019). Equilibrium Path Sampling Data for Two Glycosynthetic Reactions of Thermatoga maratima Alpha-L-Fucosidase D224G [Data set]. University of Michigan - Deep Blue. https://doi.org/10.7302/f2ht-xw04

Relationships

Files (Count: 4; Size: 677 GB)

This is a brief documentation of the files stored in this directory. These files were produced in the course of research that resulted in the following publication:

Burgin, T. and Mayes, H. B. 2019. “Mechanism of oligosaccharide synthesis via a mutant GH29 fucosidase.” Reaction Chemistry & Engineering 4: 402-9

This project aimed to discover and analyze the molecular mechanism of synthesis of two particular fucosylated oligosaccharide products in a mutant enzyme whose wild type performs the opposite reaction (cleavage of fucosyl glycosidic bonds). Discovery of the mechanism was performed using an unbiased simulations method known as aimless shooting, whereas analysis of the mechanism in terms of the energy profile was performed using a separate method known as equilibrium path sampling (EPS). The data in this directory concerns the latter method.

Files in this directory are first grouped into two subdirectories: eps_working and alpha13_eps_working. Each of these directories represents the complete dataset for equilibrium path sampling for a particular reaction pathway. The former concerns the reaction mechanism wherein a bond is formed connecting the fucosyl donor C1 to the acceptor O4, whereas the latter concerns a very similar mechanism but connecting the donor C1 to the acceptor O3. Experimental work by Cobucci-Ponzano et al. (see above paper for full citation) had revealed that both mechanisms were possible with roughly equal probability.

Within each of these subdirectories, files are grouped according to name. A full description of the details of EPS is beyond the scope of this document, but in short, the dataset consists of “forward” (“fwd”) and “backward” (“bwd”) trajectories (.nc files) beginning from a shared “initial” (“init”) coordinate file (.rst or .rst7). The input files for these simulations are named as “eps[1-10].in” depending on the length of the simulation. End-point restart files for each trajectory are also retained. Each pair of fwd and bwd trajectories is checked for its value along the reaction coordinate detailed in the paper, and the resulting data files in each directory “eps_results.out” represent the collection of these results. These results can be translated into free energy profiles along the reaction coordinate using any number of methods, though this author would recommend Michael Shirts’s MBAR approach.

The simulations in this study were conducted using the Amber 16 molecular simulations software package. As such, files are in Amber format. The corresponding Amber parameter/topology file for each of the coordinate/trajectory files is “ts_guess.prmtop”. A full inventory of the files is not included herein; however, the reader is directed to the “status.txt” file in each subdirectory. This file enumerates each of the independent “threads” of simulations in the following format:

[thread_name] acceptance ratio: X/Y, or Z%
Status: …

Here, thread_name is the base name of the thread, of which there are Y “moves”. These moves are named as: [thread_name]_[y] (for y from 1 to Y), and each move should have a corresponding initial, forward, and backward trajectory/coordinate file as described above named as: [thread_name]_[y]_[init/fwd/bwd].[rst/rst7/nc]. The number of accepted trajectories X corresponds to the number of threads out of the total Y that contained at least one frame inside the target window of reaction coordinate ranges (again, see paper for a more detailed description of and references for EPS). The status line indicates the last-known status of the thread including met termination criteria if applicable. A history file for each thread can be found in the “history” subdirectory if more detailed acceptance information for each thread is desired.

Finally, a log file for each subdirectory is included, “as.log”. This file contains a detailed log of the underlying work done by the script that was used to automate EPS.

Download All Files (To download individual files, select them in the “Files” panel above)

Total work file size of 677 GB is too large to download directly. Consider using Globus (see below).

Files are ready   Download Data from Globus
Best for data sets > 3 GB. Globus is the platform Deep Blue Data uses to make large data sets available.   More about Globus