Work Description
Title: Data For: Achieving Reproducibility and Replicability of Molecular Dynamics and Monte Carlo Simulations Using the Molecular Simulation Design Framework (MoSDeF) Open Access Deposited
Attribute | Value |
---|---|
Methodology |
|
Description |
|
Creator | |
Depositor | |
Contact information | |
Discipline | |
Funding agency |
|
ORSP grant number |
|
Keyword | |
Citations to related material | |
Resource type | |
Last modified |
|
Published |
|
Language | |
DOI |
|
License |
(2025). Data For: Achieving Reproducibility and Replicability of Molecular Dynamics and Monte Carlo Simulations Using the Molecular Simulation Design Framework (MoSDeF) [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/fdqw-jy80
Relationships
- This work is not a member of any user collections.
Files (Count: 7; Size: 1000 GB)
Thumbnailthumbnail-column | Title | Original Upload | Last Modified | File Size | Access | Actions |
---|---|---|---|---|---|---|
![]() |
lrc_shift_subproject_workspace.tar.zst | 2025-01-10 | 2025-01-10 | 275 GB | Open Access |
|
![]() |
mdmc_ethanol_subproject_workspac...r.zst | 2025-01-10 | 2025-01-10 | 146 GB | Open Access |
|
![]() |
methane_systemsize_subproject_wo...r.zst | 2025-01-10 | 2025-01-10 | 90.5 GB | Open Access |
|
![]() |
waterspce_nist_subproject_worksp...r.zst | 2025-01-10 | 2025-01-10 | 12.7 GB | Open Access |
|
![]() |
workspace.tar.zst | 2025-01-10 | 2025-01-10 | 478 GB | Open Access |
|
![]() |
code.tar.zst | 2025-01-31 | 2025-03-23 | 14.4 MB | Open Access |
|
![]() |
README.md | 2025-01-31 | 2025-03-23 | 6.02 KB | Open Access |
|
Data For: Achieving Reproducibility and Replicability of Molecular Dynamics and Monte Carlo Simulations Using the Molecular Simulation Design Framework (MoSDeF)
This work was funded by the National Science Foundation, Grant/Awards OAC-1835067, OAC-1835560,
OAC-1835593, OAC-1835612, OAC-1835613, OAC-1835630, OAC-1835713, OAC-1835874
Dataset Contact: [email protected]
Dataset Creators (by institution):
- Vanderbilt University
- Craven, Nicholas
- Gilmer, Justin
- Moore, Timothy
- Iacovella, Chrisopher
- Quach, Co
- McCabe, Clare [email protected]
- Cummings, Peter [email protected]
- University of Minnesota Twin Cities
- Singh, Ramanish
- Siepmann, J. [email protected]
- Wayne State University
- Crawford, Brad
- Dyukov, Maxim
- Potoff, Jeffrey [email protected]
- University of Notre Dame
- Marin-Rimoldi, Eliseo
- Smith, Ryan
- Defever, Ryan
- Maginn, Edward [email protected]
- Boise State University
- Fothergill, Jenny
- Jones, Chris
- Jankowski, Eric [email protected]
- University of Michigan
- Butler, Brandon
- Anderson, Joshua
- Glotzer, Sharon [email protected]
Key Points:
- We compare the simulation densities for 5 different molecules across 3 MD engines and 3 MC engines
- We evaluate the affects of different long range correction methods for MD and MC engines
- Other affects of simulation implementation are evaluated for their affects on density with high precision
- We compare simulations setup via MoSDeF, which is transferable across engines, to data generated across engines without MoSDeF: Michael Schappals et al. Journal of Chemical Theory and Computation 2017 13 (9), 4270-4280 DOI: 10.1021/acs.jctc.7b00489.
Research Overview:
Molecular simulations are complex numerical computations that require extensive knowledge and
understanding to set up. As such, replicating simulation procedure can be difficult if it is
necessary to translate the exact procedures that were used across methodologies, software, or parse
from the text of a paper. Molecular Simulation Design Framework (MoSDeF) simplifies these tasks by
allowing common workflows to be instantiated and used for a variety of engines, and documents the
exact input files, forcefield used, and run time parameters for each engine. Specifically,
HOOMD-blue, LAMMPS, GROMACS, Cassandra, MCCCS-MN, and GOMC are available simulation methods set up
through MoSDeF. Across these engines, relative errors of <0.1% are discernible for even increasingly
complex simulation procedures, and controllable experiments are able to be run to evaluate the
affects of small differences in methodologies that exist across the engines. We demonstrate the
utility of using these group-curated tools for simulations research by comparing the standard errors
herein to the standard errors found in previous literature.
Methodology:
The data are generated through 6 simulation engines and conducted at 5 different universities.
- HOOMD-blue 4.0.0 -> University of Michigan
- LAMMPS 23Jun2022 -> Vanderbilt University
- GROMACS 3.0.5 -> Vanderbilt University
- Cassandra 1.2.5 -> Notre Dame University
- MCCCS-MN 2020 -> University of Minnesota
- GOMC v2.75a -> Wayne State University
Data are collected in 5 separate workspace, one for the main density data calculations across the
space and 4 for the subproject simulations that were performed to validate and dive deeper into
specific engine implementations. In order to copy the simulation trajectory and calculated averages
used to generate figures, these workspace folders must be downloaded and pointed to the correct
place in the code which can be found in code.tar.zst
in this archive and at
https://github.com/mosdef-hub/reproducibility_study
The following indicates where each directory in this repo should be installed relative to the GitHub
project structure.
workspace -> reproducibility_study/reproductibility_project/workspace
lrc_shift_subproject_workspace -> reproducibility_study/reproductibility_project/lrc_shift_subproject/workspace
mdmc_ethanol_subproject_workspace -> reproducibility_study/reproductibility_project/mdmc_ethanol_subproject/workspace
methane_systemsize_subproject_workspace -> reproducibility_study/reproductibility_project/methane_systemsize/workspace
water_spce_nist_workspace -> reproducibility_study/reproductibility_project/waterspce_nist_subproject/workspace
** note that the data in spe_subproject for the single point energies is found on GitHub and does not need to be
individually installed.
This work used signac-flow to manage the project, which was broken into pieces and run on clusters
at each university. The core steps of this process are as follows.
- Initialize job statepoints in the signac project 1.5. All jobs are then ran from the CLI for signac by navigating to the reproducibility_project/src/engines/ENGINE_NAME and running the local project.py file. For more information on interfacing with signac and signac-flow, see https://docs.signac.io/en/latest/flow-project.html#the-flowproject
- Generate molecular structures through scripts found in src/molecules/system_builder.py
- Submit simulation to cluster. This can be broken into multiple steps for some engines.
- Run data analysis reproducibility_project/aggregate_data/analyze_data.py
- Generate plots reproducibility_project/aggregate_summary/plotting.py
- Repeat 1-6 for each of the subprojects
- reproducibility_project/lrc_shift_subproject
- reproducibility_project/mdmc_ethanol_subproject
- reproducibility_project/spe_subproject
- reproducibility_project/methane_systemsize - There are multiple subsubprojects within this data, looking at number of particles and rcut changes.
- reproducibility_project/waterspce_nist_subproject
Other packages can be found in the file environment.yml
, and engine specific installs in
engines.yml
. To install and activate, run with miniconda or anaconda installed on your device.
git clone https://github.com/mosdef-hub/reproducibility_study.git
cd reproducibility_study
mamba env create -f environment.yml
conda activate mosdef-study38