Machine learning models for Si nanoparticle growth in nonthermal plasma

Raymond, Matt; Elvati, Paolo; Saldinger, Jacob C; Lin, Jonathan; Shi, Xuetao; Violi, Angela

Work Description

Title: Machine learning models for Si nanoparticle growth in nonthermal plasma Open Access Deposited

Attribute	Value
Methodology	The data are mechanism labeling of the Molecular Dynamics (MD) simulation trajectories at the end of allocated simulation time. The MD simulation is carried out using LAMMPS and the atomic interactions were modeled using a classical all-atom reactive force field. The labeling is made by computing the composition and the number of clusters in the system. Two atoms were assigned to the same cluster if their distance was less than twice their typical bond length, namely 0.44, 0.32 and 0.148 nm for Si/Si, Si/H, and H/H pairs, respectively. If the trajectory outcome is non-sticking (more than one cluster), the label is "-1"; If the trajectory outcome is physisorption (one cluster, but no chemical bond formed), the label is "0"; If the trajectory outcome is chemisorption (one cluster, chemical bonds formed), the label is "1", "2", "3", "4", where the numerical value corresponds to the number new bonds formed. This approach is the same as in Shi, X., Elvati, P., Violi, A. (2021). On the growth of Si nanoparticles in non-thermal plasma: physisorption to chemisorption conversion. J. Phys. D. https://doi.org/10.1088/1361-6463/ac0b71
Description	Nanoparticles (NPs) formed in nonthermal plasmas (NTPs) can have unique properties and applications. However, modeling their growth in these environments presents significant challenges due to the non-equilibrium nature of NTPs, making them computationally expensive to describe. In this work, we address the challenges associated with accelerating the estimation of parameters needed for these models. Specifically, we explore how different machine learning models can be tailored to improve prediction outcomes. We apply these methods to reactive classical molecular dynamics data, which capture the processes associated with colliding silane fragments in NTPs. These reactions exemplify processes where qualitative trends are clear, but their quantification is challenging, hard to generalize, and requires time-consuming simulations. Our results demonstrate that good prediction performance can be achieved when appropriate loss functions are implemented and correct invariances are imposed. While the diversity of molecules used in the training set is critical for accurate prediction, our findings indicate that only a fraction (15-25%) of the energy and temperature sampling is required to achieve high levels of accuracy. This suggests a substantial reduction in computational effort is possible for similar systems.
Creator	Raymond, Matt; Elvati, Paolo; Saldinger, Jacob C; Lin, Jonathan; Shi, Xuetao; and Violi, Angela
Creator ORCID iD	https://orcid.org/0000-0001-6824-8692
Depositor	[email protected]
Depositor creator	true
Contact information	[email protected]
Discipline	Science
Funding agency	National Science Foundation (NSF) Other Funding Agency
Other Funding agency	US Army Research Office
ORSP grant number	W911NF-18-1-0240
Keyword	machine learning molecular dynamics nanoparticle nonthermal plasma silane sticking coefficient
Citations to related material	Raymond, M., Elvati, P., Saldinger, J. C., Lin, J., Shi, X., & Violi, A. (2025). Machine learning models for Si nanoparticle growth in nonthermal plasma. Plasma Sources Science and Technology. https://doi.org/10.1088/1361-6595/adbae1 https://arxiv.org/abs/2501.00003
Resource type	Dataset
Last modified	04/08/2025
Published	04/08/2025
Language	English
DOI	https://doi.org/10.7302/5dt5-cy22
License	http://creativecommons.org/publicdomain/zero/1.0/

To Cite this Work:
Raymond, M., Elvati, P., Saldinger, J. C., Lin, J., Shi, X., Violi, A. (2025). Machine learning models for Si nanoparticle growth in nonthermal plasma [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/5dt5-cy22

Relationships


This work is not a member of any user collections.

Files (Count: 69; Size: 8.01 MB)

Title	Original Upload	Last Modified	File Size	Access	Actions
readme.txt	2025-03-20	2025-04-02	6.61 KB	Open Access	View Details Download
si2h6.si2h.csv	2025-03-06	2025-03-06	109 KB	Open Access	View Details Download
si2h6.si2h2.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si2h6.si2h4.csv	2025-03-06	2025-03-06	113 KB	Open Access	View Details Download
si2h6.si2h5.csv	2025-03-06	2025-03-06	113 KB	Open Access	View Details Download
si2h6.si2h6.csv	2025-03-06	2025-03-06	114 KB	Open Access	View Details Download
si2h6.sih2si.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si2h6.sih2sih.csv	2025-03-06	2025-03-06	111 KB	Open Access	View Details Download
si2h6.sih3si.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si2h6.sih3sih.csv	2025-03-06	2025-03-06	112 KB	Open Access	View Details Download
si4.si2h.csv	2025-03-06	2025-03-06	109 KB	Open Access	View Details Download
si4.si2h2.csv	2025-03-06	2025-03-06	109 KB	Open Access	View Details Download
si4.si2h4.csv	2025-03-06	2025-03-06	109 KB	Open Access	View Details Download
si4.si2h5.csv	2025-03-06	2025-03-06	109 KB	Open Access	View Details Download
si4.si2h6.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si4.sih2si.csv	2025-03-06	2025-03-06	109 KB	Open Access	View Details Download
si4.sih2sih.csv	2025-03-06	2025-03-06	109 KB	Open Access	View Details Download
si4.sih3si.csv	2025-03-06	2025-03-06	109 KB	Open Access	View Details Download
si4.sih3sih.csv	2025-03-06	2025-03-06	109 KB	Open Access	View Details Download
si29h18.si2h.csv	2025-03-06	2025-03-06	109 KB	Open Access	View Details Download
si29h18.si2h2.csv	2025-03-06	2025-03-06	109 KB	Open Access	View Details Download
si29h18.si2h4.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h18.si2h5.csv	2025-03-06	2025-03-06	111 KB	Open Access	View Details Download
si29h18.si2h6.csv	2025-03-06	2025-03-06	112 KB	Open Access	View Details Download
si29h18.sih.csv	2025-03-06	2025-03-06	109 KB	Open Access	View Details Download
si29h18.sih2.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h18.sih2si.csv	2025-03-06	2025-03-06	109 KB	Open Access	View Details Download
si29h18.sih2sih.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h18.sih3.csv	2025-03-06	2025-03-06	111 KB	Open Access	View Details Download
si29h18.sih3si.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h18.sih3sih.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h18.sih4.csv	2025-03-06	2025-03-06	113 KB	Open Access	View Details Download
si29h27.si2h.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h27.si2h2.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h27.si2h4.csv	2025-03-06	2025-03-06	111 KB	Open Access	View Details Download
si29h27.si2h5.csv	2025-03-06	2025-03-06	112 KB	Open Access	View Details Download
si29h27.si2h6.csv	2025-03-06	2025-03-06	113 KB	Open Access	View Details Download
si29h27.sih.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h27.sih2.csv	2025-03-06	2025-03-06	111 KB	Open Access	View Details Download
si29h27.sih2si.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h27.sih2sih.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h27.sih3.csv	2025-03-06	2025-03-06	112 KB	Open Access	View Details Download
si29h27.sih3si.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h27.sih3sih.csv	2025-03-06	2025-03-06	111 KB	Open Access	View Details Download
si29h27.sih4.csv	2025-03-06	2025-03-06	114 KB	Open Access	View Details Download
si29h31.si2h.csv	2025-03-06	2025-03-06	109 KB	Open Access	View Details Download
si29h31.si2h2.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h31.si2h4.csv	2025-03-06	2025-03-06	112 KB	Open Access	View Details Download
si29h31.si2h5.csv	2025-03-06	2025-03-06	113 KB	Open Access	View Details Download
si29h31.si2h6.csv	2025-03-06	2025-03-06	114 KB	Open Access	View Details Download
si29h31.sih.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h31.sih2.csv	2025-03-06	2025-03-06	111 KB	Open Access	View Details Download
si29h31.sih2si.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h31.sih2sih.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h31.sih3.csv	2025-03-06	2025-03-06	113 KB	Open Access	View Details Download
si29h31.sih3si.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h31.sih3sih.csv	2025-03-06	2025-03-06	111 KB	Open Access	View Details Download
si29h31.sih4.csv	2025-03-06	2025-03-06	114 KB	Open Access	View Details Download
si29h36.si2h.csv	2025-03-06	2025-03-06	109 KB	Open Access	View Details Download
si29h36.si2h2.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h36.si2h4.csv	2025-03-06	2025-03-06	112 KB	Open Access	View Details Download
si29h36.si2h5.csv	2025-03-06	2025-03-06	551 KB	Open Access	View Details Download
si29h36.si2h6.csv	2025-03-06	2025-03-06	555 KB	Open Access	View Details Download
si29h36.sih2si.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h36.sih2sih.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h36.sih3si.csv	2025-03-06	2025-03-06	110 KB	Open Access	View Details Download
si29h36.sih3sih.csv	2025-03-06	2025-03-06	111 KB	Open Access	View Details Download
readme.txt	2025-03-20	2025-03-20	6.61 KB	Open Access	View Details Download
readme.txt	2025-03-20	2025-03-20	6.61 KB	Open Access	View Details Download

Date: March 05, 2025

Dataset Title: Machine learning models for Si nanoparticle growth in nonthermal plasma

Dataset Contact: Angela Violi [email protected]

Dataset Creators:
Name: Matt Raymond
Email: [email protected]
Institution: University of Michigan Department of Electrical Engineering and Computer Science
ORCID: https://orcid.org/0000-0001-6824-8692

Name: Paolo Elvati
Email: [email protected]
Institution: University of Michigan Department of Mechanical Engineering
ORCID: https://orcid.org/0000-0002-6882-6023

Name: Jacob C. Saldinger
Email: [email protected]
Institution: University of Michigan Department of Chemical Engineering, Low Carbon Pathway Innovation at BP
ORCID: https://orcid.org/0000-0001-5005-614X

Name: Jonathan Lin
Email: [email protected]
Institution: University of Michigan Department of Electrical Engineering and Computer Science
ORCID: https://orcid.org/0009-0004-6381-4068

Name: Xuetao Shi
Email: [email protected]
Institution: University of Michigan Department of Mechanical Engineering, Dana-Farber Cancer Institute at Harvard
ORCID: https://orcid.org/0000-0001-6274-5495

Name: Angela Violi
Email: [email protected]
Institution: University of Michigan Departments of Electrical Engineering and Computer Science, Mechanical Engineering, and Chemical Engineering
ORCID: https://orcid.org/0000-0001-9517-668X

Funding: US Army Research Office MURI Grant No. W911NF-18-1-0240 and the
NSF ECO-CBET No. F059554.

Key Points:
- We simulated the collisions of silane nanoparticles of various sizes (which we call "clusters" and "impactors") using molecular dynamics with a reactive force field.
- "Clusters" are defined as Si2H6, Si4, and Si29Hy (y=18,27,31,36), while "impactors" are defined as Si2Hy (y=1-6) in all possible hydrogen distributions (\eg for Si2H4, we simulated both H2Si*-*SiH2 and HSi**-SiH3).
- The dependence of sticking probabilities on temperature, H coverage of both silane impactors and cluster surfaces, and the size of the cluster, are modeled using various machine learning models.
- We find that machine learning models accurately predict sticking probabilities when trained on as little as 15% of these simulations, significantly reducing the number of simulations that must be run.

Research Overview:
Nanoparticles (NPs) formed in nonthermal plasmas (NTPs) can have unique properties and applications. However, modeling their growth in these environments presents significant challenges due to the non-equilibrium nature of NTPs, making them computationally expensive to describe. In this work, we address the challenges associated with accelerating the estimation of parameters needed for these models. Specifically, we explore how different machine learning models can be tailored to improve prediction outcomes. We apply these methods to reactive classical molecular dynamics data, which capture the processes associated with colliding silane fragments in NTPs. These reactions exemplify processes where qualitative trends are clear, but their quantification is challenging, hard to generalize, and requires time-consuming simulations. Our results demonstrate that good prediction performance can be achieved when appropriate loss functions are implemented and correct invariances are imposed. While the diversity of molecules used in the training set is critical for accurate prediction, our findings indicate that only a fraction (15-25%) of the energy and temperature sampling is required to achieve high levels of accuracy. This suggests a substantial reduction in computational effort is possible for similar systems.

Methodology:
The data are mechanism labeling of the Molecular Dynamics (MD) simulation trajectories at the end of allocated simulation time. The MD simulation is carried out using LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator) and the atomic interactions were modeled using a classical all-atom reactive force field. The labeling is made by computing the composition and the number of clusters in the system.

Two atoms were assigned to the same cluster if their distance was less than twice their typical bond length, namely 0.44, 0.32 and 0.148 nm for Si/Si, Si/H, and H/H pairs, respectively.

If the trajectory outcome is non-sticking (more than one cluster), the label is "-1";
If the trajectory outcome is physisorption (one cluster, but no chemical bond formed), the label is "0";
If the trajectory outcome is chemisorption (one cluster, chemical bonds formed), the label is "1", "2", "3", "4", where the numerical value corresponds to the number new bonds formed.

This approach is the same as in [2].

Instrument and/or Software specifications: NA

Files contained here:
The 66 CSV files correspond to the 66 molecular dynamics simulations run for this work, out of the 78 possible cluster-impactor combinations described above. We did not include the 12 simulations that were previously simulated in [2].
The title of each CSV file corresponds to the label of one of the MD trajectories, and the columns are as follows:
- Temperature: the system temperature. There are 5 temperatures (300K, 400K, 500K, 600K, and 900K).
- Configuration: the molecular orientation configurations. There are 5 configurations, denoted "2, 4, 6, 8, 10", for both impacting fragments, separated by an underscore "_". In total, there could be 5*5=25 configurations (although not all of them may be sampled).
- Velocity interval: the impact velocity interval percentages in terms of the CDF (cumulative distribution function). Typically, there are 200 velocity intervals, but they may be sub-sampled down to 40 velocity intervals for certain trajectories.
- Label: the trajectory outcome, as described in the "Methodology" section of this document.

Below is an exhaustive list of all interactions evaluated in this work (choosing one from each column.)

| Cluster | Impactor |
|---------|----------|
| Si2H6 | SiH |
| Si4 | SiH2 |
| Si29H18 | SiH2-Si |
| Si29H27 | SiH3 |
| Si29H31 | SiH3-Si |
| Si29H36 | SiH3-SiH |
| | SiH4 |
| | SiH2-SiH |
| | Si2H |
| | Si2H2 |
| | Si2H4 |
| | Si2H5 |
| | Si2H6 |

Related publication(s):
[1] Raymond, M., Elvati, P., Saldinger, J. C., Lin, J. Shi, X., Violi, A. (2025). Machine learning models for Si nanoparticle growth in nonthermal plasma. Plasma Sources Sci. Technol. Https://doi.org/10.1088/1361-6595/adbae1
[2] Shi, X., Elvati, P., Violi, A. (2021). On the growth of Si nanoparticles in non-thermal plasma: physisorption to chemisorption conversion. J. Phys. D. https://doi.org/10.7302/vd87-wm68

Use and Access:
This data set is made available under a Creative Commons Public Domain license (CC0 1.0).

Update Provenance Log Entries

Download All Files (To download individual files, select them in the “Files” panel above)

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to contact us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.