Data: Electrostatic simulation dataset for electrospray emitter geometry design Author: Joshua Eckels Email: eckelsjd@umich.edu Reference: J.D. Eckels, C.B. Whittaker, B.A. Jorns, A.A. Gorodetsky, B. St. Peter, R.A. Dressler, “Simulation-based surrogate methodology of electric field for electrospray emitter geometry design and uncertainty quantification”, presented at the 37th International Electric Propulsion Conference, Boston, MA USA, June19-23, 2022 Code repository for data processing scripts: https://github.com/eckelsjd/espet_pde.git Software requirements: Matlab R2021b with PDE toolbox Python 3+ Instructions: 1. Unzip the "base_structure.zip" file. This gives the following format: /models - Trained feedforward neural network files (.onnx and .mat) /post - Sobol index results (.csv) /samples - Emitter geometry samples (.txt) /sims - Electrostatic simulation results files (.mat) /test - Test data for neural network (.mat) /train - Training data for neural network (.mat) 2. The "sim_train_X.zip" files contain the raw .mat simulation files that were used to train the neural network. Download and unzip all of these into the /sims folder 3. The "sim_test_X.zip" files contain the raw .mat simulation files that compose the test set for the neural network. Download and unzip all of these into the /test/sims folder. 4. Visit the code repository (linked above) for documentation on how to load, view, and process the raw simulation data. An example on usage is included in the linked code repository (open access). Details on file contents: /models/esi_surrogate.onnx - this is an open neural network exchange file format for distributing trained neural network weights. It can be ported into most open-source formats (Python, etc.) /models/model_onnxnet.mat - this is the trained neural network that can be loaded directly into Matlab /models/norm_data.mat - this is normalization data needed to generate predictions with the neural network. See the code repository for an example usage. /post - this directory contains .csv files for Sobol indices of the dataset. S1=first-order Sobol index, S2=second-order Sobol index, ST=total-order Sobol index. See the code repository for sensitivity analysis scripts, and see the conference paper for details on Sobol analysis. /*/samples.txt - These text files contain raw samples of emitter geometry in a 5-column format (d, rc, alpha, h, ra). See the conference paper for the interpretation of the geometry parameters. See the code repository for the scripts that generated the samples. /samples - Contains the geometry samples of the emitters used to train the neural network /sims - Contains the raw .mat simulation files used to train the neural network /train - Contains a compact form of the training dataset in 2 .mat files: train_dffnet_max.mat - Used to train a feedforward neural network with 5 inputs and 1 output, which is the peak electric field only train_geometry_cell.mat - Contains a Matlab cell array of the full electric field distribution on each training sample. This can be useful in future work to learn the full electric field distribution rather than just the peak - See the code repository for scripts that parse the /sims folder into these compact .mat files /test - Contains the samples, sims, and compact .mat files for the testing dataset. These are all akin to the training data described above. The /test folder has a separate folder for the test data set samples (test/samples) and simulations (test/sims) - This data is used to test the performance of the neural network on unseen emitter geometries - The "test_results.txt" file is akin to the "samples.txt" files, but has an additional column with the relative percent error of the neural network on the test set. "sampler_input.json" - This file is included mainly for completeness. It is used to extract the bounds of the dataset from which to sample emitter geometry. It is already included in the code repository Final notes: - The file structure contained within these zip files is compatible with the Matlab and Python scripts in the code repository. In fact, the code repository has a similar file structure to which this dataset can be copied into as is.