INTRODUCTION: Thank you for your interest in these data. Contained here should be all code and data needed to regenerate the results of our manuscript ("Statistical Characterization of Emitter Fabrication over an Electrospray Array Thruster"), including the data-containing figures. We have included these in the interest of being as transparent and reproducible in our research as possible. Each of the underlying pieces of code is thoroughly internally documented and commented. If you have questions about the code or cannot get it working, please reach out and I would be happy to help. My aim is for this to allow full transparency and reproducibility of my results. I hope that including these tools encourages others in the electrospray community to thoroughly analyze emitter geometry to propagate through model predictions and inform inferences about experimental data. Sincerely, Collin B. Whittaker cbwhitt@umich.edu Last updated: 08 Jul 2024 RESEARCH OVERVIEW: This work represents an effort to characterize the emitter geometry of a porous conical type electrospray array thruster extensively, rigorously, and procedurally. Data was collected at NASA Glenn Research Center in Cleveland, OH from 06-26 Sep 2022, and analyzed primarily in October/November that year with subsequent updates/improvements over the next two years to support publication. Data were collected by Collin B. Whittaker with help from Dr. Jon Mackey of NASA GRC, and data were analyzed by Collin B. Whittaker. This work was supported by a NASA Space Technology Graduate Research Opportunity (80NSSC21K1247). METHODS: A 576-emitter thruster, the AFET-P-003, was inspected using a coherence scanning interferometer, a form of white light interferometer. The 9 topographic maps produced are organized into NorthWest, North, NorthEast, West, Center, East, SouthWest, South, and SouthEast subregions. These topographic maps of the emitter geometry are divided to produce 543 individual sites ready for analysis. A geometric model of an emitter as a spherically-capped cone recessed within a circular aperture is fit to each site, with the parameters of this model describing the high-level features of the emitter geometry. We subsequently computed statistics over these parameters for the entire population of 543 sites to make inferences about the manufacturing process and to inform future modeling efforts. More simply put, by shining a light at the thruster and then observing the interference pattern from its reflection, the shape of each of the microscopic needles that make up the thruster can be measured and a map of the height at each position produced. We divide this map into subsections that each contain one single needle. Because the needles are not manufactured perfectly (the surface may be bumpy or their sides missing chunks), we extract key dimensions of each needle (e.g., how tall it is, how sharp it is) by fitting an idealized blunt cone geometry to the data and reading off the parameters of this best-fit shape. Once we've repeated this for every needle, we can then compute statistics over the needles (such as the mean height or standard deviation of how sharp they are), which represent variability (i.e., tolerances) in our manufacturing process. ANALYSIS FLOW: Below is an overview of the data analysis procedure, including the order code should be run to reproduce results. I have included all the intermediate files created for the analysis I performed, so it is possible to reproduce starting at any stage, essentially. ----- Code/TopoData_Preprocessor.m - run individually on all 9 raw .datx files output by the Zygo profilometer - each run produces a corresponding .datp file as the user segments the domain VVVVV Code/EmissionModel/GeometryModel/ls_fit.py - called with all 9 .datp files as arguments (or called individually on all 9 files) - I ran this from the command line and wrapped it in a batch script so that I could submit it as a job to a high performance computing cluster - it took about 1.5 hours to do all 9 .datp's on two Intel Xeon Gold 6154 cpus, with a total 36 of cores (one node on our cluster), so a conventional quad-core machine takes instead 10s of hours (essentially, an entire day) to perform the same analysis; thus, if you have access to high performance computing, it is recommended, but not essential - produces 9 corresponding .datf files VVVVV Code/geom_extract.py - called as is - if you generate your own .datf files using ls_fit.py, you will need to change the names in the script to target the correct files - these files also contain metadata about the fitting process itself, which is described more thoroughly in ls_fit.py - this essentially concludes data analysis, producing a file that lists the geometry of every site in the array, but further plotting can be done VVVVV Various plotting files - once the geometry information has been extracted from the .datp's using geom_extract.py, it is possible to run the plotting scripts and generate the figures - note that several of these reference metadata about which sites were excluded from analysis, etc., so if you made different choices in segmenting the data this will manifest differently; I believe they are set up such that no modification of the code should be necessary, but perhaps not ----- FILE INVENTORY: Below appears a description of all the files included here. -------------------- Python-spec-file.txt A spec file for conda that can be used to pull the necessary packages in a new conda environment by running: conda create --name myenv --file Python-spec-file.txt While this environment differs from that used to generate the results originally, as far as I can tell it reproduces them exactly, at least on the same platform. Note that is only includes those packages that were explicitly requested in building the eenvironment, and not the related dependencies, etc. As a result, dependencies may have changed and could be subtly different. Feel free to reach out if you experience difficulties. You may need to modify the "prefix" to match your machine. For MATLAB codes, at least anything R2023a or newer should work. -------------------- README.txt This readme file. -------------------- Data/*.datx Where referenced, these are the raw data files exported by the Zygo machine. They are simply large HDF5 files that encapsulate the results of performing a stitch on the machine. -------------------- Data/*.datp These are the files exported by my preprocessing software (Code/TopoData_Preprocessor.m) that chops up the raw data into individual sites, computes some initial guesses to the geometry, and rewrites it into a format that the code that fits the geometry model expects. They are HDF5 files with a format internal to the code base. Since the preprocessor uses a human in the loop to segment the data, the results will not be exactly the same between multiple users. The .datp HDF5 files contain metadata that fully describes how this was done by the user, which could be used to reproduce the results exactly. At this time, the preprocessor does not support taking one of these files as an input to modify the choices made, though I wish to incorporate this functionality in the future. In any case, I have included these directly, so further analysis can still be computed exactly. -------------------- Data/*.datf These are the files exported by Code/EmissionModel/GeometryModel/ls_fit.py, and contain information about the fits of the geometry model to each site, which are largely read by the plotting scripts. They are HDF5 files with a format internal to the code base. -------------------- Code/EmissionModel This folder is copied over from a local library that I maintain for my own research. The files included are those used for the paper. -------------------- Code/EmissionModel/__init__.py Just a python package indicator file. -------------------- Code/EmissionModel/GeometryModel/__init__.py Similar for the GeometryModel subpackage. -------------------- Code/EmissionModel/GeometryModel/geometry_model.py This module implements the geometry model of the paper (i.e., as a function in Python). -------------------- Code/EmissionModel/GeometryModel/ls_fit.py This module implements a standard interface for fitting the geometric model of the geometry_model.py module to a set of data formatted as export by Code/TopoData_Preprocessor.m. This was the code used to generate the *.datf data files referenced by other code. -------------------- Code/EmissionModel/GeometryModel/HPC_ls_fit.sh This is a batch script for calling out to ls_fit.py on the University of Michigan's Great Lakes high performance computing cluster, which runs SLURM, where the codes were actually run. -------------------- Code/TopoData_Preprocessor.m This script provides a graphical interface that allows a user to import data from a topographic measurement and then to procedurally segment that data into different sites within an array. It supports a few different file formats (Zygo, Keyence, tab-delimited, see internal documentation) and has broader functionality than that used in the paper. Though being worth only a single line in the manuscript, this code took a while to develop and represents a crucial part of the data analysis pipeline, which is why I have included it. If you make any of your own edits or improvements, please let me know and we can perhaps migrate it to a central github repository or the like. -------------------- Code/geom_extract.py This script just reorganizes fit results from the 9 separate subregions taken into a collated representation for the entire array. -------------------- Code/dimsPlot.m This function produces the schematic for site geometry included in the paper. -------------------- Code/arrayProfilePlot.m This script produces figures showing the raw topographic map for each subregion of the array along with numbered sites. It produces a figure like that shown for the SW region in the paper for each of the 9 regions. -------------------- Code/fitCompPlot.m This script produces figures like the fit comparison for site 097 included in the paper, producing one for every single site in the array to produce a more complete record. The best-fit parameters are rendered on the figures themselves. -------------------- Code/fullCornerPlot.m This script produces the corner plot included in the paper. By adjusting the code, the plot can be extended to include additional parameters (e.g. r_b). It uses a predefined plotting function I use Code/cornerPlot.m -------------------- Code/cornerPlot.m A MATLAB function that is a canned way of making a corner plot. It is thoroughly documented. -------------------- Code/geomScatterPlot.m This script produces the site-by-site scatter plots for different parameters shown in the paper. The script generates scatter plots for many more parameters than could be included in the paper. -------------------- Code/convergencePlot.m This script produces the convergence study for subsample size presented in the paper. It calculates additional conergence curves that aren't plotted up.