Manganese in the sedimentary record has been interpreted by many as a powerful redox proxy for paleoenvironments, and yet very little work has been done to ensure that the manganese-rich minerals in the rock record are actually recording primary signals. In the accompanying manuscript, we present an in-depth characterization of the manganese mineralogy from two correlated regions recording the Transvaal Supergroup in South Africa with markedly different alteration histories to investigate if there can be post-depositional emplacement of manganese-rich minerals. The data uploaded here are X-ray absorption spectra of (1) manganese standard minerals that were useful in our analyses and (2) minerals from an important well-characterized sample that may be useful as comparative standards in future studies.
The relationship between words in a sentence often tell us more about the underlying semantic content of a document than its actual words, individually. Recent publications in the natural language processing arena, more specifically using word embeddings, try to incorporate semantic aspects into their word vector representation by considering the context of words and how they are distributed in a document collection. In this work, we propose two novel algorithms, called Flexible Lexical Chain II and Fixed Lexical Chain II that combine the semantic relations derived from lexical chains, prior knowledge from lexical databases, and the robustness of the distributional hypothesis in word embeddings into a single decoupled system. In short, our approach has three main contributions: (i) unsupervised techniques that fully integrate word embeddings and lexical chains; (ii) a more solid semantic representation that considers the latent relation between words in a document; and (iii) lightweight word embeddings models that can be extended to any natural language task. Knowledge-based systems that use natural language text can benefit from our approach to mitigate ambiguous semantic representations provided by traditional statistical approaches. The proposed techniques are tested against seven word embeddings algorithms using five different machine learning classifiers over six scenarios in the document classification task. Our results show that the integration between lexical chains and word embeddings representations sustain state-of-the-art results, even against more complex systems.
The writing samples included in this folder were collected as part of a longitudinal study in writing development published in Developing Writers in Higher Education: A Longitudinal Study (University of Michigan Press, 2019). Writing samples were chosen and uploaded by students as part of the study and come from lower and upper level courses. To learn more about this study, please see the epublication https://doi.org/10.3998/mpub.10079890.
The interviews included in this folder were conducted as part of a longitudinal study in writing development published in Developing Writers in Higher Education: A Longitudinal Study (University of Michigan Press, 2019). Interviews were conducted upon students' entry into the study (files labelled "entry") and exit from the study (files labelled "exit"). To learn more about this study, please see the epublication https://doi.org/10.3998/mpub.10079890 and the website https://www.developingwritersbook.org/pages/about/about-the-study/.
This data set is a collection of word similarity benchmarks (RG65, MEN3K, Wordsim 353, simlex999, SCWS, yp130, simverb3500) in their original format and converted into a cosine similarity scale.
In addition, we have two Wikpedia Dumps from 2010 (April) and 2018 (January) in which we provide the original format (raw words), converted using the techniques described in the paper (MSSA, MSSA-D and MSSA-NR) (title in this repository), and also the word embeddings models for 300d and 1000d using a word2vec implementation. A readme.txt is provided with more details for each file.
SPSS is required to access processed dataset in .sav format. Model output is provided as a word document, and Qualtrics survey instrument is included as PDF and .docx, where .docx version contains survey logic and question numbers.
The work on accelerating authenticated boot for embedded system resulted in designing an algorithm in python to perform the random address generation and cryptographic MAC calculation.
The Sampled Boot schemes implemented in this package allow a significant reduction of the time
needed to authenticate firmware images during startup, while still retaining a high degree of trust.
This is particularly useful for automotive applications in which startup time constraints make secure boot a time prohibitive process. and Citation for this dataset: Nasser, A., Gumise, W. (2019). Authenticated Boot Acceleration Algorithm [Code and data]. University of Michigan Deep Blue Data Repository. https://doi.org/10.7302/yeh1-1x17
This collection represents various raw data and analysis of cores extracted during the November 2008 mission of R/V Melville in the Santa Barbara Basin., The core included is the jumbo piston core MV0811-14JC. Core photos, physical properties and magnetic susceptibility from the multisensor track (MST), and the scanning X-ray fluorescence (XRF) data are included in the collection., and Cruise DOI: 10.7284/903459
The research is funded by NSF OCE-1304327.
The data and the scripts are to show that seizure onset dynamics and evoked responses change over the progression of epileptogenesis defined in this intrahippocampal tetanus toxin rat model. All tests explored in this study can be repeated with the data and scripts included in this repository. and Dataset citation: Crisp, D.N., Cheung, W., Gliske, S.V., Lai, A., Freestone, D.R., Grayden, D.B., Cook, MJ., Stacey, W.C. (2019). Epileptogenesis modulates spontaneous and responsive brain state dynamics [Data set]. University of Michigan Deep Blue Data Repository. https://doi.org/10.7302/r6vg-9658
This data and scripts are meant to test and show seizure differentiation based on bifurcation theory. A zip file is included which contains real and simulated seizure waveforms, Matlab scripts, and metadata. The matlab scripts allow for visual review validation and objective feature analysis. The file “README.txt” provides more detail about each individual file within the zip file. and Data citation: Crisp, D.N., Saggio, M.L., Scott, J., Stacey, W.C., Nakatani, M., Gliske, S.F., Lin, J. (2019). Epidynamics: Navigating the map of seizure dynamics - Code & Data [Data set]. University of Michigan Deep Blue Data Repository. https://doi.org/10.7302/ejhy-5h41
This collection represents various raw data and analysis of cores extracted during the January 2009 mission of the research vessel Sproul in the Santa Barbara Basin., Cores included: box core SPR0901-04BC, box core SPR0901-unnamed, and Kasten core SPR0901-03KC. Core photos, physical properties and magnetic susceptibility from the multisensor track (MST), and the scanning X-ray fluorescence (XRF) data are included in the collection., and Cruise DOI: 10.7284/901089
This research is funded by NSF-OCE 0752093.
The NASA MAVEN (Mars Atmosphere and Volatile Evolution) spacecraft, which is currently in orbit around Mars, has been taking monthly measurements of the speed and direction of the winds in the upper atmosphere of Mars between about 140 to 240 km above the surface. The observed wind speeds and directions change with time and location, and sometimes fluctuate quickly. These measurements are compared to simulations from a computer model of the Mars atmosphere called M-GITM (Mars Global Ionosphere-Thermosphere Model), developed at U. of Michigan. This is the first comparison between direct measurements of the winds in the upper atmosphere of Mars and simulated winds and is important because it can help to inform us what physical processes are acting on the observed winds. Some wind measurements have similar wind speeds or directions to those predicted by the M-GITM model, but sometimes, there are large differences between the simulated and measured winds. The disagreements between wind observations and model simulations suggest that processes other than normal solar forcing may become relatively more important during these observations and alter the expected circulation pattern. Since the global circulation plays a role in the structure, variability, and evolution of the atmosphere, understanding the processes that drive the winds in the upper atmosphere of Mars provides key context for understanding how the atmosphere behaves as a whole system.
A basic version of the M-GITM code can be found on Github as follows:
and About 30 Neutral Gas and Ion Mass Spectrometer (NGIMS) wind campaigns (of 5 to 10 orbits each) have been conducted by the MAVEN team (Benna et al., 2019). Five of these campaigns are selected for detailed study (Roeten et al. 2019). The Mars conditions for these five campaigns have been used to launch corresponding M-GITM code simulations, yielding 3-D neutral wind fields for comparison to these NGIMS wind observations. The M-GITM datacubes used to extract the zonal and meridional neutral winds, along the trajectory of each orbit path between 140 and 240 km, are provided in this Deep Blue Data archive. README files are provided for each datacube, detailing the contents of each file. A general README file is also provided that summarizes the inputs and outputs of the M-GITM code simulations for this study.
This is the flora-fauna lexical material obtained in the course of more general lexical and grammatical fieldwork on languages of central-eastern Mali (Dogon, Songhay, Bangime, Bozo). The spreadsheets in this work, duplicated in xlsx and csv formants, present our flora-fauna lexicons as of early 2019 for many languages of central-eastern Mali, and certain languages of southwestern Burkina Faso. The Malian data is in two spreadsheets (flora, fauna), while the Burkina data is in separate spreadsheets for flora, birds, fish, insects, lizards and snakes, and mammals. Please begin with the “readme” document.
Our project, mainly on Dogon languages of Mali, has branched out to Burkina Faso with emphasis on documentation of the most endangered languages. Tiefo-N was studied on an emergency basis since it was down to two aging competent speakers. For additional comments and links to a reference grammar, see the readme file.
The work on the Bangime language, spoken by the Bangande people, was carried out as part of a larger linguistic fieldwork project focused on Dogon languages. Bangime is confirmed as a language isolate with no demonstrable linguistic relatives—possibly the only such isolate in West Africa.
Jalkunan is an endangered language of the Mande family, spoken in the village cluster of Blédougou in southwestern Burkina Faso. The lexical work complements a published grammar with texts. See the readme for further information.
The research adheres to PRISMA-HARM recommendations for systematic reviews. The reproducible search strategies for all databases, the citation export files from all databases, and the eligibility screening decisions are included in the dataset.
The search data supports a literature review project on lifestyle therapies for the management of atrial fibrillation. The data included in the dataset are the reproducible search strategies (in docx) and the exported results of all citations from all databases (txt and ris files). These searches and exported result files contain all citations originating from the database searches that were considered for inclusion.