Computational Approaches Enabling Disparately Acquired Untargeted LC-MS Metabolomics Data Analysis
Habra, Hani
2022
Abstract
Metabolomics is the systematic study of small molecule metabolites that are substrates, intermediates, and products of cellular metabolism. Metabolomics assays performed using liquid chromatography- mass spectrometry (LC-MS) typically detect thousands of analytes, or features characterized by mass-to-charge (m/z) ratio and retention time (RT). The objective of untargeted metabolomics is to detect, quantify, and identify as many compounds as possible and to relate their abundances with phenotypic outcomes. Metabolite identification is a major bottleneck in metabolite profiling studies, with only a small percentage of observed features unambiguously identified in a typical experiment. A substantial proportion of the detected features consists of in-source adducts, fragments, isotopologues, complexes, chemical and computational artifacts. Detecting and removing these redundancies is essential for improving statistical power in downstream analysis, as many metabolomics studies have limited sample sizes. LC-MS assays can be performed using a wide range of chromatographic conditions, instruments, and other analytical techniques. Differences in protocols between and within laboratories create numerous challenges for information transfer and meta-analysis, especially for unidentified compounds. This dissertation is focused on developing computational methods for enabling disparately acquired LC-MS metabolomics data analysis and demonstrating their benefits in compound identification and biomedical investigations. First, I describe Binner, a standalone application for annotating in-source adducts, fragments, complexes, and isotopologues derived from a common metabolite, thus facilitating the reduction of feature tables to a parsimonious expression of the detected metabolome. I highlight the unique capabilities of Binner, including its superior annotation performance compared to existing programs and its modules for facilitating the discovery of complex annotations. Second, I describe metabCombiner, a software package for aligning metabolomics measurements acquired under similar, but non-identical, conditions, concatenating their values to generate merged feature tables. metabCombiner uses a spline-based modeling approach to project across substantial gaps in retention times and a weighted similarity score to match features corresponding to identical analytes. This package forms the basis for expanded sample size analyses as well as information transfer between protocols, instruments, and laboratories. I detail multiple applications in which compound identification rates in plasma, urine, and other specimens are improved by coupling disparate LC-MS alignment to novel experimental and computational elucidation approaches. A framework consisting of alignment and normalization steps for the removal of intra-batch, inter-batch, and inter-experiment variation in retention times and acquired signal was developed and applied to metabolomics studies of ALS and pregnancy. Subsequent statistical and bioinformatics approaches using partial correlation networks performed on the aligned, normalized datasets illustrate the benefits of combining datasets, despite major differences in experimental conditions. Together, these computational methods address numerous data analysis challenges and unlock new opportunities in the metabolomics field as well as other fields that utilize LC-MS for high-throughput measurements.Deep Blue DOI
Subjects
metabolomics computational methods
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.