Statistical and Computational Methods for Single Cell and Spatial Transcriptomics
Xi, Jingyue
2023
Abstract
Single-cell RNA sequencing (scRNA-seq) and Spatial Transcriptomics (ST) have become instrumental tools to understand cellular dynamics and heterogeneity in disease-related tissues. As these technologies rapidly advance, many statistical and computational challenges arise in analyzing them, and relatively few methods address the challenges in the upstream processing of data from rapidly evolving technologies. This dissertation investigates computational challenges of upstream data analysis for scRNA-seq and ST, including quality control to identify and filter out droplets com- prised of ambient RNAs, enabling high-resolution inference of ST data from a new submicrometer resolution technology, and developing robust computational tools and pipelines capable of handling ST platforms at various resolutions. Following a brief overview of scRNA-seq, ST technologies and related challenges in Chapter I, in Chapter II, we focus on the problem of distinguishing cell-containing droplets from cell-free droplets that mostly contain ambient RNAs in scRNA-seq data from multiple angles. By leveraging efficient randomization, manifold visualization, statistical test tailored for sparse scRNA-seq data, and machine learning methods, we develop SiftCell, a suite of software tools to identify and visualize cell-containing and cell-free droplets in manifold space via randomization, to classify between the two types of droplets, and to quantify the contribution of ambient RNAs for each droplet. We also develop Sparse Quantile Aggregation Test (SQuAT), a statisti- cal test designed to aggregate quantile-based summary statistics from many sparse discrete datasets for meta-analysis. SQuAT robustly identifies likely cell-containing droplets and highly variable genes across cell types in sparse scRNA-seq data and is integrated as a core statistical method in SiftCell. Through a comprehensive evalua- tion of three scRNA-seq or snRNA-seq datasets we demonstrate that SiftCell enables new visualization of locating cell-free droplets in the manifold space and outperforms existing methods in filtering cell-containing droplets and in quantifying ambient RNA contribution. In Chapter III, we introduce Seq-Scope, a new submicrometer resolution ST tech- nology that repurposes the Illumina sequencing platform to achieve high resolution and scalability. Unlike other ST technologies, Seq-Scope does not require cumber- some image processing steps and leverages the existing sequencing platform to obtain spatial barcodes that are 0.5 − 0.8μm apart from each other, achieving a resolution comparable to that of an optical microscope. We performed the complete Seq-Scope experimental and analytical procedure on two representative gastrointestinal tissues (liver and colon). This chapter focuses on the computational aspects that enable the analysis of data produced from the new Seq-Scope technology. In Chapter IV, we build a comprehensive software pipeline STtools that provides a versatile framework to handle ST datasets with various resolutions. STtools is designed to efficiently align, cluster and visualize ST data scaling with millions of spatially resolved barcodes. STtools improves the resolution of spatial inference com- pared to typical segmentation-based approaches by leveraging the Multi-scale Sliding Window (MSSW) algorithm. We applied STtools to several ST platforms, including Seq-Scope, Slide-seq and VISIUM and showed that STtools enables both analysis and visualization at various resolutions.Deep Blue DOI
Subjects
single-cell RNA sequencing Spatial Transcriptomics
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.