Massively Parallel Screens to Identify Splice Disruptive Variants in Human Disease Genes
Smith, Cathy
2023
Abstract
Splicing is a critical step in mRNA maturation with roles in gene regulation and proteome diversification. Splice disruptive variants (SDVs) are implicated in diverse human diseases, and 10-33% of exonic variants may disrupt splicing. However, identifying SDVs remains challenging due to the degeneracy and redundancy of the underlying sequence code. Experimental splicing measurements from patient cells or mini-gene assays can detect SDVs but have traditionally been low-throughput. Massively parallel splicing assays (MPSAs) systematically measure splicing impacts at scale and could clarify variant pathogenicity and inform models of splicing regulation. In this assay, complex barcoded libraries of mutant exons are synthesized, cloned into minigene constructs, and transfected into human cells. Splicing outcomes of each mutation are quantified en masse using targeted RNA-seq of minigene-derived transcripts, and analyzed with a custom python package. In Chapter 2, I apply this assay to the pituitary transcription factor gene POU1F1 (in collaboration Dr. Sally Camper’s lab). Mutations in POU1F1 cause combined pituitary hormone deficiency (CPHD), a clinically and genetically heterogenous disorder with prevalence ~1:4000. We targeted exon 2, which has two alterative isoforms (alpha and beta) using competing splice acceptors that encode mutually antagonistic proteins. We measured the splicing effects of 1,070 SNVs across the exon and surrounding introns and identified 96 SDVs - 14 of which were synonymous substitutions. Our measurements were concordant with six nearby heterozygous missense and synonymous variants seen in unrelated hypopituitarism patients. This map identifies a putative splice silencer motif that represses the use of the normally lowly expressed beta isoform. In Chapter 3, I apply a MPSA to a critical developmental renal transcription factor gene, WT1 (in collaboration with clinical nephrologist Dr. Jen Lai Yee). Mutations in WT1 are implicated in nephrotic syndrome and sexual differentiation phenotypes. I focus on exon 9 which is alternatively spliced at competing donor sites resulting in two isoforms (KTS+ and KTS-). KTS+ and KTS- are normally expressed in ~2:1 ratio, but perturbation of the ratio can lead to Frasier’s syndrome – a rare nephrotic syndrome. We tested 518 SNVs for splicing defects and identified 8 known Frasier’s Syndrome variants as well as 16 additional variants that similarly lowered the KTS ratio. We also detected 19 variants increasing the KTS ratio, two of which have been observed in patients with sexual differentiation phenotypes. Although MPSAs can measure splicing effects of hundreds of variants simultaneously, the current scale of variant discovery via exome and genome sequencing demands efficient and accurate computational approaches to identify splice disruptive variants genome-wide. To evaluate the state of the art within contemporary splice prediction algorithms, in Chapter 4 I employed the results of five high throughput splicing assays and one literature curated variant set. A unique advantage of MPSAs over typical training and validation datasets is that they avoid bias towards essential splice site variants. I found the latest deep learning tools, SpliceAI and Pangolin, were most concordant with the measured splicing effects. However, all tools showed less agreement with exonic splicing outcomes compared to intronic. Some tools’ predictions, like SpliceAI’s, were sensitive to specified annotation files. Therefore, there is still room for improvement within the next generation of splice prediction algorithms which future MPSA studies may facilitate. Thus, MPSAs are critical to identify clinically relevant SDVs and improve computational splice prediction.Deep Blue DOI
Subjects
Splicing Massively parallel splicing assays Computational prediction
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.