Data-Centric Execution Inspection
Quinn, Andrew
2021
Abstract
Modern software projects are incredible feats of engineering that manage dozens of concurrent execution tasks, are comprised of millions of lines of code, and written by hundreds of developers. Moreover, projects must meet an ever growing set of complex requirements, including correctness, performance, and security. Due in part to this complexity, developers make mistakes that lead to software failures that have devastating cost to society; in 2020 alone, operational failures caused by software bugs cost US companies 56 trillion dollars. To understand the causes and effects of a software bug, a developer usually dynamically inspects the state of their execution by using an inspection tool such as gdb, Intel Pin, or logging. Debugging tools today support an inline inspection interface, which essentially requires iteratively constructing new inspection programs to understand a bug. The inline inspection interface imposes two limitations: (1) powerful inspection programs impose infeasible performance overhead and (2) inspection programs are needlessly complex to specify. This dissertation proposes an alternative model for inspecting an execution called data-centric execution inspection. The framework takes a data-oriented view that considers an execution as a first-class data object and enables execution inspection as queries over the data objects. This dissertation shows that a data-centric framework enables the use of common data-centric approaches, namely cluster-scale parallelization and relational query models, to enable fundamentally more powerful inspection through three projects. First, this dissertation shows how data-centric execution inspection enables cluster-scale parallelization of execution inspection to alleviate the performance limitations of existing tools. Data-centric execution inspection enables systems to inspect multiple regions of an execution simultaneously, so a system can parallelize inspection work across thou- sands of cores in a compute cluster. Alas, existing techniques and systems do not parallelize well, since they assume that inspection occurs inline with the original program. This thesis redesigns inspection techniques and tools by following scalability as a first class design constraint to facilitate cluster-scale parallelization of execution inspection. First, JetStream parallelizes the work of dynamic information flow tracking (DIFT) an order of magnitude better than prior approaches by partitioning DIFT across two phases and leveraging a different from of parallelism for each phase. Second, Sledgehammer proposes a general vision for cluster-fueled debugging, which uses thousand-core computer cluster to enable debugging that is both powerful and interactive. Sledgehammer identifies cluster-fueled de- bugging as both a vehicle for accelerating existing debugging tools, such as retro-logging to enable after-the-fact execution inspection, and enables fundamentally more powerful tools, such as continuous-function evaluation to enable “always on” evaluation of complex global program invariants. Second, this dissertation shows how a relational query model, called the OmniTable query model, transforms execution inspection from a low-level programming task into a high-level data science task to enable inspection that is both low-latency and simpler to specify. The model exposes a table abstraction of each execution and supports SQL queries for inspection. Results indicate that debugging using the OmniTable model is more succinct than in current state-of-the art tools. Moreover, our prototype, Steamdrill, optimizes inspection queries using a planning approach that seamlessly uses Sledgehammer-style cluster parallelization, standard relational optimizations, and a novel multi-replay approach to provide results an order-of-magnitude faster than prior general-purpose inspection tools.Deep Blue DOI
Subjects
Execution Inspection Debugging Retroactive Analysis
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.