Assisting Cloud System Development with Automated Insight Generation

Qiu, Yiming

Assisting Cloud System Development with Automated Insight Generation

dc.contributor.author	Qiu, Yiming
dc.date.accessioned	2025-05-12T17:40:06Z
dc.date.available	2025-05-12T17:40:06Z
dc.date.issued	2024
dc.date.submitted	2024
dc.identifier.uri	https://hdl.handle.net/2027.42/197262
dc.description.abstract	As cloud computing revolutionizes the IT industry, the complexity of underlying cloud systems has been increasing rapidly. The advent of new hardware accelerators and software platforms has made it challenging for cloud users to master the growing development toolkits. Compounding the issue, the programming frameworks and internals of these new systems are highly heterogeneous, with different performance characteristics, resource constraints, management principles, and reliability considerations. Consequently, it is becoming crucial to minimize human effort when managing these new ecosystems. In this dissertation, we advocate for assisting cloud developers and operators by automatically generating system insights. These insights bridge the gap between user intentions and system requirements, providing clarity on the outcomes of user actions on a system without the need for tedious trial-and-error processes. This dissertation demonstrates how we generate various types of insights for different cloud systems. Firstly, the dissertation explores performance optimization insights, which are critically needed as users attempt to offload legacy code from on-premise servers to emerging accelerators like SmartNICs. These new hardware components feature entirely different programming abstractions, compilers, instruction sets, and architectures. Although a straightforward offloading strategy might functionally work, it could lead to significant performance degradation, undermining the benefits of using accelerators. To address this issue, we create a toolset called Clara, which can automatically predict offloading performance and suggest tuning strategies before extensive deployment efforts. This allows users to make informed decisions on whether and how to offload their legacy code. Secondly, the dissertation investigates safety compliance insights for the cloud networking stack, focusing on ensuring the correctness of system updates for the latest generation of runtime-programmable platforms. We observe that even if both the current and intended functionalities are correct and efficient, the intermediate transition state can still introduce consistency and capacity issues into the core network. To tackle this challenge, we employ formal reasoning techniques to achieve update clarity. We develop FlexPlan, an interactive platform that synthesizes runtime transition plans meeting dynamic user demands, greatly minimizing the need for manual intervention. Lastly, the dissertation unearths infrastructure management insights for emerging cloud orchestration platforms. Clouds are constructed by providers like Microsoft but are intended for third-party use. This user/owner division limits cloud users’ visibility and control over cloud service behavior. The adoption of Infrastructure-as-Code (IaC) style cloud orchestration platforms further complicates this semantic gap by adding another intermediate layer of abstraction. To address this complexity, we propose Zodiac, a pipeline that automatically uncovers cloud provider requirements, and clarifies their interaction with orchestration platforms. The outcome is a set of orchestration rules that cloud users must follow to ensure proper cloud management practices. Throughout these projects, we leverage and extend techniques from a wide variety of disciplines, such as formal reasoning, software testing, machine learning, and their intersections. The results demonstrate the feasibility of generating useful insights across cloud data, control, and management planes, while unveiling an even larger insight generation and integration design space yet to be explored.
dc.language.iso	en_US
dc.subject	cloud management
dc.subject	programmable network
dc.subject	program analysis
dc.subject	program synthesis
dc.subject	machine learning
dc.subject	configuration mining
dc.title	Assisting Cloud System Development with Automated Insight Generation
dc.type	Thesis
dc.description.thesisdegreename	PhD
dc.description.thesisdegreediscipline	Computer Science & Engineering
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeemember	Chen, Ang
dc.contributor.committeemember	Chen, Jiasi
dc.contributor.committeemember	Beckett, Ryan
dc.contributor.committeemember	Wang, Xinyu
dc.subject.hlbsecondlevel	Computer Science
dc.subject.hlbtoplevel	Engineering
dc.contributor.affiliationumcampus	Ann Arbor
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/197262/1/yimingq_1.pdf
dc.identifier.doi	https://dx.doi.org/10.7302/25688
dc.identifier.orcid	0009-0003-9328-3205
dc.identifier.name-orcid	Qiu, Yiming; 0009-0003-9328-3205	en_US
dc.working.doi	10.7302/25688	en
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: yimingq_1.pdf
Size:: 2.904MB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.