Learning Deep Controllable and Structured Representations for Image Synthesis, Structure Prediction and Beyond

Yan, Xinchen

Learning Deep Controllable and Structured Representations for Image Synthesis, Structure Prediction and Beyond

dc.contributor.author	Yan, Xinchen
dc.date.accessioned	2020-01-27T16:22:21Z
dc.date.available	NO_RESTRICTION
dc.date.available	2020-01-27T16:22:21Z
dc.date.issued	2019
dc.date.submitted	2019
dc.identifier.uri	https://hdl.handle.net/2027.42/153334
dc.description.abstract	Generative modeling is a frontier research topic in machine learning and AI. Despite the recent success in image synthesis, developing a general form of deep generative models on complex data (e.g., multi-modal and structured data) also directly applicable to real-world tasks remains a challenging open problem. The major challenges include high-dimensional representation space, many entangled factors of variation, and lack of existing deep modules for this data generation process. In the thesis, I introduce the concept of controllable representations, a set of factors as intermediate representations, for deep generative modeling. In the context of attribute-to-image synthesis, we consider controllable units as (1) semantic factors described by the visual attributes and (2) other related factors not included in the input attributes for image synthesis (e.g., such as pose and background color). To facilitate ecient learning of such controllable units for attribute-to-image synthesis, I explore novel deep structured modules that can be trained in auto-encoding style that synthesizes images from controllable units. In addition, I demonstrate the representation power of such design in conditional generation (e.g., control partial set of units while keeping the rest unchanged) as well as other related applications including image completion via analysis-by-synthesis optimization. For the rest of the thesis, I investigate and propose several variations to learn controllable and structured representations in several related problems including (1) image manipulation with semantic structures (e.g., object bounding boxes), (2) human motion prediction with transformation-based representations, and (3) 3D shape prediction from single-view with geometry-aware modules. The case studies in the thesis demonstrate not only the representation power but also a common aspect that the representations can be learned in an unsupervised or weakly-supervised manner. In the end, I discuss several future directions in learning deep controllable and structured modules for other multi-modal and structured data as well as the application to adversarial learning.
dc.language.iso	en_US
dc.subject	Deep Representation Learning
dc.subject	Conditional Generative Models
dc.subject	Image Synthesis
dc.subject	Structure Prediction
dc.title	Learning Deep Controllable and Structured Representations for Image Synthesis, Structure Prediction and Beyond
dc.type	Thesis
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	Computer Science & Engineering
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeemember	Lee, Honglak
dc.contributor.committeemember	Corso, Jason
dc.contributor.committeemember	Deng, Jia
dc.contributor.committeemember	Kuipers, Benjamin
dc.subject.hlbsecondlevel	Computer Science
dc.subject.hlbtoplevel	Engineering
dc.description.bitstreamurl	https://deepblue.lib.umich.edu/bitstream/2027.42/153334/1/xcyan_1.pdf
dc.identifier.orcid	0000-0003-1019-5537
dc.identifier.name-orcid	Yan, Xinchen; 0000-0003-1019-5537	en_US
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: xcyan_1.pdf
Size:: 14.46MB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.