Show simple item record

Learning Deep Controllable and Structured Representations for Image Synthesis, Structure Prediction and Beyond

dc.contributor.authorYan, Xinchen
dc.date.accessioned2020-01-27T16:22:21Z
dc.date.availableNO_RESTRICTION
dc.date.available2020-01-27T16:22:21Z
dc.date.issued2019
dc.date.submitted2019
dc.identifier.urihttps://hdl.handle.net/2027.42/153334
dc.description.abstractGenerative modeling is a frontier research topic in machine learning and AI. Despite the recent success in image synthesis, developing a general form of deep generative models on complex data (e.g., multi-modal and structured data) also directly applicable to real-world tasks remains a challenging open problem. The major challenges include high-dimensional representation space, many entangled factors of variation, and lack of existing deep modules for this data generation process. In the thesis, I introduce the concept of controllable representations, a set of factors as intermediate representations, for deep generative modeling. In the context of attribute-to-image synthesis, we consider controllable units as (1) semantic factors described by the visual attributes and (2) other related factors not included in the input attributes for image synthesis (e.g., such as pose and background color). To facilitate ecient learning of such controllable units for attribute-to-image synthesis, I explore novel deep structured modules that can be trained in auto-encoding style that synthesizes images from controllable units. In addition, I demonstrate the representation power of such design in conditional generation (e.g., control partial set of units while keeping the rest unchanged) as well as other related applications including image completion via analysis-by-synthesis optimization. For the rest of the thesis, I investigate and propose several variations to learn controllable and structured representations in several related problems including (1) image manipulation with semantic structures (e.g., object bounding boxes), (2) human motion prediction with transformation-based representations, and (3) 3D shape prediction from single-view with geometry-aware modules. The case studies in the thesis demonstrate not only the representation power but also a common aspect that the representations can be learned in an unsupervised or weakly-supervised manner. In the end, I discuss several future directions in learning deep controllable and structured modules for other multi-modal and structured data as well as the application to adversarial learning.
dc.language.isoen_US
dc.subjectDeep Representation Learning
dc.subjectConditional Generative Models
dc.subjectImage Synthesis
dc.subjectStructure Prediction
dc.titleLearning Deep Controllable and Structured Representations for Image Synthesis, Structure Prediction and Beyond
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineComputer Science & Engineering
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberLee, Honglak
dc.contributor.committeememberCorso, Jason
dc.contributor.committeememberDeng, Jia
dc.contributor.committeememberKuipers, Benjamin
dc.subject.hlbsecondlevelComputer Science
dc.subject.hlbtoplevelEngineering
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/153334/1/xcyan_1.pdf
dc.identifier.orcid0000-0003-1019-5537
dc.identifier.name-orcidYan, Xinchen; 0000-0003-1019-5537en_US
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.