Show simple item record

Learning Dense Visual Features for the Sun and Natural Scenes

dc.contributor.authorHiggins, Richard
dc.date.accessioned2025-05-12T17:37:50Z
dc.date.available2025-05-12T17:37:50Z
dc.date.issued2025
dc.date.submitted2025
dc.identifier.urihttps://hdl.handle.net/2027.42/197196
dc.description.abstractWe demonstrate simple methods for visual representation learning in two data-rich settings: the sun and natural scenes. We produce dense representations that are applicable to the following tasks: (1) improved hand and object segmentation for video understanding; (2) improved solar magnetic field estimation; and (3) 3D-aware image/scene editing for compositional generation. For natural scenes, we introduce responsibility, a means of ascribing scene motion (estimated via optical flow) to hands. We pair responsibility with an off-the-shelf person segmentation system and use the resulting pseudolabels with a three-way contrastive loss to train a UNet that segments people, held-objects, and background pixels. We next extend this gestaltist idea of motion and shared fate to learn a self-supervised image segmentation system. We first estimate a fundamental matrix between two ego-centric video frames. We then create pseudolabels by decomposing pixels that disagree with this camera motion model into hands and held-objects. We then use these pseudolabels to train an HRNet that class-agnostically segments scenes. For solar scenes, we train UNets to fast and accurately estimate the per-pixel magnetic field of the sun by inverting polarized light measurements from the Solar Dynamics Observatory/Helioseismic and Magnetic Imager (SDO/HMI). We then create SynthIA, a synthetic instrument trained on paired data from two satellites: SDO and Hinode. SDO/HMI images the entire solar disk, while the Hinode/SOT spectropolarimeter captures only small regions at a higher spectral and spatial resolution. We pair the co-observed input data from SDO/HMI with MERLIN inversions from the Hinode/SOT spectropolarimeter to create a cross-satellite dataset. After training, SynthIA is able to generate Hinode MERLIN-like inversions from only SDO/HMI input data, synthetically expanding Hinode's spatially-limited but high-quality inversion results to the full disk. Finally, we perform 3D-aware image/scene edits by conditioning latent diffusion using a sequence of neural nouns and verbs as a visual prompt. We edit scenes by applying verbs in an object-centric manner and then recompose the scene with a background. This factorization affords test-time compositionality, allowing us to compose edited objects from multiple datasets in the same scene while preserving camera control.
dc.language.isoen_US
dc.subjectComputer Vision
dc.subjectArtificial Intelligence
dc.subjectRepresentation Learning
dc.titleLearning Dense Visual Features for the Sun and Natural Scenes
dc.typeThesis
dc.description.thesisdegreenamePhD
dc.description.thesisdegreedisciplineComputer Science & Engineering
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberFouhey, David
dc.contributor.committeememberGombosi, Tamas I
dc.contributor.committeememberMakar, Maggie
dc.contributor.committeememberYu, Stella
dc.subject.hlbsecondlevelComputer Science
dc.subject.hlbtoplevelEngineering
dc.contributor.affiliationumcampusAnn Arbor
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/197196/1/relh_1.pdf
dc.identifier.doihttps://dx.doi.org/10.7302/25622
dc.identifier.orcid0000-0002-6227-0773
dc.identifier.name-orcidHiggins, Richard; 0000-0002-6227-0773en_US
dc.working.doi10.7302/25622en
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.