TALK    [MERL Seminar Series 2021] Dr. Hsiao-Yu (Fish) Tung presents talk at MERL entitled Learning to See by Moving: Self-supervising 3D scene representations for perception, control, and visual reasoning

Date released: November 2, 2021


  •  TALK    [MERL Seminar Series 2021] Dr. Hsiao-Yu (Fish) Tung presents talk at MERL entitled Learning to See by Moving: Self-supervising 3D scene representations for perception, control, and visual reasoning
    (Learn more about the MERL Seminar Series.)
     
  • Date & Time:

    Tuesday, November 2, 2021; 1:00 PM EST

  • Abstract:

    Current state-of-the-art CNNs can localize and name objects in internet photos, yet, they miss the basic knowledge that a two-year-old toddler has possessed: objects persist over time despite changes in the observer’s viewpoint or during cross-object occlusions; objects have 3D extent; solid objects do not pass through each other. In this talk, I will introduce neural architectures that learn to parse video streams of a static scene into world-centric 3D feature maps by disentangling camera motion from scene appearance. I will show the proposed architectures learn object permanence, can imagine RGB views from novel viewpoints in truly novel scenes, can conduct basic spatial reasoning and planning, can infer affordability in sentences, and can learn geometry-aware 3D concepts that allow pose-aware object recognition to happen with weak/sparse labels. Our experiments suggest that the proposed architectures are essential for the models to generalize across objects and locations, and it overcomes many limitations of 2D CNNs. I will show how we can use the proposed 3D representations to build machine perception and physical understanding more close to humans.


  • Speaker:

    Dr. Hsiao-Yu (Fish) Tung
    MIT BCS

    Hsiao-Yu (Fish) Tung is a postdoc at MIT BCS working with Josh Tenenbaum and Dan Yamins. She is interested in self-supervised 3D perception and common sense learning for embodied agents. She studies how 3D perception can improve the way machines see, act, reason, and understand language, and how machines can acquire and improve their 3D perception and common sense knowledge through their interactions with the physical world. She received her PhD from CMU Machine Learning Department under the supervision of Katerina Fragkiadaki. She is named one of the 2021 Siebel Scholars and 2019 Rising Stars in EECS program, and her research is supported by the Yahoo InMind fellowship and Siemens FutureMaker fellowship.

    She received her M.S. in CMU MLD and B.S. in Electrical Engineering from National Taiwan University. During her master degree, she worked with Professor Alex Smola on spectral methods for Bayesian models.

  • Research Areas:

    Artificial Intelligence, Computer Vision, Machine Learning, Robotics