Perception, Vision & Embodied AI

Bridges the gap between raw sensor data and high-level reasoning using state-of-the-art machine learning.

Key Topics

Computer vision and visual perception
Sensor fusion
Embodied AI and foundation/world models
Multimodal deep learning
Reinforcement learning and federated learning

Overview

For a robot to act intelligently, it must first understand its surroundings. Our work in perception focuses on combining data from multiple sensors (cameras, LiDAR, sonar) to create a cohesive understanding of the world. We actively leverage deep reinforcement learning and emerging foundation models to teach robots how to interpret and react to visual cues autonomously.