Perception, Vision & Embodied AI
Bridges the gap between raw sensor data and high-level reasoning using state-of-the-art machine learning.
Key Topics
- Computer vision and visual perception
- Sensor fusion
- Embodied AI and foundation/world models
- Multimodal deep learning
- Reinforcement learning and federated learning
Overview
For a robot to act intelligently, it must first understand its surroundings. Our work in perception focuses on combining data from multiple sensors (cameras, LiDAR, sonar) to create a cohesive understanding of the world. We actively leverage deep reinforcement learning and emerging foundation models to teach robots how to interpret and react to visual cues autonomously.