Towards building general purpose embodied intelligence, my research lies at the intersection of computer vision (perception) and robot learning (action). I am also investigating broader applications of AI to Science and Social Good.
I am a Research Scientist at Skild AI. Previously, I spent two years as a researcher at Meta's Fundamental AI Research (FAIR) Labs and postdoc at Robotics Institute at CMU with Abhinav Gupta, Deepak Pathak, and Xinlei Chen. I received my PhD from UIUC advised by Alex Schwing & Svetlana Lazebnik, and graduated from IIT Kanpur before that.
I will be joining the University of California at Irvine next year as an Assistant Professor in Computer Science.
UC Irvine is a resourceful, friendly, and warm ecosytem and the campus is ideal for learning-tinkering-building AI systems. Deadline for applications is December 15th 2024. For opportunities to work with me, please see my UCI profile.
|
|
|
|
|
|
|
IIT Kanpur 2011-2016 |
UIUC 2016-2022 |
CMU & Meta 2022-2024 |
Skild AI 2024- |
UC Irvine 2025- |
   
|
   
|
   
|
   
|
   
|
UMass Amherst Summer 2015 |
Uber ATG Summer 2017 |
Allen Institute for AI Summer 2018, 2020 |
FAIR (collab w/ UT Austin) Summer 2019, FA19, SP20 |
Google DeepMind Summer 2021, FA21 |
|
In 2025, I will be joining UC Irvine as an Assistant Professor in Computer Science! |
Oct 2024 |
Serving as Area Chair for CVPR 2025. |
Sep 2024 |
Serving as Area Chair for ICLR 2025. |
June 2024 |
Organizing the community-building workshop 'CV 20/20: A Retrospective Vision' at CVPR 2024. |
Jan 2024 |
Serving as Area Chair for NeurIPS 2024. |
Jun 2023 |
Organizing the 'Scholars & Big Models: How Can Academics Adapt?' workshop at CVPR 2023. |
+ previous news
|
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels
Duy-Kien Nguyen, Mahmoud Assran, Unnat Jain, Martin R Oswald, Cees GM Snoek, Xinlei Chen
arXiv 2024
paper
|
|
Exploitation-Guided Exploration for Semantic Embodied Navigation
Justin Wasserman, Girish Chowdhary, Abhinav Gupta, Unnat Jain
ICRA 2024
Best Paper at NeurIPS 2023 Robot Learning Workshop
paper |
project |
code
|
|
Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots
Xavi Puig*, Eric Undersander*, Andrew Szot*, Mikael Cote*, Ruslan Partsey*, Jimmy Yang*, Ruta Desai*, Alexander Clegg*, Michal Hlavac, Tiffany Min, Theo Gervet, Vladimír Vondruš, Vincent-Pierre Berges, John Turner, Oleksandr Maksymets, Zsolt Kira, Mrinal Kalakrishnan, Jitendra Malik, Devendra Chaplot, Unnat Jain, Dhruv Batra, Akshara Rai**, Roozbeh Mottaghi**
ICLR 2024
paper |
project |
code
Media:
|
|
An Unbiased Look at Datasets for Visuo-Motor Pre-Training
Sudeep Dasari, Mohan Kumar Srirama, Unnat Jain*, Abhinav Gupta*
CoRL 2023
project |
pdf
|
|
Pretrained Language Models as Visual Planners for Human Assistance
Dhruvesh Patel, Hamid Eghbalzadeh, Nitin Kamra, Michael Louis Iuzzolino, Unnat Jain*, Ruta Desai*
ICCV 2023
paper |
code
|
|
Adaptive Coordination in Social Embodied Rearrangement
Andrew Szot, Unnat Jain, Dhruv Batra, Zsolt Kira, Ruta Desai, Akshara Rai
ICML 2023
paper |
code
|
|
Affordances from Human Videos as a Versatile Representation for Robotics
Shikhar Bahl*, Russell Mendonca*, Lili Chen, Unnat Jain, Deepak Pathak
CVPR 2023
paper |
project
Media:
|
|
MOPA: Modular Object Navigation with PointGoal Agents
Sonia Raychaudhuri, Tommaso Campari, Unnat Jain, Manolis Savva, Angel X. Chang
WACV 2024
paper |
project |
code
|
|
Last-Mile Embodied Visual Navigation
Justin Wasserman*, Karmesh Yadav, Girish Chowdhary, Abhinav Gupta, Unnat Jain*
CoRL 2022
paper |
project |
code
|
|
Retrospectives on the Embodied AI Workshop
Matt Deitke, Dhruv Batra, Yonatan Bisk, ... Unnat Jain ... Luca Weihs, Jiajun Wu
arXiv 2022
paper
|
|
Learning State-Aware Visual Representations from Audible Interactions
Himangi Mittal, Pedro Morgado, Unnat Jain, Abhinav Gupta
NeurIPS 2022
paper |
code
|
|
Bridging the Imitation Gap by Adaptive Insubordination
Luca Weihs*, Unnat Jain*, Iou-Jen Liu, Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing
NeurIPS 2021
paper |
project |
code
|
|
Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments
Sonia Raychaudhuri, Saim Wani, Shivansh Patel, Unnat Jain, Angel X. Chang
EMNLP 2021 (short)
paper |
project |
code
|
|
GridToPix: Training Embodied Agents with Minimal Supervision
Unnat Jain, Iou-Jen Liu, Svetlana Lazebnik, Aniruddha Kembhavi, Luca Weihs*, Alexander Schwing*
ICCV 2021
paper |
project
|
|
Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents
Shivansh Patel*, Saim Wani*, Unnat Jain*, Alexander Schwing, Svetlana Lazebnik, Manolis Savva, Angel X. Chang
ICCV 2021
paper |
project |
code
|
|
Cooperative Exploration for Multi-Agent Deep Reinforcement Learning
Iou-Jen Liu, Unnat Jain, Raymond Yeh, Alexander Schwing
ICML 2021 (long oral)
paper |
project |
code
|
|
MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation
Saim Wani*, Shivansh Patel*, Unnat Jain*, Angel X. Chang, Manolis Savva
NeurIPS 2020
paper |
project |
code |
challenge
|
|
AllenAct: A Framework for Embodied AI Research
Luca Weihs*, Jordi Salvador*, Klemen Kotar*, Unnat Jain, Kuo-Hao Zeng, Roozbeh Mottaghi, Aniruddha Kembhavi
arXiv 2020
paper |
project |
code
Media:
|
|
A Cordial Sync: Going Beyond Marginal Policies For Multi-Agent Embodied Tasks
Unnat Jain*, Luca Weihs*, Eric Kolve, Ali Farhadi, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing
ECCV 2020 (spotlight)
paper |
project |
code
|
|
SoundSpaces: Audio-Visual Navigation in 3D Environments
Changan Chen*, Unnat Jain*, Carl Schissler, Sebastia Vicenc Amengual Gari, Ziad Al-Halah, Vamsi Krishna Ithapu, Philip Robinson, Kristen Grauman
ECCV 2020 (spotlight)
paper |
project |
code |
challenge
Media:
|
|
TAB-VCR: Tags and Attributes based VCR Baselines
Jingxiang Lin, Unnat Jain, Alexander Schwing
NeurIPS 2019
paper |
project |
code
|
|
Two Body Problem: Collaborative Visual Task Completion
Unnat Jain*, Luca Weihs*, Eric Kolve, Mohammad Rastegari, Svetlana Lazebnik, Ali Farhadi, Alexander Schwing, Aniruddha Kembhavi
CVPR 2019 (oral)
paper |
project |
code
Talk @ Amazon:
video,
ppt,
pdf
Talk @ CVPR'19:
video,
ppt,
pdf,
poster
|
|
Two can play this Game: Visual Dialog with Discriminative Question Generation and Answering
Unnat Jain, Svetlana Lazebnik, Alexander Schwing
CVPR 2018
|
|
Creativity: Generating Diverse Questions using Variational Autoencoders
Unnat Jain*, Ziyu Zhang*, Alexander Schwing
CVPR 2017 (spotlight)
video |
paper
|
|
Compact Environment-Invariant Codes for Robust Visual Place Recognition
Unnat Jain, Vinay Namboodiri, Gaurav Pandey
Conference on Computer and Robot Vision (CRV) 2017
|
|