Nicklas Hansen is a Ph.D. student at UC San Diego advised by Prof Xiaolong Wang and Prof Hao Su. He is also a student researcher at Meta AI. Nicklas' research interests involve developing machine learning systems, specifically neural agents, that have the ability to learn, generalize, and adapt over their lifetime. In this episode, we talk about long-horizon planning, adapting reinforcement learning policies during deployment, why algorithms don't drive research progress, and much more!
Below are some highlights from our conversation as well as links to the papers, people, and groups referenced in the episode.
Highlights
“I think that was the first realization: that clearly we cannot train on all environments that exist because of the difficulty of training these algorithms, but also just the practicality of defining all of the things that we want to be robust.”
“It’s a huge problem in RL research. I feel like one of the major bottlenecks is the lack of data sets, benchmarks, and environments where you can really explore all of these different directions of RL research, especially when it comes to generalization. We had to take an existing benchmark and artificially change the simulation to make it look different, but it’s still pretty limited how much diversity you can get from that.”
“You could provide a reward signal and intuitively that would be able to adapt as well using rewards. And we did actually do those experiments and tried to compare how many samples do you need with self-supervision versus how many do you need with reward? And it turns out—I don’t recall the exact numbers—but it was something like a hundred episodes or something if you do reward-based fine-tuning versus self-supervision which was like one episode.”
References
Sim-to-Real Transfer of Robotic Control with Dynamics Randomization
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization
Generalization in Reinforcement Learning by Soft Data Augmentation
Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels
Do Vision Transformers See Like Convolutional Neural Networks?
Introducing Dreamer: Scalable Reinforcement Learning Using World Models
ManiSkill: Generalizable Manipulation Skill Benchmark with Large-Scale Demonstrations
Learning vision-guided quadrupedal locomotion end-to-end with cross-modal transformers
Thanks to Tessa Hall for editing the podcast.












