Nicklas Hansen, UCSD: Why algorithms don't drive research progress

Generally Intelligent

0:00

-1:49:18

Nicklas Hansen, UCSD: Why algorithms don't drive research progress

Imbue

Dec 16, 2022

Nicklas Hansen is a Ph.D. student at UC San Diego advised by Prof Xiaolong Wang and Prof Hao Su. He is also a student researcher at Meta AI. Nicklas' research interests involve developing machine learning systems, specifically neural agents, that have the ability to learn, generalize, and adapt over their lifetime. In this episode, we talk about long-horizon planning, adapting reinforcement learning policies during deployment, why algorithms don't drive research progress, and much more!

Below are some highlights from our conversation as well as links to the papers, people, and groups referenced in the episode.

Highlights

“I think that was the first realization: that clearly we cannot train on all environments that exist because of the difficulty of training these algorithms, but also just the practicality of defining all of the things that we want to be robust.”

“It’s a huge problem in RL research. I feel like one of the major bottlenecks is the lack of data sets, benchmarks, and environments where you can really explore all of these different directions of RL research, especially when it comes to generalization. We had to take an existing benchmark and artificially change the simulation to make it look different, but it’s still pretty limited how much diversity you can get from that.”

“You could provide a reward signal and intuitively that would be able to adapt as well using rewards. And we did actually do those experiments and tried to compare how many samples do you need with self-supervision versus how many do you need with reward? And it turns out—I don’t recall the exact numbers—but it was something like a hundred episodes or something if you do reward-based fine-tuning versus self-supervision which was like one episode.”

References

Thanks to Tessa Hall for editing the podcast.

Nicklas Hansen, UCSD: Why algorithms don't drive research progress

Highlights

References

Discussion about this episode

Ready for more?