Publications
Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with $q^\pi$ Realizability and Concentrability
Regret Minimization via Saddle Point Optimization
On Efficient Planning in Large Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning
Efficient Planning in Combinatorial Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning
Investigating action encodings in recurrent neural networks in reinforcement learning