Publications
Trajectory Data Suffices for Statistically Efficient Policy Evaluation in Finite-Horizon Offline RL with Linear $q^\pi$ Realizability and Concentrability
Paper
| Talks: High-level
Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with $q^\pi$ Realizability and Concentrability
Regret Minimization via Saddle Point Optimization
On Efficient Planning in Large Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning
Efficient Planning in Combinatorial Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning
Investigating action encodings in recurrent neural networks in reinforcement learning