Shangtong Zhang: Off Policy Evaluation
Data Fest Online 2020 Reinforcement Learning track In this talk, I will present my recent work on offpolicy evaluation, where we want to estimate the performance of a policy with only a given dataset without executing the policy. Offpolicy evaluation has broad real world applications such as recommendation systems. I will start with a brief introduction to reinforcement learning and discuss main challenges in Offpolicy evaluation. Then I will present our work GradientDICE at ICML 2020 and discuss how and why it is better than previous methods like DualDICE and GenDICE, both theoretically and empirically. Register and get access to the tracks: Join the community:
|
|