Player of Games: All the games, one algorithm (w, author Martin Schmid)
, playerofgames, deepmind, alphazero Special Guest: First author Martin Schmid Games have been used throughout research as testbeds for AI algorithms, such as reinforcement learning agents. However, different types of games usually require different solution approaches, such as AlphaZero for Go or Chess, and Counterfactual Regret Minimization (CFR) for Poker. Player of Games bridges this gap between perfect and imperfect information games and delivers a single algorithm that uses tree search over public information states, and is trained via selfplay. The resulting algorithm can play Go, Chess, Poker, Scotland Yard, and many more games, as well as nongame environments. OUTLINE: 0:00 Introduction 2:50 What games can Player of Games be trained on 4:00 Tree search algorithms (AlphaZero) 8:00 What is different in imperfect information games 15:40 Counterfactual Value and PolicyNetworks 18:50 The Player of Games search procedure 28:30 How to train the network 34
|
|