Poker Bots

Posted on  by admin

Background

  1. Poker Bots
  2. Poker Bots Wikipedia

Poker Bots

Software based on neural networks and machine learning. Allows you to play poker automatically and make money. Our poker bot independently interacts with the poker room without the help of a human operator. You just need to register accounts, set up a game schedule and regularly replenish and withdraw your deposit.

Reinforcement Learning

  1. Online poker is not the free-for-all that it used to be. The game experienced a huge boom in the early-to-mid 2000s as amateur player Chris Moneymaker's $2.5 million (£1.9 million) 2003 World.
  2. Download poker bot Unregistered users can run the bot for free in any room we support, but it will terminate in 10 minutes after connection to the table. Then you can restart it again.
Poker Bots

Reinforcement Learning has grown in popularity in recent years, and since Google Deepmind's AlhpaGo emerged victorious against Lee Sedol and other GO Grandmasters, Reinforcement Learning has been proven to be an effective training method for neural networks, especially in cases of deterministic and non-deterministic gameplay. Libratus, a Poker playing Neural Network developed by Carnegie Mellon University, applies Reinforcement Learning techniques along with standard backpropagation and temporal delay techniques in order to win against Poker players across the world, including the winners of past Poker Grand Tournaments. However, Libratus does not use current deep learning and reinforcement learning techniques, as outlined by the AlphaGO or Deepmind papers. We wanted to explore the possible benefits of using Q-Learning to create a poker bot that automatically learns the best possible policy through self-play over a period of time.

Q-learning

Poker Bots Wikipedia

Poker Bots

Q-learning is the specific reinforcement learning technique we wanted to apply to our PokerBot. A complete explanation of Q-Learning can be found here. For our purposes, it will suffice to know that:

BotsOnline
  • Q-learning penalizes actions that may end up badly in the future
  • Q-learning ALSO rewards actionst that may end up winning the game in the future
  • We need action-state pairs: A list of all possible actions in all possible states
  • A Q-function can then be generated
  • This represents the BEST possible future reward if the action 'a' is taken in state 's'