lpetrich
Contributor
Robots Are Beating Humans At Poker | FiveThirtyEight - "It’s hard to win against an opponent without a tell." (presumably a permanent poker face)
Or more precisely, poker-playing software. -- Superhuman AI for multiplayer poker | Science
Computers have been very successful in games with simple game worlds, determinism, and complete information. A simple game world is one that does not need many bits to describe it. Determinism means no random parts. Complete information means that the game state is completely accessible to all the players. Video games violate game-world simplicity with the details of their displays, games with thrown dice or shuffled cards violate determinism, and many card games violate complete information -- each player sees only some of the cards.
Some notable games with these features are tic-tac-toe, checkers, chess, and Go. Tic-tac-toe is completely solved, and a brute-force solution of it is a fairly simple exercise for a computer-science student. Checkers is also completely solved, though its solution requires a large database of solution data. Chess and Go have not been completely solved, and doing so is likely impractical, but some chess and Go software plays as well as human champions in those games.
Departing from determinism is backgammon, though it also has a simple game world and complete information. Here also, some backgammon software can play at championship level.
I now get to poker, which both departs from determinism and has incomplete information, though it has a simple game world. There are numerous versions of poker, but all of them have in common that no player knows which cards will be drawn from the deck aside from lack of duplication, and also that each player gets some cards which only they get to see for part of the game - the player's hand.
Some poker players bluff, that is, bet as if their hand is stronger than it actually is. A bluffer hopes that the other players will decide that they are likely to lose. They will then fold, withdrawing from the current round to reduce their losses. But other players may find a bluff unconvincing, and that is where calling one's bluff comes from.
From the 538.com article,
Pluribus was trained much like AlphaGo and AlphaZero, champion-level software for playing Go and chess. It started from scratch and played itself repeatedly, with it tweaking its decision parameters to get improved performance.
Neither the 538.com article nor the Science-magazine article mentioned bluffing, however. It would be interesting to see if some poker AI could guess that a player might be bluffing -- or if it could do some bluffing of its own.
Another aspect of human poker playing is that some players may try to get around the game's incomplete information by looking at other players' facial expressions and the like to guess what they are thinking. Some players try to avoid giving away such information by having a "poker face", an expressionless appearance, and wearing sunglasses. Members of other species can pick up such cues, and a famous case of that was Clever Hans, a seemingly knowledgable horse. Its "knowledge" was from doing so, as a psychologist demonstrated.
Online poker gets around that issue, though interpreting such body cues is another AI challenge.
Or more precisely, poker-playing software. -- Superhuman AI for multiplayer poker | Science
Computers have been very successful in games with simple game worlds, determinism, and complete information. A simple game world is one that does not need many bits to describe it. Determinism means no random parts. Complete information means that the game state is completely accessible to all the players. Video games violate game-world simplicity with the details of their displays, games with thrown dice or shuffled cards violate determinism, and many card games violate complete information -- each player sees only some of the cards.
Some notable games with these features are tic-tac-toe, checkers, chess, and Go. Tic-tac-toe is completely solved, and a brute-force solution of it is a fairly simple exercise for a computer-science student. Checkers is also completely solved, though its solution requires a large database of solution data. Chess and Go have not been completely solved, and doing so is likely impractical, but some chess and Go software plays as well as human champions in those games.
Departing from determinism is backgammon, though it also has a simple game world and complete information. Here also, some backgammon software can play at championship level.
I now get to poker, which both departs from determinism and has incomplete information, though it has a simple game world. There are numerous versions of poker, but all of them have in common that no player knows which cards will be drawn from the deck aside from lack of duplication, and also that each player gets some cards which only they get to see for part of the game - the player's hand.
Some poker players bluff, that is, bet as if their hand is stronger than it actually is. A bluffer hopes that the other players will decide that they are likely to lose. They will then fold, withdrawing from the current round to reduce their losses. But other players may find a bluff unconvincing, and that is where calling one's bluff comes from.
From the 538.com article,
It was trained on a common poker version called Texas Hold 'em with no-limit betting.By 2007 and 2008, computers, led by a program called Polaris, showed promise in early man vs. machine matches, fighting on equal footing with, and even defeating, human pros in heads-up limit Hold ‘em, in which two players are restricted to certain fixed bet sizes.
In 2015, heads-up limit Hold ’em was “essentially solved” thanks to an AI player named Cepheus. This meant that you couldn’t distinguish Cepheus’s play from perfection, even after observing it for a lifetime.
In 2017, in a casino in Pittsburgh, a quartet of human pros each faced off against a program called Libratus in the incredibly complex heads-up no-limit Hold ’em. The human pros were summarily destroyed. Around the same time, another program, DeepStack, also claimed superiority over human pros in heads-up no-limit.
...
Brown and Sandholm’s latest creation, named Pluribus, is superhuman at a flavor of no-limit poker with more than two players — specifically, six — which is identical to one of the most popular forms of the game played online and very closely resembles the game I was playing in that room in the desert.
Pluribus was trained much like AlphaGo and AlphaZero, champion-level software for playing Go and chess. It started from scratch and played itself repeatedly, with it tweaking its decision parameters to get improved performance.
The finished program, which ran on just a couple of Intel CPUs, was pitted against top human players — each of whom had won at least $1 million playing as a professional — in two experiments over thousands of hands: one with one copy of Pluribus and five humans and another with one human and five copies of Pluribus. The humans were paid per hand and further incentivized to play their best with cash put up by Facebook. Pluribus was determined to be profitable in both experiments and at levels of statistical significance worthy of being published in Science.
Neither the 538.com article nor the Science-magazine article mentioned bluffing, however. It would be interesting to see if some poker AI could guess that a player might be bluffing -- or if it could do some bluffing of its own.
Another aspect of human poker playing is that some players may try to get around the game's incomplete information by looking at other players' facial expressions and the like to guess what they are thinking. Some players try to avoid giving away such information by having a "poker face", an expressionless appearance, and wearing sunglasses. Members of other species can pick up such cues, and a famous case of that was Clever Hans, a seemingly knowledgable horse. Its "knowledge" was from doing so, as a psychologist demonstrated.
Online poker gets around that issue, though interpreting such body cues is another AI challenge.