For the last day of this week, I’ve chosen to play poker with artificial intelligence (AI). Past this metaphorically first sentence, i would like to share with The Information Age a recent post from the MIT Technology Review, another irresistible one, and this time about a most recent AI conquest of a typically human capacity – the card game of Poker.
This is a game that involves several human psychological traits that we might think an artificial device would never master. For example, the human ability to read other people’s minds would have seemed intractable form an AI perspective; or the ability to induce contradictory beliefs in others , like when the poker players uses the strategy of bluffing. Strategic thinking under uncertainty and imperfect information appears to being conquered by the AI community of computer scientists and software geeks. Or so that is what is claimed in the article below, where a few links to some of the pioneers is well recommended a follow through, as well as the full readership of the paper DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker, that is also provided below in this post, with its abstract description.
What I find the most interesting in developments such as these is the interplay, interdisciplinary nature of developments in Artificial Intelligence and Computer Science. An application in using a computer to master a card game like Poker is also of utility in other apparently distinct, unrelated setting like cybersecurity, robot planning or automated guidance of a taxi car. This cross interdisciplinary capacity of computer science and AI is one major reason to see the fields of study so exciting and promising, drawing further chunks of attention and also investment from several entrepreneurs and venture capitalists. One main takeaway from this is the pervasiveness of information and information technologies within human societies and how they are, were and will be structured. The Information Age is actually an age of all ages and eras, undoubtedly.
Libratus is playing thousands of games of heads-up, or two-player, no-limit Texas hold’em against several expert professional poker players. Now a little more than halfway through the 20-day contest, Libratus is up by almost $800,000 against its human opponents. So victory, while far from guaranteed, may well be in the cards.
A win for Libratus would be a huge achievement in artificial intelligence. Poker requires reasoning and intelligence that has proven difficult for machines to imitate. It is fundamentally different from checkers, chess, or Go, because an opponent’s hand remains hidden from view during play. In games of “imperfect information,” it is enormously complicated to figure out the ideal strategy given every possible approach your opponent may be taking. And no-limit Texas hold’em is especially challenging because an opponent could essentially bet any amount.
We can quite sense in that last paragraph the significance of Poker in understanding many human psychological traits, especially reasoning with imperfect information, and how do we deal with that and try to somehow guess what the full information might be, to help us decide the best action to pursue. Not only that but we also need to take into account that other players will pursue similar goals, and in the process they will also try to guess what your hand is and to increase further the imbalance in the information available to a best decision/action:
“Poker has been one of the hardest games for AI to crack,” says Andrew Ng, chief scientist at Baidu. “There is no single optimal move, but instead an AI player has to randomize its actions so as to make opponents uncertain when it is bluffing.”
Libratus was created by Tuomas Sandholm, a professor in the computer science department at CMU, and his graduate student Noam Brown. Sandholm, an expert on game theory and AI who emigrated from Finland for his PhD, says it is amazing that humans have been able to outplay computers for so long. “It just blows my mind how good these top pros are,” he says. “Of all of these games that AI has tackled, [poker] is the only one where AI hasn’t reached superhuman performance.”
“Whether a move is good or not depends on things you cannot observe,” says Vincent Conitzer, a professor at Duke University who teaches AI and game theory. “This also results in a need to be unpredictable. If you never bluff, you are not a good player. If you always bluff, you are not a good player. Game theory tells you how to randomize your play in a way that is, in a sense, optimal.”
Last year, Sandholm led the development of a previous poker-playing program, called Claudico, which was soundly beaten in a match against several professional poker players. He explains that Libratus uses several new advances to achieve such a high level of play. This includes a new equilibrium approximation technique, Sandholm says, as well as several new methods for analyzing possible outcomes as cards are revealed at later stages of a game. This end-game analysis is computationally very challenging, and is performed during each game at the Pittsburgh Supercomputing Center, a facility operated by CMU and the University of Pittsburgh.
Advances in machine learning and AI have seen a number of superhuman game playing programs emerge recently. Last year, researchers at DeepMind, a subsidiary of Alphabet, developed a program capable of beating one of the world’s best Go players. This achievement was so spectacular because Go is extremely complex, and because it is hard to measure progress within the game (see “Google’s AI Masters Go a Decade Earlier than Expected”).
The techniques used to build a smarter poker-bot could have many real-world applications. Game theory has already been applied to research on jamming attacks and cybersecurity, automated guidance for taxi service, and robot planning, says Sam Ganzfried, who was involved with the development of Claudico and is now an assistant professor at Florida International University in Miami.
However, even if Libratus triumphs this week, that doesn’t mean that humans no longer deserve a spot at the card table. The multiplayer version of no-limit Texas hold’em cannot be mastered using the techniques employed by Libratus.
A few different research groups are focused on tackling poker. Another academic team, from the University of Alberta in Canada, and Charles University and Czech Technical University in the Czech Republic, recently developed a program, called DeepStack, that has already beaten several professional players in heads-up no limit Texas hold’em (see “Poker Is the Latest Game to Fold Against AI”). However, Sandholm says, the players involved in the match against Libratus are far stronger, and are playing many more hands against the machine, which should provide greater statistical significance to the result.:
Artificial intelligence has seen a number of breakthroughs in recent years, with games often serving as significant milestones. A common feature of games with these successes is that they involve information symmetry among the players, where all players have identical information. This property of perfect information, though, is far more common in games than in real-world problems. Poker is the quintessential game of imperfect information, and it has been a longstanding challenge problem in artificial intelligence. In this paper we introduce DeepStack, a new algorithm for imperfect information settings such as poker. It combines recursive reasoning to handle information asymmetry, decomposition to focus computation on the relevant decision, and a form of intuition about arbitrary poker situations that is automatically learned from selfplay games using deep learning. In a study involving dozens of participants and 44,000 hands of poker, DeepStack becomes the first computer program to beat professional poker players in heads-up no-limit Texas hold’em. Furthermore, we show this approach dramatically reduces worst-case exploitability compared to the abstraction paradigm that has been favored for over a decade.
Featured Image: from Fig.3 in the paper DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker with the following caption: Figure 3: Deep counterfactual value networks. The inputs to the network are the pot size, public cards, and the player ranges, which are first processed into bucket ranges. The output from the seven fully connected hidden layers is post-processed to guarantee the values satisfy the zero-sum constraint, and then mapped back into hand counterfactual values.