New York: An artificial intelligence (AI) programme has defeated leading professionals in six-player no-limit Texas hold’em poker, the world’s most popular form of poker, researchers said Friday.
Developed by Carnegie Mellon University in the US and Facebook AI, the programme called Pluribus defeated poker professional Darren Elias, who holds the record for most World Poker Tour titles, and Chris Ferguson, winner of six World Series of Poker events.
Each pro separately played 5,000 hands of poker — a family of card games that combines gambling, strategy, and skill — against five copies of Pluribus, according to a research paper published in the journal Science.
In another experiment involving 13 pros, all of whom have won more than USD 1 million playing poker, Pluribus played five pros at a time for a total of 10,000 hands and again emerged victorious.
“Pluribus achieved superhuman performance at multi-player poker, which is a recognised milestone in artificial intelligence and in game theory that has been open for decades,” said Tuomas Sandholm, a professor who developed Pluribus with Noam Brown, a PhD candidate at Carnegie Mellon.
“Thus far, superhuman AI milestones in strategic reasoning have been limited to two-party competition. The ability to beat five other players in such a complicated game opens up new opportunities to use AI to solve a wide variety of real-world problems,” Sandholm said.
“Playing a six-player game rather than head-to-head requires fundamental changes in how the AI develops its playing strategy,” said Brown, who is associated with Facebook AI.
“We’re elated with its performance and believe some of Pluribus’ playing strategies might even change the way pros play the game,” Brown said.
Pluribus’ algorithms created some surprising features into its strategy.
For instance, most human players avoid ‘donk betting’ — that is, ending one round with a call but then starting the next round with a bet.
It is seen as a weak move that usually doesn’t make strategic sense.
However, Pluribus placed donk bets far more often than the professionals it defeated.
“Its major strength is its ability to use mixed strategies,” Elias said.
“That’s the same thing that humans try to do. It’s a matter of execution for humans — to do this in a perfectly random way and to do so consistently. Most people just can’t,” he said.
Sandholm and Brown earlier developed Libratus, which two years ago decisively beat four poker pros playing a combined 120,000 hands of heads-up no-limit Texas hold’em, a two-player version of the game.
Games such as chess and Go have long served as milestones for AI research. In those games, all of the players know the status of the playing board and all of the pieces, researchers said.
However, poker is a bigger challenge because it is an incomplete information game; players can’t be certain which cards are in play and opponents can and will bluff, they said.
That makes it both a tougher AI challenge and more relevant to many real-world problems involving multiple parties and missing information.
All of the AIs that displayed superhuman skills at two-player games did so by approximating what’s called a Nash equilibrium.
Named for the late Nobel laureate John Forbes Nash Jr, a Nash equilibrium is a pair of strategies where neither player can benefit from changing strategy as long as the other player’s strategy remains the same.
Although the AI’s strategy guarantees only a result no worse than a tie, the AI emerges victorious if its opponent makes miscalculations and can’t maintain the equilibrium.
In a game with more than two players, playing a Nash equilibrium can be a losing strategy.
Pluribus dispenses with theoretical guarantees of success and develops strategies that nevertheless enable it to consistently outplay opponents.
The AI programme first computes a ‘blueprint’ strategy by playing six copies of itself, which is sufficient for the first round of betting.
From that point on, Pluribus does a more detailed search of possible moves in a finer-grained abstraction of game.
It looks ahead several moves as it does so, but not requiring looking ahead all the way to the end of the game, which would be computationally prohibitive.
PTI