Wednesday, March 19, 2014

GTO Poker Outside of Heads Up -- What it solves and what it does not

It seems that you can’t discuss poker strategy these days without hearing the term GTO.  This should come as no surprise given that the promise of Game Theory Optimal poker is that it is completely unbeatable.  Obviously I'm 100% on the GTO bandwagon given the contents of this blog, but like anything, GTO poker has its limitations and people seem to regularly ignore these limitations either out of ignorance or self-interest.  Understanding GTO concepts will help any poker player (yes even microstakes players) improve their game, but it is not a holy grail that solves all of poker and guarantees free money.

The key thing to understand is that for most players, GTO can drastically improve your play in many specific situations. However, outside of heads up it cannot be used as a basis for an entire strategy, and you will be best off leveraging GTO concepts alongside standard play.

GTORangebuilder was built from the ground up with the strengths and limitations of GTO play in mind, and the only situations it analyzes are situations where its advice will guarantee you the stated EV. However, because of this (and because of computational limits), it's limited in what situations it can solve and it is designed to be a tool to aid you in improving your game, not a solution to all of poker.  Anyone claiming to know an unbeatable strategy for what to do in every possible situation in say, a 6-max 100BB cash game, is confused about what GTO means, pure and simple.

Nash equilibria defined:


For an in-depth look at the definition of a Nash equilibrium and how they are used in poker see this post. Today I'm just going to look at the technical definition and highlight exactly what it says.  From wikipedia:

"In game theory, the Nash equilibrium is a solution concept of a non-cooperative game involving two or more players, in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only their own strategy."

The key word that people often gloss over here is "own".  A player playing a GTO (or Nash Equilibrium strategy, I use the terms interchangeably), is guaranteed that if all the other players are also playing Nash Equilibrium strategies, that no player could unilateral change their strategy and increase their EV.

Lets first apply this definition to heads up where GTO poker does guarantee unbeatable play.  And then we'll look at some three player examples where it all falls apart.

GTO in Heads Up:


Suppose you are playing heads up vs a fish and you somehow are able to play perfect GTO poker, while the fish is not and makes many mistakes.  You can imagine the mistakes the fish makes as him changing his own strategy from GTO to a weaker strategy.  The definition above says he cannot possibly have gained EV by changing his own strategy.  Furthermore, poker is zero-sum and you are the only other player so if he lost EV than you had to have gained that EV.

If you play GTO and your opponent plays anything else in a heads up game they cannot beat you, it is mathematically impossible.  That is a powerful statement and it's what makes GTO so appealing. If perfect GTO play were known (which barring a major breakthrough like advances in quantum computing is unlikely to happen this century for the full NLHE game) it truly would solve heads up forever.

Heads Up Subgames:


The power of GTO play isn't limited to situations where only two players are dealt into the hand.  You can apply the same logic as above to any "subgame" (poker situation), in which only two players are left in the hand.  If only two players see the flop, while the preflop action might determine the starting pot size on the flop as well as the flop hand ranges for both players, from that point on the players are playing a heads up two player subgame in which GTO play is unbeatable.  Thus while you might have put yourself in a bad situation by seeing the flop, a GTO strategy will be able to determine the exact EV of the situation you've put yourself in and if you follow the strategy, no matter what your opponent does you are guaranteed at least that EV.  If they do not play GTO themselves, your EV can only go up.

The vast majority of poker hands are heads up by the river so by applying GTO strategies to heads up situations that come up in your games you can greatly improve your win-rate in all sorts of post-flop situations whether you play 6-max, full ring, or heads up.

GTO 3-handed


EDIT: In my CardRunners Video I redid these calculations more precisely using the Hold'em Resources calculator.  The basic result is the same but the exact numerical values / strategies below are not as exact as they could be, see the video for more precise results.

In 3-handed situations the entire premise of GTO starts to break down, because a decrease in one of our opponents EV does not necessarily mean an increase in ours.  In fact, it is often the case that an opponent who makes mistakes can actually decrease our EV even if we continue to play GTO.  The easiest way to see this is to start with a simple example of a 3 handed push/fold equilibrium in a short stacked scenario.

Suppose we are 3 handed and all players have 15BB and are playing shove/fold poker in a rake free cash game (15BB is a bit to deep for this to be a great idea, but that doesn't matter for the sake of the example). The equilibrium solution for this game is reasonably simple and can be found here.

Basically the button shoves about 29% of his hands, the small blind calls with 14.5%, and the big blind over calls with a very tight range (9.4%) and calls if the small blind folds with a wider range (14.8%).  If the button folds, then the small blind shoves a very wide range (46%) and the big blind calls 28%.

Lets assume the hero is in the small blind.  If you put that scenario into an analysis tool like CardRunnersEV (I ran CREV with a 1 million hand monte-carlo sample which is pretty good but not perfectly accurate, particlarly because CREV rounds to the nearest 1BB/100) you can easily see the expected value for each player when playing the Nash Equilibrium strategies.  They are:

Button EV:  19 bb / 100
(HERO) Small Blind EV:  -11 bb / 100
Big Blind EV:  -8 bb / 100

If all 3 players play GTO, on average each player will win 19bb / 100 in the button, lose 11bb / 100 in the small blind, and lose 8bb / 100 in the big blind, netting to 0bb / 100 break even play.

Now we know from the Nash definition that if any player starts from the Nash state (where all 3 players are playing Nash) and changes only his own strategy, that he will reduce his EV.  Lets assume that the button is a weak tight player and does not shove nearly enough.  We know this has to decrease the buttons EV, but nothing about the definition of a Nash Equilibrium guarantees that the button's change in strategy won't also decrease the hero's ev.

If the button only shoves: 55+, AJ+, KQ, KJs, QJs, JTs the EVs become:

Button EV:  15 bb / 100
(HERO) Small Blind EV:  - 17 bb / 100
Big Blind EV:  2 bb / 100

The hero's EV is down 6bb per 100, even though he is still playing the GTO strategy.  The hero's EV decreases by more than the buttons EV, even though the button is the player making a mistake!  If you imagine that every player plays GTO in all positions, except for the one fishy player who is too tight on the button, what happens to the hero's winrate?  He wins 19bb / 100 on the button, loses 17bb / 100 on the small blind, and loses 8bb / 100 on  the big blind, for an average of -2bb / 100.  Playing GTO poker in 3+ way scenarios can lose money if there is a fishy player at the table who is not playing GTO.

If you imagine that the Big Blind player is a smart reactive player it can get even worse!  The condition that the big blind must lose EV if he changes his strategy away from the Nash Equilibrium strategy no longer applies once there is a fish on the button.  The Nash condition is only relevant when ALL players are playing Nash.  Now that the button has changed his strategy, the big blind player can change his strategy as well to increase his profit and to reduce our hero's ev.  If the BB tightens up his over-calling range he can further reduce the hero's EV by almost another 1BB / 100 when the hero is in the small blind.

In 3-way pots with a fish a GTO strategy can lose and furthermore, a smart reactive player can adjust his strategy to make the GTO strategy lose even more.

It is important to note that the above are not due to ICM, they appear even in cash games.  In SnG situations where ICM is a factor there are even bigger and more obvious instances where the presence of a fish can make a nash strategy -EV, but the fundamental issue in both cash games and ICM cases is the same.

Conclusions:


This is in no way meant as a condemnation of GTO play.  GTO strategies are unbeatable for any 2 handed situation and even in 3+ handed situations, understanding GTO theory will give you tons of insights into how to balance ranges and increase your EV.  I believe that GTO will be the driving force in the continued evolution of poker strategy over the next 5 years.  However, this post should act as a warning if you are ever considering turning off your brain and blinding following GTO.  Anyone who misapplies GTO theory to scenarios where it doesn't hold, or who is using it as a crutch rather than as a tool, is going to quickly be left behind.




3 comments:

  1. Good post. I've never read about GTO in 3 way sitatuions. Thank you

    ReplyDelete
  2. Nice blog! So what do you think about PokerSnowie and their GTO claims?

    ReplyDelete
  3. I honestly don't know much about snowie as I've never used it. I'd assume they are just using the term GTO loosely in a marketing sense, and don't mean that they have actually computed a full GTO solution.

    The idea of solving for a GTO strategy all the way back to preflop is a bit ridiculous, even if you drastically limit the bet sizes you consider. On top of that even were GTO known, as soon as one player is playing non-GTO (ie there is just a single fish at the table) the game gets infinitely more complex, and GTO play may well be losing play as my post illustrates.

    I mainly just wanted to get across that the game isn't anywhere near solved and anyone who claims it is is likely FOS. People are going to be learning and improving at poker for a long time to come.

    As for snowie, its certainly possible that a great poker AI already exists or will come out soon. Deep blue is great at chess even though it is widely accepted that we are decades/centuries from figuring out a GTO solution to chess.
    I know the University of Alberta team claims to have an AI that is on par with a decent small stakes grinder at 6 max. It plays non-exploitative poker but they don't claim to be GTO in any sense as far as I know.

    The main point of my post is that, even if somehow someone did magically solve for GTO, there would be many situations in 3+ handed games where playing GTO would lose, often substantially, if there were a fish at your table. If there are no fish at your table and everyone were GTO then you'd just be grinding away money to the rake so it would be pointless to play anyways.

    Hopefully that makes sense :)

    ReplyDelete

Note: Only a member of this blog may post a comment.