## Thursday, March 6, 2014

### Strategies, Nash Equilibria and GTO Poker

To really wrap your head around GTO poker it is important to have a good grasp of what defines a strategy in NLHE and what makes a set of strategies a Nash Equilibrium or GTO.  This post is designed to give you a formal theoretical definition of these terms as well as some practical examples and intuition about how they relate to day to day poker.

What is a poker strategy?

This is actually a more complicated question than you might think.  If this seems a bit confusing at first just keep reading until you reach the practical examples as they should clear things up.  The sections highlighted in grey have more mathematical details and are fine to skip if you just want a high level overview.

Generally speaking a strategy for poker says what to do in every possible situation with every possible hand.

Mathematically, a strategy is a function f(E, H) that for every possible sequence of past events, E, and for every possible hand we might be dealt, H, returns a legal poker action A (like bet a certain amount/call/fold).

The simplest example of a poker strategy is a push fold chart like many players use at the end of SnGs, such as this one.  It states for every possible stack size, every possible hand, and whether an opponent went all in before you or not what to do, either go all in or fold (interpret the 'callers' action as going all in if his opponent does anything other than shoving or folding).

More complex strategies that involve complex postflop play, reacting to tendencies that our opponents exhibited in past hands, etc are practically speaking impossible to write down in a simple chart as they need to describe what to do with every possible hand in response to every possible set of opponent actions every possible board.  In fact, even for a computer to store the information for a complete strategy is completely impossible, a complete strategy wouldn't fit on all the hard-drives on earth combined if it were to contain hundreds of possible different bet sizes .

Despite that complexity, people obviously play solid poker, and with modern software like GTORangeBuilder we can compute highly accurate approximations of postflop GTO play, so how does this work?  Without a clear definition of a strategy we cannot analytically and numerically compare the performance of strategies, nor can we solve for approximate or exactly Nash Equilibria.  So our first step must be coming up with a formal definition of a strategy and make it applicable to actually playing poker in a practical sense at the tables.

It's actually quite simple, we just have to limit ourselves to considering strategies that either, ignore lots of information, or that make a very limited set of choices, or both.  Again lets go back to the push fold charts that I mentioned above.  They completely ignore information as they ignore all history from past hands and any information we might infer from our opponents prior actions.  They also limit the number of actions they might take to shoving and folding.  While it might seem like there is no way that ignoring that much information could possible lead to a strategy that is useful in actual poker games, it turns out that when stacks are small and blinds are big, push fold equilibria are extremely profitable tools to use. SnG end game push fold strategies have been used by elite players world wide to win millions of dollars as part of a very short stacked strategy.

Going back to our mathematical definition, ignoring information comes down to limiting what we consider in terms of past events E in our strategy f(E, H).  The push fold charts, limit E to a binary value, did someone else go all in before us or not?  They also then limit what the set of actions A that the function f can return to two options, go all in or fold.

Given how that even the simplest mathematical analysis of push fold equilibria has been useful even at high levels of play, it seems natural to ask what expanding either E, A or both can do to bolster our understand of poker.  Luckily there are some logical ways to think about how to go about considering slightly more general strategies and what additional elements make the most sense to add to E and A.  I'll get into that more after defining a Nash Equilibrium.

What is a Nash Equilibrium?

A Nash Equilibrium is a group of strategies for every player in a game where:

1. Each player knows every other players strategy exactly
2. Each player's strategy is maximizing their expected value against their opponents strategies.

The easiest way to understand this is through simple examples.  A common one is Rock Paper Scissors.  Lets consider strategies that ignore history and just play Rock, Paper or Scissors with specific probabilities.  We can represent these as three numbers that sum to 1 where the first number is the chance of playing rock, the second the chance of playing paper and the third the chance of playing scissors.

Consider two simple strategies, good old rock (1, 0, 0), which always plays rock, and the diplomat (0, 1, 0), which always plays paper.

This pair of strategies is not a Nash Equilibrium because if the player using good old rock knows in advance that he is playing against the diplomat (against which he always loses) he could increase his expected value by playing scissors more and rock less.

If we instead consider two strategies that play Rock, Paper and Scissors equally, (1/3, 1/3, 1/3) and (1/3, 1/3, 1/3) then we can see that those strategies do constitute a Nash Equilibrium.

In this situation both players are winning and losing half the time.  Increasing the frequency with which they play any particular option won't change that at all, they still will win and lose half the time.  So even knowing their opponents strategy exactly, they have no way to alter their own strategy to increase their expected value.

An important thing to note that is relevant to poker is that, in a repeated game (eg where you play multiple rounds of Rock, Paper Scissors, or multiple hands of poker) playing the single round Nash Equilibrium repeatedly is also a Nash Equilibrium of the repeated game.  However, repeated games also can have additional more complex Equilibria that adjust based on the actions taken in previous rounds.

What makes Nash Equilibria Powerful

The big reason that Nash Equilibrium strategies are powerful is that they give you a guaranteed minimum EV.  The way they are defined, they assume your opponent knows your strategy and that his strategy is the absolute best possible counter to what you are doing.  Thus if you play against any other opponent who is not perfectly countering your strategy, your EV can only go up.

In two player situations, Nash Equilibria are strategies that provide the best possible guaranteed minimum EV.

The other power of Nash Equilibria is that they are simple.  Nash Equilibria strategies by definition don't make specific plays against specific types of opponents based on past history.  They assume that your opponent will correctly adjust to whatever you do and thus they treat all opponents as the same.

This has the advantage that it drastically reduces the scope of strategies we need to consider and allows us to focus more on playing 100% solid poker ourselves rather than constantly trying to get inside our opponents heads.  However, it also has the weakness that against opponents who are making lot of mistakes, a Nash Strategy will not earn as high an EV as a strategy that is perfectly designed to counter the specific mistakes of our opponents.

While this weakness is certainly relevant, Nash Equilibrium strategies in poker do quite well, even against opponents who make a lot of mistakes. Furthermore, any adjustments you make to exploit an opponent will open up leaks in your own game that your opponent can use against you such that you are likely to perform worse than the Nash strategy. Even players that seem very fishy adjust to their opponents play more than you might expect.

Also by using a concept know as minimally exploitative play which is a variant of a GTO strategy we can still exploit our opponents mistakes while minimizing our own exploitability.  These types of strategies can also be calculated with GTORangeBuilder and I will get into exactly how that works in future posts.

Nash Equilibria in Poker

In poker, to understand Nash Equilibria you have to think at the strategy level.  If we take the push / fold game example again, the idea is that both players must know each others strategies and have no way to change their own strategy and increase their EV.  This does not mean that they know their opponents specific hand in a situation, just his strategy.  So in a push fold equlibria, when your opponent pushes all in at you, you would know the exact range of hands that he might do that with, but you would have no idea which specific hand he actually held.

To check if a set of push fold strategies are an equilibrium in a HU game we'd need to consider the following.
1. Take the Small Blind's strategy.  Given the hand range that the Big Blind is calling our shoves with, can we increase our EV by folding any of the hands we are shoving?  Or shoving any of the hands we are folding?
2. Now take the Big Blind strategy.  Given the hand range that the Small Blind is shoving with, can we increase our EV by calling with any hands we are folding?  Or by folding any of the hands that we are calling with?
If after checking 1 and 2 above we conclude that neither strategy can be changed to increase its EV, then we have a Nash Equilibrium.  A simple example of a push/fold Nash Equilibrium in a HU game with 10BBs is here.

Note that, if we are being precise, the push/fold equilibrium is not actually an equilibrium of a regular poker game because it doesn't consider what would happen if the small blind raised to say 2x rather than shoving.  In push / fold equilibrium we limit the set of strategies we consider to only strategies that push or fold.  In practice, if your opponent might actually, say, min-raise and then fold, and you as the big blind respond by shoving over your opponents min-raise then you are not actually playing a Nash Equilibrium strategy and are not guaranteed to achieve a specific EV.

At a high level Nash Equilibrium strategies can never rely on "tricking" our opponent.  For example, if there is some river bluff shove that we would actually only ever make with a busted draw and never with the nuts, then an opponent who knew our strategy would never fold to that bluff.  However, Nash Equilibrium strategies still make plays that are very difficult for our opponent to react to, not by being sneaky, but by instead balancing the hand ranges that we take specific actions with such that it is impossible for our opponent, even if they know our strategy, to read our specific hand and take advantage of us.  For example, an equilibrium strategy might make the same river bluff shove with a busted draw, so long as it also occasionally shoved the river for value in the same situation with the nuts.

Nash Equilibrium strategies also make fundamentally solid decisions regarding relative hand strengths equities, pot odds, and probability 100% of the time.  For example, a Nash Equilibrium strategy will never fold the nuts, it will never call a bet with a hand that doesn't have enough equity against an optimal opponents range to make the call plus EV, it will never call with a weak hand in the same situation that it would fold a strong hand, it will never use an inefficient bet size, etc.  All those small optimizations mean that even when a Nash Equilibrium strategy is not directly exploiting our opponents it will still tend to crush weak players, just by being fundamentally solid and error free.

Obviously push / fold strategies barely scratch the surface of poker strategy so naturally we'll want to consider Nash Equilibria that model more complex poker decisions with multiple betting rounds, varying bet amounts, etc.

GTORangeBuilder was the first program to go beyond push/fold strategies and to make postflop Nash Equilibria accessible and easy to compute and analyze.  Currently, GTORangeBuilder focuses on heads up river postflop scenarios and it simplifies the space of strategies by only considering specific bet sizes, and by condensing all the information from actions taken prior to the river into few key components:
1. The size of the pot and the effective stack sizes when the flop was dealt
2. The starting hand ranges of both players when the flop was dealt
3. A list bet sizes for each player to use on each street.
It then generates complete strategies for each player that are a Nash Equilibrium.  Specifically this means it lists for each player, what they should do with every hand in their range in every possible scenario that might occur with the given list of bet sizes.  The best way to understand this is to see it in action, so check out this post for a practical example of looking at Nash Equilibrium play with GTORangeBuilder and how to use that to improve your every day decisions at the tables.