http://www.cardrunners.com/poker-videos/the-theory-of-winning-part-1-asuth/

In the video I go through GTO fundamentals, solve some example games, and expand on some of the topics I've covered regarding strategies and GTO in 3+ handed scenarios.

Thanks largely to VodkaHaze from reddit I got an opportunity to make a Game Theory focused video series for CardRunners. The video is free to anyone for this weekend only so definitely check it out here:

http://www.cardrunners.com/poker-videos/the-theory-of-winning-part-1-asuth/

In the video I go through GTO fundamentals, solve some example games, and expand on some of the topics I've covered regarding strategies and GTO in 3+ handed scenarios.

This weeks solution is in video form:

The one aspect of the solution that is not covered in the video is the EV of the trial for an individual judge.

When all judges are playing the symmetric equilibrium strategy, the EV for a single judge is -102 when the accused is a human and 922 when the accused is a werewolf. If on average the accused is a werewolf 50% of the time then the EV of the trial is 410.

For comparison, if there were a single judge deciding honestly based solely on his own ritual result he would average 880 when the accused is a werewolf and -20 when then the accused is a human for an average EV of 430.

Even when all the judges have identical incentives, unanimity rule reduces their expectation.

The one aspect of the solution that is not covered in the video is the EV of the trial for an individual judge.

When all judges are playing the symmetric equilibrium strategy, the EV for a single judge is -102 when the accused is a human and 922 when the accused is a werewolf. If on average the accused is a werewolf 50% of the time then the EV of the trial is 410.

For comparison, if there were a single judge deciding honestly based solely on his own ritual result he would average 880 when the accused is a werewolf and -20 when then the accused is a human for an average EV of 430.

Even when all the judges have identical incentives, unanimity rule reduces their expectation.

When a citizen of the ancient city of Bayes is accused of being a werewolf they are brought before the Tribunal to be considered for execution. The three Tribunal members can detect if someone is a werewolf or not through a simple spiritual ritual involving steamed badger milk. Since the ritual is only 90% accurate (yes badger milk is actually 90% effective for werewolf detection), each Tribunal member performs it separately, in secret and then decides to vote guilty or innocent. The accused is only executed if the Tribunal unanimously votes guilty. The wise Tribunal members are unbiased and go into each trial believing that there is a 50%/50% chance that the accused is a werewolf prior to conducting their ritual, however the members base their vote 100% on strategic self interest.

- If the citizen is executed and reverts to wolf-form upon death, they were a werewolf and the Tribunal is given the accused’s possessions for their wisdom and public service.
- If they do not turn into a wolf upon death, the Tribunal has executed an innocent citizen, and must each pay the citizen’s family a grievance fee. The family also gets a cake that says “Sorry guys, our bad, #sorrynotsorry”.
- When the Tribunal sets a non-werewolf free, usually not much happens. The people of Bayes eat some discarded apology cake, get drunk, and think of how it might be fun to accuse other people of being werewolves.
- If the Tribunal lets a werewolf go free, this is revealed when they turn into a wolf upon death (often at the hands of a cake-filled, drunken mob under a full moon). As a penance, the Tribunal pays the werewolf’s family the grievance fee and gets none of the werewolf’s possessions.

Imagine that you are one of the members of the Tribunal of Bayes presiding over the fate of one of the richest men in town. If you all vote to execute him, and are correct, you will each get 1000 gold coins (which buys a lot of cake in Bayes). If you unanimously execute him, and he was innocent, or if you let him go, and he is later found to be a werewolf, you must each pay a 200 gold coin fee. If you correctly set him free, you gain/lose nothing.

What is an equilibrium (GTO) strategy for voting based on the result of the ritual? What is the expected value in gold coins for each tribunal member when they all follow the equilibrium voting strategy, and what is the probability that they convict an innocent citizen or that they release a guilty citizen? How would these numbers change if the Tribunal used majority rule rather than requiring unanimity?

EDIT: To clarify the efficacy of the badger milk ritual. It is 90% accurate in both directions. That is, if the accused is a werewolf there is a 10% chance the badger milk ritual will say that he is not, and if the accused is not a werewolf there is a 10% chance that the badger milk ritual will say that he is.

You can check out the full solution here.

Today I'm going to walk through the solutions to the GTO True or False quiz. As I warned in the post, the quiz was quite hard, in aggregate the overall % of questions answered correctly was about 52%, just slightly better than randomly guessing. In case you missed the quiz you can take it here.

**Question 1: **"Betting on the river with a hand in a situation where a GTO opponent never calls with a worse hand, and never folds with a better hand cannot be part of a GTO Strategy in Heads Up NLHE"

**Answer: False**

Overall this is one of the trickiest questions, although there is a very simple situation in which the statement is clearly false. If you imagine you are on the river and the board has a royal flush, then shoving is clearly GTO, and a GTO opponent will never fold worse or call with better. Shout out to reddit user yellowstuff for noting that.

There is also a more interesting set of examples where betting in this type of situation is profitable.

The one thing, besides making your opponent call with worse or fold better that a bet can accomplish is that it can limit your opponents bet sizing options. To some extent, it turns out that something like blocking bets can be GTO which is quite surprising.

In spots where you have a bluff catcher, but your range also includes the nuts reasonably often it can be GTO to lead small some % of the time with nuts, air, and bluff catchers. By making it a lot more expensive for your opponent to bluff at you (by raising), you can get more value out of your nuts when they raise, make them fold to a tiny bet with your air when they fold, and make them unable to use the most profitable bet size against a check/call with a bluff catcher.

The Mathematics of Poker talks about this in the AKQ game #5 where they solve a full no-limit version of the AKQ game and demonstrate that the GTO strategy for the out of position player is to occasionally bet his kings. I think it's one of the most interesting parts of the whole book; they call it a preemptive bet. They actually work out the math in detail and its worth checking out, but the intuition is what I laid out above.

**Question 2: **"Two players, both playing GTO strategies, are playing two hands of heads up in a rake free game of NLHE. Player 2 has a leak, where every time he is supposed to take an action with probability 100% according to his strategy, he instead takes that action 99% of the time, and randomly chooses another action 1% of the time. Player 2 will have EV < 0 vs. his opponent."

**Answer: True**

This one is pretty simple if you consider the types of errors that Player 2 will make. For example, Player 2 will fold AA preflop 1% of the time as the first to act player. This is a significantly -EV decision. If both players play GTO for 2 hands, then Player 2's EV would be 0. By definition of a Nash Equilibrium, none of his random errors can increase his EV, and some of them, (like folding the nuts) will be strictly minus EV, so his overall EV will be strictly less than 0.

**Question 3: **"In a 3 handed game of NLHE with no rake, two players are playing GTO strategies, the third player is not. The third player must have EV <= 0 and the GTO players must both have EV >= 0."

**Answer: False**

I explained this in depth here. This was the question that people most frequently got wrong.

**Question 4: **"If two players reach the river with ranges that have 50% equity and they both play GTO strategies on the river, then the player who is in position cannot have a lower EV than his opponent."

**Answer: False**

The easiest way to solve this one is to imagine Player 1 is in position has a range that is 100% medium strength hands and Player 2 has a range that is 50% nuts, 50% air. As long as Player 2 doesn't fold the nuts he is guaranteed to win at least 50% and have equal EV and if he can ever make is opponent fold to a bluff, or call a value bet then he will win more than his opponent. A very simple strategy like betting the pot whenever he has the nuts and 50% of the time when he has air is GTO and guarantees him an EV of triple his opponent.

This is discussed in more depth here.

**Question 5: **"Suppose we solve a specific river scenario (with pencil and paper, or with a program) for a GTO strategy. A friend shows us a strategy that claims to be GTO for the full game of HUNLHE. If it plays differently in our river scenario then it cannot be GTO."

**Answer: False**

This question gets into the idea of off equilibrium path behaviors. A pair of GTO strategies define something called an equilibrium path, which is the set of situations that will occur with non-zero probability when the two GTO strategies play against each other.

The definition of Nash Equilibrium requires that there is no profitable way for either player to deviate off of the equilibrium path and increase his overall EV. It**does not** require that the GTO strategy plays perfectly off of the equilibrium path.

As a simple example, suppose that it is not GTO to get to the river with 27o in some specific situation S in the game of HUNLHE as a whole. Then it is entirely possible that a strategy that is GTO in the entire game of NLHE will play quite suboptimally in the situation S against a player who does hold 27o.

**Question 6: **"Two bots playing a shove or fold preflop game with 1000 BB stacks. You observe that Player 2 is calling Player 1's shoves less than 0.1% of the time over an infinitely large sample. These bots are not playing GTO shove/fold strategies."

**Answer: False**

One might think that if our opponent is calling less than 1 in 1000, and we win 1.5BB when we shove and he folds, then shoving any two cards would auto profit.

However, the equilibrium to this game is to only shove with AA and to only call with AA. Due to card removal effects, the odds that Player 2 has AA, given that Player 1 has AA, is less than 1 in 1000. Were someone to try and shove a different hand (say KK) they would get called 1 in 221, which would make the shove unprofitable.

In general you cannot just use an observed calling frequency to determine the profitability of a bet. You have to consider the conditional probability that your opponent will call your bet, given the cards you hold.

Overall this is one of the trickiest questions, although there is a very simple situation in which the statement is clearly false. If you imagine you are on the river and the board has a royal flush, then shoving is clearly GTO, and a GTO opponent will never fold worse or call with better. Shout out to reddit user yellowstuff for noting that.

There is also a more interesting set of examples where betting in this type of situation is profitable.

The one thing, besides making your opponent call with worse or fold better that a bet can accomplish is that it can limit your opponents bet sizing options. To some extent, it turns out that something like blocking bets can be GTO which is quite surprising.

In spots where you have a bluff catcher, but your range also includes the nuts reasonably often it can be GTO to lead small some % of the time with nuts, air, and bluff catchers. By making it a lot more expensive for your opponent to bluff at you (by raising), you can get more value out of your nuts when they raise, make them fold to a tiny bet with your air when they fold, and make them unable to use the most profitable bet size against a check/call with a bluff catcher.

The Mathematics of Poker talks about this in the AKQ game #5 where they solve a full no-limit version of the AKQ game and demonstrate that the GTO strategy for the out of position player is to occasionally bet his kings. I think it's one of the most interesting parts of the whole book; they call it a preemptive bet. They actually work out the math in detail and its worth checking out, but the intuition is what I laid out above.

This one is pretty simple if you consider the types of errors that Player 2 will make. For example, Player 2 will fold AA preflop 1% of the time as the first to act player. This is a significantly -EV decision. If both players play GTO for 2 hands, then Player 2's EV would be 0. By definition of a Nash Equilibrium, none of his random errors can increase his EV, and some of them, (like folding the nuts) will be strictly minus EV, so his overall EV will be strictly less than 0.

I explained this in depth here. This was the question that people most frequently got wrong.

The easiest way to solve this one is to imagine Player 1 is in position has a range that is 100% medium strength hands and Player 2 has a range that is 50% nuts, 50% air. As long as Player 2 doesn't fold the nuts he is guaranteed to win at least 50% and have equal EV and if he can ever make is opponent fold to a bluff, or call a value bet then he will win more than his opponent. A very simple strategy like betting the pot whenever he has the nuts and 50% of the time when he has air is GTO and guarantees him an EV of triple his opponent.

This is discussed in more depth here.

This question gets into the idea of off equilibrium path behaviors. A pair of GTO strategies define something called an equilibrium path, which is the set of situations that will occur with non-zero probability when the two GTO strategies play against each other.

The definition of Nash Equilibrium requires that there is no profitable way for either player to deviate off of the equilibrium path and increase his overall EV. It

As a simple example, suppose that it is not GTO to get to the river with 27o in some specific situation S in the game of HUNLHE as a whole. Then it is entirely possible that a strategy that is GTO in the entire game of NLHE will play quite suboptimally in the situation S against a player who does hold 27o.

However, the equilibrium to this game is to only shove with AA and to only call with AA. Due to card removal effects, the odds that Player 2 has AA, given that Player 1 has AA, is less than 1 in 1000. Were someone to try and shove a different hand (say KK) they would get called 1 in 221, which would make the shove unprofitable.

In general you cannot just use an observed calling frequency to determine the profitability of a bet. You have to consider the conditional probability that your opponent will call your bet, given the cards you hold.

This weeks brainteaser is a true or false quiz on GTO concepts. Getting them all right is only part of the goal, ideally you should be confident that you could convince a skeptical friend of the correct answer to each question. It's hard, good luck :)

If you want to discuss the questions/answers you can join the reddit discussion here.

If you want to discuss the questions/answers you can join the reddit discussion here.

Subscribe to:
Posts (Atom)