Monday, October 20, 2014

GTO Brainteaser #8 Bonus Solution -- Optimal Betsize Calculations

I got a request for a numeric answer to the bonus of GTO Brainteaser # 8 which involves solving for an optimal bet size in a two street bluffing game.  The solution to the non-bonus question is here and is worth reading first.  I haven't actually done a post on deriving optimal betsizing in multistreet play before so I thought it would be useful to demonstrate the mathematics involved.  I'm not going to restate the game structure again here so please check out the original problem statement if you are not familiar with the original game.

The basic technique for calculating optimal bet sizing is as follows.

1. Rather than using a fixed betsize in your calculations, make the bet size a variable and solve for GTO strategies as a function of that variable
2. Compute the EV of the game when both players play the GTO strategies as a function of the bet size variable
3. Maximize the EV of the person making the bet with respect to the bet size variable.  That is your optimal bet size.
While this technique is quite simple conceptually, the actual algebra involved can be hairy so I usually just make wolfram alpha do it.  So lets get started.

To calculate the optimal betsize we will make a few assumptions that are reasonably easy to verify and that I have shown in other posts / videos.

1. The hero should always bet the nuts on the turn.  This allows him to "compound the nuts" over multiple streets which as I showed here and in more depth in my CardRunners videos is always +EV compared to betting on a single street with a polarized range.
2. On the river it will always be most profitable to shove with the nuts with our polarized range.  This is quite simple to prove and I showed it in my first CardRunner's video.
3. The hero's EV with his air on the river will be 0 unless z is so large that it is optimal for the villain to always fold the river.  However, clearly if the villain were always folding in a spot where the hero might hold air on the river, he would never call a turn bet, so any time a turn bet is called, the hero's EV with air must be 0 on the river
Combining observations 1 and 2 we can parameterize our bet sizing strategy with a single number x, the number of chips that we plan to bet on the turn.

If we bet x chips on the turn, then we know we will jam the river and bet the rest of our chips.  Given a starting turn pot of 100 chips if we bet x chips on the turn, the river pot will be 100 + 2x when we are called.

Our river jam will thus be a bet of  (150 - x) into a pot of 100 + 2x.  This means that we will be making a

b = (150-x) /  (100 + 2x) percentage pot bet on the river.

Now lets call the frequency of the villain calling the turn c.  Observation 4 tell us that since the hero's river EV is 0 with air, his EV for bluffing the turn is very simple to calculate.

EV[turn bluff] = (1-c) * 100 - c * x

Clearly the villain always folding or always calling the turn is highly exploitable so we know his turn calling strategy is mixed and we can apply indifference conditions to see that

c = 100 / (100 + x)

What about the villains calling frequency on non 3d/2c rivers?  This will of course just be determined by indifference conditions that depend on the pot / bluff size.  Call the villains river calling frequency rc.

EV[river bluff] = (1 - rc) * (100 + 2x) - (150 - x) * rc

Indifference conditions imply that

rc = (2x + 100) / (x + 250)

Now we can write the optimal turn bluffing frequency as a function of the turn bet size as well by looking at the villains EV for calling.  I calculated the villains EV of calling when the bet size was 50 chips in my previous post but I will duplicate the calculation here, assuming that the hero is bluffing with frequency with his air and always betting the nuts.  This means that (1-z) = a / (1 + a) of his betting range is air and z = 1 / (1 + a) of his betting range is the nuts.

Since the hero's EV with air on all rivers is 0, when he bluffs and we call we win the 100 chip pot plus his turn bet size in EV.  When he holds the nuts on 3c/2d runouts our EV is 0 on the river and on other runouts our EV when our opponent holds the nuts .

EV[call] = (100 + x) * (1-z) + z * (2/44 * -x + 42 / 44 * (-x - rc * (150 -x)))

If we apply indifference conditions to say that the EV of a call must be 0, this a relationship between z and x that we can solve for z.

Wolfram Alpha is much better at algebra than me so I just computed that relationship here.

Now the EV of the game for the villain is just how often the hero checks the turn, which is just (1-a) / 2, because by indifference conditions, when the hero bets the villain is indifferent between calling and folding and thus his EV is 0.

Since z = 1 / (1 + a), a = (1/z) - 1, so (1 - a)/2 = (2 - 1/z) / 2

So the EV of the game is (2 - 1/z) / 2 for the villain and we know z as a function of x.  Thus the optimal bet size for the hero is the value of x that minimizes his opponents EV, (2 - 1/z) / 2, where z is between 1/2 and 1 (because our betting range is at least 1/2 nuts and at most 100% nuts.  Since this is clearly decreasing in z, we just need to minimize z.

Again I calculated this using wolfram alpha here.  The result is that the optimal bet size is 52.69 chips.  This intuitively makes sense, as we would expect to bet slightly larger on the turn with some of the river runouts killing our action than we would without that risk.

The EV of this game is ~87.89 so the EV gain by changing betsize in this case is tiny, about 0.02 chips.

Thursday, October 16, 2014

GTO Brainteaser #8 Solution -- Multistreet Theory vs Practice

In this post I'm going to discuss the solution to GTO Brainteaser 8, check it out here if you missed it.  I'm also going to provide some introduction to multistreet theory and some simple examples of understanding the impact of runouts and the flow of information across streets.  A browseable GTORB version of the turn solution is also presented near the bottom for those of you interested in a sneak peak at the GTORB turn solution interface (it still needs some polish).

Solution

The brainteaser involved studying the following game:

• You are on the turn and the board is  AsAhKsKh
• The hero a hand range of AcAd and 3c2d
• The villain a hand range of KcKd
• The pot is 100 chips and stacks are 150 chips
• The hero can either bet 50 chips on the turn and 100 on the river or he can shove for 150 on the turn.

• The goal was to determine how and why this game was different from the nuts vs air multi-street game that we looked at in GTO Brainteaser #6 and from the multi-street polarized vs merged range theory in the mathematics of poker.

The basic answer to this question is quite simple, the real world game with an actual deck is worse for the hero than the model game from GTO Brainteaser 6, because when a 3c or 2d hits on the river it reveals to the villain that the hero must hold the nuts.  The villain is able to convert this information into money by folding 100% on either of these rivers.  Perfectly polarized ranges are always the strongest possible ranges, so for the hero, having his range depolarized on the some river runouts decreases his EV.

Note that the villain also gains information when an Ac or Ad comes on the river, but this information is not valuable because our EV when we hold 3c2d is 0 anyways.  A GTO opponent calls enough to make us indifferent between bluffing and folding, so if we are forced to always fold that doesn't actually decrease our EV.  In this case the villain still gains information but has no way to convert it into money.

We can calculate the exact EV decrease quite simply.  1/2 of the time we hold AcAd and if we recall the solution to GTO Brainteaser #6, our EV with AcAd in this spot without river runouts giving away information is 16/9th of the pot.  Now when we hold AcAd, since we know our opponent holds KcKd, there are 44 river cards that might come and 2 of them reduce our EV to 1.5 pots in the case where our opponent calls our bet.

Clearly, the villain must still make us indifferent between betting and checking a Q, on the turn which means that he must call 2/3rds of the time to make the EV of betting 0.  Thus the hero EV with an Ace in the new game can by calculated by adding up:

1. 1/3 * 100 -- we bet and they fold
2. 2/3 * (42/44 * R)  -- hero bets, villain calls and a non-3c/2d river comes where R is the hero EV on that river
3. 2/3 * (2/44 * 150) -- hero bets, villain calls, and a 3c or 2d river comes and villain just folds
On the unblocked rivers, as in Brainteaser #6, the villain must call our bet of 100 chips 2/3 of the time so our ev with an A on the river is R = 150 * 1/3 + 250 * 2/3 = 650 /3.

Thus our EV with an A is 175.76 chips or 1.7576 pots.  Our EV loss with an Aces is 16/9 * 100 - 175.76 = 2.02 chips.

Since we hold the nuts half of the time, our overall EV loss is 1.01 chips.  So the EV of the actual game for the hero is about 87.87.  As you can see in the solution browser below, GTORB computes the EV as 87.83 which is within the given margin of error of 0.05 chips.

How does this EV loss effect optimal play?  Intuitively this is actually pretty simple.  Of course we still always should bet the nuts, and we should bet enough of our air that our opponent is indifferent between calling and folding to our turn bet.  In this game, where our opponent's EV when we hold the nuts is higher, we need a higher nuts to air ratio to maintain his indifference which means that we must bluff the turn less frequently.

Mathematically figuring out the optimal  bluffing frequency is a bit complex as we need to make sure to properly weight the probability of various river cards coming, using all of the villains information about his opponents range and his own hand.

The villains EV for calling the turn and then playing GTO on the river is 150 when his opponent holds air (On average he wins the entire river pot of 200, but 50 of those chips were his own).  When his opponent holds the nuts, on 3c or 2d runouts his EV is -50 because he called 50 and the turn and always folds the river.  On other all other runouts he calls the 50 on the turn plus an additional 100 on the river 2/3rds of the time for a total EV of -350/3 (-116.66).

If the hero is betting x% of his air and all of his nuts then when he bets he holds air x/(1+x) of the time and the nuts 1/(1+x) of the time and given that he holds the nuts, 3c or 2d come 2/44ths of the time.

And the villains EV for calling a turn bet of 50 is

150 * x / (1 + x) - 1/(1+x) * (100 * 2/44 + 350/3 * (1-2/44)).

Setting that equal to 0 and solving for x gives that the hero should bet 25/33 or 75.76% of his air on the turn and 43.1% of his turn betting range should be air which matches the GTORB solution precisely.  The villain still calls 2/3rds of both the turn and the river bet as that is all that is required to make the hero indifferent between bluffing his air and checking it.  This is our exact mathematical equilibrium solution, you can browse the approximate GTORB solution below.

I also solved the optimal bet sizing bonus question in a separate post here for those who are interested.

Takeaways

This example may seem trivial, but as it turns out, the existence of river runouts shift the range distributions of players and transfer information.  Being in a position to put as much money into the pot as possible when you have an informational edge over your opponent or when your equity distribution is polarized  and as little as possible when it is merged is very powerful.

I'll demo a much more powerful example of this in the next brainteaser where we will see an example of how equity transitions and river information can make protecting your hand via turn bets that never fold out better hands and are never called by worse hands, still be GTO, even if you were required to pay your opponent his turn hand equity when he folds worse.

Note on epsilon equilibrium:  One quick note on the GTORB solution which is a 0.05 chip (5/10,000ths of the pot) epsilon equilibrium. Due to the approximation techniques used, the GTORB strategy actually has the hero checking the nuts on the turn a tiny fraction of a % of the time.  This has almost no impact on the game EV or solution accuracy but it does mean that if you examine a river after both players check you will see the hero bet with a very low frequency.  This is because the approximate solution has him holding the nuts with a tiny probability.  These rounding errors are why the solution has a nash distance of 0.05 chips which in this case means an opponent who played perfectly could exploit the approximate GTO strategy for 0.05 chips out of the 100 chip pot.

Thursday, October 9, 2014

GTO Brainteaser #8 -- Solving the Turn, Theory vs Practice

The internal alpha version GTORB is now capable of solving turn scenarios for GTO turn and river play so this brainteaser is going to focus on multi-street theory.  In the solution (probably a week from today) I'll post the first fully browse-able GTORB turn solution to the model game below for those of you who are excited play around with a GTO turn strategy.  Note that the version of GTORB that can solve the turn won't be released commercially for a month or two as there are some performance / scalability issues that I need to solve before it is ready for mass use.  It will likely cost extra.

The problem

In GTO Brainteaser #6 I looked at a model scenario where the hero had a range of 50% nuts, 50% air while the villain had a range of 100% medium strength hands.  There was a 100 chip pot, 150 chip stacks and two streets of betting.  The hero could either bet 50 chips on the turn and then have the option to bet 100 chips on the river or he could shove the turn for 150 chips, and the question was which option is higher EV and what are GTO strategies for both players in this game.

The key simplification that made this scenario quite different from real world poker is that it was assumed that no river card was actually dealt, there were just two rounds of betting.

For those who are curious you can check out the full solution to brainteaser #6 here.  It turns out that it is optimal for the hero to bet 50 chips on the turn with all of his nut hands and 7/9ths of his air and then to bet 100 chips on the river when he is called with all of his nut hands and 3/7ths of his air hands.  The villain calls each of these bets 2/3rds of the time and folds 1/3rd.  The hero wins 8/9ths of the 100 chip pot in EV in this game.  Furthermore, it turns out that betting 50 chips on the turn and 100 chips on the river is the exact optimal bet sizing for the hero to maximize his EV, all other bet sizes are lower EV.

Lets now look at a very similar game.  Imagine the following (completely made up) scenario.

1. You are on the turn and the board is  AsAhKsKh
2. The hero has a hand range of AcAd and 3c2d
3. The villain has a hand range of KcKd
4. The pot is 100 chips and stacks are 150 chips
5. The hero can either bet 50 chips on the turn and 100 on the river or he can shove for 150 on the turn.

Clearly no matter what river card comes, the relative strengths of the hands in both players ranges will not change so in that respect this game seems identical to the model game from brainteaser #6.  AcAd will beat KcKd on every possible river and 3c2d will lose to KcKd on every possible river.

However, it turns out that GTO play in this game is different from GTO play in GTO Brainteaser #6.  Why?

1. What is the EV of the game for the hero, is it higher or lower?
2. What are the optimal strategies in this game and what is the hero's EV when both players play optimally?

Bonus:  Is betting half pot on the turn and the river still optimal or is there a higher EV bet size?