Friday, January 13, 2017

Brains vs AI: My Prediction and some Tips for the brains

My skype has been inundated with questions and prediction requests regarding the ongoing brains vs AI matchup so I thought I'd take some time to write down my official prediction and to also help point the brains in the right direction for beating the bot.

For those of you who haven't been following it, after the brains defeated Claudico in the last major human vs bot challenge, the latest AI from CMU/Alberta is named Liberatus and he is back to play 120,000 hands vs Dong Kim, Jason Les, Jimmy Chou, and Daniel Mcaulay.  Furthermore, after two days of play (~8k of the 120k hands) the AI is up against 3 of the 4 players and significantly (1500bbs) overall.

As a result the betting lines have moved such that the AI is now favored to win the whole thing after starting out as a 4-1 dog.  I'm going to boldly go on record as saying that the betting lines are wrong, the humans will stage a comeback, and the AI will not win this year.  All that is under the assumption that the humans actively look for leaks not just in its ranges but in its reactions to bet sizing.  If they just play their standard game they will likely lose.  I'll give some specific advice on how to attack the AI below.

For what its worth I think the technology to make a human level HUNLHE bot is there, but that it involves combining a lot of state of the art techonology in just the right way and I don't believe the researchers will get it right this try.  My medium to longer term outlook for the future of humanity in HUNLHE is very bleak.

How to beat a GTO bot

GTO bots are generally constructed around the principal of taking a set of pre-computed GTO solutions and then interpolating them (often with some learning component) to figure out how to react to bet sizing that is outside of the pre-computed game tree.  As far as I know the details of Liberatus' specific algorithms have not been released so I'll have to make some assumptions about the general construction of GTO bots.  Deepstack, a cutting edge bot that recently made some questionable claims about "beating" human professionals, has detailed more of their architecture in a published paper so I am basing some of this analysis on their approach.

Because of the way GTO bots are constructed, if you play within the precomputed GTO solutions bet sizing abstraction you are guaranteed to lose.  When HU limit hold'em  was solved it directly implicated that any version of NLHE which was restricted to a small number of "fixed" sizes, even if they are percentages of the pot rather than fixed amounts, was also solvable.  Anyone with a bit of programming experience and a budget could go to SPF, buy some preflop solution, and trivially make a GTO bot that would be unbeatable if you agreed in advance to only ever bet some specific pot %s, eg 50% or 100% pot postflop, always 3x, limp or fold pre, always 3-bet to 9, etc.

The only way to attack the bot is going to be to attack its bet sizing abstraction.  The difficult and to date unsolved part of building an unbeatable poker bot comes entirely from correctly determining how to react to bet sizings outside of its abstracted solutions.  Note that by the definition of GTO strategies you cannot play within its bet sizing abstraction but with non-standard ranges preflop and on the flop and then hope to somehow exploit it on the turn and river unless you can somehow go outside its bet sizing abstraction in a systematic exploitative way on those later streets.  Understanding that the only way to beat the bot is to attack its abstractions is the first key step.

Liberatus seems to be taking things a step further than the naive approach I suggested above, by resolving the turn and river dynamically during a hand, presumably with a large number of bet sizes.  This adaptation allows the bot to play a preflop/flop strategy that may be based on a GTO computation that only had 2 turn and river sizes, but then resolve the turn and river with a much larger set of bet sizings on the fly.  What this addition means is that if you play a strategy such that reaching the turn with a range that is close to what the preflop and flop components of the bots solution dictate, then you are likely already screwed.  It will be able to solve a very large version of that turn/river branch of the game tree with a large number of bet sizes and your ability to attack it on the turn and river will be very limited IF you play within its bet sizing abstraction preflop and on the flop.  

Thus the key to beating the bot is to find holes in the preflop and flop bet sizing abstraction.  In particular, one should look for weak reactions to non-standard 3-bet sizes and 4 bet sizes as a primary means of attack.  Flop check raises may be vulnerable as well.  The tricky part to this is doing so with a sensible range.  

I'm going to illustrate how you would attack a bot by using non-standard 3-bet sizing as an example.  This all assumes that one has unlimited time and unlimited resources which of course the brains in this challenge do not.  That said, a reasonable approach would be to do the following.

  1. Get a HUNL GTO preflop solution with the sizes the bot seems to use itself
  2. Run a few HUNL GTO preflop simulations with unusual 3-bet sizes, pick one that performs well even against a perfect response
  3. See if the bot ever shows down a hand that should be in the range of the solution from step 1 but should not be in the range of the solution from step 2
  4. If so you've found a leak
  5. Take the reaction to a 3-bet from step 1 and lock that strategy in to the solution you chose from step 2
  6. Observe what a minimally exploitative strategy is
  7. Keep an eye on what you observe in terms of the bots reaction ranges to your non-standard 3-bet size.  Its reaction strategy may be interpolated from two GTO solutions with bet sizes near your 3-bet size (eg if you 3-bet to 7 it might interpolate between a 3-bet to 5 and a 3-bet to 9) or it might be using some learning algorithm to try and reduce its mistakes over time, or they might be updating it at night
I think that if the brains use the next 10-20k hands to test the bots reactions to unusual preflop and flop sizes in situations where the odd sizing is only slightly inefficient to start with that they will be able to find some wholes that they can attack for the remainder of the match.

If they can consistently reach the turn in spots where the bots estimate of the GTO range for them (and it) to hold at that point is significantly wrong, then its dynamically solving will only lead it astray as it will input incorrect starting ranges and thus output an incorrect strategy.  The key is just to get outside of its bet sizing abstraction early in the hand were it has to be more sparse.

Despite the bad early start for the brains, I still think that it is unlikely that Liberatus is unexploitable and that assuming it is attempting to play near GTO then the brains, given time should be able to find those leaks and attack them without fear of counter exploitation.  As long as the brains realize that their "standard game" isn't sufficient and take a focused and structured approach to identifying leaks that the bot has as a result of its bet sizing abstraction and attacking them I think there is still hope for humanity.

Sunday, September 4, 2016

New MTT Focused Strategy Pack Released -- Plus Strategy Pack Sale through 9/21/2016

I'm very excited to announce the launch of my latest strategy pack which explores the underlying theory behind MTT tournaments focusing on early and midstage chip valuations, preflop strategy and postflop play. The pack is one of biggest undertakings I've done to date and includes new techniques for modeling skill edge and chip value in early/mid stage tournaments as well as a lot of specific strategy recommendations.  The pack is available for purchase in the GTO dojo here: A preview of the main video is below.

And through September 21st we're offering a special discount where anyone who buys the new MTT Theory and Practice strategy pack can get $50 off any one other strategy pack of their choice.  Just purchase both packs from our website and email and I'll refund $50 to you within 24 hours.

Note that the $50 off only applies to strategy packs that I made, so it applies to any pack in the GTO Dojo with the exception of the the Spins/HUSNG packs.

Thursday, August 18, 2016

MTT Preflop SPF Solution Pack: Late Position Opening and Big Blind Defense

As many of you know I'm hard at work on an MTT focused GTORB strategy pack, which will include a ton of simulation results analyzing chip EV and dollar EV at various points in the tournament, preflop play recommendations, postflop play analysis, and much more.

The first step towards that was working with a friend who is an MTT specialist to make a SPF preflop solution pack that we believe shows substantially stronger open ranges, 3-bet sizes, and defense range composition than existing population play or computational results.

The SPF pack is available here and is 5% off for GTORB readers with this link:

SPF solution packs are standalone and can be used with the free SPF version which is acceptable here:

I've also made a youtube video that briefly explains some of the methodology and shows some of the ranges from the solution pack which you can watch below.


Tuesday, July 12, 2016

July Strategy Pack Sale

Through the end of July I'm offering $50 off when you purchase any two strategy packs from the  Just purchase them both at the regular price and email mentioning the sale and I'll refund $50 of the purchase within 24 hours.

Wednesday, June 29, 2016

Calculate Aggregate GTO Action Frequencies with SPF

In my latest free youtube video I've made a quick tutorial on how to use Simple Postflop to compute average GTO c-bet frequencies so that you can directly compare those numbers to DB / HUD stats.  I used the data from the 6-max preflop solution pack to compute the GTO BTN vs BB C-betting frequency for a 50% pot c-bet.

Note there is one technical detail that I skimmed when presenting the methodology which is that, when aggregating the c-bet frequencies, you also need to weight each board not only by its normal weight, but by the OOP check frequency on that given board, because that prior action is required for us to reach a decision point where we could c-bet.  Since OOP checks near 100% on all boards in this case, the impact of this is minor, but  it is essential in cases where you are analyzing actions that occur later in the hand as if you do not incorporate the board specific rate of reaching the decision point you are interested in then you can significantly mis-weight various flops.

GTORB readers can purchase SimplePostflop at the 10% discount here:

You can also get the Simple Postflop preflop pack that I used at a discount here (its pack #4):

Monday, June 20, 2016

Check out the latest Thinking Poker Podcast for my interview

I had the privilege this week of being on the Thinking Poker podcast, which was a lot of fun.  You can check out the full interview here:

They analyzed the hand they presenting in episode 176 with the GTORB software to get a GTO perspective on the optimal play so make sure and check that out as well.

Tuesday, May 17, 2016

Check out my interview on Red Chip Poker

I was fortunate enough to get a chance to do an interview for Red Chip poker.  I had a lot of fun and think the podcast went really well, for those who are curious you can check it out here: