Two Plus Two Newer Archives Fictitious play for multi-player games
 FAQ Members List Calendar Search Today's Posts Mark Forums Read

#1
11-15-2007, 12:41 AM
 jukofyork Senior Member Join Date: Sep 2004 Location: Leeds, UK. Posts: 2,551
Fictitious play for multi-player games

If fictitious play is used to compute a NE in a multi-player game where it is possible for a player to "spite" another (such as in SNGs), then is it correct to assume that each player will attempt to only rationally maximise their own EV for the update rule?

This would mean that you would roughly use this update algorithm:

1. Init strategy to something arbitrary (A).
2. Find the maximal exploitative strategy to A (B).
3. Find the maximal exploitative strategy to a player who plays A or B each with a 50% chance of being selected (C).
4. Find the maximal exploitative strategy to a player who plays A or B or C each with a 33.3% chance of being selected (D).
5. Find the maximal exploitative strategy to a player who plays A or B or C or D each with a 25% chance of being selected (E).
.
.
.
N. Stop when no further exploitative strategy can be found against the strategy collection (or until some reasonable exploitative EV threshold is reached).

This is the basic idea that the ICM Nash Calculator is using.

But, by just assuming that each player will attempt to rationally maximise their own EV seems to ignore the fact that they could also gain EV by another method: You could give up some of your own EV to cost an opponent even more which in turn would force the opponent to change their strategy possibly meaning that you gain EV by threatening to give up some (ie: by partially minimizing your opponent's EV).

This seems to fly in the face of what I know about NE states though, as it shouldn't be possible for a player to deviate profitably from a NE (assuming the above algorithm really does converge to a NE).

Is the reasoning flawed here somewhere? If so, can you think of a simple game as an example?

Is the above algorithm flawed? If so, then what alteration would be required for the update rule?

This post is related to a post in the STT forum which discusses taking a -EV play now to force your opponent to alter his strategy so as to possibly gain more EV in the future. The reason this might be correct is that the play only costs you ~\$30, yet costs your opponent ~\$300:

http://forumserver.twoplustwo.com/sh...age=0&amp;vc=1

This led me to wondering what is the correct method to compute a NE where it's possible to "spite" an opponent like this:

http://forumserver.twoplustwo.com/sh...age=0&amp;vc=1

Juk [img]/images/graemlins/smile.gif[/img]
#2
11-15-2007, 05:43 AM
 plexiq Senior Member Join Date: Apr 2007 Location: Vienna Posts: 138
Re: Fictitious play for multi-player games

[ QUOTE ]
This seems to fly in the face of what I know about NE states though, as it shouldn't be possible for a player to deviate profitably from a NE (assuming the above algorithm really does converge to a NE).

[/ QUOTE ]

Well, the definition of a NE is still satisfied. No player can unilaterally deviate from the NE to gain value. Thats true in your example - if only 1 player deviates, he cant improve.

If you assume bot-like players who wont deviate from the NE no matter what happens, then your best choice is to play the NE as well.

That said, i think the problem you describe is inherent in the definition of the NE. It doesnt really matter what algorithm we would use to find (or approximate) the NE.

Some kind of "raw" first idea, didnt really think it through yet:
Instead of optimizing the "current" equity (ie, playing maximally exploitative), each player tries to "drag" the strategies in a direction that will give him better equity than the current state - but only as long as his deviation from maximally exploitative play costs the respective opponents more EV than him.

This should converge to a more "robust" set of strategies. But then, these strategies will be easily exploitable by opponents who simply skip their "spite calls".

This gets pretty interesting if you think about it. If we draw random players from a population of 50% NE, and 50% "spite callers" and put them into a game, the spite caller population would have a higher expectation in this game, i think.

Need to think it through before posting any more. I hope the above makes any sense, lol.
#3
11-15-2007, 12:25 PM
 The 13th 4postle Senior Member Join Date: Oct 2006 Posts: 378
Re: Fictitious play for multi-player games

Since poker is a mixed strategy game. There will be multiple NE. One set of decisions is not the right play but mixing up your strategy is more profitable because it is a repeated game.

Anything that makes your opponent play differently, that you can take advantage of in the future is optimal and if you are able to do it you should. However, that's harder to do online then live.
#4
11-15-2007, 01:19 PM
 trojanrabbit Senior Member Join Date: Aug 2004 Location: dominated and covered Posts: 188
Re: Fictitious play for multi-player games

I think the difference lies in the definition of the "game." Fictitious play will work (I've used it) if you assume the current hand is a one-shot deal. There are no more interactions after the current hand. However if you extend the definition of the game to cover multiple hands then it gets a lot more complicated.

A perfect example is when there is a big stack bullying the table near the bubble. Nash says the big stack should raise almost every hand and the small stacks should almost always fold. However just being in this situation is -EV for the small stacks. It would be in a small stacks long-term interest to call more liberally and punish the raiser. This will attempt to get the big stack to stop his bullying. But you have to take a -EV move now in order to try and stop being in continually -EV situations in the future.

But that would be way too complicated to figure out with a computer...

Tysen
#5
11-15-2007, 01:39 PM
 jukofyork Senior Member Join Date: Sep 2004 Location: Leeds, UK. Posts: 2,551
Re: Fictitious play for multi-player games

[ QUOTE ]
Some kind of "raw" first idea, didnt really think it through yet:
Instead of optimizing the "current" equity (ie, playing maximally exploitative), each player tries to "drag" the strategies in a direction that will give him better equity than the current state - but only as long as his deviation from maximally exploitative play costs the respective opponents more EV than him.

[/ QUOTE ]
Yep, this is what I was thinking, but "dragging" the values could be very computationally expensive to try. The basic idea would be to somehow "drag" your own strategy into the space where it is -EV for you and see how that effects your opponents maximally exploitative strategy. The current update rule never considers these -EV calls.

Perhaps rather than "dragging" this could be accomplished by some kind of recursive update rule which is about order O(n) more complex? One idea would be to find the gradient of EV change for you for each variable of the strategy and then update your strategy variable by moving in the direction which increases EV for you (as opposed to updating it based on whether it is +EV or -EV for you to play against the current opposing strategy).

I've still not thought about this much yet so the idea might be flawed or there might be a much simpler way to combine the maximally exploitative strategy with the maximally spiteful strategy and update the rules based on both.

[ QUOTE ]
This should converge to a more "robust" set of strategies. But then, these strategies will be easily exploitable by opponents who simply skip their "spite calls".

[/ QUOTE ]
I don't think it could really be exploited, as it's the threat of the spite calls more than the calls itself that's important. The equilibrium should mean that if player A deviates by not spite calling player B anymore then player B won't be making the pushes that are punished by the spite calls anyway so nothing has changed. If the player B decides to push these anyway knowing that he'll be spite called then he's just made his strategy -EV compared to if he respected the player A's spite calls.

[ QUOTE ]
This gets pretty interesting if you think about it. If we draw random players from a population of 50% NE, and 50% "spite callers" and put them into a game, the spite caller population would have a higher expectation in this game, i think.

[/ QUOTE ]
That's quite interesting and would make an interesting experiment. What would happen if you tried to train up a maximally exploitative strategy to play against this mixed NE/spite player? Perhaps this would be a more robust strategy than NE alone?

[ QUOTE ]
Need to think it through before posting any more. I hope the above makes any sense, lol.

[/ QUOTE ]
Yep, some of my ideas might be totally off here too - I've just woke up and not really thought too carefully about all this yet, but overall it makes for some interesting thinking!

Juk [img]/images/graemlins/smile.gif[/img]
#6
11-15-2007, 02:08 PM
 jukofyork Senior Member Join Date: Sep 2004 Location: Leeds, UK. Posts: 2,551
Re: Fictitious play for multi-player games

[ QUOTE ]
I think the difference lies in the definition of the "game." Fictitious play will work (I've used it) if you assume the current hand is a one-shot deal. There are no more interactions after the current hand. However if you extend the definition of the game to cover multiple hands then it gets a lot more complicated.

A perfect example is when there is a big stack bullying the table near the bubble. Nash says the big stack should raise almost every hand and the small stacks should almost always fold. However just being in this situation is -EV for the small stacks. It would be in a small stacks long-term interest to call more liberally and punish the raiser. This will attempt to get the big stack to stop his bullying. But you have to take a -EV move now in order to try and stop being in continually -EV situations in the future.

But that would be way too complicated to figure out with a computer...

[/ QUOTE ]
Yep, I guess this would require expanding the game tree out to be able to see the blinds moving and the big stack getting into more and more +EV bullying situations. Perhaps it could be expanded into the next hand (or even next few hands) and still be computationally tractable? Not sure how much better the solutions would be though.

Juk [img]/images/graemlins/smile.gif[/img]
#7
11-20-2007, 03:24 PM
 plexiq Senior Member Join Date: Apr 2007 Location: Vienna Posts: 138
Re: Fictitious play for multi-player games

[ QUOTE ]
A perfect example is when there is a big stack bullying the table near the bubble. Nash says the big stack should raise almost every hand and the small stacks should almost always fold. However just being in this situation is -EV for the small stacks. It would be in a small stacks long-term interest to call more liberally and punish the raiser. This will attempt to get the big stack to stop his bullying. But you have to take a -EV move now in order to try and stop being in continually -EV situations in the future.

But that would be way too complicated to figure out with a computer...

Tysen

[/ QUOTE ]

I think the example is actually mixing in a different problem.

One part of the problem you describe boils down to flaws of ICM. ICM overestimates midstack-equities at the bubble, and underestimates bigstack equity. If we had access to a better EQ-estimation, midstacks would automatically call wider, because relative equities of folding/busting/doubling up would change.

With ICM we have lots of scenarios where players are expected to win/lose equity during the next orbit. This should never be the case with an accurate EQ model.

As i understand it, thats to be seen "separated" from our original problem: That the NE is usually a bad state for the caller, because he is actually in the position to "force" the pusher into a more favorable state. I think this is a problem with the NE altogether. Maybe i can think of some toy game to better demonstrate my though,...(hopefully i wont forget about the thread, again [img]/images/graemlins/laugh.gif[/img])
#8
11-21-2007, 10:01 AM
 plexiq Senior Member Join Date: Apr 2007 Location: Vienna Posts: 138
Re: Fictitious play for multi-player games

Ok, here is a toy game featuring a "spite-calling" situation:

Basic game is the same as in Math of Poker, pg 127.

*) Every player is dealt a hand in [0...1].
*) SB ("Pusher") can push or fold
*) If SB pushes, BB ("Caller") can call or fold.
*) If there is a showdown, the player with the higher hand has 2/3 pot equity.

We use stacks of 5BB (SB=0.5, BB=1).

So far, thats just "normal" HeadsUp - and there s no possible spite-calling. After all, we are still in a zero-sum game atm. The NE for this "base game" is: Pusher: 70%, Caller: 56%.

Now lets add the possibility to "spite call":
We now change the game, such that the players will convert their stacks to money after the game, and the players goal is to optimize their \$EV. However, the conversion is non-linear. Their stack will be converted to money by payout(chips)=sqrt(chips). (Any strictly growing function will do, as long as it grows "slower than linear". Sqrt is an arbitrary choice.)

This models to some degree the situation of an SNG, because doubling up in chips will now be worth less than double \$.

In this modified game, the NE would be:
Pusher: Top 100%, Caller: 8.6%.

We are only 5BB deep, and NE suggests that BB is only calling 8.6% against an ATC push. Alright so far [img]/images/graemlins/smile.gif[/img]

In the plot you can see that the Caller can deal huge "EV-damage" to the pusher, by sacrificing very little EV himself. I think that the NE is unsuitable in this situation, because the caller could clearly "force" the pusher into a more favorable state.
#9
11-21-2007, 10:41 AM
 Paxinor Member Join Date: Sep 2006 Posts: 87
Re: Fictitious play for multi-player games

to simulate a sit n go properly, wouldn't it be suitable to create a game where the sum of \$EV is always the same? i mean this is the crucial point, because it needs to be a zero sum game! and ICM is a zero sum game too...
#10
11-21-2007, 10:54 AM
 plexiq Senior Member Join Date: Apr 2007 Location: Vienna Posts: 138
Re: Fictitious play for multi-player games

What we want to simulate here, is the SB-vs-BB "subgame", after n other players folded. ICM isnt zero sum in this situation (if we only look at the involved players).

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home Two Plus Two     Two Plus Two Internet Magazine     The Two Plus Two Bonus Program     Special Sklansky Forum     About the Forums     MOD DISCUSSION     Test General Poker Discussion     Beginners Questions     Books and Publications     Televised Poker     News, Views, and Gossip     Brick and Mortar     Home Poker     Poker Beats, Brags, and Variance     Poker Theory     Poker Legislation Coaching/Training     Stoxpoker.com     DeucesCracked.com German Forums     Poker Allgemein: Poker in general     Strategie: Holdem NL cash [German]     Strategie: Sonstige     Internet/Online [German]     BBV [German]     Small Talk [German] French Forums     Forum Francophone     Strategie [French]     BBV [French] Limit Texas Hold'em     Texas Hold'em     High Stakes Limit     Medium Stakes Limit     Small Stakes Limit     Micro Stakes Limit     Mid-High Stakes Shorthanded     Small Stakes Shorthanded     Limit-->NL PL/NL Texas Hold'em     High Stakes     Medium Stakes     Small Stakes     Micro Stakes     Full Ring Tournament Poker     MTT Strategy     High Stakes MTT     MTT Community     STT Strategy     Tournament Circuit/WSOP Other Poker     Omaha/8     Omaha High     Stud     Heads Up Poker     Other Poker Games General Gambling     Probability     Psychology     Sports Betting     Other Gambling Games     Entertainment Betting     Money Making and Other Business Discussion Internet Gambling     Internet Gambling     Internet Bonuses     Affiliates/RakeBack     Software     Poker Site Software, Skins, & Networks 2+2 Communities     Other Other Topics     The Lounge: Discussion+Review     EDF     BBV4Life Other Topics     Sporting Events     Politics     Business, Finance, and Investing     Travel     Science, Math, and Philosophy     Health and Fitness     Student Life     Golf     Video Games     Puzzles and Other Games     Laughs or Links!     Computer Technical Help     Bin Sponsored Support Forums     RakebackNetwork     RakeBackDepot     RakeReduction.com Rakeback     PokerSavvy

All times are GMT -4. The time now is 08:54 PM.