Results of a CP2-7 experiment

MarkGritter · #1 04-27-2007, 03:10 AM

I wrote some code which, given a list of hands, and a list of opponent Chinese Poker/2-7 settings, optimizes the hands to maximize value against the specified opponent strategy. It does this by picking the setting with the highest value, on average, against all the relevant settings in the list.

I've started by running a small experiment with just 1000 possible hands for each player. (In reality there are about 635 billion if we distinguish suits.) Player 'A' gets 1000 hands and player 'B' gets 1000 hands, initially set up randomly and possibly illegally.

Player 'A' goes first, and based on B's random settings, constructs his hands to win the most. Call this strategy A1.

Then player B constructs a strategy, B1, that tries to beat A1. B doesn't know which cards A has in any given confrontation, but he knows how A1 will set the possible opposing hands. (Because A and B are dealt from the same deck, not all the settings are revelant to a particular hand--- in fact, given a specific A hand, only 1% of the B hands are possible, which gives us about 10 possible opponents hands on average.)

Then go on to make strategies A2, B2, etc.

Now, if there is a 'pure' solution for CP2-7, we should expect A and B to converge to an expected value close to 0. One of them might have more good hands, of course, but they should at least agree on how much this "subset" of CP is worth. It should be slight positive for one of them and slightly negative for one of them.

This does not appear to be the case. Here are the estimated values for each strategy when competing against the previous one:

A1 estimated EV: $3.834 (at $1/point) vs. random B
B1 estimated EV: $1.522 vs. A1 (1000 changed hands)
A2: $0.300 vs. B1 (884 changed hands)
B2: $0.262 vs. A2 (635 changed hands)
A3: $0.125 vs. B2 (564 changed hands)
B3: $0.277 vs. A3 (543 changed hands)
A4: $0.136 vs. B3 (562 changed hands)
B4: $0.258 vs. A4 (539 changed hands)
A5: $0.135 vs. B4 (581 changed hands)
B5: $0.295 vs. A5 (557 changed hands)
A6: $0.138 vs. B5 (592 changed hands)
B6: $0.273 vs. A6 (566 changed hands)
A7: $0.147 vs. B6 (594 changed hands)
B7: $0.298 vs. A7 (573 changed hands)
A8: $0.145 vs. B7 (592 changed hands)

Note that whichever player knows the other's pure strategy can pick a counter-strategy that provides him with a substantial positive expectation. In fact, this strategy shows that hands which have a fixed best strategy appear to be the minority.

Here are some hands I traced that switch back and forth:
<font class="small">Code:</font><hr /><pre>
A1: 6dJcKcKhKs 2c4s5d6s8c QcQhQs
A2: 6d6sQcQhQs 2c4s5d8cJc KcKhKs
A3: same
A4: QcQhQsKcKh 2c4s5d6d8c 6sJcKs
A5: 6d6sQcQhQs 2c4s5d8cJc KcKhKs (= A2)

B1: 2h3d4s5s6d 3s6h7h9sTd QdKsAh
B2: 3d3s6d9sTd 2h4s5s6h7h QdKsAh
B3: 3s4s5s9sKs 2h3d6d7hTd 6hQdAh
B4: 3d3s6d9sTd 2h4s5s6h7h QdKsAh (= B2)
B5: 3s4s5s9sKs 2h3d6d7hTd 6hQdAh (= B3)

A2: 2s5s8sTsJs 2c5c6d8h9c ThKhAh
A3: 2c2s5c8h8s 5s6d9cJsKh ThTsAh
A4: 2s5s8sTsJs 2c5c6d8h9c ThKhAh (= A2)
A5: 2c2s5c8h8s 5s6d9cJsKh ThTsAh (= A3)
A6: 2s5s8sTsJs 2c5c6d8h9c ThKhAh (= A2)
</pre><hr />

Caveats:

1. My code could have a bug. I've eyeballed its decisions and they seem to make sense. I am willing to make it available for review.

2. Things might change between 2000 hands and 635 billion. I didn't construct the sample of hands using any special method, so it is unlikely that I happened to just "get lucky", but it is possible that my sample is not large enough to exhibit enough smoothness. (Or small enough to contain many degenerate cases.) I will try to run larger samples now that I have some confidence that it works.

3. A good pure strategy might exist but not be reachable using just local optimization. You might be able to globally optimize using game theory, making sub-maximal choices for some hands to prevent being exploited by your opponent. (However, note that most game-theoretic answers involve non-pure strategies.) I am willing to provide my data set to anybody who wants to try constructing a superior pure strategy.

However, I think that this result strongly argues that a "consider all the possibilities for these 13 cards and select the best" strategy is exploitable.

4. I may not be patient enough. Perhaps the process will arrive at a pure strategy after many, many iterations. I think that is unlikely given that most hands seem to be running in relatively short cycles, and there does not seem to be any progress toward convergence--- but it's a possibility.

Thread Tools
Show Printable Version Email this Page
Display Modes
Switch to Linear Mode Switch to Hybrid Mode Threaded Mode