Selecting a valid hand configuration for Monte-Carlo simulation

jukofyork · #1 04-28-2007, 04:36 AM

Q) Given a distribution of hands for each player (stored as 52x52 triangular matrix where the weights sum to unity), what is the best way to assign each player a hand that fits the distributions without introducing bias?

(1) Surely there must be a better way to do this than simply assign each player a hand independently from their distribution, then check if any of them have been assigned the same cards and keep restarting until a valid configuration is reached?

(2) How much bias am I likely to introduce if I select the first player's hand from their distribution, then re-weight the 2nd player's distribution and select his hand, and so on?

(3) Also, how much bias will occur if I assign each player a hand using method (1) and then if two players have the same cards in their hand, ONLY reassign those two players a new hand and keep doing that until I get a valid configuration?

I'll take a method which introduces a small amount of bias over a mega-slow method like (1) any-day, but I assume others must have come across this before and found some way round the slowness - any ideas?

Juk [img]/images/graemlins/smile.gif[/img]

jukofyork · #2 04-28-2007, 05:00 AM

So far my search of 2+2 seems pretty futile. Without going into the archives the only references to this problem are this from bachfan and this from Andrew Prock. [img]/images/graemlins/frown.gif[/img]

Also, searching for documents related to "Monte-Carlo" and poker tends to just throw up stuff about the "Monte-Carlo poker tournament" rather than stuff about Monte-Carlo simulation... Bleh.

Juk [img]/images/graemlins/smile.gif[/img]

jukofyork · #3 04-28-2007, 05:37 AM

Taken from http://www.cs.ualberta.ca/~darse/Papers/AAAISS99.html

[ QUOTE ]
5.2 Selective Sampling

When simulating a hand, we have specific information that can be used to bias the selection of cards. For example, a player who has been raising the stakes is more likely to have a strong hand than a player who has just called every bet. For each opponent, Loki maintains a probability distribution over the entire set of possible hands (the Weight Table), and the random generation of each opponent’s hole cards is based on those probabilities. Thus, we are biasing our selection of hole cards for the opponent to the ones that are most likely to occur.

[/ QUOTE ]

Taken from: http://www.cs.ualberta.ca/~darse/Papers/AAAI99.html

[ QUOTE ]
obvious_move = NO;

trials = 0;

while( ( trials <= MAX_TRIALS ) and ( obvious_move == NO ) )

{

trials = trials + 1;

Pos = current_state_of_the_game + ( selective_sampling to generate_missing_information );

for( each legal move m )

{

value[m] += Search( pos.m, info );

}

if( $ i such that value[ i ] >> value[ j ]( " j, j _ i ) )

{

obvious_move = YES;

}

}

select decision based on value[];

[/ QUOTE ]

I'll don't think I'll ever understand the mentality of the UofA poker research group: If it can't be repeated then why bother publishing it?

Hmmm... Could it have anything to do with the COMERCIALIZATION of "Poker Academy"?... Hmmm... I wonder... [img]/images/graemlins/smirk.gif[/img]

Juk [img]/images/graemlins/smile.gif[/img]

jukofyork · #4 04-28-2007, 07:29 AM

Just found some interesting ideas about how to go about this on the old Yahoo Poki group (see this thread and it's replies). Ignoring the idea of creating a huge sparse look-up table mentioned in that thread, I've so far come up with these methods, but none of them really seem ideal:

Input : WeightMatrix[10][52][52]
Output: PlayerHand[10]

Method 1: Full re-sampling on collision

Code:<hr /><pre>init CardNotUsedYet[52]

finished=false
while finished=false

for each player, I
PlayerHand[I]=ChooseHand(WeightMatrix[I]) { O(1) if we use gsl_ran_discrete() and O(n) to init at start }

finished=true
for each player, I
if either card in PlayerHand[I] set in CardNotUsedYet
finished=false
else
update CardNotUsedYet with PlayerHand[I]
</pre><hr />
NOTES: Very quick if collisions rare, else could be bad and might even not terminate if very constrained distributions are passed in. Not biased in any way.

Method 2: Partial re-sampling on collision

Code:<hr /><pre>init CardNotUsedYet[52]

for each player, I { using a random permutation of player ordering [ie: not 0..9]}
do
PlayerHand[I]=ChooseHand(WeightMatrix[I]) { O(1) if we use gsl_ran_discrete() and O(n) to init at start }
while either card in PlayerHand[I] set in CardNotUsedYet

update CardNotUsedYet with PlayerHand[I]
</pre><hr />
NOTES: Very quick if collisions rare, else could be bad and might even not terminate if very constrained distributions are passed in. Might still be badly biased?

Method 3: Re-adjust WeightMatrix after each assignment

Code:<hr /><pre>for each player, I { using a random permutation of player ordering [ie: not 0..9]}
PlayerHand[I]=ChooseHand(WeightMatrix[I]) { O(log2n) at best, using cumulative sum method }
for each remaining player, J
for each hand in weight matrix, K
if hand contains either card in PlayerHand[I], set WeightMatrix[J][][]=0
</pre><hr />
NOTES: Very slow [(10x9)/2x(1326/2) + (10x9)/2x1326 = 89505 opps per selection!], but will always terminate. Might still be biased though?

Method 4: Weighting of results using uniform distribution (thanks to Mogobu for this idea!):

Code:<hr /><pre>1. Use Knuth shuffle to assign each player a hand uniformly randomly.
2. Run the Monte-Carlo simulation using the uniform distribution but bias the result summation based on WeightMatrix[][][] entry for the hand the player was assigned.</pre><hr />
NOTES: Might require alot more simulation runs for the results to converge (but only on similar distributions to what would cause (1) and (2) to not terminate or run very slowly...), but very quick to initialize the player's holdings and shouldn't be biased because of the uniform distributions.

Does using a permutation of player orderings in (2) and (3) produce the same results as (1) or is it still biased? Can anybody think of any alternative/better algorithms to do this? Any ideas/help would be greatly appreciated! [img]/images/graemlins/smile.gif[/img]

Juk [img]/images/graemlins/smile.gif[/img]

PS: Alastair J. Walker's distribution selection method used in the GSL is outlined here.

SixthSense · #5 04-28-2007, 09:28 AM

[ QUOTE ]
Q) Given a distribution of hands for each player (stored as 52x52 triangular matrix where the weights sum to unity), what is the best way to assign each player a hand that fits the distributions without introducing bias?

(1) Surely there must be a better way to do this than simply assign each player a hand independently from their distribution, then check if any of them have been assigned the same cards and keep restarting until a valid configuration is reached?

[/ QUOTE ]

Juk,

I'm not sure if this would introduce bias, but why not just assign player #1's hand from the distribution, then assign player #2 and if there is a collision with player #1 then just reselect for #2, and repeat? E.g. for player #3 you'd have to check for a collision with player #1 and #2.

-Ben

jukofyork · #6 04-28-2007, 10:25 AM

[ QUOTE ]
[ QUOTE ]
Q) Given a distribution of hands for each player (stored as 52x52 triangular matrix where the weights sum to unity), what is the best way to assign each player a hand that fits the distributions without introducing bias?

(1) Surely there must be a better way to do this than simply assign each player a hand independently from their distribution, then check if any of them have been assigned the same cards and keep restarting until a valid configuration is reached?

[/ QUOTE ]

Juk,

I'm not sure if this would introduce bias, but why not just assign player #1's hand from the distribution, then assign player #2 and if there is a collision with player #1 then just reselect for #2, and repeat? E.g. for player #3 you'd have to check for a collision with player #1 and #2.

[/ QUOTE ]
Hi Ben, yes this is the idea I'm going with at the moment except I permute the order of players first:

Imagine if player #1 only plays AA and player #2 only plays AK and KQ. If you don't permute the order then the fact that player #1 has taken away two of the aces every time before player #2 gets his selection would mean that he would end up with KQ far more often than he should (as there will always only be 2 aces left to choose from). On the other hand, if you permute the player order then 50% of the time player #2 gets to choose when there are 4 aces still available and 50% of the time he has just 2 aces left to choose from.

I think using this method then the selection should be just the same as if we had assigned all players hands before we checked for collisions?

It seems also that by simply discarding the combinations that collide with the previous player's selections it has exactly the same effect as recomputing the weights after each selection (if we recompute them, then the invalidated weights would be set to 0).

I'm still not 100% sure though if their is some flaw in my thinking here and the algorithm is biased though? I've nearly finished the code and will post what I have when I've tested it a bit.

Juk [img]/images/graemlins/smile.gif[/img]

jukofyork · #7 04-28-2007, 06:06 PM

Here's my code so far:

Code:<hr /><pre>// ################################################## #########################
// # MONTE-CARLO HAND SELECTOR CLASS #
// ################################################## #########################

#pragma once

#ifndef MONTE_CARLO_HAND_SELECTOR_H
#define MONTE_CARLO_HAND_SELECTOR_H

// ************************************************** *************************
// * INCLUDES *
// ************************************************** *************************

#include <iostream>

#include "core/misc.h"

// We use the GSL functions for "General Discrete Distributions" to select from the distributions.
// NOTE: Couldn't get their RNG to link properly, so extracted and hacked the source using own RNG...
#include "discrete.h"

// ************************************************** *************************
// * CONSTANTS *
// ************************************************** *************************

// This is the number of times we will try selecting a hand for a player before
// we decide to restart with a new player order permutation.
#define MAX_SELECTION_ATTEMPTS 1000

// This is the number of times we will restarting with a new player order
// permutation before we decide to give in and fail.
#define MAX_PLAYER_ORDER_PERMUTATIONS 1000

// ************************************************** *************************
// * CLASS *
// ************************************************** *************************

class MonteCarloHandSelector {

public:

// This can be used to convert from the (triangular) 1D hands back to the 2D hands made from 2 cards.
int Map1DTo2D[(52*51)/2][2];

private:

// This is the lookup we create for each player.
gsl_ran_discrete_t* Lookups[10];

// This is the number of players we have created the lookups for.
// NOTE: set to -1 when the lookups are in the unallocated state.
int NumPlayersInLookups;

// ============================= Constructor =============================

public:

MonteCarloHandSelector()
{
// No lookups allocated yet.
NumPlayersInLookups=-1;

// Lets make arrays to convert 1d array back to the 2 cards that made it.
for (int X=0,T=0;X<51;X++) {
for (int Y=X+1;Y<52;Y++) {
Map1DTo2D[T][0]=X;
Map1DTo2D[T++][1]=Y;
}
}

} // End Constructor.

// ============================== Destructor =============================

// Make sure we don't leave the distribution lookups tables allocated.
~MonteCarloHandSelector() { UnallocateLookups(); }

// ============================ Public Members ===========================

// Use this function to (re-)initialize the distribution lookup tables for a certain set of distributions.
void InitializeLookups(const double Weights2D[10][51][52],const int NumPlayers);

// Use this functions to select hands after you've initialized the lookup tables.
// NOTE: Returns false if failed even after many re-tries, else returns true.
bool SelectHands(int* SelectedHands);

// =========================== Private Members ===========================

private:

// Use this to free the distribution lookup tables (still safe to call even if they are not allocated yet).
void UnallocateLookups(void);

}; // End MonteCarloHandSelector class.

// ################################################## #########################
// # MONTE-CARLO HAND SELECTOR MEMBER FUNCTIONS #
// ################################################## #########################

void MonteCarloHandSelector::InitializeLookups(const double Weights2D[10][51][52],const int NumPlayers)
{ // Use this function to (re-)initialize the lookup tables for a certain set of distributions..
// NOTE: Weights must have [][LOW][HIGH] (eg: Weights[][0][1] is OK, but Weights[][1][0] is not).

// This is the traiangular version which stores all possible hands in 1D vector.
double Weights1D[10][(52*51)/2];

// Lets make sure they weren't allready setup to an old distribution.
UnallocateLookups();

// Save the number of players we create the lookups for.
NumPlayersInLookups=NumPlayers;

// Convertg the 52x52 input to the triangular vector version.
for (int I=0;I<10;I++)
for (int X=0,T=0;X<51;X++)
for (int Y=X+1;Y<52;Y++)
Weights1D[I][T++]=Weights2D[I][X][Y];

// Lets create the lookup table for each player.
for (int I=0;I<NumPlayers;I++)
Lookups[I]=gsl_ran_discrete_preproc((52*51)/2,Weights1D[I]); // Use the gsl_function to create the lookup.

} // End InitializeLookups.

// ================================================== =========================

bool MonteCarloHandSelector::SelectHands(int* SelectedHands)
{ // Use this functions to select hands after you've initialized the lookup tables.
// NOTE: Returns false if failed even after many re-tries, else returns true.

int I;

// This is a list of the cards we have in use so far.
bool CardInUse[52];

// These are used for selecting the player order permutation using the Knuth-Shuffle.
int SwapPosition,TempPosition;

// This is used so we can create a random permuation of the player's order.
int PlayerOrder[10];

// This is the (permuted order) index of the player we have selected.
int CurrentPlayerIndex;

// This is the hand index we select from the distribution.
int HandIndex;

// These are used to so we can terminate the algorithm after too many failures.
int PermutationAttempts,SelectionAttempts;

// Check we have initialized the lookup tables.
if (NumPlayersInLookups==-1)
Error(FATAL,"Attempt to call SelectHands before Lookups created.");

// Lets try this a few player order permutations before we finally give in.
PermutationAttempts=0;
do {

// Lets set all the cards to unused.
for (int I=0;I<52;I++)
CardInUse[I]=false;

// Lets randomize the order of the players using a Knuth-Shuffle.
// NOTE: This is done so we don't select the cards for players with low indexes first.
for (int I=0;I<NumPlayersInLookups;I++)
PlayerOrder[I]=I;
for (int I=NumPlayersInLookups-1;I>0;I--) {

// Select a position between 0 and I (inclusive).
SwapPosition=RandInt(I);

// Swap them.
TempPosition=PlayerOrder[I];
PlayerOrder[I]=PlayerOrder[SwapPosition];
PlayerOrder[SwapPosition]=TempPosition;

}

// Using the randonly permuted player order, assign each player a hand in turn.
for (int I=0;I<NumPlayersInLookups;I++) {

// Set the index of the next player we are going to try to select a hand for.
CurrentPlayerIndex=PlayerOrder[I];

// Keep trying until we select a hand that hasn't been used yet.
// NOTE: This should have the same effect as re-weighting, as we simply ignore any impossible
// combinations that would have been set to 0 in the re-weighting process.
SelectionAttempts=0;
do {
HandIndex=gsl_ran_discrete(Lookups[CurrentPlayerIndex]); // Use the gsl_function to select in O(1).
} while ((CardInUse[Map1DTo2D[HandIndex][0]]==true || CardInUse[Map1DTo2D[HandIndex][1]]==true)
&& (++SelectionAttempts)<MAX_SELECTION_ATTEMPTS);

// Did we just fail to select a hand for this player?
if (SelectionAttempts>=MAX_SELECTION_ATTEMPTS) {
//Error(WARNING,"Max selection attempts exceeded, trying new player ordering.");
break;
}

// Update the cards we have now used.
CardInUse[Map1DTo2D[HandIndex][0]]=true;
CardInUse[Map1DTo2D[HandIndex][1]]=true;

// Lets set the player's hand.
SelectedHands[CurrentPlayerIndex]=HandIndex;

}

// DEBUGGING: Print information about the re-tries.
//cout << "SelectionAttempts : " << SelectionAttempts << endl;

// If we have exausted all our tries then we have failed.
if (SelectionAttempts>=MAX_SELECTION_ATTEMPTS
&& (++PermutationAttempts)>=MAX_PLAYER_ORDER_PERMU TATIONS) {
Error(WARNING,"Failed to assign all players hands in SelectHands().");
return false;
}

} while (SelectionAttempts>=MAX_SELECTION_ATTEMPTS);

// DEBUGGING: Print information about the re-tries.
//cout << "PermutationAttempts : " << PermutationAttempts+1 << endl << endl;

// If we got here then all is OK.
return true;

} // End SelectHands.

// ================================================== =========================

void MonteCarloHandSelector::UnallocateLookups(void)
{ // Use this to free the distribution lookup tables (still safe to call even if they are not allocated yet).
if (NumPlayersInLookups!=-1) {
for (int I=0;I<NumPlayersInLookups;I++)
gsl_ran_discrete_free (Lookups[I]);
NumPlayersInLookups=-1; // Not allocated now.
}
} // End UnallocateLookups.

#endif

// ################################################## #########################
// # MAIN: TEST #
// ################################################## #########################

void main(void)
{

double Weights[10][51][52]; // [Low][High].
int SelectedHands[10]; // Returned from SelectHands.

MonteCarloHandSelector M;

// Lets make a fake distirbution.
for (int I=0;I<10;I++) {
for (int X=0;X<51;X++) {
for (int Y=X+1;Y<52;Y++) {
if (X<10 && Y<10) // Lower these two threshold to tighten up distribution.
Weights[I][X][Y]=Rand01();
else
Weights[I][X][Y]=0.0;
}
}
}

// Ititilize it on the fake distributions we have made.
M.InitializeLookups(Weights,4);

// Run the selection algorithm many times to get an estimate of the speed.
// NOTE: The RNG speed is the main deciding factor here...
clock_t StartTime=clock();
for (int I=0;I<1000000;I++)
M.SelectHands(SelectedHands);
cout << "Test1: Time for 1000000 slections: "<< (double)(clock()-StartTime)/(double)CLOCKS_PER_SEC << " seconds" << endl << endl;

// Lets set a few weights for 4 players to test the output is OK.
for (int I=0;I<10;I++)
for (int X=0;X<51;X++)
for (int Y=X+1;Y<52;Y++)
Weights[I][X][Y]=0.0;
Weights[0][0][1]=1.0;
Weights[0][2][3]=1.0;
Weights[1][0][1]=1.0;
Weights[1][50][51]=1.0;
Weights[2][50][51]=1.0;
Weights[3][4][5]=1.0;
Weights[3][6][7]=1.0;

// Re-Ititilize it on the fake distributions we have made.
M.InitializeLookups(Weights,4);

// Select the hands again.
if (M.SelectHands(SelectedHands)==false) {
cout << "Test2: Too many re-tries... failed..." << endl;
}
else {
for (int I=0;I<4;I++)
cout << "Test2: Player" <<hr />
To read it easier, then select "quote" then copy the text message out from between the code tags and copy into a text file.

You will need to either get the GNU Scientific Library working and link with that, or else use the code snippets I've hacked out with a suitably chosen RNG to replace Rand01():

Code:<hr /><pre>/* randist/discrete.c
*
* Copyright (C) 1996, 1997, 1998, 1999, 2000 James Theiler, Brian Gough
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or (at
* your option) any later version.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/

/*
Random Discrete Events

Given K discrete events with different probabilities P[k]
produce a value k consistent with its probability.

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 of the
License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details. You should have received
a copy of the GNU General Public License along with this program;
if not, write to the Free Foundation, Inc., 59 Temple Place, Suite
330, Boston, MA 02111-1307 USA
*/

/*
* Based on: Alastair J Walker, An efficient method for generating
* discrete random variables with general distributions, ACM Trans
* Math Soft 3, 253-256 (1977). See also: D. E. Knuth, The Art of
* Computer Programming, Volume 2 (Seminumerical algorithms), 3rd
* edition, Addison-Wesley (1997), p120.

* Walker's algorithm does some preprocessing, and provides two
* arrays: floating point F[k] and integer A[k]. A value k is chosen
* from 0..K-1 with equal likelihood, and then a uniform random number
* u is compared to F[k]. If it is less than F[k], then k is
* returned. Otherwise, A[k] is returned.

* Walker's original paper describes an O(K^2) algorithm for setting
* up the F and A arrays. I found this disturbing since I wanted to
* use very large values of K. I'm sure I'm not the first to realize
* this, but in fact the preprocessing can be done in O(K) steps.

* A figure of merit for the preprocessing is the average value for
* the F[k]'s (that is, SUM_k F[k]/K); this corresponds to the
* probability that k is returned, instead of A[k], thereby saving a
* redirection. Walker's O(K^2) preprocessing will generally improve
* that figure of merit, compared to my cheaper O(K) method; from some
* experiments with a perl script, I get values of around 0.6 for my
* method and just under 0.75 for Walker's. Knuth has pointed out
* that finding _the_ optimum lookup tables, which maximize the
* average F[k], is a combinatorially difficult problem. But any
* valid preprocessing will still provide O(1) time for the call to
* gsl_ran_discrete(). I find that if I artificially set F[k]=1 --
* ie, better than optimum! -- I get a speedup of maybe 20%, so that's
* the maximum I could expect from the most expensive preprocessing.
* Folding in the difference of 0.6 vs 0.75, I'd estimate that the
* speedup would be less than 10%.

* I've not implemented it here, but one compromise is to sort the
* probabilities once, and then work from the two ends inward. This
* requires O(K log K), still lots cheaper than O(K^2), and from my
* experiments with the perl script, the figure of merit is within
* about 0.01 for K up to 1000, and no sign of diverging (in fact,
* they seemed to be converging, but it's hard to say with just a
* handful of runs).

* The O(K) algorithm goes through all the p_k's and decides if they
* are "smalls" or "bigs" according to whether they are less than or
* greater than the mean value 1/K. The indices to the smalls and the
* bigs are put in separate stacks, and then we work through the
* stacks together. For each small, we pair it up with the next big
* in the stack (Walker always wanted to pair up the smallest small
* with the biggest big). The small "borrows" from the big just
* enough to bring the small up to mean. This reduces the size of the
* big, so the (smaller) big is compared again to the mean, and if it
* is smaller, it gets "popped" from the big stack and "pushed" to the
* small stack. Otherwise, it stays put. Since every time we pop a
* small, we are able to deal with it right then and there, and we
* never have to pop more than K smalls, then the algorithm is O(K).

* This implementation sets up two separate stacks, and allocates K
* elements between them. Since neither stack ever grows, we do an
* extra O(K) pass through the data to determine how many smalls and
* bigs there are to begin with and allocate appropriately. In all
* there are 2*K*sizeof(double) transient bytes of memory that are
* used than returned, and K*(sizeof(int)+sizeof(double)) bytes used
* in the lookup table.

* Walker spoke of using two random numbers (an integer 0..K-1, and a
* floating point u in [0,1]), but Knuth points out that one can just
* use the integer and fractional parts of K*u where u is in [0,1].
* In fact, Knuth further notes that taking F'[k]=(k+F[k])/K, one can
* directly compare u to F'[k] without having to explicitly set
* u=K*u-int(K*u).

* Usage:

* Starting with an array of probabilities P, initialize and do
* preprocessing with a call to:

* gsl_rng *r;
* gsl_ran_discrete_t *f;
* f = gsl_ran_discrete_preproc(K,P);

* Then, whenever a random index 0..K-1 is desired, use

* k = gsl_ran_discrete(r,f);

* Note that several different randevent struct's can be
* simultaneously active.

* Aside: A very clever alternative approach is described in
* Abramowitz and Stegun, p 950, citing: Marsaglia, Random variables
* and computers, Proc Third Prague Conference in Probability Theory,
* 1962. A more accesible reference is: G. Marsaglia, Generating
* discrete random numbers in a computer, Comm ACM 6, 37-38 (1963).
* If anybody is interested, I (jt) have also coded up this version as
* part of another software package. However, I've done some
* comparisons, and the Walker method is both faster and more stingy
* with memory. So, in the end I decided not to include it with the
* GSL package.

* Written 26 Jan 1999, James Theiler, jt@lanl.gov
* Adapted to GSL, 30 Jan 1999, jt

*/

#include <stdio.h> /* used for NULL, also fprintf(stderr,...) */
#include <stdlib.h> /* used for malloc's */
#include <math.h>

#include "core/misc.h"
#include "core/random.h"

#define DEBUG 0
#define KNUTH_CONVENTION 1 /* Saves a few steps of arithmetic
* in the call to gsl_ran_discrete()
*/

/*** Begin Stack (this code is used just in this file) ***/

/* Stack code converted to use unsigned indices (i.e. s->i == 0 now
means an empty stack, instead of -1), for consistency and to give a
bigger allowable range. BJG */

typedef struct {
size_t N; /* max number of elts on stack */
size_t *v; /* array of values on the stack */
size_t i; /* index of top of stack */
} gsl_stack_t;

static gsl_stack_t *
new_stack(size_t N) {
gsl_stack_t *s;
s = (gsl_stack_t *)malloc(sizeof(gsl_stack_t));
s->N = N;
s->i = 0; /* indicates stack is empty */
s->v = (size_t *)malloc(sizeof(size_t)*N);
return s;
}

typedef struct { /* struct for Walker algorithm */
size_t K;
size_t *A;
double *F;
} gsl_ran_discrete_t;

static void
push_stack(gsl_stack_t *s, size_t v)
{
if ((s->i) >= (s->N)) {
fprintf(stderr,"Cannot push stack!\n");
abort(); /* FIXME: fatal!! */
}
(s->v)[s->i] = v;
s->i += 1;
}

static size_t pop_stack(gsl_stack_t *s)
{
if ((s->i) == 0) {
fprintf(stderr,"Cannot pop stack!\n");
abort(); /* FIXME: fatal!! */
}
s->i -= 1;
return ((s->v)[s->i]);
}

static size_t size_stack(const gsl_stack_t *s)
{
return s->i;
}

static void free_stack(gsl_stack_t *s)
{
free((char *)(s->v));
free((char *)s);
}

/*** End Stack ***/

/*** Begin Walker's Algorithm ***/

gsl_ran_discrete_t *
gsl_ran_discrete_preproc(size_t Kevents, const double *ProbArray)
{
size_t k,b,s;
gsl_ran_discrete_t *g;
size_t nBigs, nSmalls;
gsl_stack_t *Bigs;
gsl_stack_t *Smalls;
double *E;
double pTotal = 0.0, mean, d;

if (Kevents < 1) {
/* Could probably treat Kevents=1 as a special case */

Error(FATAL,"number of events must be a positive integer");
}

/* Make sure elements of ProbArray[] are positive.
* Won't enforce that sum is unity; instead will just normalize
*/

for (k=0; k<Kevents; ++k) {
if (ProbArray[k] < 0) {
Error(FATAL,"probabilities must be non-negative");
}
pTotal += ProbArray[k];
}

/* Begin setting up the main "object" (just a struct, no steroids) */
g = (gsl_ran_discrete_t *)malloc(sizeof(gsl_ran_discrete_t));
g->K = Kevents;
g->F = (double *)malloc(sizeof(double)*Kevents);
g->A = (size_t *)malloc(sizeof(size_t)*Kevents);

E = (double *)malloc(sizeof(double)*Kevents);

if (E==NULL) {
Error(FATAL,"Cannot allocate memory for randevent");
}

for (k=0; k<Kevents; ++k) {
E[k] = ProbArray[k]/pTotal;
}

/* Now create the Bigs and the Smalls */
mean = 1.0/Kevents;
nSmalls=nBigs=0;
for (k=0; k<Kevents; ++k) {
if (E[k] < mean) ++nSmalls;
else ++nBigs;
}
Bigs = new_stack(nBigs);
Smalls = new_stack(nSmalls);
for (k=0; k<Kevents; ++k) {
if (E[k] < mean) {
push_stack(Smalls,k);
}
else {
push_stack(Bigs,k);
}
}
/* Now work through the smalls */
while (size_stack(Smalls) > 0) {
s = pop_stack(Smalls);
if (size_stack(Bigs) == 0) {
(g->A)[s]=s;
(g->F)[s]=1.0;
continue;
}
b = pop_stack(Bigs);
(g->A)[s]=b;
(g->F)[s]=Kevents*E[s];
#if DEBUG
fprintf(stderr,"s=%2d, A=%2d, F=%.4f\n",s,(g->A)[s],(g->F)[s]);
#endif
d = mean - E[s];
E[s] += d; /* now E[s] == mean */
E[b] -= d;
if (E[b] < mean) {
push_stack(Smalls,b); /* no longer big, join ranks of the small */
}
else if (E[b] > mean) {
push_stack(Bigs,b); /* still big, put it back where you found it */
}
else {
/* E[b]==mean implies it is finished too */
(g->A)[b]=b;
(g->F)[b]=1.0;
}
}
while (size_stack(Bigs) > 0) {
b = pop_stack(Bigs);
(g->A)[b]=b;
(g->F)[b]=1.0;
}
/* Stacks have been emptied, and A and F have been filled */

if ( size_stack(Smalls) != 0) {
Error(FATAL,"Smalls stack has not been emptied");
}

#if 0
/* if 1, then artificially set all F[k]'s to unity. This will
* give wrong answers, but you'll get them faster. But, not
* that much faster (I get maybe 20%); that's an upper bound
* on what the optimal preprocessing would give.
*/
for (k=0; k<Kevents; ++k) {
(g->F)[k] = 1.0;
}
#endif

#if KNUTH_CONVENTION
/* For convenience, set F'[k]=(k+F[k])/K */
/* This saves some arithmetic in gsl_ran_discrete(); I find that
* it doesn't actually make much difference.
*/
for (k=0; k<Kevents; ++k) {
(g->F)[k] += k;
(g->F)[k] /= Kevents;
}
#endif

free_stack(Bigs);
free_stack(Smalls);
free((char *)E);

return g;
}

size_t
gsl_ran_discrete(const gsl_ran_discrete_t *g)
{
size_t c=0;
double u,f;
u = Rand01(); // ### JUK: Replace with any suitable RNG (eg: ran1() from numerical recipies).
#if KNUTH_CONVENTION
c = (size_t)(u*(g->K));
#else
u *= g->K;
c = u;
u -= c;
#endif
f = (g->F)[c];
/* fprintf(stderr,"c,f,u: %d %.4f %f\n",c,f,u); */
if (f == 1.0) return c;

if (u < f) {
return c;
}
else {
return (g->A)[c];
}
}

void gsl_ran_discrete_free(gsl_ran_discrete_t *g)
{
free((char *)(g->A));
free((char *)(g->F));
free((char *)g);
}

double
gsl_ran_discrete_pdf(size_t k, const gsl_ran_discrete_t *g)
{
size_t i,K;
double f,p=0;
K= g->K;
if (k>K) return 0;
for (i=0; i<K; ++i) {
f = (g->F)[i];
#if KNUTH_CONVENTION
f = K*f-i;
#endif
if (i==k) {
p += f;
} else if (k == (g->A)[i]) {
p += 1.0 - f;
}
}
return p/K;
}</pre><hr />

So far I've not tested it on any real-world distribution and only some examples to see how it gets on with fake "tight distributions".

Can anybody see any way to improve this? It it biased?

Juk [img]/images/graemlins/smile.gif[/img]

stinkypete · #8 04-29-2007, 03:57 AM

[ QUOTE ]

(1) Surely there must be a better way to do this than simply assign each player a hand independently from their distribution, then check if any of them have been assigned the same cards and keep restarting until a valid configuration is reached?

[/ QUOTE ]

here's an idea that may be feasible depending on the application.

use method 1 as above, but if you're finding a large number of "collisions", forget monte carlo and enumerate all the possible combinations and apply weights accordingly.

this should work since in practice, if you're getting a lot of collisions, you're probably selecting from a small number of hand combinations anyway and enumerating them all shouldn't be too much work.

in some extreme cases, this wouldn't be particularly feasible - but it would probably work well in practice. (examples when it wouldn't work: if there's 10 players with all having wide ranges, or if there's 3 players who all have AK 99% of the time and any of the 168 other hands 1% of the time)

jukofyork · #9 05-02-2007, 06:16 PM

[ QUOTE ]
[ QUOTE ]

(1) Surely there must be a better way to do this than simply assign each player a hand independently from their distribution, then check if any of them have been assigned the same cards and keep restarting until a valid configuration is reached?

[/ QUOTE ]

here's an idea that may be feasible depending on the application.

use method 1 as above, but if you're finding a large number of "collisions", forget monte carlo and enumerate all the possible combinations and apply weights accordingly.

this should work since in practice, if you're getting a lot of collisions, you're probably selecting from a small number of hand combinations anyway and enumerating them all shouldn't be too much work.

in some extreme cases, this wouldn't be particularly feasible - but it would probably work well in practice. (examples when it wouldn't work: if there's 10 players with all having wide ranges, or if there's 3 players who all have AK 99% of the time and any of the 168 other hands 1% of the time)

[/ QUOTE ]
Hi, sorry for the slow reply - been away for a few days, then had problems logging onto 2+2 until I cleared my IE cache and cookies.

Yep, I agree that enumerating is prolly a good idea for the cases where the ranges are very constrained and possibly even adding something too do it after so many failures (I think detecting an impossible configuration before you run though is going to be O(n!), as you will has to test all possible player orderings to be sure).

I haven't had time to do much more with the code since my last post, but overall it seems to be working quite well and sometime over the next few days I'll write a simple CLI app to let you input a hand vs a distribution of opponent hands and maybe somebody will find it useful and/or extend it to do something more useful.

Juk [img]/images/graemlins/smile.gif[/img]

matt42s · #10 05-08-2007, 11:54 PM

[ QUOTE ]
Imagine if player #1 only plays AA and player #2 only plays AK and KQ. If you don't permute the order then the fact that player #1 has taken away two of the aces every time before player #2 gets his selection would mean that he would end up with KQ far more often than he should (as there will always only be 2 aces left to choose from). On the other hand, if you permute the player order then 50% of the time player #2 gets to choose when there are 4 aces still available and 50% of the time he has just 2 aces left to choose from.

[/ QUOTE ]

There is no bias introduced with a partial resample on collision.
When you say 'has 2 aces left to choose from' - what you really mean is '50% of the time the algorithm will encounter collisions if P2 chooses an Ace which was already used (50% x 50%)'

P2 by definition within your constraints must be dealt KQ far more often than AK.
Consider this - when P2 is first, 50% of the time P1 only has 3 aces to choose from. He still gets AA every time but the algorithm will encounter collisions with 1 of the aces when it has been used. The only variable is which player's cards collide and how many collisions occur.

Further to that, my code determines the 'tightness'/size of a players weight table and allocates the tightest players cards first, I use a selective resample on collision. Due to the ordering, collisions are minimised.

I have also had success using a pre-generated table of random numbers, at first it was only to remove the RNG code/overhead from the inner loop, but reusing a sufficiently long table has caused significant speed improvement and no notable bias (the requirements for random poker are FAR less than the double precision, infinite length stream provided by the algorithm you are using) Even if you're not convinced yet, write your code to use a random table lookup and populate it before entering the MonteCarlo routines - in the real world, populate it while the bot dealing is shuffling [img]/images/graemlins/smile.gif[/img]