View Single Post
  #1679  
Old 05-11-2007, 02:52 PM
ShaneP ShaneP is offline
Member
 
Join Date: Aug 2006
Posts: 80
Default Re: NL Bots on Full Tilt

[ QUOTE ]
neverforget,

Can you address this?

[ QUOTE ]
These statistics alone are not enough to "prove" that they are from the same player. You can't use a "simple" hypothesis testing procedure like a chi-squared test, and that won't indicate anything. A statistician examining the likelihood they are from the same player needs to have to other peoples stats and see how they behave. How much do these stat differ across 100k+ hands tight-nitty players?

There is a second issue with that there may be some correlation in some of the variables. For example, the "flop aggression", "bet flop", "raise flop" and "c/r flop" are connected to each other and a player with a high "flop aggression" will reasonably have high stats in the other 3 categories. An analysis of the likelihood they are the same player must have any correlation effect isolated. How these correlated variables behave in reality can be examined by studying the correlation of other 100k+ hands players.

Extreme care must be taken to conduct any probability tests and it is too easy to use a poorly-designed statistical test that does not consider the matters I outlined above. This is a formidable full-time task and should be undertaken by somebody with a solid postgraduate education in statistics (or having similar experience). I am not defending anybody here, but it is too easy to get carried away and it is easy to "prove" they are from the same player, but such test would lack mathematical rigority.

I'm not a statistician/mathematician but these are just my views.

[/ QUOTE ]

[/ QUOTE ]

I know you asked Neverforget, but this is sort of what I had been saying earlier (brings up a new thing since I was only talking about VPiP though). I've done a few things like this before.

The big thing is that these 'statistical tests' most people used in this thread assumed IID distributions. That is, Independent and Identically Distributed. With that, the correct formula for SD is sqrt (P * (1-P) /N). So if we're looking at how many ones we roll on a six sided die in 1000 rolls, we can use the formula above, since the die doesn't remember the previous roll or is affected by anything outside (the day, what someone else rolled, etc...)

Now, with these poker stats, in most of them both I's are violated. Taking VPiP first--one enters with different hands from different positions. Thus the picks are not from identical distributions...early position might be a pick from a distribution with a 7% chance of success, and on the button it might be 20%. Thus the above formula for SD is incorrect (but probably somewhat close--that's why I used it and adjusted my interpretation of the results)

The independence comes in on any post-flop stat, and combinations of pre-flop stats. With these many numbers, a lot of people (and a good test) would be to look at groups of numbers to test them to see if they all could be so close. However, post-flop stats are affected by the pre-flop decisions (and so aren't independent of them) and other stats are also correlated. For a simple example, PFR <= VPiP. So statistical tests that combine several stats like that (or just look at later tests) also will not have independence.

So what do people do in cases like that? Well, if they have a model, they'll simulate it thousands or millions of times and look at the distribution that results. The simulations (if they consider everything) will take into account all the correlations--in this case, if we said to raise with TT from middle position, we'd account for the correlation with 'raise after the turn if we've got an overpair or better' automatically. From the distributions that were generated, 5% and 1% (or whatever % you want) bounds can then be constructed (5% of the data lies further away from the mean than that point) and then you can test the data. What the person above is suggesting is similar; instead of simulate, just get a lot of people who play similarly (tight aggressive set miners, it looks like) and see what their numbers look like.

Hope this helps, and hope I haven't stepped on neverforget's toes (or reply).

Shane
Reply With Quote