PDA

View Full Version : Testing hypothesis that I'm a winning player


fiskebent
01-30-2006, 06:36 AM
I have a fairly small sample (4.4K) of NL50 cash game hands. I want to test the hypothesis that I'm a winning player at this level.

I have won $11 per 100 hands so far with a SD of 30.

Is it correct that I should do a z-test? It's been a while since I took a statistics class, so I'm very rusty on this.

Anyway, as far as I can see, I can calculate z as
z = (mean(x) - 0) / (SD / sqrt(n)) or in my case
z = (11 - 0) / (30 / sqrt(44))
z = 2.432192

Then I calculate the probability of getting 2.432192 or more in a standard normal distribution (mean 0, SD 1) and get 0.75%. Which means that I can safely say that I'm a winning player.

Is this correct?

mittman84
01-30-2006, 01:49 PM
this (http://www.svenskpoker.com/math.php?hands=&bb100=&std100=&ci=&btnCalc=calcula te) might be something you want to look at

fiskebent
01-30-2006, 07:04 PM
Thanks for your link.

It gives a double sided z-test, but it gives about the same result, so I think I my method is correct. It's nice to be able to say that I'm a winning player with 99.25% confidence /images/graemlins/smile.gif

Here's another good site here that explains z-tests fairly well along with some examples: http://en.wikibooks.org/wiki/Statistics:Testing_Data/z-tests

Gugel
01-30-2006, 07:58 PM
Is standard deviation displayed somewhere on pt?

fiskebent
01-30-2006, 09:34 PM
I don't know. I don't have pokertracker.

MexKrax
01-30-2006, 10:52 PM
[ QUOTE ]
Is standard deviation displayed somewhere on pt?

[/ QUOTE ]

Yes, on the sessions tab, click more detail.

VickreyAuction
01-31-2006, 02:28 AM
How did you find SD/100 without PT? I've been wondering about my SD/100 for awhile.

fiskebent
01-31-2006, 04:43 AM
I play at Pokerroom. If you enable statistics their client creates an SQLite database on your hard disk. I've figured out how to access that database and then I've written a small program to calculate my winnings and SD over each set of 100 hands.

VickreyAuction
01-31-2006, 04:23 PM
Cool, thanks.

SumZero
02-01-2006, 06:30 AM
[ QUOTE ]
Thanks for your link.

It gives a double sided z-test, but it gives about the same result, so I think I my method is correct. It's nice to be able to say that I'm a winning player with 99.25% confidence /images/graemlins/smile.gif

Here's another good site here that explains z-tests fairly well along with some examples: http://en.wikibooks.org/wiki/Statistics:Testing_Data/z-tests

[/ QUOTE ]

Note though that this method overestimates the confidence that you are a winning player. This is because you aren't using all the information available. Namely, we know that the population of poker players on average are losing players (the average player loses the rake). Therefore, relative to what you predict it is more likely than you'd suggest that you are a worse than that good player running good then that you are a better than that player running bad. As a result your confidence level will overstate the chances that you are a winning player, since you are drawn from the population of poker players. Only if we knew nothing about the population of poker players, and hence nothing about your skill level other than the sample that produced the x BB/100 and y SD/100 could we properly and accurately use the z-test.

I.e., really you want to do a maximum likelihood estimate saying what is the probability that I am a I BB/100 and J SD/100 * what is the probability that a I BB/100 with J SD/100 would produce my results and do that for all possible values of I and J and see which values of I and J give the highest number.

fiskebent
02-01-2006, 07:11 AM
I must admit that I have trouble reading your post. It's not very clear to me what you mean.

But what the rest of the population does is really of no consequence what so ever. I'm only interested in knowing whether *I'm* winning. Not whether I'm worse or better than the average Joe. It really doesn't matter that the numbers come from poker. They could just as easily be air temperature readings from Anchorage.

I agree with you that the average PTBB/100 for all players must be below zero, since the rake has to be paid. But I don't care about that. I'm putting up a hypothesis that I'm really 0 PTBB/100 or worse. Not that I'm above average. The numbers show that that's 0.75% to be the case. Since that's pretty unlikely, I can reject the hypothesis that my PTBB/100 is 0 or worse. Usually the likelyhood has to be below 5% to statistically reject a hypothesis.
That means that my 'real' PTBB/100 is somewhere above zero and I can say that I'm a winning player.

I could also put up a hypothesis that my real PTBB/100 is 5 or less. Then the math shows that that's 9.2% likely. It's still unlikely that my real PTBB/100 is 5 or lower, but since it's above 5% I can't statistically reject the hypothesis.

jason1990
02-01-2006, 10:20 AM
[ QUOTE ]
I must admit that I have trouble reading your post. It's not very clear to me what you mean.

[/ QUOTE ]
SumZero is (perhaps only implicitly) suggesting you use a Bayesian approach, which is not what you are doing when you apply a straight z-test. In a Bayesian approach, you would assume the parameters of your play (that is, winrate and SD) are random variables. You would assign to them some a priori probability distribution based on whatever assumptions you deemed appropriate. This is called the prior distribution. After playing some hands, you then compute the conditional distribution of those parameters given your prior and your data. This is called the posterior distribution.

SumZero appears to be suggesting that an appropriate prior distribution for your winrate would be one that weighs more heavily on the negative side. His reason is that he knows nothing about you other than that you are a random player, and the average winrate of a random player is negative. If the prior is weighted on the negative side, then the posterior estimate of your winrate will be (maybe only slightly) less than the estimate you are getting.

Of course, he actually knows more about you than the fact that you are a random player. He knows you are a poster on 2+2 and you are a player interested in analyzing your game via statistics. So his assumption on the prior might not be valid. Also, it is you who are making the estimate, so it is you who must decide on the prior. Therefore, it is what you know and not what SumZero knows which is relevant to determining the prior.

In any event, the Bayesian approach is simply a different way of making estimates based on mathematical statistics.