Sample Size
#1
I recent began using a mod that tracks results while resource gathering. It's been somewhat enlightening to have my perceived notion of drop rates disproved, and the distribution ebb and flow.

For those with a understanding of statistics greater than myself, here's the question:

At what point in sample size, could one begin to draw reasonable conclusions about drop rates?
Reply
#2
The better question, I believe, would be: "At what point does data gathering become unbearable?"
A plague of exploding high-fives.
Reply
#3
You mean gathering herbs etc . I have no idea how that could possibly be reliable . Sometimes you hit a area thats already been farmed , or sometimes even part farmed by taking the pick of what they need for the moment , others I know head out for specific herbs so I cant see how data gathering would give a reliable viewpoint . They improved stranglekelp so much that you cant even swim without getting tangled in it :P . Would you consistently farm the lower lvl herbs at higher char lvls just to keep the data together ? Then you may not be talking about gathering proffs and I could be way out there :)
Take care
Reply
#4
To be 95% confident that your calculated drop rate is within 1% of the true value you would need to sample 9,604 drops.
Reply
#5
*edited*

Missed the relevant thread on first go.
Reply
#6
It's completely automated so no "unbearable" tedium involved.

It's also about gaining a better understanding of the game mechanics. A facet of the game that some people frequenting this website are into.
Reply
#7
Flymo,Dec 12 2005, 10:09 AM Wrote:To be 95% confident that your calculated drop rate is within 1% of the true value you would need to sample 9,604 drops.
[right][snapback]96803[/snapback][/right]

It's fishing, so we're talking drop table lookups just like when slaying beasties.

Flymo, please elaborate on how you came to these numbers. Is this some sort of general statistical principal?
Reply
#8
You might want to google for "confidence intervals" and related keywords you come upon. My knowledge of statistics is too rusty to understand/explain it though...
Reply
#9
Roo,Dec 12 2005, 11:16 AM Wrote:It's completely automated so no "unbearable" tedium involved.

It's also about gaining a better understanding of the game mechanics.  A facet of the game that some people frequenting this website are into.
[right][snapback]96815[/snapback][/right]


Completly automated fishing? That does _NOT_ sound like something people here are into.
Reply
#10
oldmandennis,Dec 12 2005, 10:08 PM Wrote:Completly automated fishing?  That does _NOT_ sound like something people here are into.
[right][snapback]96858[/snapback][/right]

Heh, I'm pretty sure he's referring to the information gathering mod does the tracking for him while he fishes, Oldmandennis. e.g. a modded Gatherer. :)

~Frag
Hardcore Diablo 1/2/3/4 & Retail/Classic WoW adventurer.
Reply
#11
Roo,Dec 12 2005, 03:21 PM Wrote:It's fishing, so we're talking drop table lookups just like when slaying beasties.

Flymo, please elaborate on how you came to these numbers.  Is this some sort of general statistical principal?
[right][snapback]96816[/snapback][/right]

I'm not an expert in power calculations (determining the needed sample size to detect differences of a certain size from a null hypothesis in order to preform significance tests that will result in appropriate levels of confidence), but there is an excepted set of methods that are used, and the results flymo reports sound right under certain circumstances (that's the statement about 95% of trials within 1% of the actual value which translates I think to a 5% probability level on an effect size of 1% of some amount). The other thing about sample size calculations is that they are based on asymptotic relationships, such that an increase in precision of 1% requires many more samples when you go from 94% to 95% than when you go from 50% to 51%. If you're frightened by 9604 drops, you might consider aiming at half that number which would give you about 75% condfidence for that same "within 1% of the actual value." Although, I may be off on that, it's been a while.

Regardless of the methods, the power calc results only apply to experiments that are repeated under the same controlled circumstances (same fishing hole, same equipment, same char stats) and that the drops are independent cases (meaning that the drops are not affected be either any of the previous drops, or the next drops). Basically, theories about frequentist probability only apply to situations where the results of the current trial are not related to the results of any other trial. In gaming, it's possible to limit the number of drops of a certain type to a given character over a given period of time. In that case, the probability can still be computed, but it must be done using more complex methods. The good news is that the results you see in your tracker will show you what the effective probability is, and that's what you want anyway.
ah bah-bah-bah-bah-bah-bah-bob
dyah ah dah-dah-dah-dah-dah-dah-dah-dth
eeeeeeeeeeeeeeeeeeeeeeeeeeee
Reply
#12
Roo,Dec 12 2005, 07:21 PM Wrote:Flymo, please elaborate on how you came to these numbers.  Is this some sort of general statistical principal?
[right][snapback]96816[/snapback][/right]
I was hoping to get away without explaining, as it is complicated...essentially it is based on the assumptions that the data follows a normal (Gaussian) distribution, i.e. a bell curve, and there are an infinite number of possible fishing opportunities i.e. the population is infinite. Given that, there is a formula relating sample size, confidence interval (the amount of inaccuracy in your result, 1% in the example I gave) and confidence level (the probablility that your result is accurate to within the confidence interval, 95% in my example).

You can use a calculator to do the sum for you.

[Edit: corrected typo in URL]
Reply
#13
oldmandennis,Dec 12 2005, 08:08 PM Wrote:Completly automated fishing?  That does _NOT_ sound like something people here are into.
[right][snapback]96858[/snapback][/right]
Results compiled while fishing.

User participation is still required to watch for the splash, and if the RNG is pleased, you may catch something.
Reply
#14
Flymo,Dec 13 2005, 11:33 AM Wrote:I was hoping to get away without explaining, as it is complicated...essentially it is based on the assumptions that the data follows a normal (Gaussian) distribution, i.e. a bell curve, and there are an infinite number of possible fishing opportunities i.e. the population is infinite.  Given that, there is a formula relating sample size, confidence interval (the amount of inaccuracy in your result, 1% in the example I gave) and confidence level (the probablility that your result is accurate to within the confidence interval, 95% in my example).
[right][snapback]96925[/snapback][/right]
Thank you, this is exactly the answer I was interested in and why I asked the question here.

I vaguely recall these terms of which you speak, from a previous lifetime :P

Woohoo! Only 8500 to go...
Reply
#15
Roo,Dec 13 2005, 10:57 PM Wrote:Thank you, this is exactly the answer I was interested in and why I asked the question here.

I vaguely recall these terms of which you speak, from a previous lifetime  :P

Woohoo!  Only 8500 to go...
[right][snapback]96962[/snapback][/right]
There is another way of doing this if you calculate the mean and standard deviation (sd) of your sample of N results. To within a 95% confidence limit, the lower bound of your value is mean - (sd / sqr (N)) * 1.96 and the upper bound is mean + (sd / sqr (N)) * 1.96.
Reply
#16
Roo,Dec 13 2005, 02:57 PM Wrote:Thank you, this is exactly the answer I was interested in and why I asked the question here.

I vaguely recall these terms of which you speak, from a previous lifetime  :P

Woohoo!  Only 8500 to go...
[right][snapback]96962[/snapback][/right]

Also, note that if you 'reverse' the situation in the calculator, you will see that the ~9600 value is for something with a near 50% rate (worst case). Something with a much lower or higher rate will require a lower sample size for a 95% CI = ±1%.

Because your sample size needed to know the rate within 1% is dependent on the actual drop percentage, you can't actually know the number you need for an accurate sampling until you get a few drops... fortunately you already have a decent idea with your current sampling.

Try putting reverse information in the other calculator on the page (entering your number of samplings + percentage + put in some very very large number for population) and you will see your CI. Then you can change the sample size and see how that adjusts the CI for your particular drop rate of interest. Likely your rate is fairly low (or it probably wouldn't be of interest) so you probably already have a decent approximation.

For example, if your drop rate is ~10% with 1000 drops then your confidence is ~±1.9%, to get it down to ± 1% you only need ~3500 samples, not the 9600 you need at 50%.

If you want even more information on sample sizes, you can look in statistics books about sample sizes for "1 proportion" tests. Books are usually geared towards comparing one sample to a known population, you can look at this specific case by setting the Beta error to .5 (setting it to 50% effectively removes it from the equation since it doesn't exist in this specific case) and the alpha error to 0.05 (5% error for a 95% confidence interval).
Conc / Concillian -- Vintage player of many games. Deadly leader of the All Pally Team (or was it Death leader?)
Terenas WoW player... while we waited for Diablo III.
And it came... and it went... and I played Hearthstone longer than Diablo III.
Reply


Forum Jump:


Users browsing this thread: 5 Guest(s)