![]() |
#1
|
|||
|
![]() So here's the deal. I've been collecting loot data on monsters in-game and updating the wiki with their info (this has been for low-level monsters so far since they are the fastest to kill). Here's an example of one of the monsters I have the most data for: http://wiki.project1999.org/A_Decaying_Dwarf_Skeleton
That page's loot data are from 408 kills (I was farming bone chips for faction). Usually I just collect 100 kills and move on to other monsters. Even for this page, though, we can see a big (at least relative to the percentages) difference in loot percentages in items that likely have the same actual probability of dropping: the 'common loot' likelihood of dropping varies from 0.2% to 2.2%, meaning that some items appear to drop ten times as often as others, though each item, in my opinion, probably just has a 1% chance of dropping, and I just happened to get more of some items than others. I was thinking about this, and it occurred to me that the percentages I put up on the wiki might be misleading. Yeah, it's definitely better to have P1999 data than EQEmu data which is almost always wrong, but people might look at a list of loot data and get the wrong idea. For instance, say you want to farm bone chips. You look at the page for a decaying skeleton and see that they drop bone chips 72.9% of the time. Then you look at the page for a dwarf skeleton and see that they drop bone chips 67.9% of the time. Maybe you conclude that decaying skeletons drop them a little more often, so you should farm them (let's forget for now that decaying skeletons are actually easier to kill in large quantities than dwarf skeletons), even though the data is only based off of about 100 kills for each monster, meaning that we can't actually say with confidence that the real chance to drop bone chips is very close to the given percentages. I ran some numbers the other day. If an item has a 30% chance to drop and you kill the monster that drops it 100 times, then with a likelihood of about 5%, you will find the item dropping below 20% of the time, and with a likelihood of about 5%, you will find it dropping over 40% of the time. So one out of ten such items you see on the wiki will be have its drop data off by over 10%. The way I got these numbers was arduous; I looked at the associated binomial distribution, calculated the probability of getting each number of drops between 20 and 40 (from 100 trials), and added them up. Here's what I'm looking to do: if I kill a monster 'n' times and it drops some item 'k' times, I want to generate a 95% confidence interval for that item's actual drop rate, i.e. I want to say that there is a 95% chance that it drops between x% of the time and y% of the time. I haven't taken stats for a while so I don't know the best way to do this, and I worry that using smooth approximations of discrete distributions will give me intervals that are inaccurate (however, I also want to be able to do a lot of these computations (hundreds of monsters, each with over 100 kills) quickly). My assumptions are that 'n' is at least 100, and items usually drop at least 1% of the time. There's also a slight complication: most of the time items will drop either not at all or just once, but sometimes (e.g. with bone chips), they may drop twice or more (spider silks may drop five times off of spiders in East Karana). The mechanism by which this happens, I believe, is that there is a set probability 'p' used for each one dropping, and it's tested however many times. So maybe bone chips have a 50% chance to drop in each of the two times they might drop, so 25% of the time we get no bone chips, 50% of the time we get one, and 75% of the time we get two. What my parser does is it doesn't give the actual probability 'p' of each one dropping, or of at least one dropping, but rather the expected (average) number of items dropped, since this is easy to compute and in my opinion the most useful piece of information. So in the previous case, each one has a 50% chance to drop and 75% of the time we get at least one bone chip, but the expected number of bone chips is 0*25% + 1*50% + 2*25% = 1, so the wiki would show a "likelihood" of 100%. These cases would, I'm assuming, complicate the confidence intervals, though most pieces of loot that drop can only drop once, so it's more straightforward. Any stats nerds wanna help me out? Thanks!
__________________
Member of <Divinity>
Estuk Flamebringer - 60 Gnomish Wizard | Kaam Armnibbler - 55 Ogre Shaman | Aftadae Roaminfingers - 54 Halfling Rogue Aftadai Beardhammer - 50 Dwarven Cleric | Aftae Greenbottom - 49 Halfling Druid Need a port or a rez? Hit me up on IRC! | ||
Last edited by Estu; 04-18-2013 at 09:49 AM..
|
|
#2
|
|||
|
![]() I make wiki changes where I come across errors but its more to do with things like necro research trivials and stuff like that. I really wish I was better with statistics but my mind goes back to some really dry maths lessons at school years ago... could never wait to get out of there [You must be logged in to view images. Log in or Register.]
Admire what you're doing though, its a shame that a lot of people slate the wiki for inaccuracies rather than making the necessary changes as they come across them :/ | ||
|
#3
|
|||
|
![]() ... And the forum ate my reply, damnit.
Long story short, Most programming languages have stats packages that will easily crunch this data for you, and can automate pretty much the whole process. You just want to give them the drop records you experienced (If you kill three skeletons and they drop 0, 1, and 2 bone chips you'd give them "0,1,2") and let them give you the mean (expected outcome from one kill) and standard deviation. 95% confidence of where the actual mean resides is within 2 standard deviations either way, so Mean +/- 2*stddev. Alternatively StdDev is simple to calculate, just look up wiki and use the sample formula, not population. You won't have any issues with items that drop over '100%' as you're describing the expected outcome in truth, not the chance of getting 1 of the item, so just treat it like anything else and people will understand that a 160%+/-10% drop rate means they should expect 1.6 silks per kill and that you're fairly confident the real average is somewhere between 150-170%. | ||
Last edited by seped; 04-18-2013 at 02:45 PM..
Reason: adding other option.
|
|
#4
|
|||
|
![]() Is it really as simple as using the standard deviation? Doesn't that assume a normal distribution?
__________________
Member of <Divinity>
Estuk Flamebringer - 60 Gnomish Wizard | Kaam Armnibbler - 55 Ogre Shaman | Aftadae Roaminfingers - 54 Halfling Rogue Aftadai Beardhammer - 50 Dwarven Cleric | Aftae Greenbottom - 49 Halfling Druid Need a port or a rez? Hit me up on IRC! | ||
|
#5
|
|||
|
![]() Not really, this seems like a perfect example of the central limit theory, many independent random variables drawing from an unknown distribution will be approximately normally distributed.
That aside, do we have any strong assumption the drop rates aren't independent and normal? Also, I goofed in my above post, you want +/- 2* the std error, not std deviation, it's just stddev divided by sqrt(sample size). For example in ruby.. Code:
irb(main):001:0> require 'statsample' => true irb(main):002:0> test = 100.times.map { rand < 0.1 ? 1 : 0 } => [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0] irb(main):004:0> test = test.to_scale => Vector(type:scale, n:100)[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0] irb(main):005:0> test.mean => 0.11 irb(main):006:0> test.sd => 0.3144660377352203 irb(main):007:0> t_1 = Statsample::Test::T::OneSample.new(test, {:u => test.mean}) irb(main):009:0> puts t_1.summary = One Sample T Test Sample mean: 0.1100 | Sample sd: 0.3145 | se : 0.0314 Population mean: 0.1100 t(99) = 0.0000, p=1.0000 (both tails) CI(95%): -0.0624 - 0.0624 => nil irb(main):010:0> | ||
|
#6
|
|||
|
![]() Certainly we can assume the drop rates are independent. Of course they're not normal; they are a binomial distribution. My worry with the central limit theorem is that the binomial distribution would not be close enough to normal, especially for small values of 'p' (e.g. 1%, which we do see a lot of), but maybe it would be OK, or at least good enough for the wiki. Thanks for the help!
__________________
Member of <Divinity>
Estuk Flamebringer - 60 Gnomish Wizard | Kaam Armnibbler - 55 Ogre Shaman | Aftadae Roaminfingers - 54 Halfling Rogue Aftadai Beardhammer - 50 Dwarven Cleric | Aftae Greenbottom - 49 Halfling Druid Need a port or a rez? Hit me up on IRC! | ||
|
#7
|
||||
|
![]() Sorry I misspoke, meant to just say independent.
Quote:
Useful link http://en.wikipedia.org/wiki/Binomia...dence_interval What we're doing here is explicitly finding a confidence interval for a binomial test. | |||
Last edited by seped; 04-18-2013 at 04:36 PM..
Reason: adding link
|
|
#8
|
||||
|
![]() Quote:
Then it seems that if we set n=100 and p=0.01, it fails all the given tests for whether 'n' is high enough for the normal distribution to be a good approximation (on the other hand, if n=100 and p=0.3, then we seem to get a good approximation). That Wilson score interval looks pretty good, though. Maybe I'll use that. Thanks for the link!
__________________
Member of <Divinity>
Estuk Flamebringer - 60 Gnomish Wizard | Kaam Armnibbler - 55 Ogre Shaman | Aftadae Roaminfingers - 54 Halfling Rogue Aftadai Beardhammer - 50 Dwarven Cleric | Aftae Greenbottom - 49 Halfling Druid Need a port or a rez? Hit me up on IRC! | |||
Last edited by Estu; 04-18-2013 at 04:46 PM..
|
|
#9
|
|||
|
![]() Indeed, but the downside of getting a bad approximation is nothing other then getting something unhelpful like "Item drops 4% (+-4%)" which is still useful to know that the wiki doesn't have enough data to say anything more then "this item probably has a drop rate below 8%"
| ||
|
#10
|
||||
|
![]() Quote:
__________________
Member of <Divinity>
Estuk Flamebringer - 60 Gnomish Wizard | Kaam Armnibbler - 55 Ogre Shaman | Aftadae Roaminfingers - 54 Halfling Rogue Aftadai Beardhammer - 50 Dwarven Cleric | Aftae Greenbottom - 49 Halfling Druid Need a port or a rez? Hit me up on IRC! | |||
|
![]() |
|
|