You don't need to be an 'investor' to invest in Singletrack: 6 days left: 95% of target - Find out more
I want a function that will give me a value according approximately to a normal distribution. In other words, if I were to call the function lots of times and produce a graph of the incidence of a value, it'd produce a normal curve.
Not really sure what to google for exactly.. but it should be fairly straightfoward no? Any ideas?
Try a Gaussian random number generator
in excel
=NORM.DIST(5,2,3,TRUE)
Plot a frequency histogram
You want to generate series of random numbers that are normally distributed according to a given mean and variance?
The [url= http://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform ]Box-Muller transform[/url] is the usual way to do this
How do you want to do this? C/C++? Excel?
Some links to start with :
[url= http://en.wikipedia.org/wiki/Normal_distribution#Generating_values_from_normal_distribution ]http://en.wikipedia.org/wiki/Normal_distribution#Generating_values_from_normal_distribution[/url]
[url= http://stackoverflow.com/questions/2325472/generate-random-numbers-following-a-normal-distribution-in-c-c ]http://stackoverflow.com/questions/2325472/generate-random-numbers-following-a-normal-distribution-in-c-c[/url]
[url= http://smallbusiness.chron.com/generate-random-variable-normal-distribution-excel-74203.html ]http://smallbusiness.chron.com/generate-random-variable-normal-distribution-excel-74203.html[/url]
Rusty - yes. Using Java.
There's an Apache Commons math function for it, but I can't use libraries I don't already have, and I only seem to have Math 1.2.
Summing two random numbers should generate a normal distribution, assuming your generator is correct.
Summing two random numbers should generate a normal distribution
Not starting with uniform randoms (ie your basic built-in PRNG). If the two inputs are already normal then yes, but why bother 🙂
The sum of N uniform randoms approaches normal as N goes to infinity, but for N=2 you only have a triangular distribution.
Imagine I'm generating a random set of people for test data. I want to give them a height, but to be realistic the height should be on a normal distribution rather than actually random.
The sum of N uniform randoms approaches normal as N goes to infinity, but for N=2 you only have a triangular distribution.
Good point, hadn't thought it through fully... Without getting to infinity N=100 might be good enough for molgrips, though. What do you need to do with your distribution?
Sounds like all that's needed is a vaguely normal-ish spread between some reasonable min and max heights, so small N will be fine (see [url= http://en.wikipedia.org/wiki/Irwin%E2%80%93Hall_distribution ]http://en.wikipedia.org/wiki/Irwin-Hall_distribution[/url]).
Say you have a uniform generator in [0,1] use N=3 and sum giving a distribution in [0,3].
Add that to a min height (say 4 feet) and you have a heights in the range 4 to 7 feet with a peak in the middle at 5' 6".
Or bodge the numbers to suit 🙂
/hack solution
one excel workbook with 1000 cols x 10,000 rows using excel random generation to do the normal dist and just read from a col as required.
Sounds like all that's needed is a vaguely normal-ish spread between some reasonable min and max heights,
Yes, absoutley.
What do you need to do with your distribution?
As above - faking real people for test data. Heights is just an example - it's not actually height in this case. I've no idea if what I am modelling is actually a normal distribution anyway.. it probably isn't.. but it'll do 🙂 Values I'm looking for are integers as it happens.
hell in that case see my hack, just generate enough data, upload it to somewhere DB or big array and find some values
No, I want a function that does it.
fair enough, higher level question is why? If it's just for test data do it the simple way.
I use excel to knock up most of my test random data because it's quick and easy
And repeatable, which is always nice when you're testing.
Because I want to run, have it generate tons of data and drop it on a queue without me having to do anything. I don't want to be farting about with excel when a few lines of Java would do the same thing without intervention.
Also don't care about repeatability - this is just noise data, the real test cases will be inserted into it.
yep fix the numbers and loop through the sheet. Even more repeatable, when we use RNS we loop back through the same stream starting at the same point unless you can make your function start in the same point then your at a loss. You can create 10,000x1000 Random numbers from a normal dist paste them into one sheet and repeat it, if it's testing then you will be using real data so no need for the function. Unless you want to solve the problem of finding the function just solve the problem of creating the random data
And repeatable, which is always nice when you're testing.
Surely it's easier to just seed the generator with a constant?
Rusty90's links are IMO the way to go here - the polar form of Box-Muller is less than 10 lines of code and here is a C implementation (From stack overflow) which shouldn't require much tweaking in Java other than maybe math function names and constant names
double sampleNormal() {
double u = Math.random() * 2 - 1;
double v = Math.random() * 2 - 1;
double r = u * u + v * v;
if (r == 0 || r > 1) return sampleNormal();
double c = Math.sqrt(-2 * Math.log(r) / r);
return u * c;
}
I saw that one chambord.. looks neat and tidy but it's recusrive. Will give it a try though.
That presumably gives numbers between 0 and 1 with average 0.5?
Yes it's recursive to throw away values which are outside the unit circle, so will have a recursive call ~ 2/10 times (Back of mental envelope maths here, unlikely to be accurate 🙂 )
EDIT: I believe mean of 0 with stdev of 1.
So I can multiply the figure by whatever SD I want, then add an offset to create the mean I want. Great, thanks 🙂