So I've written some code to generate random variables. (using the polar form of the Box-Muller method)
My issue is that though the values make sense with few enough samples, once there are 10's or 100's of millions of values generated, it's not getting anywhere beyond 5.1 standard deviations. It seems to be capped at 5.1 standard deviations.
And I'm beginning to fear this is an issue with the limited precision VBA uses for storing the results of calculations.
Here is my code:
I believe this function generates a normal distribution centered around 0 with a s.d. = 1. That's why max is the maximum number of standard deviations above the mean for any of the generated values.
Notice that for 50 samples, this generates stuff roughly in the 2 standard deviation range. (correct)
For 1000 samples, it generates stuff roughly in the 3 standard deviation range. (correct)
It continues to be correct up to and including about 5 standard deviations. Then it hits about 5.1 standard deviations and doesn't increase at all anymore, despite going up to 100,000,000 samples.
Can anyone point out why this may be occuring?
My issue is that though the values make sense with few enough samples, once there are 10's or 100's of millions of values generated, it's not getting anywhere beyond 5.1 standard deviations. It seems to be capped at 5.1 standard deviations.
And I'm beginning to fear this is an issue with the limited precision VBA uses for storing the results of calculations.
Here is my code:
Code:
Function gauss()
numSamples = 1000000
max = -1
For i = 0 To (numSamples / 2 - 1)
Do
x1 = 2 * Rnd - 1 'basically flatly distributed on [-1,1]
x2 = 2 * Rnd - 1 'same
w = x1 * x1 + x2 * x2
Loop While w >= 1
w = Sqr((-2 * Log(w)) / w)
If x1 * w > max Then max = x1 * w
If x2 * w > max Then max = x2 * w
Next i
MsgBox "Max: " & max & " s.d."
End Function
I believe this function generates a normal distribution centered around 0 with a s.d. = 1. That's why max is the maximum number of standard deviations above the mean for any of the generated values.
Notice that for 50 samples, this generates stuff roughly in the 2 standard deviation range. (correct)
For 1000 samples, it generates stuff roughly in the 3 standard deviation range. (correct)
It continues to be correct up to and including about 5 standard deviations. Then it hits about 5.1 standard deviations and doesn't increase at all anymore, despite going up to 100,000,000 samples.
Can anyone point out why this may be occuring?