• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

How to normalize into Probability Density (y axis) of bell curve?

AitchK

Member
YI have a standard deviation aka bell curve of some data. The bar lines that I defined for the x-axis are 0-100 and the formula I used to get the Y-axis is =($B$2/($C$2*SQRTPI(2)))*EXP(-(((E2-$A$2)^2)/(2*($C$2)^2))). This has worked great to get me a bell curve distribution of my data.
But, the y-axis is showing arbitrary numbers as a result of that formula and my boss would like those numbers to be a value between 1-100 as well.
I don't quite grasp this all yet, but if I could get help doing this from someone on this forum I'd appreciate it.
Boss needs P(x)=1 for X-axis. How to do?
 
Last edited:

iferror

Member
AitchK the formula as it is has a peculiarity: the area under the curve equals the total count ($B$2).

You can 'normalize' the bell so that it has unitary area
=(1/($C$2*SQRTPI(2)))*EXP(-(((E2-$A$2)^2)/(2*($C$2)^2)))
but you'll never have the value you want... unless you do something statistically meaningless and just arithmeticaly normalize the curve

=(1/($C$2*SQRTPI(2)))*EXP(-(((E2-$A$2)^2)/(2*($C$2)^2)))*(100/MAX((1/($C$2*SQRTPI(2)))*EXP(-(((XVALRANGE-$A$2)^2)/(2*($C$2)^2)))) (Control+Shift+Enter)

XVALRANGE is the WHOLE range of data you have as X VALUES.

You'll have values from 0 to 100...but they'll sort of loose any statistical meaning
 

iferror

Member
been thinking....no need to array enter the formula as we already know where the MAX is ;)

=100*EXP(-((P2-$L$1)^2)/(2*($M$1^2)))
 

AitchK

Member
Thanks Hui and ifError. I also thought it is meaningless to 'normalize the curve' so that values are 1-100 because the formula's function is only to give x/y coordinates such that the distribution can be shown in terms of a bell curve. Right?
I think the ask was not a good one but being that I didn't have experience, I thought there was something more to the request that others would understand.

I didn't understand iferror's 'been thinking comment'....
no need to array enter the formula as we already know where the MAX is

=100*EXP(-((P2-$L$1)^2)/(2*($M$1^2)))

if you have time to explain?
 

iferror

Member
Sure :)

When you 'normalize' you divide by the max value of the function at hand, so that the max is 1 (and then multiply by some factor if you want to normalize to that number. i.e 100 in your case). And this is what the first formula does... brute force :) Array-entered, because you're passing an array of values to MAX
The latter is a bit more elegant. As the MAX of a bell curce is ALWAYS (by definition) at x=mean, the MAX value of a bell curve is ALWAYS 1/(stdev*(sqrtpi(2))
meaning that

100*((1/(stdev*(sqrtpi(2)))*exp(...))/MAX(((1/(stdev*(sqrtpi(2)))*exp(...)))=
100*((1/(stdev*(sqrtpi(2)))*exp(...))/(1/(stdev*(sqrtpi(2))) =
100*exp(...)

and that is what the second formula does.
 
Top