> Home > Statistics Notes > Probability > Expectation Values (3)
Expectation values of squared deviances from the expectation
Remember that when we squared the binomial random variable X, we got
a new random variable Y with a new distribution. It was not difficult
to find what the probabilities were, since if X=2 say, Y=4 and whenever
Y=4, X had to be 2. We found the expectation of Y just like we would
for any other random variable.
We will need to denote the expectation value of X by EX and the
expectation value of Y by EY.
Now it is time to consider a third random variable. Now we need to
take X, subtract off the expectation value of X itself from the values
of X, and square that. What does this tell us? Remember that the
expectation value is the "center of gravity" of the distribution;
the average of large samples taken from X will be
close to this expectation value of X. So when I subtract the expectation
of X from the values of X, and square it, we have the squared deviation
of X from the expectation value of X itself.
So we define U=(X-EX)2. We take
values of X, subtract off whatever the expectation of X is, and square
the answer. Every time we do this calculation for some random variable
X, we have a lot of different values for X, but the expectation of X
stays the same.
We will keep using the binomial example (it makes the calculations
simpler). We have been using the value N=3 and p=0.6; we found that
the expectation value is 1.8=Np.
Let's calculate the values of the random variable U we just defined.
Here the values that are relevant are squared deviations of X from the
expectation value of X, and these values are (0-1.8)2,
(1-1.8)2, (2-1.8)2, and
(3-1.8)2, or 3.24, 0.64, 0.04, and 1.44.
The squared deviation from the expectation takes the value 3.24 whenever the
original value took the value zero, the squared deviation from the expectation
takes the value 0.64 whenever the original value took the value 1, the
squared deviation from the expectation takes the value 0.04 whenever the
original takes the value 2, and finally the squared deviation takes
the value 1.44 whenever the original takes the value 3.
U takes the values in {3.24, 0.64, 0.04, 1.44} and no others. U is
a random variable in its own right, and these are its values. What is
the probability that U=3.24? It must be the same as the probability
that X=0 because X=0 when and only when U=3.24. The same holds true for
all the others. Remember, U is a random variable in its own right,
even though its values and probability distribution have been derived
from X, the original binomial random variable.
How can we calculate the expectation value of U, which we will
denote EU? We take all the values
of U, multiply each by its probability, and add it all together. We
will have EU=(3.24)P(U=3.24)+(0.64)P(U=0.64) + (0.04)P(U=0.04)
+ (1.44)P(U=1.44). But we know how to find these probabilities
for U, as just mentioned. So this becomes
EU=(3.24)P(X=0)+(0.64)P(X=1)+(0.04)P(X=2)+(1.44)P(X=3).
And that's all there is to it. Where do you get these probabilities
for X? They are found from the binomial distribution in this case.
Whatever the original distribution for X is, we can use the exact same
principle to find the expectation value for U. Exercise: go ahead and
plug in the values of P(X=x) for the four values of x and find out the
numerical answer for the computation (it should be 0.72).
Homework problem: Take a binomial with N=4 and p=0.2
and calculate the expectation value of the squared deviation from the
expected value of the random variable itself.
First you will need to calculate the expectation value
of the random variable itself, which is N*p. Then make a table whose
first column is all the values of X. Then let your second column be
the corresponding values of X-EX (take each X and subtract off its
expectation value). Let your third column be the square of the second
column, which is U=(X-EX)2..
Let your fourth column be P(X=x), which is the chance that X
takes the value in that row (each row has a particular value that the
random variable could take). The fifth column should be P(U=u).
The sixth column should be the product of the third and fifth columns,
in other words each value of U times the probability that U would take
that particular value. When you add up all the values in your sixth
column, you should get the expectation value of U, and it should be
0.64. Since U is defined to be the squared deviation of X from the
expectation of X, the expectation of U is the expectation of the
squared deviation of X from the expectation of X.
Try this computer experiment. Return to the binomial example I
worked out. The expected value of the
squared deviation from the mean is 0.72. If I calculate a lot
of binomial random variables, subtract off the expected value of
the random variable itself, square what I get, and average all those
answers, I ought to get something close to 0.72, because the average
of a lot of things ought to be close to the expectation value for
those things. The average of a lot of values of a random variable
should be near the expectation value of the random variable; the
average of a lot of squared values for the random variable should
be near the expected value of the squared random variable, and
the average of a lot of squared deviations from the expectation
value of a random variable
should be near the expected squared deviation from the
expectation value of the random variable.
> u1 <- rbinom(10000,size=3,prob=0.6)
> u2 <- u1-3*0.6
> u3 <- u2*u2
> mean(u3)
The expected value of the squared deviation from the expectation value
of the random variable itself is called the variance. It can
be shown that the variance of a binomial random variable with N trials
and success probability on each trial is N*p*(1-p). Verify that this
formula worked on each of the two binomial examples (the one I worked
here and also the homework example). More information on the variance
appears on the next page.