For easier and less intimidating viewing, better use a bigger display. Unless you're watching a video or recalling something
Today we will continue talking about mathematical expectation.
As always, assume we have probability space $(\Omega, \mathbb{P})$ with probability distribution $\{ p_1, p_2, ... \}$. Let's even assume (for simplicity) that there are just finitely many simple events, so we have $\{ p_1, p_2, ..., p_n \}$ as probability distribution (we call such a probability space finite). Then, for any random variables $X: \Omega \to \mathbb{R}$ we can define \[ \mathbb{E}[X] = p_1 X(\omega_1) + p_2 X(\omega_2) + ... + p_n X(\omega_n) \] and we call it mathematical expectation of a random variable $X$. This definition is what we did in the previous classwork, we will now move on to proving some useful properties of it.
Exercise 1.
Let $(\Omega, \mathbb{P})$ be a finite probability space.
Explanation and comments
click on blur to reveal
The first two parts are basically about rearranging terms in some formula, the last one is a bit more original:
The first two parts of the last exercise is what is usually referred to as linearity of mathematical expectation. In short it can written as $\mathbb{E}[a \cdot X + b \cdot Y] = a \cdot \mathbb{E}[X] + b \cdot \mathbb{E}[Y]$ where $a, b$ are constants and $X,Y$ are some two random variables.
Exercise 2.
You roll two fair dice. Let $T$ be a random variable representing the maximum of the numbers that you roll and let $M$ be a random variable representing the minimum of the numbers that you roll. Knowing that $\mathbb{E}[T] = \frac{161}{36} \approx 4.47$ (as we learned during the last lesson) and that the expected value of a die roll is 3.5 quickly find the expected value of $M$.
Explanation and comments
click on blur to reveal
The point of this exercise is to not write down the definition of maths expectation and do a bunch of boring calculations (like we did in the previous classwork when calculating $\mathbb{E}[T]$), but rather get to the answer quickly by using the linearity of expectation.
Let $R_1$ be a random variable that represents the first roll, and $R_2$ be a random variables the second roll. The key idea is then $M+T = R_1 + R_2$, therefore $M = R_1 + R_2 - T$. Now, we can use the linearity of maths expectation to the conclude \[ \mathbb{E}[M] = \mathbb{E}[R_1] + \mathbb{E}[R_2] - \mathbb{E}[T] = 3.5 + 3.5 - \frac{161}{36} \approx 7-4.47=2.53. \]
Exercise 3.
Find the mathematical expectation of a binomial random variable with parameters $n$ and $p$.
Explanation and comments
click on blur to reveal
It was tough calculating it using just the definition of mathematical expectation as you learned if you tried solving all of the problems from the last problem set. But thanks to the linearity of expectation, this problem becomes quite easy.
Recall that binomial random variable models the number of heads when we toss $n$ identical coins. Now, if you recall what a Bernoulli random variable is, you should understand that \[ B = B_1 + B_2 + ... + B_n \] where $B$ is a binomial random variable and $B_1, ..., B_n$ are all Bernoulli random variables (so $B_k$ models whether $k-$th coin was heads or not). This is a linear relation on a bunch of random variables so we can use the linearity of expectation again! I.e we can write: \[ \mathbb{E}[B] = \mathbb{E}[B_1] + \mathbb{E}[B_2] + ... + \mathbb{E}[B_n] \] Now, note that all of the $B_i$ all have the same maths expectation, because they all have the same probability distribution (like any Bernoulli random variable with parameter $p$). So the right-hand side is really $n \cdot \mathbb{E}[B_1]$. And we know that $\mathbb{E}[B_1] = p \cdot 1 + (1-p) \cdot 0 = p$. Thus we get \[ \mathbb{E}[B] = np. \]
Linearity of maths expectation is the property that is probably the most used in probability theory. One of the cool things about it is that it is always true: we do not need $X$ and $Y$ to be "unrelated" or something ("independent" if you know what that means).
Moreover, there is an idea that helps calculating many non-trivial maths expectations which goes by the name of "introduce indicator random variables":
Definition [Indicator Random Variable]
Given probability space $(\Omega, \mathbb{P})$ for any event $A$, the random variable $I_A$ is defined as $I_A(\omega) = 1$ if $\omega \in A$, and $I_A(\omega) = 0$ otherwise. This $I_A$ is what is known as an indicator random variable, which is basically a random variable telling if a certain event took place or not (hence the name).
Exercise 4.
There are $n$ personal letters to be sent to $n$ people (one letter for each person). Imagine we do it at random, instead of thinking. What is the mathematical expectation of the number of people who will receive the letters meant for them?
Explanation and comments
click on blur to reveal
This is an exercise from the extra sheet from the previous lesson and also basically the last problem from the first homework (if you look closely at it). It is time we finally do it!
Before we proceed to the solution (which is not that long), let's look at the case of $n=4$, i.e assume we have $4$ people called $W, X, Y, Z$. There $4!=4\cdot3\cdot2\cdot1=24$ different ways in which the letter could arrive to people. For each of this possibilities lets calculate the number of people who received their letter (they are marked red on the picture):
So, let $S$ be the random variable representing the number of people who received their letter. Let $I_k$ be a random variable that is equal to $1$ if person number $k$ received his/her letter and $0$ otherwise. Then we have (same idea as in the previous exercise really): \[ S = I_1 + I_2 + ... + I_n \]
Taking expectations of both sides and noticing that all of the $I_k$ are basically the same random variable, we will get that $\mathbb{E}[S] = n \cdot \mathbb{E}[I_1]$. So we are left with finding $\mathbb{E}[I_1]$. To do so, all we need is to find the probability $q$ that person number $1$ receives his/her letter. This is easy, it is $\frac{(n-1)!}{n!} = \frac{1}{n}$. Thus we indeed obtain that \[ \mathbb{E}[S] = n \cdot \mathbb{E}[I_1] = n \cdot \frac{1}{n} = 1. \]
If we look at the rightmost column there seem to be no pattern. However, if we calculate the number of reds in columns, there is a clear pattern (see the bottom row). This suggests that looking at people individually and their contribution to the answer may be a good idea.
Exercise 5.
There are $n$ couples who went to a dancing session. In each of them the husband is of exactly the same height as the wife. The moment music starts playing, people will split randomly into pairs "some husband and some wife". Find the mathematical expectation of the number of pairs in which the husband is taller than the wife he is dancing with.
Explanation and comments
click on blur to reveal
Once we know who dances with who, let's call a husband \textit{big} if he is taller that his dancing partner. Let $X$ be a random variable representing the number of big husbands. The problem asks to find $\mathbb{E}[X]$.Let $I_k$ be indicator random variable indicitaing if husband number $k$ (meaning $k-$th tallest) is big or not. Then we have \[ X = I_1 + ... + I_n \] Therefore we have \[ \mathbb{E}[X] = \mathbb{E}[I_1] + ... + \mathbb{E}[I_n] \] In difference to the problems above the random variables $I_1, ..., I_n$ are not all the same because the husbands with different heights are different, i.e they are not "symmetrical" in any way. But again, if $q_k$ is the probability that, say, some $I_k$ is equal to $1$, then $\mathbb{E}[I_k] = q_k$. So all we have to calculate now is $q_k = \mathbb{E}(I_k=1)$. This is easy: there are $k-1$ wives less tall than $k-$th husband, therefore $\mathbb{P}(I_k=1) = \frac{k-1}{n}$. Thus we have \[ \mathbb{E}[X] = \mathbb{E}[I_1] + ... + \mathbb{E}[I_n] = \frac{0+1+...+(n-1)}{n} = \frac{n-1}{2} \]
The answer makes perfect sense by the way: another way to calculate it could be to note that it should be \[ \frac{1}{2}(n-\text{expected number of pairs of the same height}) \] by the symmetry. The "expected number of pairs of the same height" has in fact been calculated in the previous exercise (do you see it?), and it is 1. Plug this "1" into the formula above and you will get exactly the same answer we derived earlier.
The property of linearity of maths expectation combined with the idea of spotting useful indicator random variables is where the "probability for olympiad maths" usually ends. I.e it is rarely the case that something that is more conceptually difficult will appear at the olympiad or in your training camp.
However, this idea is not that trivial to spot, even if you know about its existence. Thus, as always, we welcome you to work on the problem set below to become more proficient at all this.