2013 08 08 - MAT707 Notes

E(X) = x ƒ(x) (discrete case)

E(X) = x ƒ(x)dx (continuous case)

X_n = 1/n X_i → as n → ∞ → E(X) ∈ ℝ

iid~X

Tuesday, 2013 08 13 Includes some Q & A

E (expectation) is Linear

E(XY) = E(X)E(Y) (If E(XY) ∥ E(X)E(Y))

Great Expectations

X~Geom(p)

E(X) = 1/p (proof last class)
X~NegBin(r,p)

(Bernoulli trials until r^th success.)

XY_k (Y_k are iid~Geom(p))

(A Representation is way to write an iid as a sum of iid)

E(X) = E(Y_k = E(Y_k) = r • E(Y₁) = r • 1/p = r/p

X~Bin(n,p)

(Counts the # of successes in n B-trials)

Indicators...

I_k = { 1, if k^th trial is a win }

I_k = { 0, if k^th trial is a loss }

X = I_k (I_k is iid~ ←(0)--(1)→)

instead of 0 and 1 , it's 1-P and P ν_I₁

E(X) = E( I_k) = E(I_k) = n • E(I₁) = n • { 1 • p + 0 • ( 1 - p ) } = n • p

X~Pois(λ)

E(X) = x • ƒ(x) = k • e^-λ • (λ^k)/(k!) = e^-λ (λ^k)/(k-1)!

λ e^-λ (λ^k-1)/(k-1)!

Let j = k-1

λ e^-λ (λ^j)/j!

= λ e^-λ • e^λ = λ

In E(X) ∈ ℝ, this is sometimes called the "mean" of X

You could call it the population mean

This will be represented by μ

Old def: X has Poisson Distribution with parameter λ > 0

is now...

def: X has Poisson Distribution with mean λ > 0

GEOMETRIC SERIES and EXPONENTIAL SERIES!

X~U(a, b)

E(X) = (a+b)/2 ← already done

X~Exp(λ) ← This is the same λ as Poisson's Distribution

E(X) = 1/λ... Recall that half-life is ln(2)/λ

Variance

Figure 1: (See Saws) Notice that See-saw B has more of a "Spread" than See-saw A

X - E(X) : is the deviation of X from its' mean

X-E(X) = E[X] - E[E(X)] = E(X)-E(X) = 0

To prevent this cancellation of positive deviations by negative deviations, let's square the deviations.

Definition: The Variance of X (var(X)) is the expectation of the square of the difference between the E[{X-E(X)}²] and its mean, written σ²

Definition: Standard deviation = √(σ²) = stdev(X) = √(var(X))

σ has the same units as X so σ is better for applications, whereas var(X) is better for theory (because var(X) has better properties)

The Properties of Variances

Computing Formula

var(X) = E[{X-E(X)}²] = E[{X-μ}²] = E[(X-μ)²] = E[X²-2μX + μ²] = E(X²) - 2μE(X) + μ² = E(X²) - 2μμ + μ² = E(X²) - 2 μ² + μ² = E(X²) - μ² = E(X²) - (E(X))²

let C ∈ ℝ

var(X+C) = var(X)

Proof:

var(X+C)= E[[X+C-μ_X+C]²] = E[[X+C-(μ_X+C)]²] = E[(X-μ)²] = var(X) See #1

σ_X+C = σ_X

var(cX)

var(cX) = E({cX - μ_cX }²] = E[{cX - cμ_X}²] = E[c² {X-μ_X}²] = c² var(X)

⇒ stdev(cX) = √(var(cX)) = √(c² var(X)) = |c| • stdev(X)

Examples

X=fair six-sided die roll

stdev(X) = ?

stdev(X) = √(var(X))

Use the computing Formula

var(X) = E(X²) - (E(X)²)

√(E(X²) - (E(X)²)) = √((91/6) - (7/2)²) = √((182-147)/12) = √(35/12) = $1.71

HW: X~U(a,b)
HW: X~Exp(λ)

get stdev(X)

for the second one the answer will be σ_x = 1/λ

PROPERTY: Variance of Sums

(X,Y) rv μ_x = E(X), μ_y = E(Y), σ_x² = var(X), σ_y² = var(Y)

σ_X+Y² = Variance of (X+Y)

in theory... σ_X+Y² = E[{(X+Y) - μ_X+Y}²]

=E[{X-μ_X + Y - μ_Y }² ]

=E[((X-μ_X) + (Y-μ_Y))² ]

=E[(x-μ_X)² + (Y-μ_Y)² + 2(X-μ_X)(Y-μ_Y)]

=E((x-μ_X)² ) + E((Y-μ_Y)²) + 2 • E((X-μ_X)(Y-μ_Y))

=var(X) + var(Y) + 2•E((X-μ_X)(Y-μ_Y))

E((X-μ_X)(Y-μ_Y)) = cov(X,Y)

page 2 & 3

Definition: (X,Y) r.v. the covariance of (X,Y) is cov(X,Y) = E[(X-μ_X)(Y-μ_Y)]

cov(X,Y) is written as σ_X,Y

Note: cov(X,X) = var(X) = σ_X²

Note: var(X+Y) = var(X) + var(Y) + 2•cov(X,Y)

σ²_X+Y = σ²_X + σ²_Y + 2 • σ_X,Y

Note: a this is a computational formula for X,Y

cov measures the way X & Y cooperate

E[(X-μ_X)(Y - μ_Y) = E( XY - μ_Y X - μ_X Y + μ_X μ_Y ) = E(XY) - μ_Y E(X) - μ_X E(Y) + μ_X μ_Y

Remember... E(X) = μ_X and E(Y) = μ_Y

E(XY) - μ_X μ_Y = E(XY) - E(X)E(Y)

σ_X,Y = μ_XY - μ_X μ_Y

Note: If X∥Y then cov(X,Y) = E(XY) - E(X)E(Y) ⇒ using ∥, E(X)E(Y) - E(X)E(Y)=0, so ∥

Hence if X∥Y ⇒ var(X+Y) = var(X) + var(Y)

HW:

In the basic (matrix problem) jpmf/mpmf/∥ example (the one with the c's), find cov(X,Y)
Compute var(X+Y) in the linear system problem

Great Variances

X~Bin(n,p)

var(X) = var( I_k) (∥) = var(I_k) = n • var(I₁) = ...

... = n[ E(I₁²) - (E(I₁))² ] = ...

... = n[ E(I₁) - (E(I₁)²] = ...

... = n[p - p²] = np( 1 - p )

X~Geom(p)

var(X) = an ad hoc (latin: "for this") formula for variance

var(X) = E[X(X-1)] + E(X) - (E(X))²

E(X) = 1/p

x(x-1), ƒ(x) = k(k-1)p(1-p)^k-1 = p k(k-1)q^k-1 = ...

... = pq k(k-1)q^k-2 = pq d²/(dq²) (q^k) = ...

... = pq d²/(dq²) q^k = pq d²/(dq²) (1/(1-q)) = ...

... = pq d²/(dq²) (1-q)^-1) = pq • 2(1-q)^-3

(2(1-p))/(p²)

Thus var(X) = (2(1-p))/(p²) + 1/p - (1/p)² = ...

... = (2-2p+p-1)/(p²) = (1-p)/(p²) = failure/success²

Let X~NegBin(r,p)

var(X) = var(Y_k) = var(Y_k) = r • var(Y₁) = r • (1-p)/(p²)

σ_X = √(r) • σ_Y₁

HW: Fair N-Sided Die Roll, find σ_X

Standardizing

rv X has Standardized Form X^ST = (X-μ_X)/σ_x

This is the z-score

NOTES:

X^ST > 0 ⇔ X above average

X^ST = 0 ⇔ X is average

X^ST < 0 ⇔ X below average

X^ST has no units (thus, it's independent of units of X)

E(X^ST) = E([X-μ_x]/[σ_x]) = 1/σ_x • E(X-μ_x) = 1/σ_x( E(X) - μ_x ) = 0
var(X^ST) = var([X-μ_x]/[σ_x]) = 1/(σ_X²) • var(X - μ_x) = 1/(σ_X²) • var(X) = 1
stdev(X^ST) = √(1) = 1

Let X, X₁, X₂, X₃, ... , X_n be iid

Let μ = μ_x

Let σ = σ_x

Let X_n = 1/n • X_i = sample mean X_n → E(X) = μ as n → ∞

X_n^ST = ?

X_n^ST = E(X_n) = E( 1/n • X_i] = 1/n • E(X_i)

By iid E(X_i)'s are all the same, so...

= 1/n • n • μ = μ = μ_X

so:

X_n^ST = μ_X

var(X_n) = var( 1/n • X_i ) = 1/n² • var(X_i) = 1/n² • n σ² = σ²/n = var(X)/n (because E(X) = μ_X and var(X_i) = σ²)
stdev(X_n) = √(var(X)/n) = stdev(X)/√(n) = σ/√(n) = σ_X/√(n)