24
Gerzensee Week 1 – 2018-19 1 Here are some exercises to help you review the material from Week 1. I have copied them from previous assignments and exams. (There are 110 exercises here, but several are closely related so there are many fewer unique questions.) Have fun. 1. A and B are two events. Let C = A B and D = A B. (a) Show P(C) P(A) + P(B) (b) Show P(D) 1 - P(A c ) - P(B c ). (Hint: show A B = (A c B c ) c and apply the result in (a).) (c) Let A1, A2, … An denote n events. Show P(A1 A2 An) P(A1) + P(A2) + … + P(An). (Hint: Use induction. You showed this result in (a) for n = 2. Show that if it holds for n-1, then it holds for n.) 2. fX(x) = 2/x 3 for x > 1 and zero elsewhere. (a) Compute the CDF of X. (b) Find E(X). (c) Find E(1/X). (d) Let Y = 1/X. (i) Find fY(y). (ii) Find E(Y) (and verify that the answer is the same as your answer to (c).) 3. X ~ U[1,2], that is X is distributed uniformly between 1 and 2. Conditional on X = x, Y ~ U[x, 2+x]. (a) Derive E(X), E(X 2 ) and . (b) Derive E(Y), E(Y 2 ) and . 4. The following table gives the joint distribution of employment status (Y) and college graduation (X) among those either employed or looking for work (unemployed) in the working- age (25 years and older) U.S. population for September 2017. Unemployed (Y = 0) Employed (Y = 1) Non-college grads (X = 0) 0.026 0.576 College grads (X = 1) 0.009 0.389 (a) What fraction of the population is employed? unemployed? (b) What fraction of the population are college graduates? non-college graduates? (c) The unemployment rate is the fraction that is unemployed. Show that the unemployment rate is 1 - E(Y). (d) Compute E(Y | X = 1) and E(Y | X = 0). (e) Calculate the unemployment rate for (i) college grads and (ii) non-college grads. (f) A randomly selected member of the population reports being unemployed. What is the probability that this worker is a college grad? A non-college grad? (g) Are educational achievement and employment status independent? Explain. 5. Suppose that Y and X are two random variable and assume the existence of any moments necessary to find the values of and that minimize . σ X 2 σ Y 2 0 a 1 a 2 0 1 0 1 ( ) [( )] MSE EY X a a a a , = - -

Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

1

Here are some exercises to help you review the material from Week 1. I have copied them from previous assignments and exams. (There are 110 exercises here, but several are closely related so there are many fewer unique questions.) Have fun. 1. A and B are two events. Let C = A ∪ B and D = A ∩ B.

(a) Show P(C) ≤ P(A) + P(B) (b) Show P(D) ≥ 1 - P(Ac) - P(Bc). (Hint: show A ∩ B = (Ac ∪ Bc)c and apply the result in

(a).) (c) Let A1, A2, … An denote n events. Show P(A1 ∪ A2 ∪ … ∪ An) ≤ P(A1) + P(A2) + … +

P(An). (Hint: Use induction. You showed this result in (a) for n = 2. Show that if it holds for n-1, then it holds for n.)

2. fX(x) = 2/x3 for x > 1 and zero elsewhere.

(a) Compute the CDF of X. (b) Find E(X). (c) Find E(1/X). (d) Let Y = 1/X. (i) Find fY(y). (ii) Find E(Y) (and verify that the answer is the same as your answer to (c).)

3. X ~ U[1,2], that is X is distributed uniformly between 1 and 2. Conditional on X = x, Y ~ U[x, 2+x].

(a) Derive E(X), E(X2) and .

(b) Derive E(Y), E(Y2) and .

4. The following table gives the joint distribution of employment status (Y) and college graduation (X) among those either employed or looking for work (unemployed) in the working-age (25 years and older) U.S. population for September 2017.

Unemployed (Y = 0) Employed (Y = 1) Non-college grads (X = 0) 0.026 0.576 College grads (X = 1) 0.009 0.389

(a) What fraction of the population is employed? unemployed? (b) What fraction of the population are college graduates? non-college graduates? (c) The unemployment rate is the fraction that is unemployed. Show that the unemployment

rate is 1 - E(Y). (d) Compute E(Y | X = 1) and E(Y | X = 0). (e) Calculate the unemployment rate for (i) college grads and (ii) non-college grads. (f) A randomly selected member of the population reports being unemployed. What is the

probability that this worker is a college grad? A non-college grad? (g) Are educational achievement and employment status independent? Explain.

5. Suppose that Y and X are two random variable and assume the existence of any moments necessary to find the values of and that minimize .

σ X2

σ Y2

0a 1a 20 1 0 1( ) [( ) ]MSE E Y Xa a a a, = - -

Page 2: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

2

6. Suppose X has probability density function (pdf) f(x) = 1/x2 for x ≥ 1, and f(x) = 0 elsewhere.

(a) Compute P(X ≥ 5). (b) Derive the CDF of X. (c) Show that the mean of X does not exist. (d) Let Y = 1/X. Derive the probability density function of Y. (e) Compute E(Y).

7. (a) X has probability density f(x) = 1/2 for 4 ≤ x ≤ 6. Let Y = X2.

(i) What is the CDF of Y? (ii) What is the probability density function for Y?

(b) X has probability density f(x) = 1/2 for -1 ≤ x ≤ 1. Let Y = X2. (i) What is the CDF of Y? (ii) What is the probability density function for Y?

8. X is a continuous random variable with density f(x) and CDF F(x) where F is 1-to-1. Let Y = F(X). Show that Y ~ U[0,1]. 9. Suppose that X has density f(x) with median m. (The median solves F(m) = 0.5.) Show that mina E( |X - a| ) is solved using a = m. 10. Let M(t) denote the moment generating function of X. Let S(t) = ln(M(t)), S'(t) denote the first derivative of S, S''(t) denote the second derivative, and so forth.

(a) Show that S'(0) = E(X). (b) Show that S''(0) = var(X). (c) Show that S'''(0) is the 3rd centered moment of X.

11. Suppose fX,Y(x,y) = ¼ for 1 < x < 3 and 1 < y < 3, and fX,Y(x,y) = 0 elsewhere.

(a) Derive the marginal densities of X and Y. (b) Find E(Y | X = 2.3). (c) Find P[(X < 2.5) ∩ (Y < 2.0) ]. (d) Let W = X + Y. Derive the moment generating function of W.

12. (a) Suppose X and Y are independent discrete random variables. X can take on the values 0, 1,

2, 3, 4, 5, each with probability 1/6. Y can take on the values 10, 11, 12, each with probability 1/3. Z = X+Y. What is the pdf of Z? (Jargon: the pdf of Z is called the convolution of the pdfs of X and Y.)

(b) Consider the same setup as (a), so X can take on the values 0, 1, 2, 3, 4, 5, and Y can take on the values 10,11,12. The pdf of Y is as in (a). The pdf of X is different and unknown. Suppose you know the pdf of Z. Show how to compute the pdf of X. (Jargon: This is called deconvolution.)

(c) Now suppose X and Y are independent and continuously distributed with densities fX and fY. Z = X + Y. Write an expression for the pdf of Z in terms of the pdfs of X and Y.

13. Let X and Y have a joint density for and , and

elsewhere. ( )X Yf x y x y, , = + 0 1x< < 0 1y< <

( ) 0X Yf x y, , =

Page 3: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

3

(a) What is the density of (b) Find (c) Find for . (d) Let Z = 3 + X2. What is the probability density of Z?

14. Let , and elsewhere.

(a) Find E(1/X) (b) Let

(i) Find (ii) Find E(Y)

15. Suppose that is distributed with CDF and density . Show that

.

16. If has probability density for and elsewhere. Let .

(a) What is the CDF of ? (b) What is the probability density function of ?

17. Suppose X has probability density function (pdf) f(x) = 1/x2 for x ≥ 1, and f(x) = 0 elsewhere.

(a) Compute P(X ≥ 5). (b) Derive the CDF of X. (c) Show that the mean of X does not exist. (d) Let Y = 1/X. Derive the probability density function of Y. (e) Compute E(Y).

18. Suppose that X is a discrete random variable with the following joint distribution

x Prob(X = x) 1 0.25 2 0.50 3 0.25

The distribution of Y|X=x is Bernoulli with P(Y = 1 | X = x) = (1+x)−1.

(a) Compute the mean of X. (b) Find the variance of X. (c) Suppose X = 2.

(i) What is your best guess of the value of Y? (ii) Explain what you mean by “best”.

(d) What is the distribution of X conditional on Y = 1?

19. (a) X is a Bernoulli random variable with P(X = 1) = p. What is the moment generating function for X?

X?( 3 / 4)E Y X| =( )Var Y X x| = 0 1x< <

( ) 2Xf x x= 0 1x< < ( ) 0Xf x =

1Y X= /( )Yf y

X ( )F x ( )f x1 1

0( ) ( )E X F t dt-= ò

X 14( )f x = 1 3x- < < ( ) 0f x = 2Y X=

YY

Page 4: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

4

(b) Y = where Xi are iid Bernoulli with P(Xi = 1) = p. What is the moment generating

function for Y? (c) Use your answer in (b) to compute E(Y), E(Y2) and E(Y3) for n > 3.

20. Let , . Find E(X). 21. Let zero elsewhere, be the density of the vector .

(a) Find the marginal densities of and . (b) Are the two random variables independent? Explain. (c) Find the density of given . (d) Find and . (e) Find (f) Find . (g) Find . (h) Find .

22. Let , for and zero elsewhere be the density of Let . (a) Find the density of (b) Find the mean of (c) Find the variance of

23. Suppose that and are independent random variables, and are two functions, and E(g(X)) and E(h(X)) exist. Show . 24. X and Y are scalar random variables with joint pdf fX,Y.

(a) Let . Show that sXY = sYX.

(b) Show that if X and Y are independent, then sXY = 0. 25. Suppose Xi, i =1, …, n are i.i.d. random variables and let .

(a) Show that (b) When the X's are Bernoulli, then Y is binomial. Use your results for the MGF to compute

the mean and variance of Y. 26. Show that if X is N(0,1), then E(g'(X)-Xg(X)) = 0 (Hint: use integration by parts, and assume that any necessary expectations are finite.) Use this result to show . 27. X ~ N(0,1) and Y|X=x is N(x,1).

(a) Find (i) E(Y)

1

n

iiX

12( ) ( )xf x = 1 2 3x = , , ,...

1 2 1 1 2( ) 2 0 1 0 1f x x x x x, = , < < , < < , 1 2( )x x,

1X 2X

1X 2X

1( )E X 2( )E X

1[( 0 5)]P X < .

2[( 0 5)]P X < .

1 2[( 0 5) ( 0 5)]P X X< . Ç < .

1 2[( 0 5) ( 0 5)]P X X< . È < .

( ) 1f x = 0 1x< < X . 12Y X=

YYY

X Y g h( ( ) ( )) ( ( )) ( ( ))E g X h Y E g X E h Y=

σ XY = E X − µX( ) Y − µY( )⎡⎣ ⎤⎦

1

nii

Y X=

=å1( ) ( )

i

nY i XM t M t==P

1 1( ) ( )i iE X iE X+ -=

Page 5: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

5

(ii) Var(Y) (iii) Cov(X,Y)

(b) Prove that the joint distribution of X and Y is normal. 28. Suppose X is a random variable that can take on the values with

(This is the Poisson pdf). You can verify that the mean and variance of X are both equal to m. Suppose E(X) = 80.

(a) Use Chebyshev’s inequality to find a lower bound for . (b) Is the answer the same for ?

29. Suppose that Xi ~ i.i.d. N(0,S) for where is a p×1 vector. Let a denote a non-zero p×1 vector.

(a) Show that a'Xn ~ N(0, s2) and derive an expression for s2. (b) Show that (a'Xn)2/s2 ~

Let

(c) Show that .

(d) Show that Yn ⇒ Y ~ (as ). 30. Suppose that Xi are U[0,1] for . Let , where

, and s = .

(a) Show var(Xi) = 1/12. (b) Show E(s2) = 1/12.

(c) Show s2 1/12 .

(d) Show s . (e) Is E(s) = ? E(s) > ? E(s) < ? Explain.

31. Z ~ N(0,1).

0 1 2, , , ...

( )x m

Xm ef xx

-

=!

(70 90)P X£ £(70 90)P X< <

1i n= ,..., iX

χ12

Yn =α ' XnXn 'α

1n−1 i=1

n−1∑ α ' XiXi 'α

Yn ∼ F1,n−1

χ12 n®¥

1i n= ,..., 2 1 21

( 1) ( )ni ni

s n X X-=

= - -å1

1

nn iiX n X-

== å !! s2

!→p

!!→p

1/12

! 1/12 ! 1/12 ! 1/12

Page 6: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

6

(a) Derive the mean and variance of Z2.

(b) Prove that and determine the value of V.

(c) Use the result in (a) to find an approximation to .

(d) Find the exact value for .

32. (a) Z ~ N(0,1). Derive the first 4 moments of Z.

(b) W ~ . Use the result in (a) to derive the mean and variance of W.

(c) Let Wi ~ i.i.d. . Let

(i) Show that E( ) = 10. (ii) Show that .

(iii) Show that .

(iv) Does ?

33. Suppose that a random variable is distributed . Let be another random variable that is distributed . Suppose that and are statistically independent. Let the random variable be related to and by

(a) Prove that the joint distribution of is normal. Include in your answer the value of the mean vector and covariance matrix for the distribution.

(b) Suppose that you knew that . Find the optimal (minimum mean square error) forecast of .

34. and are two random variables with and .

(a) What is the joint density of and ? (b) Compute . (Hint: if , then .) (c) What is the density of ? (Hint: What is the density of conditional on X=x?)

35. Let X ~ N(0,1) and suppose that

(a) Find the distribution of Y.

1n

(Zi2 −1)

i=1

n

∑ d⎯ →⎯ N (0,V )

P Zi2

i=1

30

∑ < 40⎛⎝⎜

⎞⎠⎟

P Zi2

i=1

30

∑ < 40⎛⎝⎜

⎞⎠⎟

χ102

χ102

W = 1

nWi

i=1

n

∑ W

W p⎯ →⎯ 10

1n

(Wi −W )2

i=1

n

∑ p⎯ →⎯ σW2

E 1

n(Wi −W )2

i=1

n

∑⎡

⎣⎢

⎦⎥ =σW

2

X (6 2)N , e(0 3)N , X e

Y X e 4 6Y X e= + + .( )X Y,

10X =Y

Y X Y | X = x) ∼ N (0, x2 ) X ∼U (1,2)Y X

4( )E Y Q ∼ N (0,σ 2 ) 4 4( ) 3E Q s=YXW = W

if 0.5 10 otherwise

X XY

< <ì ü= í ýî þ

Page 7: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

7

(b) Find E(Y) (c) Find .

36. Suppose that W ~ and that the distribution of X given W = w is U[0, w] (that is uniform on 0 to w).

(a) Find E(X). (b) Find Var(X). (c) Suppose W = 4. What is the value of the minimum mean square error forecast of X?

37. Complete the following calculations

(a) Suppose . Find (b) Suppose X ~ N(50, 4). Find (c) Suppose . Find

(d) Suppose . Find

(e) Suppose . Find (f) Suppose . Find (g) Suppose . Find (h) Suppose . Find (i) Suppose . Find (j) Suppose Find (k) Suppose . Find

38. If find . 39. Find the 90th percentile of the distribution. 40. Suppose and and . Find and 41. Suppose , show that 42. X ~ N(1, 4) and Y = eX.

(a) Show that E(Y) ≥ eE(X). (b) Derive the density of Y ? (c) Compute E(Y). (Hint: What is the MGF for X?)

43. X,Y,U are independent random variables: X~N(5,4), Y~ , and U is distributed Bernoulli with P(U=1)=0.3. Let W=UX+(1−U)Y. Find the mean and variance of W. 44. Suppose that the vector is distributed , and let be a non-random matrix that satisfies . Let .

( )Var Y

23c

X ∼ N (5,100) ( 2 7)P X- < < .( 48)P X ³

X ∼ χ42 ( 9 49)P X ³ .

X ∼ χ62 (1 64 12 59)P X. £ £ .

X ∼ χ62 ( 3 48)P X £ .

X ∼ t9 (1 38 2 822)P X. £ £ .

X ∼ N (5,100) ( 1 6)P X- < < .

X ∼ N (46,2) ( 48)P X ³

X ∼ χ62 (1 24 12 59)P X. £ £ .

X ∼ F3.5. ( 7 76)P X > .

X ∼ t10 ( 2 228)P X £ .

X ∼ N (1,4) (1 9)P X< <

(10 30)N ,

X ∼ N (µ,σ 2 ) ( 89) 90P X < = . ( )94 95P X < = . µ 2s .

X ∼ N (µ,σ 2 ) ( ) 2X µ s p| - | = / .E

21c

3 1´ X 3(0 )N I, V 2 3´VV I¢ = Y VX=

Page 8: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

8

(a) Show that (b) Use Chebyshev’s inequality to construct an upper bound for Pr(Y¢Y>6). (b) Find the exact value of Pr(Y¢Y>6).

45. Let l denote a p×1 vector of 1’s, Ip denote the identity matrix of order I, suppose the p×1 vector Y is distributed as Y ~ N(0, s2Ip), with s2 = 4, and let M = Ip − (1/p)ll′.

(a) Show that M is idempotent. (b) Compute E(Y′MY). (c) Compute var(Y′MY) (d) Suppose E(Y) = 0 and var(Y) = s2Ip, but Y is not normally distributed. (i) Would your

answer to (c) change? Would your answer to (c) change? Explain. 46. Suppose that the joint distribution of X and Y are two random variables with joint normal distribution. Further suppose . Let U=X+Y and V =X −Y.

(a) Prove that U and V are jointly normally distributed. (b) Prove that U and V are independent.

47. Let . Also let , , and

be the partitions of , and such that and are –dimensional and is a matrix. Show that implies that X1 and X2 are independent.

48. Suppose Xi are i.i.d. N(µ, s2), for i = 1, … , n.. Let and

Show each of the following. (Hint: Let l denote an n×1 vector of 1’s. Note that ,

where X = (X1 X2 … Xn)´, and =X´[I – l(l´l)–1l´]X .)

(a)

(b) (n – 1)s2/s2 ~ . (c) and (n – 1)s2/s2 are independent.

(d)

(e) Suppose µ = 10, s = 2, and n = 8. Compute E( | s2 = 5).

49. X and Y are random variables. The marginal distribution of X is . The conditional distribution of Y|X=x is N(x,x).

Y′Y ∼ χ2

2

( ) ( )var X var Y=

X ∼ N p (µ,Σ) 1 2( )X X X¢ ¢ ¢= , 1 2( )µ µ µ¢ ¢ ¢= ,

11 12

21 22

æ öç ÷ç ÷ç ÷ç ÷è ø

S SS =

S SX µ S 1X 1µ k 11S k k´

12 0S =

1

1 n

ii

X Xn =

= å s2 = 1n−1

(X − X )2i=1

n

∑ .

1( ' ) 'X l l l X-=2

1( )

n

iiX X

=

2

( ) ~ (0,1)/

X Nnµ

s

-

21nc -

X

2

2 2

( )

12( 1)1

( )X

nn

n sn

X ts n

µ

s

s

µ-

/-

- /-

-= .

/!

X

21c

Page 9: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

9

(a) Derive the mean and variance of Y. (b) Suppose that you are told that X = 1. Derive the minimum mean square error prediction

of Y. (b) (More difficult) Suppose that you are told that X < 1. Derive the minimum mean square

error prediction of Y.

50. (a) Y ~ N(10,4). Find the shortest interval of the real line, say C, such that P(Y C ) = 0.90. (b) Suppose Y has a continuous pdf f. How would you find values a and b such that

(i) Pr(a ≤ Y ≤ b) = 0.90 and (ii) b-a is minimized? 51. Suppose that X and Y are two random vectors that satisfy:

and

where X is k×1 and Y is scalar. Let .

(a) Derive the values of the scalar a and the 1×k matrix b that minimize E[(Y – )2]. (b) Find an expression for the minimized value of E[(Y – )2].

52. Suppose that Xi are i.i.d. N(0,1) for and let

(a) Show that .

(b) Show that .

(c) Show that . (d) Suppose , find using an exact calculation.

(e) Now let , where

(i) Show .

(ii) Show . 53. Suppose Xi, i = 1, … , n are a sequence of i.i.d. random variables with and

. Let and Prove . (Hint – can you show mean square convergence?)

54. Xi ~ U[0,1] for i = 1, …, n. Let Y = . Suppose that n = 100. Use an appropriate

approximation to find the (approximate) value of P(Y ≤ 52). 55. Suppose Xi ~ i.i.d. N(0, s2) for and let

Î

X

Y

XEY

µµé ùé ù

= ê úê úë û ë û

var XX XY

XY YY

XY

S Sé ùé ù= ê úê ú S Së û ë û

Y a bX= +

YY

1i n= ,..., 2 1 21

nii

s n X-=

= å2 1as

s ®2 1p

s ®2 1ms

s ®15n = 2( 73)P s £ .

2 1 21

( 1) ( )ni ni

s n X X-=

= - -å 11

nn iiX n X-

== å

2( ) 1s =E2 1p

s ®

( ) 0iX =E2( )ivar X s= < ¥ i iY iX= 2

1

nii

Y n Y-=

= .å 0p

1

n

iiX

1i n= ,...,

Page 10: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

10

(a) Prove that

(b) Prove that where (as ). 56. Let , let , and .

(a) Prove that .

(b) Prove that (c) Suppose that Calculate (d) Use the result in (a) to approximate the answer in (c).

57. Let and denote the sample means from two independent randomly drawn samples Bernoulli random variables. Both of the samples is size and the Bernoulli population has

. Use an appropriate approximation to determine how large must be so that .

58. Suppose that Xi ~ i.i.d. N(0,S) for i = 1, … , n where is a p×1 vector. Let denote a non-zero vector and let

(a) Prove that . (Hint: recognize that is a scalar.)

(b) Sketch a proof that where (as ).

59. Let , where is a positive scalar. Let Q denote a non-stochastic matrix

with full column rank, and MQ = I − PQ. Finally, Let and

(A few useful matrix facts: (i) If A and B are two matrices with dimensions so that AB and BA are defined, then ; (ii) if is an idempotent matrix then

.) (a) Prove that (b) Let . Show that .

(c) Suppose that p = 30 and . Find

22

1

1

niiX

n

XY=-

1nY t -!

dY Z® (0 1)Z N ,! n®¥

Y ∼ χn2 1 2 1 2

nX n Y n- / /= - 1nW n Y-=

(0 2)d

nX X N® ,!

1p

nW ® .30n = . ( 43 8)P Y > . .

1X 2Xn

( 1) 0 5P X = = . n

1 2( 01) 95P X X| - |< . = .

iX a1p´

111 1

n nn n

i in i

X XYX X

a aa a

¢ ¢

- ¢ ¢- =

Yn ∼ F1,n−1 nXa ¢

d

nY Y® Y ∼ χ12 n®¥

Y ∼ N p (0,λ I p ) l p k´

1( )QP Q QQ Q¢ - ¢= 1 QX Y P Y¢=

2 QX Y M Y¢= .( ) ( )tr AB tr BA= Q

( ) ( )rank Q tr Q=

2( ) ( )E X p k l= - .W = 1 2( )(( ) )X X p k k/ - /

W ∼ Fk ,p−k

15k = ( 2 4)P W > . .

Page 11: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

11

(d) Suppose that where denote the greatest integeter less than or equal to z.

Show that as , , and derive an expression for . (e) Use your answer in (d) to approximate the answer to (c).

60. Xi ~ i.i.d. with E(Xi) = 0 and var(Xi) = s2. She estimates s2 using the method of moments

estimator .

(a) Show that and derive an expression for V. (Assume the existence of any necessary moments.) (b) Let denote an estimator of the standard deviation. Show that

and derive an expression for W.

61. Xi are distributed i.i.d. Bernoulli with parameter p. (a) Show that p. (b) Show that and derive an expression for V. (c) A researcher wants to estimate p using . She wants to choose a sample size, n, large

enough so that P(| - p| > 0.01) ≤ 0.05. Using the approximation in (b): (i) How large should n be if p = 0.4? (ii) How large should n be if p = 0.6? (iii) How large should n be if p = 0.7? (iv) What advice would you give the researcher that will work regardless of the value of

p? (d) Suppose you missed class on the day we covered the CLT and didn't know that was approximately normally distributed, but you did know that E( ) = p and var( ) = n-1p(1-p). Use Chebychev's inequality to answer (c.i)-(c.iv).

62. Three candidates are running for the same elected position. Call these candidates, A, B, and C. A random sample of n voters is collected. Each voter is asked to state a preference for one (and only one) of three candidates running for president. Let pA denote the fraction of voters in the population who prefer A and pB the fraction that prefer B. (Note pC = 1 - pA - pB.) Similarly let and denote the fractions in the sample.

Show that and derive an expression for V.

[ 2]k p= / [ ]z

p®¥ p(W −1)→d

R ∼ N (0,V ) V

!!σ 2 = 1

nXi2

i=1

n

!! n(σ2 −σ 2) d⎯→⎯ N(0,V )

! σ 2 = σ

n(σ −σ ) d⎯→⎯ N(0,W )

!X p⎯→⎯

!! n(X − p)d⎯→⎯ N(0,V )

!X!X

XX X

!!pA !!pB

!!n

pA − pApB − pB

⎝⎜⎜

⎠⎟⎟

d⎯→⎯ N(0,V )

Page 12: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

12

63. ei, i = 0, … , n are i.i.d. with mean and variance . For i = 1, …, n the random variable Xi is constructed as Xi = ei + 0.5ei-1.

(a) Show that .

(b) Show that , and derive an expression for V.

64. Suppose that Xi is i.i.d. N(µ,1) for i = 1, … n. The risk function for an estimator is .

(a) Let . Derive . (b) Let . Derive . (c) Which estimator is better?

65. X1 and X2 are i.i.d Bernoulli random variables with parameter p. The value of p is unknown, but you know that p takes one of two values, p = 0.3 or p = 0.4. Your prior is P(p = 0.3) = 0.7 and P(p = 0.4) = 0.3. You observe X1 = 1 and X2 = 0.

(a) What is the likelihood f(X1 = 1, X2 = 0 | p) as a function of p? (b) What is the marginal likelihood f(X1 = 1, X2 = 0)? (c) What is the posterior f(p = 0.3 | X1 = 1, X2 = 0)? (d) Loss is quadratic. What is your best guess of the value of p?

66. Y1 and Y2 are i.i.d. N(µ, 1). Let , , and . Loss

is quadratic: . (a) Derive the frequentist risk functions , , and . Plot these as a

function of µ. (b) Is admissible? Explain. (c) Suppose the prior for µ is P(µ = 0) = 0.3 and P(µ = 1) = 0.7. Derive the Bayes risk r( ),

r( ), and r( ). Rank the estimators. (d) Suppose the prior for µ is µ ~ N(0,1). Derive the Bayes risk r( ), r( ), and r( ).

Rank the estimators. (e) Suppose the prior for µ is µ ~ N(0,s2). Derive the Bayes risk r( ), r( ), and r( ).

Rank the estimators. 67. Assume that Yi|q ~ i.i.d. N(q, 1) for i = 1, … , n.

(a) Suppose that q ~ N(t, w2). Show that the posterior distribution of q depends on only through .

(b) Suppose that q ~ w for arbitrary prior density w. Show that the posterior distribution of q depends on only through .

0 1

1n i=1

n∑ Xi→p

0

1n i=1

n∑ Xi→d

N (0,V )

µ2ˆ ˆ( ) [( ) ]R Eµ µ µ µ, = -

µ1 = X 1ˆ( )R µ µ,µ2 = 1

2 X 2ˆ( )R µ µ,

µ1 = (Y1 +Y2 ) / 2 µ2 = (Y1 +Y2 ) / 4 µ3 = (Y1 +Y2 )L(µ,µ) = (µ − µ)2

R(µ1,µ) R(µ2 ,µ) R(µ3,µ)

µ3µ1

µ2 µ3µ1 µ2 µ3

µ1 µ2 µ3

1{ }ni iY =

Y

1{ }ni iY = Y

Page 13: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

13

68. Assume that Yi|q ~ i.i.d. N(q, 1) for i = 1, … , n. The prior for q is q ~ N(t, w2). The loss for an estimator is .

(a) Show that = 0.25Y1 + 0.75Y2 is an inadmissible estimator of q. (Hint: Find an estimator that dominates .)

(b) Show that = ½ is an admissible estimator of q. (Hint: Recall that Bayes estimators are admissible. Find a prior that rationalizes as a Bayes estimator.)

69. Yi| µ is i.i.d. N(µ, s2) for i = 1, … , n. Let denote the sample mean, and consider

as an estimator of µ. Suppose that the loss function is . Show that is an inadmissible estimator of µ. (Hint: Find an estimator that dominates .)

70. Xi ~ i.i.d. Bernoulli with parameter p. The prior for p is

A sample of size 2 yields X1 = 1 and X2 = 0.

(a) Derive the posterior distribution of p. (b) Suppose loss is quadratic. Derive the Bayes estimate of p.

71. Yi ~ i.i.d. N(µ, 1) for i = 1, … , n. Let denote an estimator of µ. Loss is quadratic and

the frequentist risk is mean squared error: . Suppose I want to

minimize a weighted average of R( , µ) with a weight of 1/3 on µ =1 and a weight of 2/3 on µ = 2. Derive the optimal estimator. 72. (X,Y) are two random variables with a joint distribution that depends on a scalar parameter q; write the joint density as f(x,y |q). Suppose that you have a prior, say w(q), for q.

(a) You observe X = x and Y = y. Write an expression for the posterior density of for q. (b) Alternatively, suppose that the data arrive sequentially. Specifically, you begin with the

prior w(q) and then observe X = x. Based on this observation you compute a posterior for q, this posterior serves as an “updated prior”, you then observe Y = y, and based on this observation and your “updated prior” you compute the posterior for q. Show that this sequential procedure yields the same posterior as in (a).

73. X is a Bernoulli random variable with . You have sample of i.i.d. random variables from this distribution, . Let denote the true, but unknown value of .

(a) Write the log-likelihood function for . (b) Compute the score function for and show that the score has expected value 0 when

evaluated at . (c) Compute the information .

θ (θ −θ )2

!θ!θ

!θ Y!θ

Yˆ 2Yµ = + 2ˆ ˆ( , ) ( )L µ µ µ µ= -µ µ

!!p=

1/3!with!probability!1/51/2!with!probability!4/5

⎧⎨⎪

⎩⎪

µ(Y1:n )

R(µ(Y1:n ),µ) = EY1:n |µ µ(Y1:n )− µ( )2

µ(Y1:n )

( 1)P X p= =

1 2 nX X X, ,..., op pp

p

opI

Page 14: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

14

(d) Show the maximum likelihood estimator is given by . (e) Show that is unbiased.

(f) Show that , and derive an expression for . (g) Propose a consistent estimator for V.

74. Xi ~ i.i.d. Bernoulli with parameter p, for i = 1, …, n. Let X1:n = (X1 .. Xn)’.

(a) Derive the likelihood function f(X1:n | p). (b) Derive the score, S(X1:n, p) and show that it has mean zero. (c) Let . Show that is the “best unbiased estimator” of p. (“Best” means minimum mean square estimator in this context.)

75. Let , i = 1, … , n denote an i.i.d. sample of Bernoulli random variables with . I am interested in a parameter that is given by .

(a) Construct the likelihood function for . (b) Show that .

(d) Show that is consistent.

(e) Show that , and derive an expression for V. 76. Suppose . Let be a consistent estimator of V. Show that

.

77. Suppose Yi|µ ~ i.i.d. N(µ, 1), . Suppose that you know . Let denote the

value of that maximizes the likelihood, subject the constraint , and denote the sample mean of the .

(a) Show that , where for , and otherwise.

(b) Suppose that µ . (i) Show that . Show that . Show that

.

(c) Suppose that µ = 7. (i) Show that . Show that .

MLEp = X

MLEp

n( MLEp − po )→d

N (0,V ) V

p = X p

iY Pr( 1)iY p= =q peq =

qθMLE = e

Y

θMLE

n(θMLE −θ )→d

N (0,V )

n(θ −θ0 ) d⎯ →⎯ Nk (0,V ) V

(θ −θ0 ) ' 1

nV⎛

⎝⎜⎞⎠⎟

−1

(θ −θ0 ) d⎯ →⎯ χ k2

1i n= ,.. 5µ £ ˆMLEµµ 5µ £ Y

iY

MLEµ = a(Y )5+ (1− a(Y ))Y ( ) 1a Y = 5Y ³ ( ) 0a Y =

3= ( ) 0p

a Y ® MLEµ →p

µ

n( MLEµ − µ)→d

N (0,1)

a(Y )→p

1 MLEµ →p

µ

Page 15: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

15

(d) Suppose that µ = 5. Show that . 78. Yi ~ i.i.d.N(µ, 1) for i = 1, 3, 5, … n–1, Yi ~ i.i.d.N(µ, 4) for i = 2, 4, …, n where n is even, and Yi and Yj are independent for i odd and j even.

(a) Derive the log-likelihood function for µ as a function of Y1:n, f(Y1:n|µ). (b) Derive the MLE of µ. (c) Is the MLE preferred to as an estimator of µ ? Explain. (Discuss the properties of the

estimators. Are the estimators unbiased? Consistent? Asymptotically Normal? Which estimator has lower quadratic risk?)

79. Yi ~ i.i.d. N(µY, 1) and Xi ~ i.i.d.N(µX, 1) for i = 1, …, n. Suppose Xi and Yj are independent for all i and j. A researcher is interested in q = µYµX. Let denote the MLE if q.

(a) Show that .

(b) Suppose that µY = 0 and µX = 5. Show that and derive an expression for V.

(c) Suppose that µY = 1 and µX = 5. Show that and derive an

expression for V. 80. Suppose that Yi is i.i.d. with mean 0 and variance s for i = 1, …, n with E( ) = k. Let

.

(a) Show that and derive an expression for V.

(b) Suppose a sample with that n = 100 yields = 2.1 and = 16. Construct a 95%

confidence interval for s, where s is the standard deviation of Y. 81. Suppose that Xi ~ i.i.d. N(0, s2),

(a) Construct the MLE of . (b) Prove that the estimator is unbiased. (c) Compute the variance of the estimator.

(d) Prove that . Derive and expression for (e) Suppose = 4 and . Test versus using a Wald test

with a size of (f) Suppose = 4and . Construct a confidence interval for (g) Suppose = 4 and . Test versus using a test of your

choosing. Set the size of the test at . Justify your choice of test. (h) Construct the MLE of

MLEµ →p

µ

Y

qˆ XYq =

ˆ ~ (0, )d

n W N Vq ®

( )ˆ 5 ~ (0, )d

n W N Vq - ®

Yi4

σ 2 = 1n

Yi2

i=1

n

n(σ 2 −σ 2 )→d

N (0,V )

2s1n

Yi4

i=1

n

1i n= ,..., .2s

n( MLE2σ −σ 2 )→

d

N (0,V ) V .

MLE2σ 100n = 2 3oH s: = 2 3aH s: ¹10%

MLE2σ 100n = 95% 2s .

MLE2σ 100n = 2 3oH s: = 2 4aH s: =

5%s

Page 16: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

16

(i) Is the MLE of unbiased? Explain.

(j) Prove that . Derive and expression for

(k) Suppose and n=200. Construct a confidence interval for s. (l) Suppose that the random variables were i.i.d. but non-normal. Would the confidence

interval that you constructed in (f) have the correct coverage probability ( would it contain the true value of of the time.)? How could you modify the procedure to make it robust to the assumption of normality (i.e., so that it provided the correct coverage probability even if the data were drawn from a non-normal distribution.)

82. Yi, i = 1, … , n are Bernoulli random variables with P(Yi = 1) = p.

(a) Show that var(Y), where var(Y) denotes the variance of Y. (b) Show that and derive an expression for V. (c) I am interested in Ho: p= 0.5 vs. Ha: p > 0.5. The sample size is , and I decide

the reject the null hypothesis if > 55. Use the central limit theorem to derive the approximate size of the test.

83. Xi, i = 1, … , n, are distributed i.i.d. Bernoulli random variables with P(Xi = 1) = p. Let denote the sample mean.

(a) Show that is an unbiased estimator of the mean of X. (b) Let = denote an estimator of s2, the variance of X.

(i) Show that is a biased estimator s2. (ii) Show that is a consistent estimator s2.

(iii) Show that and derive an expression for V.

(c) Suppose n = 100 and = 0.4. (i) Test H0: p = 0.35 versus H0: p > 0.35 using a test with size of 5%. (ii) Construct a 95% confidence interval for s2.

84. Suppose Xi, i = 1, … , n are a sequence of independent Bernoulli random variables with P(X=1) = p. Based on a sample of size n, a researcher decides to test the null hypothesis

versus Ha: p > 0.45 using the following rule: Reject if .

Otherwise, do not reject . (a) Compute the exact size of this test for n = 2. (b) Suppose n = 50. Use the Central Limit Theorem to get an approximate value of the

size of the test. (c) Prove that when p = 0.45.

s

n( MLEσ! −σ )→d

N (0,W ) W .

MLEσ! = 3 90%

i e. .,2s 95%

iid

(1 )p

Y Y- ®

100n =

1

niiY

X

X2s (1 )X X-

2s2s

( )2 2ˆ (0, )d

n N Vs s- ®

X

0 45oH p: = . oH 21

n niiX

=³å

oH

21lim ( ) 0n n

n iiP X®¥ =

³ =å

Page 17: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

17

85. Xi ~ i.i.d. N(0, s2) for i = 1, …, 10. A researcher is interested in the hypothesis Ho: s2 = 1

versus Ha: s2 > 1. He decides to reject the null if > 1.83.

(a) What is the size of his test? (b) Suppose that Ha is true, and specifically, that s2 = 1.14. What is the power of his test? (c) Is the researcher using the best test? Explain.

86. are i.i.d . Let . I am interested in the competing hypotheses

versus . Suppose and I decide to reject the null if . (a) What is the size of the test? (b) Suppose that the true value of is 1.5. What is the power of the test? (c) Is this the most powerful test? Explain.

87. X ~ N(0,s2). You are interested in the competing hypothesis Ho: s2 = 1 vs Ha: s2 = 2. Based on a sample of size 1, you reject the null when X2 > 3.84. (a) Compute the size of the test. (b) Compute the power of the test. (c) Prove that this is the most powerful test. (d) Let = X2.

(i) Show that is unbiased. (ii) Show that is the minimum mean squared estimator of s2.

88. A coin is to be tossed times and the outcomes are used to test the null hypothesis that

vs. , where denotes the probability that a “tails” appears. (“Tails” is one surface of the coin, and “heads” is the other surface.) I decide to reject the null hypothesis if “tails” appears in all four of the tosses.

(a) What is the size of this test? (b) Sketch the power function of this test for values of . (c) Is this the best test to use? Explain.

89. A researcher has i.i.d. draws from a N(0,s2) distribution, . He is

interested in the competing hypotheses: vs. . He plans to test these

hypotheses using the test statistic . The procedure is to reject in favor of Ha

if ; otherwise will not be rejected. (a) Determine the size of the test. (b) Determine the power of the test. (c) Is the researcher using the most powerful test given the size from (a)? If so, prove it. If

not, provide a more test.

102

1

110 i

iX

iX2(0 )N s, 2 21

1

nin i

s X=

= å2 1oH s: = 2 1aH s: > 15n = 2 5 3s > /

2s

2s2s2s

412oH p: = 1

2aH p: > p

1 2p > /

10 1 2 10( )X X X, ,...,2 1oH s: = Ha :σ

2 = 42σ = 1

10 i=1

10∑ Xi2

oH2σ >1.83 oH

Page 18: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

18

90. Suppose that is i.i.d. with mean , variance and finite fourth moment. A random sample of n = observations are collected, and you are given the following summary

statistics: and (where is the sample variance.) (a) Construct an approximate 95% confidence interval for . Why is this an

”approximate” confidence interval? (b) Let . Construct an approximate confidence interval for

91. A researcher is interested in computing the likelihood ratio statistic

but finds the calculation difficult because the density function is very complicated. However, it easy to compute the joint density of (Y,X) where where X denotes another random variable. Let g(y,x,q) denote the joint density (which depends on the value of q). Unfortunately, the researcher has data on Y, but not on X. Show that the researcher can compute LR using the formula

where the notation means that the expectation should be taken using the distribution conditional on Y and using the parameter value . 92. Suppose Xi ~ i.i.d. Bernoulli with parameter p, for i = 1, …, n. I am interested in testing H0: p = 0.5 versus Ha: p > 0.5. Let denote the sample mean of Xi.

(a) A researcher decides to reject the null if . Derive the size of the test as n → ∞.

(b) Suppose p = 0.51. What is the power of this test as n → ∞. (c) The researcher asks you if she should use another test instead of the test in (a). What

advice do you have to offer her? 93. Consider two random variables and . The distribution of is given by the uniform distribution X ~ U[0.5,1.0] and the distribution of Y conditional on X=x is given by the normal

. (a) Derive an expression for the joint density of X and Y, say . (b) Prove that is distributed as . (c) You have a sample . Derive the log likelihood function for l as a function of

. (d) Construct the MLE of .

iX µ 2s100

X = 1.23 2 4 33s = . s2 = 1n−1 i=1

100∑ (Xi − X )2

µ

2a µ= 95% a .

LR =f (Y ,θ1)f (Y ,θ2 )

( )f .

LR =f (Y ,θ1)f (Y ,θ2 )

= Eθ2[g(Y ,X ,θ1)g(Y ,X ,θ2 )

|Y ]

2[ ]E yq • |

2q

X1/20.5(1 )X n-> +

X Y X

Y | X = x ~ N (0,λx)( )X Yf x y, ,

YXZ = (0 )Z ~ N l,

1{ }Ni i iX Y =,

Xi ,Yi{ }i=1N

l

Page 19: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

19

(e) Suppose that and . Construct a confidence interval for 94. Suppose Yi ~ i.i.d. with density f(y,q), for i = 1, …, n, where q is the 2×1 vector q = (q1 q2)′. Let q0 = (q1,0 q2,0)′ denote the true value of q. Let si(q) denote the score for the i’th observation, var[si(q0)] = I(qo) denote the 2×2 information matrix, and S1:n(q) = denote the score

for the observations . Suppose that the problem is “regular” in the sense that n–1/2S1:n(q0)

, , and so forth, where is the MLE of q. Now, suppose the likelihood is maximized not over q, but only over q2, with q1 taking on its true value q1 = q1,0. That is, let solve

,

where L1:n(q1, q2) is the log-likelihood.

(a) Show that and derive an expression for V.

(b) Show that and derive an expression for W. (Note:

is the first element of S1:n(q1,0, ).)

95. Suppose that two random variables q and Y have joint density fq,Y(q,y) and marginal densities w(q) and fY(y), and suppose that the distribution of Y|q is N(q, s2). Denote this conditional density as .

(a) Show that .

(b) Show that and solve this for .

(c) Show that .

(d) Using your results from (a) and (c), show that , where

10N = 10 21( ) 4 0i

i

YXi=

= .å 95% l.

1( )niis q

=å1{ }ni iY =

0(0, ( ) )dN I q® 1

0 0ˆ( ) (0, ( ) )

dn N Iq q q -- ® q

!θ2

maxθ2 L1:n(θ1,0 ,θ2 )

n( !θ2 −θ2,0 )→

d

N (0,V )

1n∂L1:n(θ10 , !θ2 )

∂θ1→d

N (0,Ω)

∂Ln(θ10 , !θ2 )∂θ1

!θ2

fY |θ ( y |θ )

E(θ |Y = y) =

θ fY |θ ( y |θ )w(θ)dθ∫fY ( y)

∂ fY |θ ( y |θ )∂y

= 1σ 2 (θ − y) fY |θ ( y |θ )

θ fY |θ ( y |θ )

∂ fY ( y)∂y

= ∫ ∂ fY |θ ( y |θ )w(θ )∂y

E(θ |Y = y) = y +σ 2S( y)

S( y) = ∂ln( fY ( y)) / ∂y =

∂ fY ( y) / ∂yfY ( y)

Page 20: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

20

Often this formula provides a simple method for computing when is

N(q, s2). (In the literature, this is called a “simple Bayes formula” or "Tweedie's formula" for the Bayes estimator of q.) 95. Suppose that a test statistic, Tn, for testing vs. constructed from a sample of size n is known to have a certain asymptotic null distribution where X is random variable with distribution function , and convergence obtains under the null hypothesis. Suppose that the distribution depends on an unknown "nuisance" parameter . Represent this dependence by . Let solve , and suppose that is continuous in Let denote the true value of and let denote a consistent estimator of Prove that a critical region defined by produces a test with asymptotic size equal to (Such a test is said to be asymptotically similar.)

96. Xi ~ i.i.d. U(0,q), for i = 1, … , n, and let .

(a) Show that .

(b) Show that and derive an expression for V. (c) I am interested in the competing hypotheses Ho: q = 1 versus Ha: q = 0.8, and I decide to

reject Ho if ≤ 0.45. (c1) Using an appropriate approximation, compute the size of the test. (c2) Using an appropriate approximation, compute the power of the test. (c3) Is this the best test to use?

97. Suppose that is generated by a “Moving Average-1” process ( )

where is a constant and for all t. Let . Prove that

98. A sample of size 30 from i.i.d. draws from a N(µ, s2) population yields: = 50 and

= 120.

E(θ |Y = y) fY |X ( y |θ )

0oH q: = 0aH q: ¹

Tn ⇒ X ∼ F( )F x

( )F xr ( )F x r, ( )x a r, ( )F x r a, =

( )x a r, r. or r ρn

r.

Tn > x(α , ρn )a .

1

1 n

ii

X Xn =

= å

2p

X q®

(2 ) 0, )d

n X N Vq- ®

X

tY (1)MA

1t t tY µ e qe -= + +

µ ε t ∼ iid(0,σ 2 ) 11

nn ttY n Y-

== å

2 2( ) (0 (1 ) )d

nn Y Nµ s q- ® , +

Xii=1

30

Xi2

i=1

30

Page 21: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

21

(a) Calculate the MLE’s of µ and s2. (b) Suppose that you knew that µ = 2, Calculate the MLE of s2. (c) Construct the Wald test of using a critical value. (d) Construct the Wald test of Ho: µ = 2, s2 = 1 vs. Ha: µ≠2, s2≠1, using a critical value. (e) Construct a joint approximate confidence set for and Plot the resulting

confidence ellipse. (f) Calculate the MLE of Calculate a confidence interval for .

99. Suppose , i = 1, … , n, are distributed i.i.d. U(0,q) (that is, uniformly over to ).

(a) What is the MLE of ? (b) Derive the CDF of

(c) Prove that where denotes the true value of . (d) I am interested in testing vs. , and do this by using a critical region

defined by . Suppose that the sample size is . (i) Find the value of so that the test has a size of 5%. (ii) Using this value of c, find the power of the test when .

100. Yi ~ i.i.d. with mean µ and variance s2 for i = 1, …, n. A sample of size n = 100 yields

= 5 and = 50. Construct an approximate 95% confidence interval for µ.

(Justify the steps in your approximation.) 101. are i.i.d. Bernoulli random variables with P(Yi = 1) = p. I am interested in the hypotheses Ho: p= ½ and Ha: p > ½. The sample size is , and I decide the reject the null hypothesis if .

(a) Use the central limit theorem to derive the approximate size of the test. (b) Suppose . Use the central limit theorem to derive the power of the test. (c) Is there a more powerful test with the same size? Explain.

102. Y ~ N(µ, s2) and X is a binary random variable that is equal to 1 if Y > 2 and equal to 0 otherwise. Let p = P(X = 1). You have n i.i.d. observations on Yi, and suppose the sample mean is = 4 and the sample variance is = 9.

(a) Construct an estimate of p. (b) Explain how you would construct a 95% confidence for p. (c) Is the estimator that you used in (a) consistent? Explain why or why not.

103. Suppose Yi|µ ~ i.i.d. N(µ, 1), , µ ~ N(2,1), and let loss be L( , µ) = ( −µ)2. Suppose n = 1.

(a) What is the MLE of ?

2 vs 2o aH Hµ µ: = . : ¹ 5%

10%95% µ 2s .

2q µ s= / . 95% q

iX 0 qqMLEθ

θMLE→p

θ0 0q q

0 1H q: = 1aH q: >

1max { }i n iX c£ £ > 100n =c

1 02q = .

100

1

1100 i

iX

1002

1

1100 i

iX

iY100n =

155n

iiY

=>å

0 60p = .

Y 1 21( )nii

n Y Y-=

1i n= ,.. µ µ

µ

Page 22: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

22

(b) Derive the Bayes risk of the MLE. (c) Derive the posterior for . (d) Derive the Bayes estimator of . (e) Derive the Bayes risk of the Bayes estimator. (f) Suppose Y1 = 1.5.

(i) Construct a (frequentist) 95% confidence set for µ. (ii) Construct a (Bayes) 95% credible set for µ.

104. Suppose Yi ~ i.i.d. N(µ, 1). Let denote the sample mean. Suppose loss is quadratic, i.e.

. Show that ½ is admissible. 105. Suppose Y1 and Y2 are i.i.d. with mean µ and variance s2. Let = 0.25Y1 + 0.75Y2. Suppose that risk is mean squared error, so that R( ,µ) = E[( −µ)2].

(a) Compute R( ,µ). (b) Show that is inadmissable.

106. Assume that Yi|µ ~ i.i.d. N(µ, 1) for i = 1, … , n. Suppose that µ ~ g for some prior density g. Show that the posterior distribution of µ depends on only through . 107. Assume that Yi|µ ~ i.i.d. N(µ, 1) for i = 1, … , n. Suppose that µ ~ N(0, l) with probability p, and µ = 0 with probability 1–p. Derive E(µ| ). 108. Yi for i = 1, … , n are independent and distributed N(0, ), where = 1 + g/i, and g is non-negative constant. I am interested in the competing hypotheses H0: g = 0 versus Ha: g = 1.

(a) Derive the optimal test and provide explicit instructions for carrying out the optimal test with size = 10%.

(b) Is the test you derived in (a) the uniformly most powerful test for the composite

alternative Ha: g > 0? 109. Yi ~ i.i.d. N(µ,1) for i = 1, … , n. I am interested in the competing hypotheses H0: µ = 0 versus Ha: µ = 1 or µ = 2. (The alternative is composite and includes µ = 1 and µ =2.) Derive the test with best “Weighted Average Power” with a weight of 1/3 on µ =1 and a weight of 2/3 on µ = 2. 110. Suppose

for i = 1, … , n

µµ

Y2ˆ ˆ( , ) ( )L µ µ µ µ= - Y

µµ µ

µµ

1{ }ni iY = Y

Y

!!σ i2

!!σ i2

Xi

Yi

⎝⎜⎜

⎠⎟⎟∼ i.i.dN

µX

µY

⎣⎢⎢

⎦⎥⎥, I2

⎝⎜⎜

⎠⎟⎟

Page 23: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

23

I am interested in the scalar parameter q = µX/µY. Let denote the MLE of q.

(a) Suppose that µY ≠ 0. (i) Show that . (ii) Derive an expression for the value of Ω. (iii) Propose a consistent estimator for Ω.

(b) Suppose that µY = 0 and µX ≠ 0. Show that where Z ~ N(0,1).

(c) Suppose that µY = µX = 0. Show that where Z1 and Z2 are mutually independent

N(0,1) random variables. (d) Suppose that µY is “small,” say µY = mY/ where mY is number like mY = 1 or 2. Use

asymptotics to argue that has approximately the same distribution as µX/(mY+Z),

where Z is N(0,1). (e) Suppose that both µY and µX are “small,” say µY = mY/ and µX = mX/ . Use

asymptotics to argue that has approximately the same distribution as (mX + Z1)/(mY + Z2).

(f) Using your insights from (a) – (e) discuss the accuracy of the approximation , where V = Ω/n and how the accuracy depends on the values of µX and µY.

(g) Consider the 95% confidence set , where is the estimator you proposed in (a.iii). Discuss the validity of this confidence sets and how the validity depends on the values of µX and µY.

(h) Suppose you are interested in testing Ho: q = q0 versus Ho: q ≠ q0. One procedure is to use the Wald statistic based on the d-method approximation you worked out in part (a). Another procedures uses “Fieller’s method.” Here’s the idea: if q = q0, then µX - q0µY = 0, thus the null and alternative can be recast as H0: µX - q0µY = 0 versus H0: µX - q0µY ≠ 0. This can be tested using a Wald statistic constructed as

.

(i) Compute var( -q0 ) and show that xFieller(q0) ~ under H0. Does this result

depend on the values of µY and µX (as long as the satisfy µX - q0µY = 0 )? (ii) Consider the set AFieller(q) = {q | xFieller(q) ≤ 3.84}. Show that AFieller(q) is a valid 95%

confidence set. (iii) Compare the confidence set AFieller(q) to the d-method confidence set .

Are they (in some relevant sense) the same when |µX| and |µY| are large? What if |µX| and |µY| are small?

θ = X / Y

n(θ −θ )⇒W ~ N (0,Ω)

1nθ ⇒

µX

Z

θ ⇒

Z1

Z2

n

1nθ

n n

θ

θ ~a

N (θ ,V )

θ ±1.96 Ω / n Ω

ξFieller (θ0 ) = ( X −θ0Y )2 / var( X −θ0Y )

X Y χ12

θ ±1.96 Ω / n

Page 24: Gerzensee Week 1 – 2018-19 - Princeton Universitymwatson/misc/Week1_ReviewExercises...Gerzensee − Week 1 – 2018-19 1 Here are some exercises to help you review the material from

Gerzensee − Week 1 – 2018-19

24

(i) In a sample of size n = 200, = 4.0 and = 5.2. Construct the d-method and Fieller confidence sets for q. Are they similar?

(j) In a sample of size n = 200, = .040 and = .052. Construct the d-method and Fieller confidence sets for q. Are they similar?

X Y

X Y