41
STAT151 2010 FINAL EXAM This exam is 3 hours long. It is closed book but you can use an A-4 size cheat sheet. There are 6 questions. The points assigned for each question are given next to the question. You may need a calculator for some of the questions but you MUST show your work. You must show your work or give reasoning to receive at least partial credit for a question. No credit will be given for answers with no indication of how they were derived. For questions that require a one or two line explanation, you will receive no credit if you write three or more lines. Being concise matters. Work on the easiest questions first before you attempt the harder questions. Harder questions are marked “H”. If you encounter a question that you do not know how to do, you may receive partial credit if you tell me in plain English (complemented with appropriate probability/statistical symbols if you like) how you would approach the problem. You may use pen or pencil; write legibly. Name: IC: Question Points 1 16 2 12 3 22 4 19 5 17 6 14 Total 100

STAT151 2010 FINAL EXAM - mysmu.edu

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: STAT151 2010 FINAL EXAM - mysmu.edu

STAT151 2010 FINAL EXAM

• This exam is 3 hours long. It is closed book but you can use an A-4 size cheat sheet.

• There are 6 questions. The points assigned for each question are given next to thequestion.

• You may need a calculator for some of the questions but you MUST show your work.

• You must show your work or give reasoning to receive at least partial credit for aquestion. No credit will be given for answers with no indication of how they werederived.

• For questions that require a one or two line explanation, you will receive no credit ifyou write three or more lines. Being concise matters.

• Work on the easiest questions first before you attempt the harder questions. Harderquestions are marked “H”.

• If you encounter a question that you do not know how to do, you may receivepartial credit if you tell me in plain English (complemented with appropriateprobability/statistical symbols if you like) how you would approach the problem.

• You may use pen or pencil; write legibly.

Name:

IC:

Question Points

1 16

2 12

3 22

4 19

5 17

6 14

Total 100

Page 2: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 1

1. (16 points) Mohammad and Nurul are students at SMU. In one of their courses, they haveto raise funds for a project. They figured that many students and professors love Americanjunk food. So they made some donuts and set up a stall at the School of Economics. Theyfound out each customer either buys 1, 2 or 3 donuts and the number of donuts a customerbuys (X) seems to be random and can be assumed to follow a distribution with the followingPDF:

X 1 2 3P(X) 5c 3c 2c

(a) Find the value of c. (2 points)

(b) Find E(X) and var(X) and explain briefly, in simple English, the meanings of E(X)and var(X). (2 points)

Page 3: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 2

(c) Let the number of customers per hour (Y ) follow a Poisson distribution with a rate ofλ = 10. Assuming Y and X are independent, find the expected value of the numberof donuts they sell in an hour? You may use any result from the Poisson distributionwithout proof. (2 points)

(d) Two of their professors approached their stall. Let V and W be the number of donutsthe professors would buy. Assuming V and W are independent, what is the probabilitythat one professor would buy 1 donut and the other would buy 2 donuts? (2 points)

Page 4: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 3

(e) One of the professors bought 2 donuts and the other bought 1 donut. Let S, T be thetimes (in minutes), respectively, for Professor 1 and 2 to finish the donut(s) and letS, T follow a Uniform(0, 3b) distribution, such that:

f(t) =

{ 1

3bif 0 ≤ t ≤ 3b,

0 otherwise,F (t) =

{ t

3bif 0 ≤ t ≤ 3b,

1 if t > 3b,

where b represents the number of donuts. Suppose there is equal probabilty thatProfessor 1 or 2 is the one who bought 2 donuts. If S, T are independent of each other,find the probability that they will finish the donuts in 4 minutes (i.e., ≤ 4). (2 points)

(f) If we assume Professor 1 eats faster, such that S ∼ Uniform(0, 2b) whereas T ∼Uniform(0, 3b). Is the chance that they will finish in 4 minutes different from (e)? Ifso, calculate the new probability. (3 points)

Page 5: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 4

(g) Under the assumptions in (f), if they finished the donuts in 4 minutes, what can wesay about the probability that Professor 1 is the one who bought 2 donuts. (3 points)

Page 6: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 5

2. (12 points) Paul is looking for investors to invest in his property company. The number of(unsuccessful) investors (Y ) that he needs to approach before he can find one investor whois willing to invest in his company has the following PDF:

P (Y = k) =s(4− 4k) + 4k

5k+1,

where 0 < s < 1 is unknown.

(a) State the parameter(s) of this distribution. (1 point)

(b) Write down the possible values of Y . (1 point)

(c) Write down P(Y = 3.5). (1 point)

(d) Assuming s = 0.5, find the probability that he would have to approach at least twoinvestors before he could find an investor who is willing to invest, i.e., find P(Y ≥ 2).(2 points)

Page 7: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 6

(e) Write down (but do not evaluate) the first three terms in the expressions for evaluatingE(Y ) and var(Y ). (3 points)

Page 8: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 7

(f) The investors that Paul approaches are of two types; the proportion of the two typesare s and 1 − s, where 0 < s < 1. For the first type, there is a probability p = 1/5that an investor would invest and for the second type, p = 4/5. So, the distribution ofY in this question is in fact a mixture of two Geometric distributions:

Y ∼{Geometric

(p = 1

5

)with probability s,

Geometric(p = 4

5

)with probability 1− s.

The mean and variance for a Geometric(p) distribution are known to be

1− pp

and1− pp2

,

respectively. Based on these facts, find the answer to E(Y ) if s = 0.5. (4 points, H)

Page 9: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 8

3. (22 points) The following figures give the income distributions (in $1000) of twopopulations:

x

f(x)

Population 1

Population 2

0 2 4 6 8 10121416180

0.1

0.25

(a) Based on the figures, which population has (i) a higher variance and (ii) a higher chanceof finding an extremely wealthy individual. State your reason in both. (2 points)

(b) Suppose n observations from each of the two populations are drawn to form estimatorsβ1, β2, respectively, to estimate the means of Population 1 and Population 2. When n islarge, the distributions of β1, β2 look similar to those in the following figure. Assumingthe estimators are unbiased, identify the distribution that corresponds to β1 (and β2)and justify your answer briefly. (3 points, H)

Page 10: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 9

(c) Name the distributions in (b) and write down the value(s) of the centers of thedistributions. (2 points)

(d) Suppose income (X) from different populations can be assumed to follow a distributionwith PDF:

f(x) =

{xθ2e

−xθ if 0 < x <∞, θ > 0,

0 otherwise,

where under the distribution, the mean and the variance are:

E(X) = 2θ; var(X) = 2θ2.

State the parameter(s) of the distribution and name one use of the parameter(s). (2points)

(e) Suppose n = 50 independent observations of X are drawn from a population,Population 3. Let the first three observations be X1, X2, X3 = 6.031, 1.144, 2.261.Furthermore, suppose the following summary statistics are recorded:

50∑i=1

Xi = 208.27;50∑i=1

log(Xi) = 59.8972.

Write down the likelihood of observing X1 to X3 and hence generalize the method towrite down the likelihood and the log-likelihood. You may use the facts that for b > 0,log(ba) = a log(b) and log(ec) = c. (3 points)

Page 11: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 10

(f) Fill in the table below of the log-likelihood (2 points):

θ 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3`(θ) -117.27 -115.67 -114.58 -113.90 -113.55 -113.94

(g) The log-likelihood function of θ is given below. Use it to approximate the maximumlikelihood estimate (MLE, θ), to the nearest 0.1, for θ. State the reason for your choiceof the MLE. (2 points)

Log−likelihood

θ

1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3

−11

8−

117

−11

6−

115

−11

4−

113

Page 12: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 11

(h) Derive the MLE analytically and compare it to your answer in (g). (2 points)

(i) It is known that the variance of the MLE, θ, is θ2

2n. Use the CLT to find a 95% confidence

interval for the mean income E(X) = 2θ. Is there enough evidence to say the meanincome of this population is above $4000? (4 points)

Page 13: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 12

4. (19 points) Let the wealth of individuals (X, in $100000) in a population follow adistribution with PDF:

f(x) =

{(λ− 1)x−λ if 1 < x <∞,0 otherwise,

where 2 < λ <∞. The PDF’s for three different values of λ are shown in the following plot:

1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

Wealth distribution

X (wealth)

PD

F

λ=3

λ=4

λ=5

(a) Let M3,M4,M5 and V3, V4, V5 represent, respectively, the mean and variance in incomeusing PDF’s under λ = 3, 4, 5. By only observing the plot, rank the means and thevariances of the three PDFs. Justify your choice briefly. Use > to symbolize “greaterthan”. (3 points)

Mean:

Variance:

Justification:

Page 14: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 13

(b) Suppose the wealth (X) of one individual from the population is observed and it is1.962 (in $100000). Based on the observation, test the hypotheses:

H0 : λ ≥ 4.5 vs. H1 : λ < 4.5.

at the 5% significance level. (3 points, H)

(c) It is known that for λ = 4.5, E(X) = 1.4, SD(X) = 0.611. Based on these information,discuss whether your answer to (b) is reasonable (2 points, H)

Page 15: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 14

(d) In addition to the observation in (b), suppose the wealth (X) of another 99 individalsare observed, so there are in total n = 100 observations. Let the first three observationsbe X1, X2, X3 = 1.962, 1.219, 1.480 and suppose all observations are independent.Furthermore, the following summary statistics are recorded:

1

100

100∑i=1

Xi = 1.43884,100∑i=1

log(Xi) = 30.41987.

Based on the data, find the MLE of λ. (2 points)

(e) Find the MOM of λ. (2 points)

Page 16: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 15

(f) Find the Cramer-Rao lower bound for λ. Compare the MOM to the MLE and decidewhether the MOM is an asymptotically efficient estimator in this case. (2 points)

(g) Based on the same data in (d), repeat the test in (b). (2 points)

Page 17: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 16

(h) Suppose the hypotheses are rewritten as

H0 : λ = 4.5 vs. H1 : λ = 4.5 + δ, δ 6= 0.

Find the power of your test for δ = 0.5, 1, 1.5 and comment on your findings using a

few sentences. You should estimate SD(λ) by

√1

nI(λ = 4.5 + δ). (3 points, H)

Page 18: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 17

5. (17 points) For events that occur independently at a fixed rate of λ > 0 per unit time,the Poisson distribution can be used to model the number of events (X) in one unit of time:

P(X = k) =λke−λ

k!, , k = 0, 1, 2, ...

The mean and the variance of the distribution are:

E(X) = var(X) = λ.

(a) Suppose we have two samples of independent observations: V1, ..., Vm from Poisson(λ1)and W1, ...,Wn from Poisson(λ2), where m = 50 and n = 80. Suppose further that

m∑i=1

Vi = 47;n∑i=1

Wi = 156.

From the given data, find the MLE’s of λ1, λ2 based on the two samples. (3 points)

Page 19: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 18

(b) Based on the data in (a), test the hypotheses:

H0 : λ1 = 1.2 vs. H1 : λ1 6= 1.2.

at the 5% significance level. (3 points)

(c) Suppose we assume λ2 = θλ1 for a fixed and known constant θ > 0, find the MLE ofλ1 and hence the MLE of λ2. (2 points)

Page 20: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 19

(d) Let the MLE of λ1 in (a) and (c) be λ∗1 and λ∗∗1 , respectively, compare the biases andthe variances of λ∗1 and λ∗∗1 . (4 points, H)

Page 21: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 20

(e) Using your answer in (c) and assuming θ = 2, repeat the test in (b), i.e.,

H0 : λ1 = 1.2 vs. H1 : λ1 6= 1.2.

at the 5% significance level. (3 points, H)

(f) Comparing the tests in (b) and (e), which one do you prefer? Discuss briefly the rolesof n and θ in your comparison. (2 points, H)

Page 22: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 21

6. (14 points) A bank wants to predict the profit for Year 2005 using data from 17 years,1988-2004, given in the following tables.

Raw data for years 1988-2004

Year GDP Base rate Customer base Unemployment Bank profit(million $) (%) (Number) (%) (million $)

1988 911890 7.25 4887000 6.5 13001989 814309 7.75 4873000 7.8 26001990 865143 7.50 4812000 9.3 40001991 784699 11.00 4936000 8.9 -57001992 826172 7.50 4894000 8.0 -17001993 793805 7.75 4945000 7.7 -40001994 778054 7.25 4893000 6.7 13001995 941104 6.50 4972000 4.6 65001996 874517 6.75 4900000 6.4 7001997 882522 7.75 4881000 6.2 36001998 939819 6.25 4886000 6.1 25001999 935818 6.00 4964000 5.2 50002000 921937 6.00 4974000 5.6 47002001 934309 5.75 4963000 3.8 76002002 901189 5.75 4973000 5.5 53002003 903461 6.25 4893000 5.6 43002004 898132 6.75 4969000 7.0 42002005 900000 8.00 4900000 7.0 –

Selected summary statistics for years 1988-2004

Variable∑n

i=1 xi∑n

i=1 x2i

∑ni=1 xiyi

∑ni=1 yi

∑ni=1 y

2i

Y Bank profit 42200 308540000

X GDP 1.490688×107 1.312316×1013 3.952445×1010

Base rate 119.75 868.4375 240650Customer base 8.3615×107 4.112985×1014 2.081146×1011

Unemployment 110.90 758.19 215290

Page 23: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 22

●●

● ●●●●● ●● ●

● ●●

800000 900000

−60

000

4000

corr(X,Y)= 0.777

GDP

Ban

k pr

ofit

●●

●●●● ● ●●●●

●●●

6 7 8 9 10

−60

000

4000

corr(X,Y)= −0.795

Base rate

Ban

k pr

ofit

●●●● ●● ●● ●

● ●●●

4850000 4950000

−60

000

4000

corr(X,Y)= 0.205

Customer base

Ban

k pr

ofit

●●

●● ●● ● ●●●●

●●●

4 5 6 7 8 9

−60

000

4000

corr(X,Y)= −0.713

Unemployment

Ban

k pr

ofit

(a) Suppose the relationships between the different variables (X) and Bank profit (Y ) aresummarized in the given figures. Based on the figures, choose one variable that youcan use to predict Bank profit in 2005. Justify your choice statistically. (2 points)

Page 24: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 23

(b) Using your choice in (a), find a simple linear regression model for predicting Bankprofit in 2005. You do not need to carry out a test here. Show all your work. (3points)

Page 25: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 24

(c) The test statistics for simple linear regression models between Bank profit (Y ) andeach of the four variables are given in the following table. Use the values in this tableand the accompanying t-table to test your model in (b) and draw a conclusion. (2points)

Variable Test statisticGDP 4.773Base rate -5.070Customer base 0.812Unemployment -3.941

t-tabledf 14 15 16 17 18 19 120 >120

critical value 2.145 2.131 2.120 2.110 2.101 2.093 1.98 1.96

(d) Based on the model you created, predict the Bank profit in 2005 and determine the“goodness-of-fit” of your model. (2 points)

Page 26: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 25

(e) The following figures give the 95% confidence and prediction bands based on the simplelinear regression models using maximum likelihood, between each of the four variablesand Y . Choose the figure that corresponds to the model you used in (a)-(d) and (i)identify the prediction bands and (ii) mark the 95% prediction interval for the Bankprofit in 2005 using your model. Explain briefly how you arrived at your answer. (3points)

●●

● ●●●●● ●● ●

● ●●

800000 900000

4400

5000

5600

corr(X,Y)= 0.777

X1

Ban

k pr

ofit

●●

●●●● ● ●●●●

●●●

6 7 8 9 10

4400

5000

5600

corr(X,Y)= −0.795

X2

Ban

k pr

ofit

●●●● ●● ●● ●

● ●●●

0.985 0.99544

0050

0056

00

corr(X,Y)= 0.204

X3

Ban

k pr

ofit

●●

●● ●● ● ●●●●

●●●

40 60 80

4400

5000

5600

corr(X,Y)= −0.713

X4

Ban

k pr

ofit

(f) Determine whether the Bank can be 95% certain that the profit in 2005 will not losemore than $5000 million. Justify your answer using the result you obtained in (e). Thedotted grey line is drawn for ease of referencing. (2 points)

Page 27: STAT151 2010 FINAL EXAM - mysmu.edu

— END —

Page 28: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 27

ANSWERS:

(1a) 5c+ 3c+ 2c = 1⇒ 10c = 1⇒ c = 0.1.

(b)

E(X) = 0.5(1) + 0.3(2) + 0.2(3) = 1.7;

E(X2) = 0.5(1) + 0.3(4) + 0.2(9) = 3.5;

var(X) = E(X2)− E(X)2 = 3.5− 1.72 = 0.61.

E(X) can be interpreted as the average number of donuts customers buy. However, sincenot all customers buy the same number, var(X) measures the differences in the number ofdonuts bought by different customers.

(c) If Y ∼ Poisson(λ), then it is known that E(Y ) = var(Y ) = λ = 10. Since X, Y areindependent, therefore,

E(XY ) = E(X)E(Y ) = 1.7× 10 = 17.

(d)

P(1 and 2) = P(W = 1, V = 2) + P(W = 2, V = 1)

= 0.3(0.5) + 0.5(0.3)

= 0.3.

(e) If Professor 1 is the one who bought 2 donuts, then S ∼ Uniform(0, 6) and T ∼Unuform(0, 3). On the other hand, if Professor 2 is the one who bought two donuts is thesame, then S ∼ Uniform(0, 3) and T ∼ Uniform(0, 6).

Let A be the event that Professor 1 bought 2 donuts.

P(Finished in 4 minutes) = P(S < 4, T < 4|A)P(A) + P(S < 4, T < 4|A)P(A)

= P(S < 4|A)P(T < 4|A)P(A) + P(S < 4|A)P(T < 4|A)P(A)

=

(4

6

)× 1× 1

2+ 1×

(4

6

)× 1

2

=2

3.

(f) If Professor 1 is the one who bought 2 donuts, then S ∼ Uniform(0, 4) and T ∼Uniform(0, 3). On the other hand, if Professor 2 is the one who bought two donuts is thesame, then S ∼ Uniform(0, 2) and T ∼ Uniform(0, 6).

Page 29: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 28

Let A be the event that Professor 1 bought 2 donuts.

P(Finished in 4 minutes) = P(S < 4, T < 4|A)P(A) + P(S < 4, T < 4|A)P(A)

= P(S < 4|A)P(T < 4|A)P(A) + P(S < 4|A)P(T < 4|A)P(A)

= 1× 1× 1

2+ 1×

(4

6

)1

2

=5

6.

(g)

P(A|Finished in 4 minutes) =P(A,Finished in 4 minutes)

P(Finished in 4 minutes)

=P(Finished in 4 minutes|A)P(A)

P(Finished in 4 minutes)

=1256

=3

5,

which is bigger than 1/2.

(2a) s is the parameter.

(b) The possible values of Y are 0, 1, 2,....

(c) P(Y = 3.5) = 0.

(d) P(Y ≥ 2) = 1− P(Y = 0)− P(Y = 1) = 1− 0.5(4−40)+40

5k+1 − 0.5(4−41)+41

51+1 = 1− 3350

= 1750

.

Page 30: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 29

(e)

E(Y ) =∞∑k=0

kP(Y = k)

= 0× P(Y = 0) + 1× P(Y = 1) + 2× P(Y = 2) + ...

= 0× 0.5(4− 40) + 40

50+1+ 1× 0.5(4− 41) + 41

51+1+ 2× 0.5(4− 42) + 42

52+1+ ...

var(Y ) =∞∑k=0

k2P(Y = k)− E(Y )2

= 0× P(Y = 0) + 12 × P(Y = 1) + 22 × P(Y = 2) + ...− E(Y )2

= 0× 0.5(4− 40) + 40

50+1+ 1× 0.5(4− 41) + 41

51+1+ 4× 0.5(4− 42) + 42

52+1+ ...− E(Y )2

(f) Let A be an event that denotes an investor comes from group 1.

P(Y ) = P(Y,A) + P(Y, A)

= P(Y |A)P(A) + P(Y |A)P(A)

= P(Y |A)s+ P(Y |A)(1− s)

E(Y ) =∞∑k=0

kP(Y = k)

=∞∑k=0

kP(Y = k|A)s+ kP(Y = k|A)(1− s)

= s∞∑k=1

kP(Y = k|A)︸ ︷︷ ︸E(Y ) for Y∼Geometric(1/5)

+(1− s)∞∑k=1

kP(Y = k|A)︸ ︷︷ ︸E(Y ) for Y∼Geometric(4/5)

= 0.51− 1/5

1/5+ 0.5

1− 4/5

4/5

= 21

8

(3a) (i) Based on the PDF’s, it is clear that Population 2 has a bigger spread, and hencebigger variance. (ii) It is more likely to find a very wealthy individual from Population 2because the tail of the distribution is thicker and hence the probability of large value ishigher.

(b) (i) Since population 2 has a higher variance, hence var(β2) has a higher value thanvar(β1) when the sample size n is used to construct both estimators.

Page 31: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 30

(ii) The mean of Population 2 also seems higher, and for unbiased estimators, the distributionof the estimator centers around the true parameter, hence, the distribution of β2 should havea higer mean.Consquently the distribution has a higher mean and is wider corresponds to β2.

β1

β2

(c) The distributions are normal according to CLT and therefore, the centers of thedistribution are the true but unknown means of the two income distributions.

(d) The parameter is θ. One use of the parameter is that it summarizes the informationabout the distribution in a single value. So given θ, we know E(X), var(X), etc.

(e)

There is a likelihood of 6.031θ2e

−6.031θ that the 1st observation is 6.031

There is a likelihood of 1.144θ2e

−1.144θ that the 2nd observation is 1.144

There is a likelihood of 2.261θ2e

−2.261θ that the 3rd observation is 2.261

...

Therefore, the “likelihood” of observing the data is:

L(θ) =6.031

θ2e

−6.031θ × 1.144

θ2e

−1.144θ × 2.261

θ2e

−2.261θ × ...

which can be written asn∏i=1

Xi

θ2e

−Xiθ .

The log-likelihood is

`(θ) = logn∏i=1

Xi

θ2e

−Xiθ =

n∑i=1

log(Xi)− 2n log θ − 1

θ

n∑i=1

Xi = 59.8972− 2× 50 log θ − 208.27

θ.

(f)

Page 32: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 31

θ 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3`(θ) -117.27 -115.67 -114.58 -113.90 -113.55 -113.47 -113.61 -113.94

(g) The log-likelihood function corresponding to the table is given below. The approximatemaximum likelihood estimate (θ), to the nearest 0.1, for θ is about 2.1.

Log−likelihood

θ

1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3

−11

8−

117

−11

6−

115

−11

4−

113

(h) The log-likelihood is

`(θ) = logn∏i=1

Xi

θ2e

−Xiθ =

n∑i=1

log(Xi)− 2n log θ − 1

θ

n∑i=1

Xi.

Differentiating with respect to θ gives

d`(θ)

dθ= −2n

θ+

1

θ2

n∑i=1

Xi.

The MLE is the solution to d`(θ)dθ

= 0, therefore,

−2nθ +n∑i=1

Xi = 0

θ =

∑ni=1Xi

2n

=208.27

100= 2.0827.

Hence, using the formula gives similar result to using the graphical method, θ = 2.1.

(i) Based on the approximation in (b), the MLE of E(X) = 2θ is simply, by the invariance

Page 33: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 32

property, 2θ. Moreover, the variance of the MLE of E(X) is

var(

2θ)

= 22var(θ)︸ ︷︷ ︸Property of variance

= (4)

(θ2

2n

)=

2θ2

n.

Therefore, a 95% confidence interval for E(X) is

2θ ± 1.96

√2θ2

n≈ 2θ ± 1.96

√2θ2

n

= 2× 2.0827± 1.96

√2(2.08272)

50≈ 4.1634± 0.8164.

Since the lower limit is 4.1634 − 0.8164 = 3.347 < 4, therefore, we cannot be 95% certainthat the average income is higher than $4000.

(4a)

Mean: M3 > M4 > M5

Variance: V3 > V4 > V5Justification: It can be observed that the spread of the distribution is inversely proportionalto the value of λ, so that for λ = 3, the spread is the highest. For λ = 3, there is ahigher probability of a larger value of X (wealth), giving the distribution a higher mean andvariance

(b) To test the hypotheses, we need to determine the probability of seeing an observation asextreme as that as the observed value of 1.962. Therefore, we need to evaluate:

P(X > 1.962) = 1− F (1.962),

assuming H0 : λ = 4.5 is true. To find the CDF, we need

F (x) =

∫ x

1

(λ− 1)r−λdr =[−r−λ+1

]x1

= 1− x−λ+1.

Hence

1− F (1.962) = 1− (1− 1.962−λ+1) = 1.962−λ+1 = 1.962−3.5 = 0.09452648,

which is the p-value of the test. Since the p-value is larger than 0.05, hence there is notsufficient evidence to conclude that λ < 4.5.

(c) We note that E(X) = 1.4 when λ = 4.5 and 1.962 is quite a bit larger than the expectedvalue. However, since SD(X) = 0.611 is quite large, we can expect that an observation can

Page 34: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 33

deviate from E(X) by a rather large amount; so 1.962 is not such an unusual observationand therefore does not constitute a strong evidence again H0 : λ = 4.5.

(d)

There is a likelihood of (λ− 1)1.962−λ that the 1st observation is 1.962

There is a likelihood of (λ− 1)1.219−λ that the 2nd observation is 1.219

There is a likelihood of (λ− 1)1.480−λ that the 3rd observation is 1.480

...

Therefore, the “likelihood” of observing the data is:

L(θ) = (λ− 1)1.962−λ × (λ− 1)1.219−λ × (λ− 1)1.480−λ × ...

which can be written asn∏i=1

(λ− 1)X−λi .

The log-likelihood is

`(λ) = logn∏i=1

(λ− 1)1.219−λ = n log(λ− 1)− λn∑i=1

logXi = 100 log(λ− 1)− λ× 30.41987.

Analytically, the maximum can be derived by differentiating `(λ) with respect to λ gives

d`(λ)

dλ=

n

λ− 1−

n∑i=1

logXi.

The MLE is the solution to d`(λ)dλ

= 0, therefore,

n

λ− 1−

n∑i=1

logXi = 0

λ = 1 +n∑n

i=1 logXi

= 1 +100

30.41987= 4.287325.

Page 35: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 34

(e) The population mean can be derived as

E(X) =

∫ ∞1

x(λ− 1)x−λdx

=

∫ ∞1

(λ− 1)x−λ+1dx

=

[λ− 1

−λ+ 2x−λ+2

]∞1

=λ− 1

λ− 2.

The MOM estimate is to equate the population moment the sample moment:

X =λ− 1

λ− 2

1− 2X = λ− Xλ

λ =1− 2X

1− X

=1− 2(1.43884)

1− 1.43884= 4.278735

(f) log f(X|λ) = log(λ− 1)− λ logX.

The score function is

S(λ) =d

dλlog f(X|λ) = − 1

λ− 1+ logX

and the Fisher information of λ is

I(λ) = −E

[d

dλS(λ)

]= −E

(− 1

(λ− 1)2

)=

1

(λ− 1)2.

The Cramer Rao lower bound is

1

nI(λ)=

1

n 1(λ−1)2

=(λ− 1)2

n.

(g) From (a) and (c), we obtain the MLE of λ as λ = 4.287325. If H0 : λ = 4.5 is true, then

Page 36: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 35

the observed value of λ can be expressed as a Z-score, which is

Z∗ =λ− 4

SD(λ)≈ λ− 4.5√

(λ− 1)2

n

=4.287325− 4.5√

(4.5− 1)2

100= −0.6079

Since this is a 1-sided test, therefore, the critical value is 1.64. Furthermore, |Z∗| = | −0.6079| < 1.64, hence, there is not enough evidence to say λ < 4.5; we do not reject H0.

(h) The power for a 1-sided 5% significance test is

b(δ) = P

(Z > 1.64− δ

SD(λ)

).

In order to find b(δ), we need to evaluate SD(λ), which can be estimated by

√1

nI(λ = 1 + δ),

since power is evaluated under H1 : λ = 4.5 + δ. For values of δ = 0.5, 1, 1.4, the power is:

δ 1.64− δ

SD(λ)b(δ)

0.5 0.39 0.3481 -0.582 0.719

1.5 -1.36 0.913

Looking at the last column, it can be observed that the power increases with the value ofδ. If δ = 0.5, then the power is weak. On the other hand, for δ = 1, the power is almost0.8. When δ = 1.5, there is more than 90% chance that the test can correctly identify H1.Therefore, the test can be safely applied to situations where we want to detect alternativessuch that δ ≥ 1.5.

(5a) Since the the observations are independent, then for the first sample

There is a likelihood ofλV11

V1!e−λ1 of observing V1

There is a likelihood ofλV21

V2!e−λ1 of observing V2

There is a likelihood ofλV31

V3!e−λ1 of observing V3

Page 37: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 36

...

Similarly, for the second sample,

There is a likelihood ofλW12

W1!e−λ2 of observing W1

There is a likelihood ofλW22

W2!e−λ2 of observing W2

There is a likelihood ofλW32

W3!e−λ2 of observing W3

...

Therefore, the “likelihood” of observing the entire dataset of two samples is:

L(λ1, λ2) =λV11V1!

e−λ1 × λV21V2!

e−λ1 × ...× λVm1Vm!

e−λ1 × λW12

W1!e−λ2 × λW2

2

W2!e−λ2 × ...× λWn

2

Wn!e−λ2

=m∏i=1

λVi1Vi!

e−λ1n∏j=1

λWj

2

Wj!e−λ2

`(λ1, λ2) = log(λ1)m∑i=1

Vi −mλ1 −m∑i=1

log(Vi!) + log(λ2)n∑j=1

Wj − nλ2 −m∑j=1

log(Wj!)

Let λ∗1, λ∗2 be the MLE’s of λ1, λ2, then λ∗1, λ

∗2 are determined as follows:

d

dλ1`(λ∗1, λ

∗2) =

1

λ∗1

m∑i=1

Vi −m = 0⇒ λ∗1 = V =47

50

d

dλ2`(λ∗1, λ

∗2) =

1

λ∗2

n∑j=1

Wi − n = 0⇒ λ∗2 = W

(b) To test the hypotheses, we notice that if H0 : λ1 = 1.2 is true, then the observed valueof λ∗1 can be expressed as a Z-score, which is

Z∗ =λ∗1 − 1.2

SD(λ∗1)=

0.94− 1.2√1.2

50

= −1.678293

Since this is a 2-sided test, therefore, the critical value is 1.96. Furthermore, |Z∗| = | −1.678293| < 1.96, hence, there is not enough evidence to say λ1 6= 1.2; we do not reject H0.

(c) Under the assumption that λ2 = θλ1, for a fixed and known value θ > 0, the likelihood

Page 38: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 37

of observing the two samples becomes:

λV11V1!

e−λ1 × λV21V2!

e−λ1 × ...× λVm1Vm!

e−λ1 × (θλ1)W1

W1!e−θλ1 × (θλ1)

W2

W2!e−θλ1 × ...× (θλ1)

Wn

Wn!e−θλ1

=m∏i=1

λVi1Vi!

e−λ1n∏j=1

(θλ1)Wj

Wj!e−θλ1

`(λ1, λ2) = log(λ1)m∑i=1

Vi −mλ1 −m∑i=1

log(Vi!) + log(θλ1)n∑j=1

Wj − nθλ1 −n∑j=1

log(Wj!)

which can be written as `(λ1), since it is a function of λ1 for a fixed θ. Let λ∗∗1 be the MLEof λ1, then λ∗∗1 is determined as follows:

d

dλ1`(λ∗∗1 ) =

1

λ∗∗1

m∑i=1

Vi −m+1

λ∗∗1

n∑j=1

Wj − nθ = 0⇒ λ∗∗1 =

∑mi=1 Vi +

∑nj=1Wj

m+ nθ

Once we have determined λ∗∗1 , we can find λ∗∗2 = θλ∗∗1 .

(d) For λ∗1 = V ,

E(λ∗1) = E(V ) = E

(V1 + ...+ Vm

m

)=

1

mE(V1 + ...+ Vm)

=1

mE(V1) + ...+ E(Vm)

=1

mm× E(X)

=1

mm× λ1 = λ1

var(λ∗1) = var(V ) = var

(V1 + ...+ Vm

m

)=

1

m2var(V1 + ...+ Vm)

=1

m2var(V1) + ...+ var(Vm)︸ ︷︷ ︸V1,...,Vm are independent

=1

m2m× var(V )︸ ︷︷ ︸

V1,...,Vm∼Poisson(λ1)

=m

m2λ1

=λ1m.

Page 39: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 38

For λ∗∗1 =∑mi=1 Vi+

∑nj=1Wj

m+nθ,

E(λ∗∗1 ) = E

(∑mi=1 Vi +

∑nj=1Wj

m+ nθ

)

=E(V1 + ...+ Vm +W1 + ...+Wn)

m+ nθ

=λ1 + ...+ λ1 + θλ1 + ...+ θλ1

m+ nθ

=(m+ nθ)λ1m+ nθ

= λ1.

var(λ∗∗1 ) = var

(∑mi=1 Vi +

∑nj=1Wj

m+ nθ

)=

1

(m+ nθ)2var(V1 + ...+ Vm +W1 + ...+Wn)

=1

(m+ nθ)2var(V1) + ...+ var(Vm) + var(W1) + ...+ var(Wn)︸ ︷︷ ︸

V1,...,Vm,W1,...,Wn are independent

=1

(m+ nθ)2(m× λ1 + n× θλ1)

=1

(m+ nθ)2(m+ nθ)λ1

=λ1

m+ nθ.

Therefore, both λ∗1 and λ∗∗1 are unbiased, but var(λ∗∗1 ) < var(λ∗1) for any fixed θ > 0.

(e) For θ = 2,

λ∗∗1 =

∑mi=1 Vi +

∑nj=1Wj

m+ nθ=

47 + 156

50 + 2× 80= 0.9667

Based on (d), we can test the hypotheses using λ∗∗1 , which gives a test statistic

Z∗ =λ∗∗1 − 1.2

SD(λ∗∗1 )=

0.967− 1.2√1.2

50 + 2× 80

= −3.0823.

Clearly, |Z∗| = | − 3.0823| > 1.96, giving strong evidence for λ 6= 1.2; we reject H0.

(f) Comparing the two tests, clearly the one in (e) is superior because the test statistic makesuse of the extra data from the second sample. Moreover, the advantage of the second testis in the smaller variance of the MLE, which is given by λ1

m+nθ, so the larger the product

of nθ, the higher the advantage over the test in (b). n gives the sample size of the secondsample. Larger values of θ means a relatively larger number of events in the second sample,compared to the first sample.

Page 40: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 39

(6a) GDP is chosen because(1) the relationship between GDP and Bank profit seems to be linear,(2) the correlation between GDP and Bank profit is high.

(b)

b =

∑ni=1XiYi − nXY∑n

i=1

∑X2i − n(X)2

=3.952445× 1010 − 17× 1.490688×107

17× 42200

17

1.312316× 1013 − 17× 1.490688×10717

=3.952445× 1010 − 1.490688× 107 × 2482.353

1.312316× 1013 − 1.49068814/17= 0.04876272

a = Y − bX = 2482.353− 0.04876272× 1.490688× 107

17= −40276.47

Therefore, Y = −40276 + 0.04876X.

(c) Since n = 17, therefore df = n−2 = 15, we choose a critical value of 2.131. Based on thetable, the test statistic is 4.773 > 2.131. Hence, the observed relationship is not a randomphenomenon; we conclude there is an association between GDP and Bank profit.

(d) For 2005, GDP= 900000 which falls within the range of X that is estimable from theregression. Therefore,

Bank profit = −40276 + 0.04876× 900000 ≈ 3608.

Since R2 = r2 = 0.7772 = 0.603729, which is rather big and hence the fit is good.

(e) The 95% confidence interval is marked by the open squares on the red curves. Thered curves are chosen because it is known that prediction intervals are always wider thanconfidence interval. In this case, we use prediction intervals because we are interestedin prediction in 2005, not the average profits in different years with GDP = 900000.

Page 41: STAT151 2010 FINAL EXAM - mysmu.edu

2010-11 Term 1 40

●●

● ●●●●● ●● ●

● ●●

●●

800000 900000

−10

000

010

000

GDP

Ban

k pr

ofit

●●

●●●● ● ●●●●

●●●

●●

6 7 8 9 10

−10

000

010

000

Base rate

Ban

k pr

ofit

●●●● ●● ●● ●

● ●●● ●

●●

4850000 4950000

−10

000

010

000

Customer baseB

ank

prof

it

●●

●● ●● ● ●●●●

●●●

●●

4 5 6 7 8 9

−10

000

010

000

Unemployment

Ban

k pr

ofit

(f) Based on the prediction interval, since the lower limit of the prediction interval is above-5000, therefore, the Bank can be 95% certain the loss would not be more than 5000.