Upload
jack-birdwell
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
Psychology 290
Special Topics Study Course: Advanced Meta-analysis
April 7, 2014
The plan for today
• Review Bayes’ theorem• Review the likelihood function• Review maximum likelihood estimation• Prior and posterior distributions• The Bayesian approach to statistical inference
Bayes’ theorem
.P(A)
P(B) B)|P(A A)|P(B
Example of Bayes’ Theorem
• Suppose a woman has had a single unplanned, unprotected sexual encounter.
• She takes a pregnancy test, and it is positive.
• What does she want to know?• What is the probability that I am pregnant?
Example of Bayes’ Theorem (cont.)
• Let ‘B’ denote ‘pregnant’ and ‘A’ denote ‘positive pregnancy test.’
• Suppose P(A|B) is .90, p(A|~B) is .50, and P(B) is .15.
• The marginal P(A) can be expressed as P(A|B)P(B)+ P(A|~B)P(~B) = (.90)(.15) + (.50)(.85) = .56.
• P(B|A) = .90*.15 / .56 = .24107.
Example of Bayes’ theorem (cont.)
• So, on the basis of this one positive result, there is about a 1 in 4 chance that she is pregnant.
• Note that the probabilities used in this example are not accurate, and the problem is oversimplified.
Example of Bayes’ Theorem (cont.)
• Not a very satisfying answer.• Solution: retest.• Now, our prior probability of pregnancy is
P(B) = .24107.• We repeat and get another positive.• P(A) = (.90)(.24107) + (.50)(.75893)
= .59643• P(B|A) = .90*.24107/ .59643 = .36377.
Example of Bayes’ Theorem (cont.)
• If she repeats this and continues to get positive results, her probabilities of pregnancy are: test 3 = .507, test 4 = .649, test 5 = .769, test 6 = .857, test 7 = .915, test 8 = .951, test 9 = .972, and test 10 = .984.
• Each time she adds a new test (= new data), her posterior probability of being pregnant changes.
Bayesian inference
• The basic idea of Bayesian inference is to apply Bayes’ theorem to the relationship between data and our prior beliefs about parameters.
• In the example, the parameter of interest was P(pregnant).
• We updated our prior belief on the basis of each subsequent test result (data).
Bayesian inference (cont.)
• P(A|B) is the density of the data (proportional to the likelihood).
• P(B) is our prior belief about the parameters.
• P(B|A) is our updated belief about the parameters, given the observed data.
• The updated belief is called the posterior distribution.
The likelihood function
• Joint densities (data, given parameters).• View the joint density as a function of
parameters given the data likelihood function.
• Traditional use of the likelihood function: maximum likelihood estimation.
Properties of maximum likelihood estimates (review)
• Maximum likelihood estimators are often biased.
• Minimum variance estimators.• Likelihood ratio testing.
Prior and posterior distributions
• We have already defined the prior as a belief about the distribution of the parameter(s).
• Non-informative (vague) priors are used when we don’t have strong beliefs about the parameters.
• The posterior distribution is a statement of our belief about the parameters, updated to account for the evidence of the data.
Prior and posterior distributions (cont.)
• A conjugate prior is one chosen to produce a posterior that has the same form as the likelihood function.
• Examples: normal-normal, beta-binomial.
Bayesian estimation
• Bayes estimates are based on the posterior distribution.
• Often, the mean of a parameter’s posterior distribution is used as an estimate of the parameter.
• The math for that can become very difficult. Sometimes, the mode of the posterior is used instead (Bayes modal estimation).
Bayesian estimation (cont.)
• The maximum likelihood estimator may be thought of as a Bayes modal estimator with an uninformative prior.
• Modern computing power can remove the need for the nasty math traditionally needed for Bayesian estimation and inference.
• This makes the Bayesian approach more accessible than it once was.
Bayesian inference
• Bayesian inference involves probabilistic statements about parameters, based on the posterior distribution.
• Probabilistic statements are allowed because Bayesians view the parameters as random variables.
• For example, a Bayesian credibility interval allows us to make the kind of statement we wish we could make when we use confidence intervals.
Bayesian inference (cont.)
• In the Bayesian approach, one can discuss the probability that a parameter is in a particular range by calculating the area under the posterior curve for that range.
• For example, I might be able to make the statement that the probability mu exceeds 110 is .75.
• That sort of statement is never possible in frequentist statistics.
Bayesian inference (cont.)
• Bayesian inference does not involve null hypotheses. (Formally, the null hypothesis is known to be false if we take the Bayesian perspective. Why?)
• Rather, we make probabilistic statements about parameters.
• We can also compare models probabilistically.
An example using the Peabody data
• Suppose we are interested in estimating the mean Peabody score for a population of 10-year-old children.
• We have strong prior reasons to believe that the mean is 85.
• We operationalize that prior belief by stating that m ~ N(85, 4).
Peabody example (cont.)
• Next, we assume that Peabody itself is normally distributed:
.2
1);|( 2
2
22
2
X
eXP
Peabody example (cont.)
• Recall that we want a posterior distribution for m.• Bayes’ theorem says
• Note that we can ignore the denominator here, as it is just a scaling constant.
.constant
Prior mu given density Joint Posterior
Peabody example (cont.)
• Our posterior, then, is proportional to
• Some unpleasant algebra that involves completing the square shows that this is the same as normal with mean = (85s2+4nM) / (4n + s2).
.
2885
exp4
1,| 2
22
2
2
XXP
Peabody example (cont.)
• The variance of the posterior is 4s2/ (4n + s2).
• In our example, M = 81.675, an estimate of the variance is 119.2506, and n = 40.
• The posterior mean, then, is (85119.2506 + 4 40 81.675) / (4 40 + 119.2506) = 83.095.
Peabody example (cont.)
• The variance is 4 119.2506 / (4 40 + 119.2506) = 1.708.
• A 95% credibility interval is given by 83.095 ± 1.96 √1.708 = (80.53, 85.66).
• As Bayesians, we may say that the probability that m lies between those values is .95.
Peabody example (cont.)
• Now let’s suppose that we want to repeat the analysis, but with an uninformative prior for the mean.
• Instead of m ~ N(85, 4), we’ll use N(85, 10000000).
• The posterior distribution of the mean, then, is centered at (85119.2506 + 10000000 40 81.675) / (10000000 40 + 119.2506) = 81.675.
Peabody example (cont.)
• The variance is 10000000 119.2506 / (10000000 40 + 119.2506) = 2.98126.
• A Bayesian credibility interval for the mean, then, would 81.675 ± 1.96√2.98126 = (78.29, 85.06).
• Although this is identical to the confidence interval we would calculate using frequentist maximum likelihood, we are justified in giving it a Bayesian interpretation.