Upload
isabel-fisher
View
215
Download
1
Embed Size (px)
Citation preview
04/19/23 1
PUAF 610 TA
Session 4
04/19/23 2
Some words
• My email: [email protected]– Things to be discussed in TA– Questions on the course and problem sets
04/19/233
Today
• Problem Sets 1
• * Probability
• Sampling
• * Standard Error
• STATA
interval
continuous
numerical proportion
discrete
data
dichotomous
nominal
non-dichotomous
categorical
ordinal
Measurement scales
5
All measurement in science are conducted using four different types of scales:
"nominal", "ordinal", "interval" and "ratio”
Qualitative data (unordered or ordered discrete categories):
1. Nominal - numbers are used as labels for the elements (e.g. gender, party affiliation, states of a country, etc.)2. Ordinal – elements in the dataset can be ordered on the amount of the property being measured and values are assigned in this same order (e.g. ratings)
Measurement scales
6
Quantitative data (variables have underlying continuity): 3. Interval A measurement scale in which a certain distance along the scale means the same thing no matter where on the scale you are, but where "0“ (zero) on the scale does not represent the absence of the thing being measured. (temperature)
4. Ratio A measurement scale in which a certain distance along the scale means the same thing no matter where on the scale you are, and where "0" (zero) on the scale represents the absence of the thing being measured. (money)
04/19/23 7
Events
• Event vs. Observation (any collection of outcomes vs. a single observed outcome)– Simple event: any event that cannot be
subdivided into other events.– Compound event: any event that is
composed of two or more simple events.– Sample space: an event that contains all
possible outcomes.
04/19/23 8
Events
• Union of events contains simple events that are members of either one of the original events.
• Intersection of events contains simple events that are members of both of the original events.
04/19/23 9
Events
• Mutually exclusive events: have neither observations nor simple events in common.
• Independent events: the probability of one is not affected if the other has happened.
04/19/23 10
Probability
• Probability deals with the long-term likelihood of the occurrence of particular outcomes on variables of interest.
• Probability of an event is the ratio of the number of outcomes including the event to the total number of outcomes (simple events).
• P(A) = Number of outcomes that include A / Total number of possible outcomes
04/19/23 11
Probability
• probability of event = p0 <= p <= 1
0 = certain non-occurrence
1 = certain occurrence
04/19/23 12
Example
• Choose a number at random from 1 to 5.– What is the probability of each outcome? – What is the probability that the number
chosen is even?– What is the probability that the number
chosen is odd?
04/19/23 13
Example
• A glass jar contains 6 red, 5 green, 8 blue and 3 yellow marbles. If a single marble is chosen at random from the jar, what is the probability of choosing a red marble? a green marble? a blue marble? a yellow marble?
04/19/23 14
Rules of probability
• Probability of the union of two events
• The probability of event A OR event B is equal to the sum of their respective probabilities minus the probability of the intersection of the events.
• if A and B are mutually exclusive events, then P(A or B) = P(A) + P(B)
04/19/23 15
Rules of probability
• Probability of the intersection of two events
• The probability that Both A And B occur is equal to the probability A occurs times the probability that B occurs, given that A has occurred.
• If events A and B are independentthen P(A and B) = P(A)*P(B)
04/19/23 16
Rules of probability
The probability that event A will occur is equal to 1 minus the probability that event A will not occur.
P(A) = 1 - P(A')
04/19/23 17
Rules of probability
• Conditional probability
• The probability of event A given that event B has occurred is equal to the probability of the intersection of the events divided by the probability of event B.
04/19/23 18
Example
• Suppose a high school consists of 25% juniors, 15% seniors, and the remaining 60% is students of other grades.
• What’s the relative frequency of students who are either juniors or seniors ?
Example
• Suppose we have two dice. A is the event that 6 shows on the first die, and B is the event that 6 shows on the second die.
• If both dice are rolled at once, what is the probability that two 6s occur?
04/19/23 19
Example
• A box contains 6 red marbles and 4 black marbles. Two marbles are drawn without replacement from the box.
• What is the probability that both of the marbles are black?
04/19/23 20
04/19/2321
Example
• Suppose we repeat the experiment of; but this time we select marbles with replacement. That is, we select one marble, note its color, and then replace it in the box before making the second selection.
• When we select with replacement, what is the probability that both of the marbles are black ?
04/19/23 22
Example
• A student goes to the library. The probability that she checks out (a) a work of fiction is 0.40, (b) a work of non-fiction is 0.30, , and (c) both fiction and non-fiction is 0.20.
• What is the probability that the student checks out a work of fiction, non-fiction, or both?
04/19/23 23
Example
• At Kennedy Middle School, the probability that a student takes Technology and Spanish is 0.087. The probability that a student takes Technology is 0.68.
• What is the probability that a student takes Spanish given that the student is taking Technology?
04/19/23 24
Sampling
• Simple random sampling• Systematic sampling• Cluster sampling• Stratified random sampling• Multistage sampling
04/19/23 25
Sampling distributionof the mean
• When using samples we inevitably face the problem of sampling error which is defined as the difference between the population mean (μ) and the sample mean ( ).
• We can provide a probabilistic estimate of the accuracy of the sample mean through a theoretical sampling distribution.
Sampling distribution of the mean
26
• Central tendency
The expected value of the mean of the distribution of sample means is equal to the population mean.
• Variance
The expected value of the variance for the sampling distribution of the mean is
where σ2 is the variance in the population and n is the sample size.
04/19/23
Sampling distribution of the mean
27
Standard deviation of the sampling distribution of the mean
where σ is the standard deviation in the population (can be approximated by the sample standard deviation) and n is the sample size.
Standard deviation of the sampling distribution of the mean is called the standard error of the mean.
04/19/23
Sampling distributionof the mean
• As n increases, the standard error decreases.
• As n increases, the shape of SDM becomes more like the normal distribution even if the variable is not normally distributed in the population.
04/19/23 28
Standard error of a proportion
29
Standard error of a proportion is the standard deviation of its sampling distribution. Since proportions have two possible outcomes, the sampling distribution is binomial, however with relatively large sample sizes it approximates the normal distribution.
04/19/23
Standard errors and statistical precision
30
• Statistical precision is reflected in standard errors as measures of variability of the sampling distribution of a statistic.
• Small standard errors imply greater accuracy of the estimate.
• When the sample is representative, the standard error will be small.
04/19/23
STATA
• Beginner’s Guide to SAS & STATA Software (Dept. of Agricultural & Applied Economics, UGA)
• http://www.aaegrad.uga.edu/stata_sas_guide.pdf
• Learning by practice !
Stata Commands
32
summarize
univar
Stata Commands
33
univar, boxplot
graph box
020
4060
EMPF
T
Stata Commands
34
univar, boxplot
graph box
0 20 40 60EMPFT
STATA commands
35
hist varname, norm0
.1.2
.3.4
Den
sity
2 4 6 8 10usunemp
Stata Commands
36
set obs #
generate varname=rbinomial(1,p)
table varname