Upload
margery-arnold
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Describe 3 entirely different (but practical) ways for determining the area (in cm2) of the darkened region below (design is on a piece of
paper) to within 0.1%.
Describe 3 entirely different (but practical) ways for determining the area (in cm2) of the darkened region below (design is on a piece of
paper) to within 0.1%.
1. Superimpose a finely-spaced grid over the figure and count squares.
2. Cut out figure and weigh it. Compare that weight to that of piece of paper. If too light, transfer image to another uniformly-dense material.
3. Divide figure into local regions that can be integrated numerically.
4. Computer scan image and count pixels.
5. Build a container whose cross-section is that of the darkened figure. Fill with 1000cc water and measure level.
Describe 3 entirely different (but practical) ways for determining the area (in cm2) of the darkened region below (design is on a piece of
paper) to within 0.1%.
6. Use a “polar planimeter” – gadget that mechanically integrates the area defined by a close curve.
7. “Throw darts.” Draw rectangle (of calculable area) that encloses image. Pick random points within the rectangle and count which ones fall within the darkened figure. The ratio can be used to estimate area.
Why Study Statistics?Why Study Statistics?
Statistics can tell us about…Sports
Economy Population
Statistics: A mathematical science concerned with data collection, presentation, analysis, and interpretation.
New York City, NY
7,900,000
7,950,000
8,000,000
8,050,000
8,100,000
8,150,000
8,200,000
8,250,000
2000 2001 2002 2003 2004 2005 2006
Year
Es
tim
ate
d P
op
ula
tio
n
Employment Rates: 2006
63.1%
32.3%4.6%
UnemployedEmployed Other
Why Study Statistics?Why Study Statistics?
Statistical analysis is also an integral part of scientific research!Are your experimental results believable?
Example: Tensile Strength of Spaghetti
Data suggests a relationship between Type (size) and breaking strengthNot perfect – have random error.
Why Study Statistics?Why Study Statistics?
Responses and measurements are variable!
Due to…
Random Error – may vary from observation to observation
Perhaps due to inability to perform measurements in exactly the same way every time.
Goal of statistics is to find the model that best describes a target population by taking sample data.
Represent randomness using probability.
Systematic Error – same error value by using an instrument the same way
ProbabilityProbability
Experiment of chance: a phenomena whose outcome is uncertain.
Probabilities Chances
Probability Model
Sample SpaceEventsProbability of Events
Sample Space: Set of all possible outcomes
Event: A set of outcomes (a subset of the sample space). An event E occurs if any of its outcomes occurs. Rolling dice, measuring, performing an experiment, etc.
Probability: The likelihood that an event will produce a certain outcome.Independence: Events are independent if the occurrence of one does not affect the probability of the occurrence of another. Why important?
ProbabilityProbability
Consider a deck of playing cards…
Sample Space?
Event?
Probability?
Set of 52 cards
R: The card is red. F: The card is a face card.
H: The card is a heart. 3: The card is a 3.
P(R) = 26/52 P(F) = 12/52
P(H) = 13/52 P(3) = 4/52
Events and variablesEvents and variables
Can be described as random or deterministic:
The outcome of a random event cannot be predicted:
The sum of two numbers on two rolled dice.
The time of emission of the ith particle from radioactive material.
The outcome of a deterministic event can be predicted:The measured length of a table to the nearest cm.
Motion of macroscopic objects (projectiles, planets, space craft) as predicted by classical mechanics.
Extent of randomnessExtent of randomness
A variable can be more random or more deterministic depending on the degree to which you account for relevant parameters:
Mostly deterministic:Only a small fraction of the outcome cannot be accounted for.
Length of a table:• Temperature/humidity variation• Measurement resolution• Instrument/observer error• Quantum-level intrinsic uncertainty
Mostly Random:Most of the outcome cannot be accounted for.
• Trajectory of a given molecule in a solution
Random variablesRandom variables
Can be described as discrete or continuous:• A discrete variable has a countable number of values.
Number of customers who enter a store before one purchases a product.
• The values of a continuous variable can not be listed:Distance between two oxygen molecules in a room.
Random Variable Possible Values
Gender Male, Female
Class Fresh, Soph, Jr, Sr
Height (inches) Integer in interval {30,90}
College Arts, Education, Engineering, etc.
Shoe Size 3, 3.5 … 18
Consider data collected for undergraduate students:
Is the height a discrete or continuous variable?How could you measure height and shoe size to make them continuous variables?
Probability DistributionsProbability Distributions
If a random event is repeated many times, it will produce a distribution of outcomes (statistical regularity).
(Think about scores on an exam)
The distribution can be represented in two ways:
• Frequency distribution function: represents the distribution as the number of occurrences of each outcome
• Probability distribution function: represents the distribution as the percentage of occurrences of each outcome
Discrete Probability DistributionsDiscrete Probability Distributions
Consider a discrete random variable, X:
f(xi) is the probability distribution function
What is the range of values of f(xi)?
Therefore, Pr(X=xi) = f(xi)
Discrete Probability DistributionsDiscrete Probability Distributions
Properties of discrete probabilities:
0)()Pr( ii xfxX for all i
k
ii
k
ii xfxX
11
1)()Pr( for k possible discrete outcomes
bxa
i
i
xfaFbFbXa )()()()Pr(
Where: )Pr()( xXxF
Discrete Probability DistributionsDiscrete Probability Distributions
Example: Waiting for a success
Consider an experiment in which we toss a coin until heads turns up.
Outcomes, w = {H, TH, TTH, TTTH, TTTTH…}
Let X(w) be the number of tails before a heads turns up.
12
1)(
xxf For x = 0, 1, 2….
00.05
0.10.15
0.20.25
0.30.35
0.40.45
0.5
f(x)
0 1 2 3 4 5 6
Waiting time
k
ii
k
ii xfxX
11
1)()Pr(
bxa
i
i
xfaFbFbXa )()()()Pr(
Cumulative Discrete Probability DistributionsCumulative Discrete Probability Distributions
j
iixfxFxX
1
)()'()'Pr( Where xj is the largest discrete value of X less than or equal to x’
1)Pr( kxX
Discrete Probability DistributionsDiscrete Probability Distributions
Example: Distribution Function for Die/Dice
Distribution function for throwing a die:
Outcomes, w = {1, 2, 3, 4, 5, 6} f(xi) = 1/6 for I = 1,6
0.000
0.020
0.040
0.060
0.080
0.100
0.120
0.140
0.160
0.180
1 2 3 4 5 6
Discrete Probability DistributionsDiscrete Probability Distributions
Example: Distribution Function for Die/Dice
Distribution function for the sum of two thrown dice:
f(xi) = 1/36 for x1 = 2
2/36 for x2 = 3
…
0.000
0.020
0.040
0.060
0.080
0.100
0.120
0.140
0.160
0.180
2 3 4 5 6 7 8 9 10 11 12
Continuous Probability Density FunctionContinuous Probability Density Function
Cumulative Distribution Function (cdf): Gives the fraction of the total probability that lies at or to the left of each x
Probability Density (Distribution) Function (pdf): Gives the density of concentration of probability at each point x
)Pr()( xXxF
Continuous Probability DistributionsContinuous Probability Distributions
Properties of the cumulative distribution function:
0)( F
1)(0 xF
1)( F
Properties of the probability density function:
b
a
dxxfaFbFbXa )()()()Pr(
)Pr()( xXxF
Continuous Probability DistributionsContinuous Probability Distributions
1
10t
F(t)
For continuous variables, the events of interest are intervals rather than isolated values.
Not interested in probability that the bus will arrive in 3.451233 minutes, but rather the probability that the bus will arrive in the subinterval (a,b) minutes:
Consider waiting time for a bus which is equally likely to be anywhere in the next ten minutes:
10)()()(
abaFbFbTaP
Continuous Probability DistributionsContinuous Probability Distributions
Example: Gaussian (normal) distribution:
Each member of the normal distribution family is described by the mean (μ) and variance (σ2).
Standard normal curve: μ = 0, σ = 1.
Normal / Gaussian DistributionNormal / Gaussian Distribution
Can be used to approximately describe any variable that tends to cluster around the mean.
Normal (Gaussian) Distribution:
The sum of a (sufficiently) large number of independent random variables will be approximately normally distributed.
Central Limit Theorem:
Used as a simple model for complex phenomena – statistics, natural science, social science
e.g., Observational error assumed to follow normal distribution
Importance:
Examples of experiments/measurements that will produce Gaussian distribution?
Standard ErrorStandard Error
N
Standard Deviation:
Standard Error:
Variance is the average squared distance of the data from the mean. Therefore, the standard deviation measures the spread of data about the mean.
Variance:
Deviation:
Standard ErrorStandard Error
How do we reduce the size of our standard error?
1) Repeated Measurements
2) Different Measurement Strategy
Jacob Bernoulli (1731): “For even the most stupid of men, by some instinct of nature, by himself and without any instruction (which is a remarkable thing), is convinced the more observations have been made, the less danger there is of wandering from one’s goal" (Stigler, 1986).
MomentsMoments
Other values in terms of the moments:
Skewness: 2/32
3
‘lopsidedness’ of the distribution
a symmetric distribution will have a skewness = 0
negative skewness, distribution shifted to the left
positive skewness, distribution shifted to the right
Kurtosis:
Describes the shape of the distribution with respect to the height and width of the curve (‘peakedness’)
Central Limit TheoremCentral Limit Theorem
As the sample size goes to infinity, the distribution function of the standardized variable leads to the normal distribution function!
http://www.jhu.edu/virtlab/prob-distributions/
MomentsMoments
In physics, the moment refers to the force applied to a system at a distance from the axis of rotation (as in a lever).
In mathematics, the moment is a measure of how far a function is from the origin.
The 1st moment about the origin: (mean)
The 2nd moment about the mean: (variance)
Average value of x
A measure of the ‘spread’ of the data
Two teams measure the height of a pole.
• Which team did the better job?• Why do you think so?
Team A Team B183 183.0182 183.5185 182.7181 182.5183 183.1184 183.3
avg = avg =183 183.0
std dev = std dev =1.41 0.37
Height in cm
If you measure something many times, you get random error.
• The positive and negative errors should balance out.
• The average should be closer to the true value than any one measurement might be.
• The deviations from the average for individual measurements give an indication of the reliability of that average value.
• Standard deviation measures the reliability of the average.
Team A Team B183 183.0182 183.5185 182.7181 182.5183 183.1184 183.3
avg = avg =183 183.0
std dev = std dev =1.41 0.37
Height in cm