Upload
ria-davidson
View
22
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Measures of Central Tendency and Dispresion. Content Analysis- Challenges. Lose some nuance when coding How to select material from universe of possible material? Is material accurate? Unintentional problems Purposeful distortion Ultimately a question of validity Are coders accurate? - PowerPoint PPT Presentation
Citation preview
Measures of Central Measures of Central Tendency and DispresionTendency and Dispresion
Content Analysis- ChallengesContent Analysis- Challenges
Lose some nuance when codingLose some nuance when codingHow to select material from universe of possible How to select material from universe of possible material?material?Is material accurate?Is material accurate? Unintentional problemsUnintentional problems Purposeful distortionPurposeful distortion Ultimately a question of validityUltimately a question of validity
Are coders accurate?Are coders accurate? Can establish reliabilityCan establish reliability Harder to establish validity Harder to establish validity
StatisticsStatistics
Provides description of a sample or Provides description of a sample or populationpopulation
SimplificationSimplification
Univariate- Only interested in one attribute Univariate- Only interested in one attribute at a timeat a time
Bivariate- consider relationships between Bivariate- consider relationships between 2 attributes2 attributes
Multivariate- the sky is the limitMultivariate- the sky is the limit
PercentagesPercentages
Useful for comparing groups with unequal Useful for comparing groups with unequal numbersnumbers
CABLE * APPROVE1 Crosstabulation
Count
92 39 90 149 370
308 123 335 596 1362
400 162 425 745 1732
.00
1.00
CABLE
Total
.00 .33 .67 1.00
APPROVE1
Total
CABLE * APPROVE1 Crosstabulation
92 39 90 149 370
24.9% 10.5% 24.3% 40.3% 100.0%
5.3% 2.3% 5.2% 8.6% 21.4%
308 123 335 596 1362
22.6% 9.0% 24.6% 43.8% 100.0%
17.8% 7.1% 19.3% 34.4% 78.6%
400 162 425 745 1732
23.1% 9.4% 24.5% 43.0% 100.0%
23.1% 9.4% 24.5% 43.0% 100.0%
Count
% within CABLE
% of Total
Count
% within CABLE
% of Total
Count
% within CABLE
% of Total
.00
1.00
CABLE
Total
.00 .33 .67 1.00
APPROVE1
Total
PercentagesPercentages
PercentagesPercentages
To Compute:To Compute: (#with trait of interest(#with trait of interest/t/total #) X 100otal #) X 100
Example 1- Sample of 4 cats, one is blackExample 1- Sample of 4 cats, one is black
(¼)X100- 25%(¼)X100- 25%
Example 2-Sample of 750, 612 approve of Example 2-Sample of 750, 612 approve of the presidentthe president
(612/750)X100= 81.6%(612/750)X100= 81.6%
What Constitutes the Denominator? What Constitutes the Denominator?
Percentage of TotalPercentage of Total
Percentage of Valid CasesPercentage of Valid Cases Excludes missing casesExcludes missing cases Typically more appropriateTypically more appropriate
Cumulative Percent-what percentage so Cumulative Percent-what percentage so far have reached this levelfar have reached this level
An ExampleAn ExampleCLINTMOR
807 44.7 49.1 49.1
568 31.4 34.5 83.6
215 11.9 13.1 96.7
54 3.0 3.3 100.0
1644 91.0 100.0
163 9.0
1807 100.0
.00
.33
.67
1.00
Total
Valid
SystemMissing
Total
Frequency Percent Valid PercentCumulative
Percent
CLINTKNO
31 1.7 1.9 1.9
143 7.9 8.7 10.5
856 47.4 51.9 62.4
620 34.3 37.6 100.0
1650 91.3 100.0
157 8.7
1807 100.0
.00
.33
.67
1.00
Total
Valid
SystemMissing
Total
Frequency Percent Valid PercentCumulative
Percent
Measures of Central TendencyMeasures of Central Tendency
ModeMode
Mean (Average)Mean (Average)
MedianMedian
Computing the MeanComputing the Mean
Requires At least ordinal dataRequires At least ordinal data
(Y(Y11+ Y+ Y22+ Y+ Y33…. +Y…. +Yii)/I)/I
Example have people with incomes of Example have people with incomes of 10,000, 15,000, 25,000, 55,000, 32,000, 10,000, 15,000, 25,000, 55,000, 32,000, 29,50029,500
Mean=(10,000+15,000+25,000, +55,000+ Mean=(10,000+15,000+25,000, +55,000+ 32,000+29,500)/6= 27,75032,000+29,500)/6= 27,750
ModeMode
Most common with nominal dataMost common with nominal dataCount frequencies, find most commonCount frequencies, find most commonAsk 30 1Ask 30 1stst graders favorite color graders favorite color7 blue7 blue3 chartreuse 3 chartreuse 4 purple4 purple2 yellow2 yellow10 red10 red3 green3 green1 Black1 BlackMode- RedMode- Red
FrequenciesFrequenciesPID
346 19.1 19.5 19.5
274 15.2 15.4 34.9
269 14.9 15.1 50.1
206 11.4 11.6 61.7
230 12.7 13.0 74.6
215 11.9 12.1 86.7
236 13.1 13.3 100.0
1776 98.3 100.0
31 1.7
1807 100.0
.00
1.00
2.00
3.00
4.00
5.00
6.00
Total
Valid
SystemMissing
Total
Frequency Percent Valid PercentCumulative
Percent
Computing the MedianComputing the Median
Requires at least Ordinal DataRequires at least Ordinal Data
Put values in orderPut values in order
If odd number, value half are above, half belowIf odd number, value half are above, half below
If even number- Average of two middle casesIf even number- Average of two middle cases
Income Example:Income Example: 10,000, 15,000, 25,000, 55,000, 32,000, 29,50010,000, 15,000, 25,000, 55,000, 32,000, 29,500 10,000, 15,000, 25,000, 29,500, 32,000, 55,00010,000, 15,000, 25,000, 29,500, 32,000, 55,000 Median=25,250Median=25,250
When To Use Which?When To Use Which?
Mode- nominal dataMode- nominal data Better to actually give totals for all if few Better to actually give totals for all if few
choices, e.g. 33% red, 10% greenchoices, e.g. 33% red, 10% green
Mean- when appropriate dataMean- when appropriate data
Median- with ordinal data, in cases where Median- with ordinal data, in cases where there are a few values that might cause a there are a few values that might cause a skewskew
Outlier- Data point with extreme valueOutlier- Data point with extreme value
Median vs. MeanMedian vs. Mean
Created a fake town with 100 residentsCreated a fake town with 100 residentsIncomes 19,00-138,000 Incomes 19,00-138,000 Mean=57600, Median=49,500Mean=57600, Median=49,500Suppose one person with 30,000 moves away, Suppose one person with 30,000 moves away, replaced by Millionairereplaced by Millionaire Mean=67,300, Median=55,000 Mean=67,300, Median=55,000
Replaced by 50,000,000Replaced by 50,000,000 Mean=557,300 Median= 55,000Mean=557,300 Median= 55,000
Replaced by Bill Gates (50 Billion)Replaced by Bill Gates (50 Billion) Mean=500Million, Median= 55,000Mean=500Million, Median= 55,000
Measures of DispersionMeasures of Dispersion
Measure of Central Tendency loses somethingMeasure of Central Tendency loses something
Income example?Income example?
DispersionDispersion Measure of how much divergence there is from the Measure of how much divergence there is from the
meanmean
HistogramHistogram Horizontal Axis breaks variable down into rangesHorizontal Axis breaks variable down into ranges Vertical Axis-count within each rangeVertical Axis-count within each range
47000.00 48000.00 49000.00 50000.00 51000.00 52000.00
income1
0.0
42.5
85.0
127.5
170.0
Co
un
t
40000.00 50000.00 60000.00
income2
0.0
42.5
85.0
127.5
170.0C
ou
nt
30000.00 40000.00 50000.00 60000.00 70000.00
income3
0.0
42.5
85.0
127.5
170.0
Co
un
t
25000.00 50000.00 75000.00 100000.00
income4
0.0
42.5
85.0
127.5
170.0
Co
un
t
Quantifying Dispersion- Standard Deviation
• Find difference from mean for each observation
• Add them up• Divide by the number
of cases minus1
1
)(ˆ2
1
n
YY
Standard Deviation from Previous cases
• Mean= 50,024, S.D=992.5
• Min=46,834, Max=52,935
47000.00 48000.00 49000.00 50000.00 51000.00 52000.00
income1
0.0
42.5
85.0
127.5
170.0
Co
un
t
• Mean=50,255 S.D.=4792
• Min=35,671 Max=65,095
40000.00 50000.00 60000.00
income2
0.0
42.5
85.0
127.5
170.0
Co
un
t
• Mean=50,311 S.D.=10,124
• Min=22,522 Max=78,642
30000.00 40000.00 50000.00 60000.00 70000.00
income3
0.0
42.5
85.0
127.5
170.0
Co
un
t
• Mean=50,982 S.D.=18,898
• Min=1591 Max=105,957
25000.00 50000.00 75000.00 100000.00
income4
0.0
42.5
85.0
127.5
170.0
Co
un
t
Gore Thermometer
• Mean=57.4, S.D.=25.7
• 0=4.6%, 100= 5.6%
0.00 25.00 50.00 75.00 100.00
gorethrm
0
100
200
300
Co
un
t
George W Bush Thermometer
• Mean=56.1 S.D.=24.9• 0= 4.4% 100=4.7%
0.00 25.00 50.00 75.00 100.00
wtherm
0
100
200
300
Co
un
t
Clinton Thermometer
• Mean=55.2 S.D.=29.7• 0=9.5% 100=7.1%
0.00 25.00 50.00 75.00 100.00
clinthrm
0
100
200
300
Co
un
t
For Next TimeFor Next Time
The Normal DistributionThe Normal Distribution
Bivariate RelationshipsBivariate Relationships
Get stats assignmentsGet stats assignments