Basic of Statistics & Normal Distribution

Preview:

DESCRIPTION

Basic of Statistics & Normal Distribution. What Is Statistics?. Collection of Data Survey Interviews. Summarization and Presentation of Data Frequency Distribution Measures of Central Tendency and Dispersion Charts, Tables,Graphs. Statistical Methods. - PowerPoint PPT Presentation

Citation preview

Basic of Statistics & Normal Distribution

What Is Statistics?What Is Statistics?

• Collection of DataCollection of Data– Survey– Interviews

• Summarization and Presentation of DataSummarization and Presentation of Data

Frequency DistributionMeasures of Central Tendency and DispersionCharts, Tables,Graphs

Statistical Methods

StatisticalMethods

DescriptiveStatistics

InferentialStatistics

Key Terms

• 1. Population (Universe)– All Items of Interest

• 2. Sample– Portion of Population

• 3. Parameter– Summary Measure about Population

• 4. Statistic– Summary Measure about Sample

• PP in in PPopulation opulation

& & PParameterarameter

• SS in in SSample ample & & SStatistictatistic

StatisticalComputer Packages

• 1. Typical Software– SAS– SPSS– MINITAB– Excel

• 2. Need Statistical Understanding– Assumptions– Limitations

Standard NotationStandard Notation

Measure Sample Population

Mean X

Stand. Dev. S

Variance S2 2

Size n N

Measures of Central Tendency Measures of Central Tendency forfor

Ungrouped DataUngrouped Data

Raw Data

MeanMean• Measure of Central Tendency• Most Common Measure• Acts as ‘Balance Point’• Affected by Extreme Values (‘Outliers’)• Formula (Sample Mean)

X

X

n

X X X

n

ii

n

n

1 1 2

Advantages of the MeanAdvantages of the Mean• Most widely used• Every item taken into account• Determined algebraically and amenable to

algebraic operations• Can be calculated on any set of numerical

data (interval and ratio scale) -Always exists• Unique• Relatively reliable

DisadvantagesDisadvantages of the Mean of the Mean• Affected by outliers• Cannot use in open-ended

classes of a frequency distribution

MedianMedian• Measure of Central Tendency

• Middle Value In Ordered Sequence– If Odd n, Middle Value of Sequence– If Even n, Average of 2 Middle Values

• Not Affected by Extreme Values

• Position of Median in Sequence

Positioningg Pointn 1

2

Advantages of the MedianAdvantages of the Median

• Unique• Unaffected by outliers and skewness• Easily understood• Can be computed for open-ended classes

of a frequency distribution• Always exists on ungrouped data• Can be computed on ratio, interval and

ordinal scales

Disadvantages of MedianDisadvantages of Median

• Requires an ordered array• No arithmetic properties

ModeMode• Measure of Central Tendency

• Value That Occurs Most Often

• Not Affected by Extreme Values

• May Be No Mode or Several Modes

• May Be Used for Numerical & Categorical Data

Advantages of ModeAdvantages of Mode• Easily understood• Not affected by outliers• Useful with qualitative problems• May indicate a bimodal

distribution

Disadvantages of ModeDisadvantages of Mode

• May not exist• Not unique• No arithmetic properties• Least accurate

Relationship among Mean, Median, &Mode

• If a distribution is symmetrical, the mean, median and mode coincide

• If a distribution is non symmetrical, and skewed to the left or to the right, the three measures differ.

A positively skewed distribution(“skewed to the right”)

MeanMedian

Mode MeanMedian

Mode

A negatively skewed distribution(“skewed to the left”)

Measures of Dispersion Measures of Dispersion forfor

Ungrouped DataUngrouped Data

RangeRange• Measure of Dispersion• Difference Between Largest & Smallest

Observations

Range X Xl est smallestarg

The Root Of All Process EVIL

“VARIATION”

What is the standard deviation?What is the standard deviation?

• The SD says how far away numbers on a list are from their average.

• Most entries on the list will be somewhere around one SD away from the average. Very few will be more than two or three SD’s away.

Variance & Variance & Standard DeviationStandard Deviation

• Measures of Dispersion

• Most Common Measures

• Consider How Data Are Distributed

• Show Variation About Mean (X or )

What is the standard deviationWhat is the standard deviation

• Same means different standard deviations

SD

SD

Sample Sample Standard Deviation Standard Deviation FormulaFormula

(Computational Version)(Computational Version)

1

)()( 22

n

XnXs =

Population Mean

N

x

PopulationStandard Deviation

N

x

2)(

Coefficient of VariationCoefficient of Variation• 1. Measure of Relative Dispersion

• 2. Always a %

• 3. Shows Variation Relative to Mean

• 4. Used to Compare 2 or More Groups

• 5. Formula (Sample)

CVS

X 100%

Coefficient of Variation

• 1. Measure of relative dispersion

• 2. Always a %

• 3. Shows variation relative to mean

• 4. Used to compare 2 or more groups

• 5. Formula:

• 6. Population Sample

CVsx

(100) CV (100)

_

Summary of Variation Measures

Measure Equation Description

Range x largest - x smallest Total Spread

Interquartile Range Q3 - Q1 Spread of Middle 50%

Standard Deviation(Sample)

x

n

2

1

Dispersion aboutSample Mean

Standard Deviation(Population)

x

N

2 Dispersion about

Population Mean

Variance(Sample)

(x )2

n 1Squared Dispersionabout Sample Mean

Coeff. of Variation s / (100) Relative Variation

x_

x

_

x

_

Also known as the Empirical Rule

Recommended