51
with Statistics Workshop Fun

With Statistics Workshop with Statistics Workshop FunFunFunFun

Embed Size (px)

Citation preview

Page 1: With Statistics Workshop with Statistics Workshop FunFunFunFun

with Statistics

Workshop

Fun

Page 2: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

How the workshop works

• 3 hours @ Temasek Polytechnic

– First 1.5 hours workshop on statistics

– next 1.5 hours to create an interesting infographic to tell a story using the data.

Page 3: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Why attend this workshop?

• Learn key statistics concepts that will help you make better decisions

• Pick up useful Microsoft Excel Skills

• Win attractive prizes!

Page 4: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Stats in Action!

Page 5: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Examples of Infographic

Page 7: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Examples of Infographic

Page 8: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

What is Statistics ?

Statistics is the study of the Collection Organization Analysis Interpretation of data

Page 9: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Why Study Statistics?

1. Numbers are everywhere!

2. Statistical techniques are used to make decisions that effect our daily lives

3. How can stats affect you?

Page 10: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Types of Statistics

Statistics

DescriptiveStatistics

InferentialStatistics

Page 11: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Definition

Descriptive Statistics – Methods of organizing, summarizing, and presenting data in an informative way.Examples: (Mean, Median, Mode), Frequency distribution table, Charts (Bar Chart, Line Chart), graphs (Histogram, Box-and-Whisker Plot) etc.

Inferential Statistics – The methods used to estimate a property of a population on the basis of a sample.Example: Sampling

Page 12: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Inferential Statistics

a research firm observes that women are twice as likely as men to shop impulsively.

an accountant observes that the current year’s total sales of $60 million represents a 20% increase compared to last year’s total sales.

Which type of statistics is involved when ...

ANSWER :

ANSWER : Descriptive Statistics

Page 13: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Population and Sample

Samplerefers to a portion of the population.

Populationrefers to a set or collection of all possible observations of some specific characteristics.

Page 14: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Population Sample(a) Weekly allowances of all students of

Singapore Primary School.(b) A group of 5 employees selected

from Ace Company Ltd to representit at a conference.

(c) Volume of sales generated by all 30salesmen of Trustworthy Trading.

(d) Monthly expenditure on clothes for50 persons.

(e) Life span of 100 batteriesmanufactured by EverlastingCompany.

(f) Salaries of all employees of RafflesBank.

Page 15: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Types of Variables

Types of Variables

Qualitative (Categorical responses)

Quantitative (Numerical responses)

Page 16: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Variable Definition1. Qualitative variable: When the characteristic being studied is nonnumeric, it is called a qualitative variable. Examples are gender, state, country etc. It is discrete

2. Quantitative variable: When the variable studied can be reported numerically, the variable is called a quantitative variable. Examples are age, amount, no. of children etc. can be either discrete or continuous

(a) Discrete variable: Individually separate and distinct. can only assume certain values and there are usually “gaps” between values. Example: Children in a family, number of students, number of employees etc.

(b) Continuous variable: can assume any value within a specified range. Example: Amount, height, temperature etc.

Page 17: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Qualitative (Discrete)

Quantitative Discrete Continuous

(a) Records of number of sick leave of employees.

(b) Number of vehicles sold this month in Singapore.

(c) Time taken to travel from home to TP.

(d) Amount of rainfall in the month of January.

(e) Total Diameter of rings produced.

(f) Types of credit card used.(Master, visa, diners)

(g) Monthly income of employees.

Page 18: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Levels of Measurement

There are four levels of data:• Nominal • Ordinal• Interval• Ratio

Page 19: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Definition

1. Nominal: variables which are classified into categories and order will be meaningless. Example: Race, Gender, Religious affiliations etc.

Nominal level variables must be:

(a) Mutually exclusive An individual object, can only belong to one category at a time. Not possible to have 2 categories at a single time. Can you be both F and M?

(b) Exhaustive Each individual object must belong to either a F or M

Page 20: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

2. Ordinal: Ordinal level variables are arranged in some order and the categories have some relationship among them. Example: Student’s grade, customer’s rating, military rank.

Definition

Page 21: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

3. Interval: Similar to the ordinal level, but there is a meaningful difference between values. 0 ≤ x ≤ 1 is an interval which contains 0 and 1, as well as all numbers between them

Example: Temperature, Dress size, time

Definition

Page 22: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

4. Ratio: Practically all quantitative data is recorded as ratio level of measurement. Similar to the interval level, but has an absolute zero (0). Example: Number of employees, distance etc.

Definition

Page 23: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

M.E.A.N M.E.D.I.A.N

M.O.D.E

Page 24: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Mean

the average value of the data set.

the most important and most frequently used measure of central tendency. computed as the sum of all observed values divided by the total number of observations.

Page 25: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Example

The following shows the net profits of 12 branches of

Evergreen Florist Shop on Mother’s Day.

Net Profits ($)

903 1745 1883 863 1204 16241698 957 1041 1138 1354 1802

Compute the mean net profit

• assuming that data are from a population.

Page 26: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Solution

Population Data

Population Mean

903+1745+1883+863+1204+1624+1698+957+1041+1138+1354+1802 12

N

X

Population Size

16212 12$1,351

=

=

=

sum of all observed values

1

Page 27: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Median

Compute the median for the following odd number of observations.

903 1745 1883 863 1204 16241698 957 1041 1138 1354

First arrange the data in an array ( in ascending order )

863 903 957 1041 1138 1204 1354 1624 1698 1745 1883

= $1,204

Median= th observation in data array) (n + 1 2

) ( 11+ 1 2

Median = th observation

= 6 th observation

Net Profits ($) of Evergreen Florist

Middle value in the data set

Page 28: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Example

Compute the median for the following even number of observations.

903 1745 1883 863 1204 16241698 957 1041 1138 1354 1895

First arrange the data in an array ( in ascending order )

863 903 957 1041 1138 1204 1354 1624 1698 1745 1883 1895

= $1,279

Net Profits ($) of Evergreen Florist

Median= th observation in data array) (n + 1 2

Median = 6.5= 1204 + 1354 /2

2

Page 29: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Mode

Determine the mode for the following data :

Since the value occurs most frequently,

Mode =

$100,000 $5,000 $10,000 $20,000 $30,000 $50,000 $100,000

$100,000

$100,000

the value that occurs most frequently.

Page 30: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Example

No ModeRaw data: 8 6 7 9 2 5

One ModeRaw data: 8 8 7 9 2 8

More Than One ModeRaw data: 8 8 7 9 2 9

Answer :

Answer :

Answer :

No Mode

8

8 and 9

3

Page 31: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Comparison of Mean, Median & Mode modemedianmean

modemedian

mean

mode median

meanDistribution Skewed to Right or Positively SkewedMean > Median

Distribution Skewed to Left or Negatively SkewedMean < Median

Symmetrical Distribution or Normal Distribution

Mean = Median = Mode

For skewed distributions, the is the best measure as it lies between the mean and mode.

MEDIAN

4

Page 32: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Range

The following shows the net profits of 12 branches of Evergreen Florist Shop on Mother’s Day.

903 1745 1883 863 1204 16241698 957 1041 1138 1354 1802

Find the range for the net profit.

Range = Largest Value - Smallest Value

Range = =

1883 - 863$1,020

5

Page 33: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

1. Variance

the average of the squared distances of the observations from the mean.

22

2 N

X

1n

xnxs

222

Population Variance

Sample Variance

Page 34: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

What’s the difference?

What is the difference between the 3 curves?

Curve A

Curve B

Curve C

They have same mean

but different amount of spread (variability).

So how far iseach data valuefrom the mean?

Page 35: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Standard Deviation

most important and most commonly used measure of dispersion.

defined as the square root of variance, i.e. the square root of the average of the squared distances / deviations of the observations from the mean.

6

Page 36: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Graphical Presentations of Data

Page 37: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Histogram a graphical presentation of a frequency distribution.

is constructed by (i) marking class intervals on the x-axis, and (ii) drawing rectangles whose heights correspond to the class frequencies.

Page 38: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Histogram with 1 category

Sales 10 to 19 20 to 29 30 to 39 40 to 49 50 to 59 60 to 691 4 3 5 9 8

Page 39: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Histogram with 2 categories (Group)Sunshine hours Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov DecLondon 52 71 113 153 204 204 205 195 148 111 69 48Barcelona 146 156 187 204 248 267 308 270 207 181 145 143

Page 40: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Histogram with 4 categories (Stacked)Sunshine hours Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov DecLondon 52 71 113 153 204 204 205 195 148 111 69 48Barcelona 146 156 187 204 248 267 308 270 207 181 145 143

7

Page 41: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Frequency Polygon (Line Chart)

is formed by letting the midpoint of each class represents the data in that class and then connecting the sequence of midpoints at their respective frequencies.

10 to 19 20 to 29 30 to 39 40 to 49 50 to 59 60 to 69

Histogram Showing Daily Sales Turnover

0

1

2

3

4

5

6

7

8

9

10

Sales Turnover (in $'000)

Num

ber o

f Day

s

10 to 19 20 to 29 30 to 39 40 to 49 50 to 59 60 to 69

Sales(in $’000)

Frequency(No. of Days)

10 – 1920 – 2930 – 3940 – 4950 – 5960 – 69

143598

Frequency polygon showingdaily sales turnover.

Page 42: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Line chart with 1 categoryYear 1870 1913 1950 1973 2003Western Europe $367 $902 $1,396 $4,096 $7,857

Page 43: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Line chart with 6 categoriesYear 1870 1913 1950 1973 2003Western Europe $367 $902 $1,396 $4,096 $7,857USA $98 $517 $1,455 $3,536 $8,430Japan $25 $71 $160 $1,242 $2,699China $189 $241 $244 $739 $6,187India $134 $204 $222 $494 $2,267Africa $45 $79 $203 $549 $1,322

8

Page 44: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Area chart with 6 categoriesYear 1870 1913 1950 1973 2003Western Europe $367 $902 $1,396 $4,096 $7,857USA $98 $517 $1,455 $3,536 $8,430Japan $25 $71 $160 $1,242 $2,699China $189 $241 $244 $739 $6,187India $134 $204 $222 $494 $2,267Africa $45 $79 $203 $549 $1,322

9

Page 45: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Pie Chart

Ethnic Group Frequency DegreesChinese 2240 269ºMalay 400 48ºIndian 230 28ºOthers 130 15º

Total: 360º

Number of Degrees Relative Value For Each Category of the Category

360o

circular display divided into sections based on the number of observations. useful in showing proportional relationships, such as market share & budgets.

Total = 3000

Page 46: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Example

Pie Chart Showing the Ethnic Composition of Residents in ABC New Town

Others (130)

Chinese(2240)

Malay(400)

Indian(230)

10

Page 47: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Pictogram Showing the Ethnic Composition of Residents in ABC New Town

Chinese

= 100 residents130

230

400

a display that uses pictures or symbols to represent frequencies.

Malay

Indian

Others

2240

Pictogram

11

Page 48: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Scatter & Bubble Plot

12

Showing relation based on 2 dimensions

Showing relation based on 2 dimensions

Page 49: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Infographic (http://infogr.am/beta/)

Page 51: With Statistics Workshop with Statistics Workshop FunFunFunFun

Temasek Polytechnic • School of Informatics & IT

Thanks!