280

Statistical Methods for Engineering

Embed Size (px)

Citation preview

Page 1: Statistical Methods for Engineering
Page 2: Statistical Methods for Engineering
Page 3: Statistical Methods for Engineering

SCHOOL OF ENGINEERING

DIPLOMA IN INDUSTRIAL & OPERATIONS MANAGEMENT

DIPLOMA IN SUPPLY CHAIN MANAGEMENT

DIPLOMA IN CIVIL AVIATION

P01 – SUMMARIZE AND PRESENT

E214 : STATISTICAL METHODS FOR ENGINEERING

Copyright © 2009 School of Engineering, Republic Polytechnic, Singapore

All rights reserved. No part of this document may be reproduced, stored in a retrieval

system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording or otherwise, without the prior permission of the School of

Engineering, Republic Polytechnic, Singapore.

Page 4: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 2 of 3

Summarize and Present It is reported that male polytechnic students have a higher failure rate in the National Physical Fitness Award (Napfa) test compared with JC students. This is despite the fact that many polytechnic students play sports or exercise at least once a week. To help male students pass the Napfa test, your school has introduced an exercise program to help improve their fitness level. Attached are some data from the exercise program:

Data.xls

You are asked to summarize the data and present them in a form that is useful and easy to understand at a glance. How would you describe and measure the different sets of data? What display graphs or tools can you use?

Page 5: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 3 of 3

Page 6: Statistical Methods for Engineering

1School of Engineering

E214 – Statistical Methods for

Engineering

P01 – Summarize and Present

Page 7: Statistical Methods for Engineering

2School of Engineering

What is Statistics?

• Statistics provides a basis for assessing and drawing a conclusion.

• Statistics plays a critical role in the improvement of the quality of any product or service. It enables engineers to understand phenomena subject to variation and to effectively predict or control them.

• Basic idea behind all statistical methods of data analysis is to make inferences about a population by studying a relatively small sample chosen from it.

• Everything dealing with the collection, processing, analysis, and interpretation of numerical data belongs to the domain of statistics.

Page 8: Statistical Methods for Engineering

3School of Engineering

Descriptive Vs Inferential Statistics

Descriptive

• Enable understanding of important features or provide insight of data through the use of values and graphical presentations

• Purpose is to organize and summarize the data collected in some meaningful forms or measures that are easily understood

• Examples:

Charts, graphs, plots, measures of mean, median, frequency, standard deviation.

Inferential

Consists of:

• Making claims about population from data collected in sample

• Performing estimations about population characteristics and making predictions

• Determining relationships among variables

• Examples:

Hypothesis Testing, ANOVA, correlation analysis

Page 9: Statistical Methods for Engineering

4School of Engineering

Stem-and-Leaf Plot• Simple way to summarize a data set

• Compact way to represent data, and provides some indication of its

shape

• Stem-and-leaf plot displays all the sample values but the order the items

were sampled cannot be determined

• Example of a Stem-and-Leaf Plot

24 24 26 26 26 27 27 27 27 28 29 30 30

31 33 35 36 36 37 37 43 45 45 46 48 49

50 50 51 53 53 55 56 57 58 59 59 60 60

Stem Leaf 2 44666777789 3 001356677 4 355689 5 00133567899 6 00

Page 10: Statistical Methods for Engineering

5School of Engineering

Box Plot

• Box Plot presents the median, first and third quartiles, and outliers. It is used to compare samples.

• The box plot has two whiskers and two parts of the box, each representing one quarter of the data.

• ‘Whiskers’ extend from the top and bottom of the box and end at the most extreme data point that is not an outlier

• Interquartile range (IQR) is the difference between the third quartile and first quartile

• Outliers lie more than 1.5 IQR below the first quartile or 1.5 IQR above the third quartile

Page 11: Statistical Methods for Engineering

6School of Engineering

Anatomy of a Box Plot

X

XX

Outliers

Third Quartile

First Quartile

Median

Largest data point

within 1.5 IQR of

the third quartile

Smallest data point

within 1.5 IQR of the

first quartile

(Taken from Navidi W., Statistics for Engineers and Scientists)

Page 12: Statistical Methods for Engineering

7School of Engineering

Distribution Shape and

Box Plot

Right-SkewedLeft-Skewed Symmetric

Q1 Q1 Q1Q2 Q2 Q2

Q2

Q3 Q3 Q3

Q1 Q3 Q1 Q2 Q3 Q1 Q2 Q3

Page 13: Statistical Methods for Engineering

8School of Engineering

Histogram• Most common form of graphical representation of frequency

distribution

• Useful in displaying shape, location and variability of the data

• Emphasizes irregularities and unusual features

• Sometimes it can be enough to draw a histogram in order to solve an

engineering problem

0

5

10

15

20

25

30

35

40

45

11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90

Marks

Fre

qu

ency

of

Pu

pil

s

Page 14: Statistical Methods for Engineering

9School of Engineering

Cumulative Frequency

A point on the horizontal axis of the cumulative frequency graph represents a

possible data value.

Its corresponding vertical plot gives the number of the data whose value are

less than or equal to it.

A cumulative frequency plot is called an Ogive.

0

20

40

60

80

100

120

140

160

180

10 20 30 40 50 60 70 80

Marks

Cu

mu

lati

ve F

req

uen

cy

Page 15: Statistical Methods for Engineering

10School of Engineering

Pareto Diagram

• Orders each type of failure or defect according to its frequency

• Very useful in the analysis of defect data in manufacturing systems. Helps engineers identify important defects and their causes

• When a process is identified as a candidate for improvement, the first step is to collect data on the frequency of each type of failure and then present the data on a Pareto Diagram

• Always arrange categories in descending order of frequency of occurrence, that is, the most frequently occurring is on the left, followed by the next most frequently occurring

• The horizontal scale of a Pareto Diagram is usually categorical classifications

Page 16: Statistical Methods for Engineering

11School of Engineering

Pareto Diagram

The Pareto Diagram highlights the relatively few types of defects that areresponsible for most of the observed defects.

Pareto diagram is an important part of a quality improvement program as it forces attention to the most critical defects.

Pareto diagram graphically depicts Pareto’s empirical law that any assortment of events consists of a few major and many minor elements. Typically, two or three elements will account for more than half the total frequency.

It is much easier to reduce or eliminate frequently occurring defects than rare ones.

Page 17: Statistical Methods for Engineering

12School of Engineering

Mean

• Defined as the sum of the observations divided by

sample size

• To emphasize that it is based on a set of observations, it

is often referred to as the sample mean

• It indicates the center of the data

• Affected by extreme values (outliers)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Mean = 5 Mean = 6

Page 18: Statistical Methods for Engineering

13School of Engineering

Median• The median of a sample is the middlemost value after the data

is arranged from smallest to largest

• It is not affected by extreme values (outliers)

• Eliminates the effect of extreme (very large or very small) values

• If n numbers are ordered from smallest to largest:– If n is odd, the median is the number in position (n+1)/2

– If n is even, the median is the average of the numbers in positions (n/2) and (n/2 + 1)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Median = 5 Median = 5

Page 19: Statistical Methods for Engineering

14School of Engineering

Mode

• Most frequently occurring value in a

sample

• There may be no mode, there may be

several modes

• It is not affected by extreme values

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Mode = 9

0 1 2 3 4 5 6

No Mode

Page 20: Statistical Methods for Engineering

15School of Engineering

Time Series Plot• Also known as Line Graph or Run Chart

• Displays data in a time sequence for a given period of

time

• Used to monitor whether there is a systematic change of

the data over time (trend)

Page 21: Statistical Methods for Engineering

16School of Engineering

Scatter Diagram• Different sets of data are plotted on different axes

• Show whether a relationship exists between 2 sets of

data, i.e. how much one is affected by the other

Page 22: Statistical Methods for Engineering

17School of Engineering

Shape of a Distribution

• Describes how data is distributed

• Measures of shape

– Symmetric or Skewed

Mean = Median =ModeMean < Median < Mode Mode < Median < Mean

Right-SkewedLeft-Skewed Symmetric

Page 23: Statistical Methods for Engineering

18School of Engineering

Quartiles and Percentiles

• The quartiles are the 25th, 50th and 75th percentiles– First quartile Q1 = 25th percentile

– Second quartile Q2 = 50th percentile

– Third quartile Q3 = 75th percentile

• Second quartile is equal to the median

• Interquartile range = third quartile - first quartile

• Example:Percentiles are often used to interpret scores on standardized tests.

If a student is informed that her score is on the 70th percentile, this means that 70 percent of students who took the test received lower scores.

Page 24: Statistical Methods for Engineering

19School of Engineering

Determining Quartiles and Percentiles

To calculate the sample 100 p-th Percentile:

1. Order the n observations from smallest to largest

2. Determine the product np

If np is not an integer, round it up to the next integer and find the corresponding ordered value.

If np is an integer, say k, calculate the mean of the k-th and the (k+1)-st ordered observations.

Example:

If n is 80, in order to find Q1, first obtain np

np = (80)(0.25)=20

Since np is an integer, Q1 is obtained by taking the

average of the 20th and 21st ordered observations.

Page 25: Statistical Methods for Engineering

20School of Engineering

Position of Quartiles and Percentiles

75th Percentile

50th Percentile

25th Percentile

Q1 Q2 Q3

Cumulative Frequency Graph

Page 26: Statistical Methods for Engineering

21School of Engineering

Dispersion

• Measures the spread of the values around the central tendency

• 2 common measures: range and standard deviation

• Standard deviation is an important measure of the variation in the data. You will learn more about it!

Page 27: Statistical Methods for Engineering

22School of Engineering

Today’s Problem

Page 28: Statistical Methods for Engineering

23School of Engineering

Conclusion

• Different graphical representations have different advantages

• Stem & Leaf Plot is a compact way to represent data, and provides some indication of its shape

• Box Plot presents the median, first and third quartiles, and outliers. It is used to compare samples

• Histogram is a common form of graphical representation of frequency distribution used for displaying shape, location and variability of the data

• A suitable graphical representation should be chosen depending on what you are interested to display

Page 29: Statistical Methods for Engineering

24School of Engineering

Learning Outcomes

• Differentiate between Descriptive and Inferential Statistics

• Select the Appropriate Data Display Tools– Frequency of Occurrence (Pie Chart, Pareto Chart)

– Distribution of Data (Stem and Leaf Plot, Box Plot, Histogram Plot)

– Trends over Time (Time Series Plot)

– Association (Scatter Diagram)

• Summary Measurements– Distribution

– Measure of Central Tendency (Mean, Median, Mode)

– Dispersion (Range, Standard Deviation)

– Quartiles and Percentiles

Page 30: Statistical Methods for Engineering

SCHOOL OF ENGINEERING

DIPLOMA IN INDUSTRIAL & OPERATIONS MANAGEMENT

DIPLOMA IN SUPPLY CHAIN MANAGEMENT

DIPLOMA IN CIVIL AVIATION

P02 – Describe it with Venn

E214 : STATISTICAL METHODS FOR ENGINEERING

Copyright © 2009 School of Engineering, Republic Polytechnic, Singapore

All rights reserved. No part of this document may be reproduced, stored in a retrieval

system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording or otherwise, without the prior permission of the School of

Engineering, Republic Polytechnic, Singapore.

Page 31: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 2 of 2

Describe it with Venn The Land Transport Authority of Singapore (LTA) is interested to find out about whether Republic Polytechnic students have been adequately served by the public transport system. They are planning to conduct a survey to find out the proportion of students who take public transport to school, as well as the number who get to school on time using public transport. The mode of transport, whether it is by means of bus, train or both, should be indicated in the survey. How would you advise LTA to conduct this survey? When the survey is completed, how can the response be analyzed using a Venn diagram to determine if the public transport system serving Republic Polytechnic is satisfactory?

Page 32: Statistical Methods for Engineering

1School of Engineering

E214 – Statistical Methods for

Engineering

P02 – Describe it with Venn

Page 33: Statistical Methods for Engineering

2School of Engineering

Sample Space and Events

• In statistics, a set of all possible outcomes of an

experiment is called a sample space.

• Sample spaces are usually denoted by the

letter S.

• In statistics, any subset of a sample space is

called an event.

• A subset is any part of a set, including the

whole set, and a set called the empty set (denoted by Ø).

• The empty set has no elements at all.

Page 34: Statistical Methods for Engineering

3School of Engineering

Sample Space and Events

An Example:

• Roll a die and observe the number obtained.

• In this example, rolling the die is the experiment.

• The only possible outcomes are 1, 2, 3, 4, 5 or 6.

• The event that a die comes up an even number

is:

– The sample space for the experiment is S =

{1,2,3,4,5,6}

– Coming up an even number corresponds to Even =

{2,4,6}

Page 35: Statistical Methods for Engineering

4School of Engineering

Mutually Exclusive Events

• Mutually Exclusive events have no elements in common.

• For example, it is impossible that a coin can come up both

heads and tails.

• Such an event is said to be mutually exclusive.

• The events A and B are said to be mutually exclusive if

they have no outcomes in common.

• Eg. Rain („A‟) or no rain („B‟) at 12pm are mutually

exclusive events

A B

S

Page 36: Statistical Methods for Engineering

5School of Engineering

Union

• If A and B are any two sets in a sample space S, their

union, denoted by A U B, is the subset of S that contains

all elements that are either in A, in B, or in both.

• In words, A U B, means “A and/or B”.

• Eg. Number of students clearing either PP („A‟) or CE („B‟)

or both.

S

A B

Page 37: Statistical Methods for Engineering

6School of Engineering

Intersection• If A and B are any two sets in a sample space S, their

intersection, denoted by A ∩ B, is the subset of S that

contains all elements that are in both A and B.

• In words, A ∩ B means “A and B”.

• In the previous example, A ∩ B indicates the number of

students clearing both CE and PP.

A B

S

Page 38: Statistical Methods for Engineering

7School of Engineering

Complement• The complement of an event A, denoted by Ac, is the

subset of S that contains all the elements of S that are not

in A.

• In words, Ac means “not A”.

• Eg. Ac for Event A which is taking bus means all

responses other than „Bus‟, i.e. „Train‟ or „Both‟ or „Others‟

Ā A

S

Page 39: Statistical Methods for Engineering

8School of Engineering

Probability TheoremsGiven a finite sample space S and an event A in S, we define P(A), the probability of A, to be a value of an additive set function that satisfies the following three conditions:

• Axiom 1 0 ≤ P(A) ≤ 1 for each event A in S.

• Axiom 2 P(S) = 1.

• Axiom 3: If A and B are mutually exclusive events in S, then

P (A U B) = P(A) + P(B).

(An Axiom is any starting assumption from which other statements are logically derived. It requires no proof.)

• Probability functions must be additive.

– If A1, A2, A3, ... are mutually exclusive events in a sample space S, then

P (A1 U A2 U A3 U … U An ) = P(A1) + P(A2) + P(A3) + …+ P(An)

• For any event A, P(Ac) = 1 – P(A)

Page 40: Statistical Methods for Engineering

9School of Engineering

A ∩

B

Addition Rule

When A and B are non-mutually exclusive events in S,

P(A U B) = P(A) + P(B) – P(A ∩ B)

When A and B are mutually exclusive events in S,

P(A U B) = P(A) + P(B), since P(A ∩ B) = 0

A B

S

A U B

Page 41: Statistical Methods for Engineering

10School of Engineering

Mutually Exclusive vs Independent

A B

If A and B are mutually exclusive events

Then, A ∩ B = ø so that probability of A occurring given that B has

occurred is P(AIB)=0 and also P(BIA)=0

Two events are considered be independent if the occurrence of one is not

affected by the occurrence or nonoccurrence of the other.

The below Multiplication Rule applies if and only if A and B are independent:

P(A ∩ B) = P(A).P(B)

Hence, P(AIB) = P(A) and P(BIA) = P(B)

Page 42: Statistical Methods for Engineering

11School of Engineering

Independent Events

• If A and B are independent, then the following pairs are also independent:

– A and Bc , Ac and B, and Ac and Bc

• An ExampleA die is thrown twice. Find the probability of obtaining a 4 on the first throw and an odd number on the second throw.

Let A be the event „a 4 is obtained on the 1st throw‟ => P(A) = 1/6

Let B be the event „an odd number is obtained on the 2nd throw‟ => P(B) = 3/6 = ½, since B = {1, 3, 5}

Since the result of the 2nd throw is clearly not affected by the result of the 1st throw, A and B are independent events.

Hence, P(A ∩ B) = P(A). P(B) = 1/6 . 1/2 = 1/12

Page 43: Statistical Methods for Engineering

12School of Engineering

Today‟s Problem

Survey Questions:

Question 1 What is your primary mode of transport to RP?Response Bus, Train, Both, Others

Question 2 Under normal circumstances, do you usually arrive in RP on time?

Response Yes, No

Page 44: Statistical Methods for Engineering

13School of Engineering

Proposed Solution

S

A

C

B

9

2

1

2

0

D

2

5

4

Events

A : By Bus

B : By Train

C : By Bus and Train

Ac ∩ Bc ∩ Cc : By Others

D : On Time

Dc : Late

A ∩ Dc : Late by Bus

B ∩ D : Punctual by Train

(A ∩ D) U (B ∩ D) U (C ∩ D):

Punctual by Public Transport

Page 45: Statistical Methods for Engineering

14School of Engineering

Analysis

• 16/25 or 64% of respondents take public transport to RP and

arrive on time.

• It may be more meaningful to estimate the probability of students

being on time given that they take public transport (16/23 = 70%).

This is known as conditional probability.

• A follow-up survey can be conducted to determine the reasons of

arriving late by bus

• A bigger sample size is needed to better represent the population

that takes bus to RP and target audience should be selected

randomly to avoid biasness e.g. respondents should have

addresses in many parts of the island

Page 46: Statistical Methods for Engineering

15School of Engineering

Learning Outcomes

• Concept of:– Sample Space

– Events

– Mutually Exclusive Events

– Independent Events

– Unions, Intersections, and Complements

• Venn diagrams

• Additive Rules

• Multiplicative Rules

Page 47: Statistical Methods for Engineering

SCHOOL OF ENGINEERING

DIPLOMA IN INDUSTRIAL & OPERATIONS MANAGEMENT

DIPLOMA IN SUPPLY CHAIN MANAGEMENT

DIPLOMA IN CIVIL AVIATION

P03 – Dependent or Independent

E214 : STATISTICAL METHODS FOR ENGINEERING

Copyright © 2009 School of Engineering, Republic Polytechnic, Singapore

All rights reserved. No part of this document may be reproduced, stored in a retrieval

system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording or otherwise, without the prior permission of the School of

Engineering, Republic Polytechnic, Singapore.

Page 48: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 2 of 2

Dependent or Independent Having developed a microcontroller based actuator that automatically lowers window shades in strong sunlight, Nathan knows he has to put the system through the rigors of testing before he can unveil it. His ego was bruised when home tests conducted by his school mates revealed that the shades did not lower with every incidence of strong sunlight. Knowing that a system running with a reliability of 95 percent is the industry tolerance for microcontroller based actuators, Nathan is determined to ascertain the reliability of his invention. He commissioned a laboratory to test the reliability of the system over the course of 100 incidences of strong sunlight. Attached are the schematic of the system and the results of the tests conducted by the laboratory:

Schematic of the System and Test Data

Based on the data collected, help Nathan analyse the reliability of the system.

Page 49: Statistical Methods for Engineering

1School of Engineering

P03:

Dependent or Independent

E214 – STATISTICAL METHODS IN ENGINEERING

Page 50: Statistical Methods for Engineering

2School of Engineering

Dependent Events

• Two events are said to be dependent if the occurrence

or outcome of the first event affects the probability of

occurrence of the second event.

Probability of both events occurring, P(A ∩ B)

= P(A).P(BIA) = P(B).P(AIB)

• ExampleThere are 2 red balls and 3 blue balls in a bag. If two balls are drawn at random without replacing the balls, find the probability that both balls are red.

P(both red balls) = P(1st ball is red).P(2nd ball is red after first draw is red) = 2/5 x 1/4 = 1/10

The probability that the 2nd ball is red is clearly dependent on the result of the 1st draw.

Page 51: Statistical Methods for Engineering

3School of Engineering

Conditional Probability• Probability of event A occurring given that event B has already occurred

is written as P(A|B)

S

BA

P(A|B) = P(A∩B)/P(B) …….. (1)

P(BIA) = P(B∩A)/P(A) …….. (2)

Since P(A∩B) = P(B∩A), equating (1) and (2),

P(AIB).P(B) = P(BIA).P(A)

P(AIB) = [P(BIA).P(A)]/P(B) ------- BAYES RULE

P(AIB) is the probability of event A occurring, given that event B has already occurred.

Note that P(A|B) ≠ P(B|A).

Page 52: Statistical Methods for Engineering

4School of Engineering

Conditional Probability

Example

• Given that a heart is picked at random from a pack of 52 playing cards, find the probability that it is a picture heart card

• P (picture card | heart card)

= P (picture card ∩ heart card) / P (heart card)

=

= 3/13

52/13

52/3

Page 53: Statistical Methods for Engineering

5School of Engineering

Bayes’ Theorem: An ExampleAndy, Ben and Carrie pack biscuits in a factory. From the batch allotted to

them, Andy packs 55%, Ben 30% and Carrie 15%. The probability that Andy

breaks some biscuits in a packet is 0.7, for Ben it is 0.2, and, for Carrie the

probability is 0.1. What is the probability that a randomly selected packet

with broken biscuits is packed by Andy?

)(

)().|(

DP

APADP

Solution:

Let A be the event ‘the packet was packed by Andy’, B be the event ‘the packet

was packed by Ben’, C be the event ‘the packet was packed by Carrie’, and, D

be the event ‘the packet contains broken biscuits’.

Given P(A) = 0.55, P(B) = 0.3, P(C) = 0.15, P(D|A) = 0.7, P(D|B) = 0.2,

P(D|C) =0.1

We require P(A|D). Using Bayes’ Rule,

P(A|D) =

P(D) =P(D|A).P(A) + P(D|B).P(B) + P(D|C).P(C)

=(0.7)(0.55) + (0.2)(0.3) + (0.1)(0.15) = 0.46

=46.0

)55.0)(7.0(= 0.837

Page 54: Statistical Methods for Engineering

6School of Engineering

Probability Tree• The probability of the final outcome is given by the sum of the products of the probabilities

corresponding to each branch of the tree.

• Probability Tree can be used for both dependent and independent events

• Using the probability tree to solve the preceding example:

AB

C

0.55

0.3

0.15

0.7

0.3

0.2

0.8

0.1

0.9

Broken -

(0.55)(0.7)

Broken - (0.30)(0.2)

Broken - (0.15)(0.1)

P(AID) = )1.0)(15.0()2.0)(3.0()7.0)(55.0(

)7.0)(55.0(

= 0.837

Not Broken

Not Broken

Not Broken

Page 55: Statistical Methods for Engineering

7School of Engineering

Sequence of Dependent Events

Example

A bag contains eight green counters and three black counters. Two counters

are drawn, one after the other without replacement. Find the probability of

drawing one green and one black counter, in any order.

B

G

B

G

B

G

P(G1) = 8/11

P(G2|G1) = 7/10

P(B2|G1) = 3/10

P(B1) = 3/11

P(G2|B1) = 8/10

P(B2|B1) = 2/10

1st Draw

2nd Draw

P(Drawing 1G & 1B)

= (8/11)(3/10)

= 24/110 + 24/110

= 24/55

+ (3/11)(8/10)

Page 56: Statistical Methods for Engineering

8School of Engineering

Proposed Probability Tree Solution

Microcontroller

works

No: 0.06

Yes: 0.94

Yes: 0.9726

Actuator

works

Yes: 0.88

System

state

No: 0.12

Strong Sunlight,

and at least one

sensor works

Yes: 0.88

No: 0.12

No: 0.06

Yes: 0.94

Yes: 0.88

No: 0.12

Yes: 0.88

No: 0.12 0.000197

0.001447

0.003091

0.022665

0.007003

0.051353

0.804535

(Prob. of shade

working)

No: 0.30(0.24)(0.38)

= 0.0274

0.109709

Page 57: Statistical Methods for Engineering

9School of Engineering

Solution Analysis

Assumptions

1. Non-mutually exclusive events Failure of one

component does not preclude the failure of another.

Example: Any of the sensor and the actuator can fail

together.

2. Independent Failures The likelihood of a component

failing is not affected by the occurrence of other failures

Example: Probability of microcontroller working is the

same regardless of whether the sensor is functioning

Page 58: Statistical Methods for Engineering

10School of Engineering

Solution Analysis

From the Probability Tree:

• Probability of system working successfully on a sunny

day = 0.8045

• Probability of shade not lowering on a sunny day = 1 –

0.8045 = 0.1955

• P (system fails and sensor is faulty) = 0.0274

• P(system fails and only microcontroller is faulty) = 0.0513

• P(system fails and only actuator is faulty) = 0.1097

Page 59: Statistical Methods for Engineering

11School of Engineering

Solution Analysis

Conditional Probability:

• P(system fails | microcontroller is faulty) = 1

• P(system fails | sensor is working) = P(system fails and

sensor works)/ P(sensor works)

= 0.1681 / 0.9726 = 0.1728

Page 60: Statistical Methods for Engineering

12School of Engineering

Solution Analysis

Conditional Probability:

• P(only actuator is faulty | system fails) = P(only

actuator fails and system fails)/ P(system fails)

= 0.1097/(1-0.8045) = 0.5613

Page 61: Statistical Methods for Engineering

13School of Engineering

Learning Outcomes

• Conditional Probability

• Bayes’ Theorem

• Probability Tree

Page 62: Statistical Methods for Engineering

SCHOOL OF ENGINEERING

DIPLOMA IN INDUSTRIAL & OPERATIONS MANAGEMENT

DIPLOMA IN SUPPLY CHAIN MANAGEMENT

DIPLOMA IN CIVIL AVIATION

P04 – MANY COMBINATIONS

E214 : STATISTICAL METHODS FOR ENGINEERING

Copyright © 2009 School of Engineering, Republic Polytechnic, Singapore

All rights reserved. No part of this document may be reproduced, stored in a retrieval

system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording or otherwise, without the prior permission of the School of

Engineering, Republic Polytechnic, Singapore.

Page 63: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 2 of 2

Many Combinations

You are a perfume connoisseur who is in charge of creating a new family of perfumes. The scent of a perfume is formed by different components known as notes. There are 3 types of notes, namely top note, middle note and base note. The different notes unfold over time when a perfume is applied and combine to describe the scent of the perfume. You have shortlisted a list of 16 aromatic compounds of which 3 will be used for top notes, 8 for middle notes and 5 for base notes. The new perfume is designed to have 6 different aromatic compounds and must contain at least one of each type of notes. If it takes the company 3 days to produce and test 100 different perfumes, how much time is required for testing all possible combinations of aromas? You may assume that each different perfume would contain a unique combination of compounds.

Illustrative Figures on Perfume Notes

Page 64: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

P04

Many Combinations

Page 65: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(2)

Permutations

• A permutation is an ordered arrangement of distinct objects.

• One permutation differs from another if the order of arrangement differs or if the content differs.

• How many ways are there to arrange three boys –A, B, and C?

• The possible arrangements are ABC, ACB, BAC, BCA, CAB, CBA. There are six ways.

• Each arrangement is called a permutation.

Page 66: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(3)

Permutation of n different objects

• For the first boy, we can choose from A, B or C (3 ways).

• Once the first boy is chosen, the second boy can be chosen from the 2 remaining boys (2 ways).

• The third boy has to be the remaining boy (1 way).

• Number of ways = 3 x 2 x 1 = 3! = 6

• Number of ways of arranging n different objects is n!

• n! = n(n-1)(n-2)….(3)(2)(1)

Page 67: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(4)

Permutation of n distinct objects

Example:

• How many ways can the letters A, B, C and D be arranged?

Approach:

• The 1st letter can be chosen in 4 ways (either A or B or C or D)

• The 2nd letter can be chosen in 3 ways.

• The 3rd letter can be chosen in 2 ways.

• The 4th letter can be chosen in only 1 way.

• Therefore, number of ways of arranging 4 letters

= 4! = 24

Page 68: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(5)

Permutation of n objects (not all distinct)

Example:

• If instead of the letters A, B, C, D, we have the letters A, A, A, D

Approach:• The 24 arrangements reduce to: AAAD, AADA, ADAA, DAAA

• The number of ways of arranging 4 objects, of which 3 are alike = 4! / 3! = 4

• The number of ways of arranging n objects of which p are of one type, q of another type, r of a third type and so on is

!...!!

!

rqp

n

Page 69: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(6)

Permutation of r objects from n objects• Consider the number of ways of placing 3 of the letters A, B, C,

D, E, F, G in 3 empty spaces.

• The 1st space can be filled in 7 ways. The 2nd space can be filled in 6 ways. The 3rd space can be filled in 5 ways.

• Therefore, there are (7)(6)(5) = 210 ways of arranging 3 letters taken from 7 letters.

• Number of permutations of 3 objects taken from 7

= 7P3=(7)(6)(5) =

• Number of permutations of r objects taken from n different objects is nPr =

)!37(

!7

!4

!7

)1)(2)(3)(4(

)1)(2)(3)(4)(5)(6)(7(

)!(

!

rn

n

Page 70: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(7)

Combinations• A combination is an arrangement of distinct objects where

one combination differs from another only if the content of the arrangement differs. Order does not matter.

• The number of combinations of n different objects taken r at a time, denoted by nCr is

Proof:

We are interested in determining the number of combinations when there are n distinct objects to be selected r at a time. Since the number of permutations was the number of ways to select r objects from the n and then permute the r objects, we note that nPr = r! nCr

Hence nCr = nPr / r! =

)!(!

!

rnr

n

)!(!

!

rnr

n

Page 71: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(8)

Combinations: Example 1

• Four letters are chosen at random from the word RANDOMLY. Find the probability that all four letters chosen are consonants.

• Let S be the possibility space, then n(S) = 8C4 = 70

• Let E be the event ‘4 consonants are chosen’. As there are 6 consonants, n(E) = 6C4 = 15

• P(E) = n(E)/n(S) = 15/70 = 3/14

Page 72: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(9)

Combinations: Example 2

• Suppose a box contains 8 chip processors, 3 of which are

defective. If 3 are sold at random, find the probabilities that:

– Exactly 2 are defective

– All 3 are defective

– At least 1 is defective

• Taking each chip as individual entities, we need to determine

all combinations when 3 chips are sold 8C3

Exactly 2 are defective

– Combination of 2 defective chips and 1 good chip 3C2 x 5C1

Hence, probability is 3C2 x 5C1/ 8C3 = (3x5) / 56 = 0.27

Page 73: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(10)

Combinations: Example 2

All 3 are defective

– Combination of 3 defective chips 3C3

Hence, probability is 3C3/ 8C3= 1 / 56 = 0.018

At least 1 is defective

– This is the complement of no defective chips 1 – P(no defective chip)

– Combination of 3 good chips 5C3

Hence, probability is 1- 5C3/ 8C3 = 1- 10/56 = 0.82

Page 74: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(11)

Proposed Solution

• The new perfume should have 6 different aromas

with at least 1 aroma in each note (top, middle and

base)

• Possible combinations are:

– 3 top notes + 2 middle + 1 base

– 3 top notes + 1 middle + 2 base

– 2 top notes + 3 middle + 1 base

– 2 top notes + 2 middle + 2 base

– 2 top notes + 1 middle + 3 base

…and so on. Total there are 9 different combinations of

top, middle and base notes.

Page 75: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(12)

Proposed SolutionTotal number of ways to create the perfume:

• 3 top notes + 1 middle + 2 base = 3C3 x 8C1 x 5C2 = 140

• 3 top notes + 2 middle + 1 base = 3C3 x 8C2 x 5C1 = 80

• 2 top notes + 3 middle + 1 base = 3C2 x 8C3 x 5C1 = 840

• 2 top notes + 2 middle + 2 base = 3C2 x 8C2 x 5C2 = 840

• 2 top notes + 1 middle + 3 base = 3C2 x 8C1 x 5C3 = 240

• 1 top note + 4 middle + 1 base = 3C1 x 8C4 x 5C1 = 1050

• 1 top note + 3 middle + 2 base = 3C1 x 8C3 x 5C2 = 1680

• 1 top note + 2 middle + 3 base = 3C1 x 8C2 x 5C3 = 840

• 1 top note + 1 middle + 4 base = 3C1 x 8C1 x 5C4 = 120

• Total = 5830

• Time required = 5830/100*3 = 175 days

Page 76: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(13)

Learning Outcomes

• Permutations

• Combinations

Page 77: Statistical Methods for Engineering

SCHOOL OF ENGINEERING

DIPLOMA IN INDUSTRIAL & OPERATIONS MANAGEMENT

DIPLOMA IN SUPPLY CHAIN MANAGEMENT

DIPLOMA IN CIVIL AVIATION

P05 – Chance Winnings

E214 : STATISTICAL METHODS FOR ENGINEERING

Copyright © 2009 School of Engineering, Republic Polytechnic, Singapore

All rights reserved. No part of this document may be reproduced, stored in a retrieval

system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording or otherwise, without the prior permission of the School of

Engineering, Republic Polytechnic, Singapore.

Page 78: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 2 of 2

Chance Winnings

Entrusted with raising funds, and inspired by his recent holiday to Las Vegas, James determines that the fruit machine is a sure bet. It has after all earned the informal name of the one-armed bandit owing to its appearance and its ability to leave the gamer penniless. Wanting to raise as much funds as possible for the charity, James knows he must take into careful consideration the payout for each winning combination. If he pays out too much, he may end up making a loss over the three-day fund raiser. If the payout is too little, it might not generate any interest in playing the fruit machine. Having four windows, each showing at any one time either a lemon, an orange, an apple or cherries, the fruit machine has been configured to pay out when at least three windows show the same fruit. Knowing the probability of a window showing a particular fruit is 0.4 for lemons, 0.3 for oranges 0.2 for apples, 0.1 for cherries, how should James set the cost of each play and pay out to support the fund-raising?

Page 79: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

P05 – Chance Winnings

Page 80: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(2)

Discrete Random Variable

• A random variable (r.v.) is a variable whose values are determined by chance.

• Random variables are denoted by capital letters (X, Y, etc.) to distinguish them from their possible values given in lower case x, y.

• Discrete random variables can take on only a finite number of values or an infinite number of values that can be counted.

• Example: A die is thrown 6 times. Let X = number of 5’s obtained.

– Then X is a discrete r.v. and x = 0, 1, 2, 3, 4, 5, 6

Page 81: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(3)

Basic Properties of a pdf

• The probability distribution of a discrete r.v. X is a list of the possible values of X together with their probabilities

f(x) = P[X=x]

• The probability of each event in the sample space must be between or equal to 0 or 1:

0 ≤ P[X=x] ≤ 1 for all x

• The sum of the probabilities of all events in the sample space must equal 1:

xall

xXP_

1][

Page 82: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(4)

Probability Density Function (pdf)

• A discrete PDF consists of all possible values that a discrete r.v. can take on, together with the associated probabilities.

• Example: Let X represent the outcomes when a fair die is tossed once. The pdf of X is:

x 1 2 3 4 5 6

P[X=x] 1/6 1/6 1/6 1/6 1/6 1/6

where P[X=x] means probability that the r.v. X takes

the value x.

Formula form: P[X=x] =1/6, for x=1,2,3,4,5,6

Page 83: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(5)

Expected Value E(X)

Let X be a discrete random variable. Then the

expected value of X, also know as the mean of

the r.v., is denoted by E(X):

E(X) = xall

xXxP_

)(

Page 84: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(6)

Example

• Find E(X) for the pdf of a single throw of a fair die.

Solution:

• E(X) =

= (1/6)(1) + (1/6)(2) + (1/6)(3) + (1/6)(4) + (1/6)(5)

+ (1/6)(6)

= 21/6 = 3.5

xall

xXxP

_

)(

x 1 2 3 4 5 6

P[X=x] 1/6 1/6 1/6 1/6 1/6 1/6

Page 85: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(7)

Properties of E(X)

• E(a) = a

• E(aX) = aE(X)

• E(aX + b) = aE(X) + b

• E[f(X) ± g(X)] = E[f(X)] ± E[g(X)]

where a and b are constant values

Page 86: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(8)

Example

The r.v. X has pdf P(X=x) for x = 1, 2, 3.

Calculate E(3), E(X), E(5X), E(5X + 3), E(X2), E(4X2 – 3)

E(3) = ∑all x 3P(X=x) = 3(0.1) + 3(0.6) + 3(0.3) = 3

E(X) = ∑all x xP(X=x) = 1(0.1) + 2(0.6) + 3(0.3) = 2.2

x 1 2 3

P(X = x) 0.1 0.6 0.3

Page 87: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(9)

Example

E(5X) = ∑all x 5xP(X=x) = 5(0.1) + 10(0.6) + 15(0.3)

= 11 = 5E(X)

E(5X + 3) = ∑all x (5x + 3)P(X=x)

= 8(0.1) + 13(0.6) + 18(0.3) = 14 = 5E(X) + 3

E(X2) = ∑all x x2P(X=x) = 1(0.1) + 4(0.6) + 9(0.3) = 5.2

E(4X2-3) = ∑all x (4x2-3)P(X=x) = 1(0.1) + 13(0.6) + 33(0.3) = 17.8 = 4E(X2) - 3

Page 88: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(10)

Variance, Var(X)

The variance of a discrete r.v. X measures the spread

or deviation of the r.v. about its mean value. It is

denoted by Var(X) or σ2:

Var(X) = E(X - µ)2

= E(X2 - 2µX + µ2) = E(X2)- 2µE(X) + E(µ2)

= E(X2)- 2µ2 + µ2

= E(X2) - µ2

= E(X2)- [E(X)]2

Page 89: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(11)

Example

The r.v. X has pdf shown below:

Find Var(X).

E(X) = 1(0.1) + 2(0.3) + 3(0.2) + 4(0.3) + 5(0.1) = 3

E(X2) = 1(0.1) + 4(0.3) + 9(0.2) + 16(0.3) + 25(0.1)

= 10.4

Var(X) = E(X2) – [E(X)]2 = 10.4 – 32 = 1.4

x 1 2 3 4 5

P(X = x) 0.1 0.3 0.2 0.3 0.1

Page 90: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(12)

Properties of Var(X)

Var(c) = 0, where c is any constant

Var(cX) = c2Var(X)

Var(cX + d) = c2Var(X), where d is a constant

Proof:

Var(cX) = E(c2X2) – [E(cX)]2 = c2E(X2) – [cE(X)]2

= c2E(X2) – c2[E(X)]2 = c2(E(X2) – [E(X)]2)

= c2Var(X)

Page 91: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(13)

Discussion for Today’s Problem

Winning

Combination

3 Lemons 3 Oranges 3 Apples 3 Cherries

Payout $1 $2 $3 $5

Winning

Combination

4 Lemons 4 Oranges 4 Apples 4 Cherries

Payout $3 $4 $6 $12

Page 92: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(14)

Discussion for Today’s Problem

P(lemons) = 0.4, P(oranges) = 0.3,

P(apples) = 0.2, P(cherries) = 0.1

Calculations for:

3 fruits of the same kind:

e.g. P(3 lemons) = 4C3 (0.4)3 (1-0.4)1 = 0.1536

4 fruits of the same kind

e.g. P(4 oranges) = (0.3)4 = 0.0081

P(James wins) = 1 – P(No winning combinations appear)

Page 93: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(15)

Discussion for Today’s Problem

x $1(no win)

$0(3 lemons)

-$1(3 oranges)

-$2(3 apples)

-$4(3 cherries)

P(X=x) 0.7062 0.1536 0.0756 0.0256 0.0036

x -$2(4 lemons)

-$3(4 oranges)

-$5(4 apples)

-$11(4 cherries)

P(X=x) 0.0256 0.0081 0.0016 0.0001

Let the cost of one play be $1 and $x be James’ profit per

play.

Expected profit per play, E(X) = $0.480

Variance of profit, Var(X) = 0.938

Page 94: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(16)

Discussion for Today’s Problem

By doubling the payout and doubling the price to

play, the expected value of James’ profit will double

and the variance will increase by four times.

• E(2X) = 2E(X) = $0.961

• Var(2X) = 4Var(X) = 3.754

• James will want to increase his expected profit and reduce

the variance so that his earnings will be more certain.

• This can be achieved by reducing the number of winning

combinations (e.g. no win for 3 lemons), increasing the cost

of play and/or reducing the payout for the 3 lemons winning

combination

Page 95: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(17)

Learning Outcomes

• Discrete Random Variable

• Probability Density Function (pdf)

• Expectation

• Variance

Page 96: Statistical Methods for Engineering

SCHOOL OF ENGINEERING

DIPLOMA IN INDUSTRIAL & OPERATIONS MANAGEMENT

DIPLOMA IN SUPPLY CHAIN MANAGEMENT

DIPLOMA IN CIVIL AVIATION

P06 – UNDERWEIGHT OR NOT

E214 : STATISTICAL METHODS FOR ENGINEERING

Copyright © 2009 School of Engineering, Republic Polytechnic, Singapore

All rights reserved. No part of this document may be reproduced, stored in a retrieval

system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording or otherwise, without the prior permission of the School of

Engineering, Republic Polytechnic, Singapore.

Page 97: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 2 of 2

Underweight or Not You are a purchaser in a food company. Recently, your supplier for frozen chicken fillet has been bought over by a competitor and the company initiated a major change in the packaging and production method. Even though the agreement for the supply of the fillet remains unchanged, you are concerned that the amount of fillet in the new packaging may be different. One day, you carried out a sampling check on 40 packets of chicken fillet and collected the following data:

P6 Data.xls

Past records show that on the average, out of 100 packets of chicken fillet, 16 packets were underweight. Your company wants to know if the claim by the supplier that the weight of the packet remains unchanged is substantiated by the data. How do you decide from the data collected? If similar checks were to be carried out in the future, what acceptance criteria should be used?

Page 98: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

P06 – Underweight or Not

Page 99: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(2)

Probability Problems with 2 Outcomes

• Many types of probability problems have only two outcomes or can be reduced to two outcomes.

• For example:– When a coin is tossed, it can land heads or tails.

– When a baby is born, it will be male or female.

– In an examination, you either pass or fail.

• Situations that can be reduced to 2 outcomes:– A medical treatment can be classified as effective or ineffective.

– A person can be classified a having normal or abnormal blood pressure, depending on the measure of the blood pressure.

– A multiple-choice question response, although there are 4 or 5 answer choices, can be classified as correct or incorrect.

• Situations like these are called binomial experiments.

Page 100: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(3)

Binomial Experiment

• A binomial experiment is an experiment that satisfies the following properties:

– Experiment consists of n repeated trials.

– Each trial has two possible outcomes: success or failure.

– Probability of success, denoted by p, is the same in each trial.

– Repeated trials are independent.

• Outcomes of a binomial experiment and the corresponding probabilities of these outcomes are called a binomial distribution.

Page 101: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(4)

Binomial Distribution

• Let X be the number of successes in n trials of a binomial experiment

• X is called a binomial random variable with pdf given by:

P(X = r) = nCrpr(1-p)n-r, where r = 0,1,2,…,n

• p is the probability of success.

• It can also be expressed as X ~ B(n,p).

Page 102: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

Example: Positive and Negatively Skewed

Binomial Distribution

(5)

Right-skewed Left-skewed

Page 103: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(6)

An Example

A coin is tossed three times. Find the probability of getting exactly two heads.

– This problem can be solved by looking at the sample space:

HHH, HHT, HTH, THH, TTH, THT, HTT, TTT

– There are 3 ways to get 2 heads, therefore,

• P(exactly 2 heads) = 3/8 = 0.375

Page 104: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(7)

An Example – Coin Toss, P(2 heads)

• Consists of three trials (tosses)

• Each trial has only two possible outcomes: heads or tails

• Probability of success (head) = 0.5 for each trial

• Outcomes are independent of each other (the outcome of one toss does not affect the outcome of the other tosses)

Solution:

• Applying Binomial Distribution, let X be the random variable representing the number of heads

• X ~ B(3, 0.5)

• P(X = 2) = 3C2(0.5)2(0.5)1 = 0.375

Page 105: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(8)

An Example

There are five multiple choice questions in a test. Each question has five possible choices. If a student randomly guesses the answers to all five questions, find the probability that he gets exactly three correct.

Solution:

Let X be the r.v. representing the number of correct answers.

X ~ B(5, 0.2)

P(X = 3) = 5C3(0.2)3(0.8)2 = 0.0512

Page 106: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(9)

Mean and Variance for Binomial Distribution

Let X ~ B(n,p), then

E(X) = µ = np

Var(X) = σ2 = np(1-p)

Page 107: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(10)

An Example

A die is rolled 480 times. Find the mean, variance and standard deviation of the number of 2’s obtained.

Solution:

Let X be the r.v. representing the number of 2’s obtained.

µ = np = 480 x 1/6 = 80

σ2 = np(1-p) = 480 x 1/6 x 5/6 = 66.67

σ = sqrt[np(1-p)] = sqrt(66.67) = 8.16

Page 108: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(11)

Today’s Problem

Let X be the random variable representing the number of underweight

fillets. Assuming probability of the fillet being underweight is the same

for all packets and the result of each weighing is independent,

X ~ binomial (40, 16/100)

Average number of underweight fillets in a sample of 40 is

E(X) = 0.16*40 = 6.4

The variance is Var(X) = 5.38 and Standard Deviation = 2.32. This

means that most of the checks should yield between 4 and 9

underweight packets.

Using Excel, work out the probability of each x (x is from 0 to 40) using:

BINOMDIST(x, 40, 16/100, 0)

Page 109: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(12)

Today’s Problem

From the graph, it is seen that the probability of getting exactly 8

packets of underweight fillets from the check is 0.125. However, we

cannot make our decision on this probability.

Page 110: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(13)

Today’s Problem

It is more useful to determine the cumulative probability in setting the

acceptance criterion. For example, if the acceptance criterion is 9 or

less, then it means that the probability of getting more than 9 packets of

underweight fillets is 1-0.9= 0.1, which is unlikely.

x P(X<=x)

0 0.00

1 0.01

2 0.03

3 0.10

4 0.21

5 0.37

6 0.54

7 0.70

8 0.82

9 0.90

10 0.95

11 0.98

12 0.99

13 1.00

Page 111: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(14)

Learning Outcomes

Binomial Distribution

– Properties

– Probabilities

– Mean

– Variance

Page 112: Statistical Methods for Engineering

SCHOOL OF ENGINEERING

DIPLOMA IN INDUSTRIAL & OPERATIONS MANAGEMENT

DIPLOMA IN SUPPLY CHAIN MANAGEMENT

DIPLOMA IN CIVIL AVIATION

P07 – ENOUGH AMBULANCES?

E214 : STATISTICAL METHODS FOR ENGINEERING

Copyright © 2009 School of Engineering, Republic Polytechnic, Singapore

All rights reserved. No part of this document may be reproduced, stored in a retrieval

system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording or otherwise, without the prior permission of the School of

Engineering, Republic Polytechnic, Singapore.

Page 113: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 2 of 2

Enough Ambulances? A Straits Times article on Jan 28, 2009 reported that Singapore Civil Service Defence Force (SCDF) is planning to open up its emergency ambulance service to the private sector. This is in response to the growing number of emergency calls that is expected to further increase with a growing and aging population. SCDF wants to add ten more private ambulances to its current fleet of forty emergency ambulances. According to the report, a total of 111,127 emergency calls were made to SCDF last year, 9 per cent more than the calls received in 2007. The number of prank calls to the emergency lines also went up to at least 11 calls per day. Being a statistics student, you are naturally curious about the numbers. Doing a quick mental calculation, you worked out that the average number of calls made daily is 316 including prank calls. However, knowing that incidences of emergencies occur randomly, you wonder if adding ten more ambulances will be sufficient for SCDF’s needs. How do you think the statistical nature of the problem is considered in the planning? Assuming that your team is consultant to SCDF, conduct a study based on the numbers given, make reasonable assumptions and present your findings.

Page 114: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

P07 – Enough Ambulances?

Page 115: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(2)

Recall: Probability Distribution

• A probability distribution lists all the outcomes of an experiment and the probabilities associated with each outcome.

• It describes the likelihood of some future event.

• Two important characteristics of a probability distribution are:– The probability of a particular outcome is between 0

and 1, inclusive.

– The sum of the probabilities of all mutually exclusive events is 1.0.

Page 116: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(3)

Recall: Random Variable

• In any experiment of chance, the outcomes

occur randomly. These quantities are called

Random Variables.

• Random Variables can be Discrete or

Continuous.

– Discrete random variables can assume only certain

clearly separated values (countable).

– Continuous random variables can assume one of an

infinitely large number of values (measurable)

Page 117: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(4)

Poisson Distribution

• Poisson probability distribution describes the number of times some event occurs during a specified interval.

• Interval may be time, distance, area, or volume.

• Poisson distribution is based on two assumptions:– Probability of a “success” is proportional to the length of the

interval

– Intervals are independent

• The longer the interval, the larger the probability and the number of occurrences in one interval does not affect the other intervals

• It is a discrete probability distribution because it is formed by counting.

Page 118: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(5)

Poisson Random Variable

Examples of Poisson Random Variables:

• Number of people who arrived at a hospital emergency

room in 1-hour interval

• No of customers queuing up at a POSB bank counter

• Number of flaws (cracks and deep scratches) in an area

of ceramic flooring in a newly built HDB flat

Interval

In a Poisson process, events

occur at random in an interval

Page 119: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(6)

Poisson Distribution

!)(

x

exP

x

The Poisson Distribution is given by the formula:

Where:

is the mean number of occurrences (successes) in a particular interval

x is the number of occurrences (successes)

e is the constant 2.71828 (base of the natural logarithm)

P(x) is the probability for a specified value of x

When X is a Poisson variable, we write

X~Poisson(), or,

X ~ Po()

for x = 0,1,2,3…

Page 120: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(7)

Characteristics of Poisson Distribution

Expected Value:

E(X) =

Variance:Var(X) = np(1 – p)

= (1 – p)

As p tends to zero, then Var(X) =

Poisson Distribution has the same Expected Value and Variance.

Page 121: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(8)

Approximation of Poisson Distribution

• Poisson Distribution can be used to approximate binomial distribution B(n,p) when n is large and p is small

• 2 general rules-of-thumb:

– n≥20 and p≤0.1 or

– n≥100 and np≤10

Page 122: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(9)

Example

Given = 5 cars arriving in a 5-minute period,

• Probability of 8 cars arriving in 5 minute period,

P(X=8) = (58 x e-5)/8! = 0.065

• Probability of more than 6 cars arriving in 5 minute period,

P(X>6) = 1-P(X<=6) = 0.24

[Using Excel, 1- Poisson(6,5,1)]

• Mean number of cars arriving in 1 hour = 12 x 5 = 60

Page 123: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(10)

Example

0

0.05

0.1

0.15

0.2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Poisson probability distribution (= 5)

Page 124: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(11)

Today’s Problem

• 111,127 calls are received in 2008. Projecting a 10% increase, the number of calls in 2009 would be 122,240.

• Average no. of calls in a day is 335+12(prank calls)= 347

• Average calls per hour is 14.5

• A few assumptions are required:– The average duration an ambulance is engaged during a call is 1

hour

– The distribution of calls throughout the day is not uniform. Assume that there is a peak hour each day and that the number of calls during the peak hour is 2 times the average no. of calls, i.e. 29

– All 40 ambulances are available at all times (together with the associated manpower and equipment)

Page 125: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(12)

Today’s Problem

• Are the number of ambulances sufficient? To answer this question, we have to calculate the probability that there will be more than 40 calls made in 1 hour:

• From Excel, Poisson(40,29,1) = 0.979

• Hence there is approximately 2% chance of running out of ambulances

• This may seem low but if we assume that the peak hour occurs everyday, then in one year, there is more than 7 incidents where there is a shortage of ambulances. It could mean 7 lives lost!

)40(1)40( XPXP

Page 126: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(13)

Today’s Problem

• If 10 ambulances are added, from Excel, Poisson(50,29,1) = 0.9999. There is almost zero chance of shortage.

• In this case, will there be too many ambulances? What percent risk is acceptable?

• Do bear in mind this is a statistical exercise (see footnote). Other factors that should be considered include the availability of manpower and equipment, the reliability of vehicle, the response time requirement, the likelihood of a disease outbreak, cost involved, etc. Can you think of any others?

Page 127: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(14)

Learning Outcomes

• Poisson Probability Distribution

• Poisson Random Variables

• Characteristics of a Poisson Distribution

Page 128: Statistical Methods for Engineering

SCHOOL OF ENGINEERING

DIPLOMA IN INDUSTRIAL & OPERATIONS MANAGEMENT

DIPLOMA IN SUPPLY CHAIN MANAGEMENT

DIPLOMA IN CIVIL AVIATION

P08 – OF PISTONS AND CYLINDERS

E214 : STATISTICAL METHODS FOR ENGINEERING

Copyright © 2009 School of Engineering, Republic Polytechnic, Singapore

All rights reserved. No part of this document may be reproduced, stored in a retrieval

system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording or otherwise, without the prior permission of the School of

Engineering, Republic Polytechnic, Singapore.

Page 129: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 2 of 2

Of Pistons and Cylinders

You are an engineer working for an engine manufacturer. Your company has received a few complaints from customers about a recently launched engine. The engine sometimes does not perform to specifications and fails to deliver the stated torque. An investigation reveals that the cause is sub-performance of the main piston. In order to perform optimally, strict specifications require that the gap between the piston and cylinder be between 0.12 and 0.40mm. After eliminating the likelihood of problems in the assembly process, you turn your attention to the dimensions of the piston and cylinder. A request is made to the supplier of the components for the exact diameter measurements of all the pistons and cylinders delivered so far.

Data.xls

What can you conclude from the data? Suppose the supplier offers a new type of piston that is touted to deliver better performance. Your colleague John asked you to conduct a sampling check on the diameter of the new pistons. You need to measure the diameters for 10 pistons and determine the mean diameter. John said that if the mean falls within one standard deviation (of the mean of the old pistons), the new pistons should be accepted. Is John’s approach correct?

Page 130: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

P08 – Of Pistons and Cylinders

Page 131: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(2)

Why Study Normal Distribution?

• Certain probability densities have so much importance

in statistics that areas under the curve have been

tabulated for future reference.

• One such distribution is the Normal, or Bell-shaped,

Distribution.

• This distribution is useful for describing variability in

industrial measurements such as lengths or weights.

• Natural variation in living organisms and their

characteristics also tend to follow a Normal

Distribution.

Page 132: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(3)

Properties of Normal Distribution Curve

• The normal distribution curve is bell-shaped.

• The mean, median and mode are equal and located at the centre of the

distribution.

• The curve is symmetrical about the mean.

• The standard deviation () specifies the amount of dispersion around the

mean.

• Two parameters and completely define a normal distribution curve.

• The further away from the mean the curve moves, the closer it gets to

the x-axis but it never touches.

• The curve is represented by the formula :

Where μ = mean, σ = std deviation, e = 2.718282, -∞<x< ∞

Page 133: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(4)

Histogram for blood pressure measurements

for sample of 118 men

0.1

.2. 3

.4.5

Perc

enta

ge o

f M

en

80 100 120 140 160

Systolic BP (mmHg)

Page 134: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(5)

Histogram for blood pressure measurements

for sample of 5000 men

0.1

.2.3

Perc

enta

ge o

f M

en

80 100 120 140

Blood Pressure (mmHg)

Page 135: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(6)

Examples of Normal Distribution Curves

Typical normal distribution

with mean=5 and variance=1

Two normal distributions with different

mean values and same variance

Two normal distributions with different variances

and the same mean

Page 136: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(7)

The Normal Distribution • Each normal density curve is completely defined by two parameters

– mean (average), represented by μ, and,

– standard deviation, represented by σ.

Page 137: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(8)

Interpreting the Normal Curve

a b

• Probability = Area under the curve = shaded region

• P (a<X<b) = area under the curve between a and b

Area under the curve is obtained using NORMDIST in Excel

or

In case of a Standard Normal Distribution, NORMSDIST.

Page 138: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(9)

Standard Normal Distribution

• Standard Normal Distribution is a distribution with mean 0 and variance 1

• It is represented by the standard normal variable Z where

Z = (X - µ) / σ

Page 139: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

Linear Combinations of Normal R.V.

• Linear combinations of normal random variables are also

normally distributed

Linear Functions

• If X ~ N(µ,σ2) and a and b are constants, then

Y = aX + b ~ N(aµ+b, a2σ2)

Sum of Two Independent Normal R.V.

• If X1 ~ N(µ1,σ12) and X2 ~ N(µ2,σ2

2) are independent

r.v., then

Y = X1+X2 ~ N(µ1+µ2, σ12+σ2

2)(10)

Page 140: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

Sampling

Sampling is a process of selecting a subset of data from the

population

Reasons for Sampling: Time Constraints

Cost Constraints

Impossibility of a Census

Population is infinite

Measuring process is destructive

Page 141: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

Take k samples each of size n and calculate the sample mean for each

sample. Using these sample means, a distribution known as sampling

distribution of the mean can be obtained.

Sample Observations Sample Mean

1 X1,1 X1,2 …… X1,n 1

2 X2,1 X2,2 …… X2,n

2

k Xk,1 Xk,2 …… Xk,n k

Essentially, sampling distribution is the distribution of values for a sample

statistic obtained from repeated samples, all of the same size and all

drawn from the same population.

Sampling Distribution of Sample Mean

x

x

x

Page 142: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

Let X1, X2,…, Xn denote a random sample selected from a population having mean µ and variance σ2.

Central Limit Theorem states that as sample size n increases (i.e. n ≥

30), the sampling distribution of the sample mean will:

1. Have a mean µ = µ

2. Have a standard deviation σ = σ / √n

3. Be approximately normally distributed

• The sampling distribution has a normal distribution if the population is

normally distributed. For other types of population, it will approximate

a normal distribution when n is large (rule of thumb, n ≥ 30)

• The standard deviation of the sample mean is known as the standard

error of the sample mean and is an indication of the accuracy of the

estimating the ‘true’ mean with sample mean

Central Limit Theorem

x

x

x

Page 143: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

Central Limit Theorem (CLT)Normal Uniform Exponential

Population Distribution

Sample of n=2

Sample of n=5

Sample of n=30

Significance of CLT

It permits us to use

sample statistics to

make inferences about

the population

parameters without

knowing anything about

the specific shape of the

population distribution.

Page 144: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(15)

Today’s Problem

• Let the gap between piston and cylinder be

Y = X2 – X1

• It follows that Y would be normally distributed as it is a linear combination of normal random variables X1 and X2.

• Mean µY = µX2 – µX1 = 25.26 – 25.00 = 0.26

• Variance σY2 = σX2

2 + σX12 = 0.062 + 0.082 = 0.0094

• Hence Y ~ N(0.26, 0.0094)

Page 145: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(16)

Today’s Problem

• Probability that a piston will not fit in a cylinder is when the gap is less than zero,

P(Y<0) = P(z<(0- µY)/√σY)

= P(z<(0-0.26)/√0.0094) = P(z<-2.67)

= 0.0037

• Probability that a piston will perform optimally is when gap is between 0.12mm and 0.40mm,

P(0.12<Y<0.40) = P [(0.12-µY)/√σY< z <(0.40-µY)/√σY)]

= P [(0.12-0.26)/√0.0094 < z < (0.40-0.26)/√0.0094]

= P(z<1.44) – P(z<-1.44) = 0.851

Page 146: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(17)

Today’s Problem

• Assuming that diameters of the new pistons have the same distribution as the old ones, distribution of the sample mean can be written as:

~ N (µ, σ2/10)

• If underlying distribution of the new piston diameters is unknown or cannot be assumed as normal, then we can increase the sample size to 30 or more pistons so that by Central Limit Theorem, the sample mean is approximated to be normally distributed.

X

Page 147: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(18)

Today’s Problem

• The diameters of the piston within the interval [24.94, 25.06] represents one standard deviation from the mean.

• Probability that the sample mean lies within one s.d. is

P(µ-σ< <µ+σ)

= P [(µ-σ-µ)/√σ2/10 < z < (µ+σ-µ)/√σ2/10]

= P(z<0.06/(√0.062/10)) – P(z<-0.06/(√0.062/10))

= P(z<3.16) – P(z<-3.16)

= 0.9984

x

1 S.D.

Population

Distribution

Distribution of

Sample Mean

Page 148: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(19)

Today’s Problem

• Although there is a 68% chance that the diameters of the old pistons lie within one standard deviation of the mean, the average diameter of the ten new pistons has a 99.8% chance of falling within the interval if both have the same distribution.

• The acceptance criteria should be based on comparison with population mean and not the distribution of the population.

• Hypothesis testing of the mean should be conducted to check whether the diameter of the new pistons is the same.

Page 149: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(20)

Learning Outcomes

• Properties of a Normal Distribution Curve

• Standard Normal Distribution

– Standard Normal Variable z

– Applications of the Standard Normal

Distribution Table

• Central Limit Theorem

• Sampling Distribution of Sample Mean

Page 150: Statistical Methods for Engineering

SCHOOL OF ENGINEERING

DIPLOMA IN INDUSTRIAL & OPERATIONS MANAGEMENT

DIPLOMA IN SUPPLY CHAIN MANAGEMENT

DIPLOMA IN CIVIL AVIATION

P09 – CASE OR NO CASE

E214 : STATISTICAL METHODS FOR ENGINEERING

Copyright © 2009 School of Engineering, Republic Polytechnic, Singapore

All rights reserved. No part of this document may be reproduced, stored in a retrieval

system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording or otherwise, without the prior permission of the School of

Engineering, Republic Polytechnic, Singapore.

Page 151: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 2 of 2

Case or No Case

A consumer product company, A & B, has being producing its popular foam facial wash for the past 10 years. A & B has always been proud of its ability to provide consumers with exciting foam height per pump of its liquid – to – foam facial foam. In a recent advertising campaign, A & B’s endorsing artiste has mentioned on national TV that the average foam height is 75 millimeters. Foam height is approximately normally distributed and has a standard deviation of 5 millimeters. Vivian, an avid blogger, has been intrigued by A & B’s claim. She decided to conduct her own experiment to challenge that the average foam height is not 75 millimeters. Vivian obtained 50 foam height data as attached below:

P09_Foam Height Data_AllTeams.xlsx

Vivian looked at the data and analyzed that she can reject A & B’s claim. Thus she wrote about her findings in her blog. Vivian wondered if her analysis will be affected if she is unaware of the standard deviation and is there an estimated sample size for a predefined error. Is Vivian’s conclusion valid? How will you help to address Vivian’s doubts?

Illustrative Figure on foam height

Page 152: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

P09

Case or No Case

Page 153: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(2)

Statistical Hypotheses

• Many problems in daily life require that we decide whether to accept or reject a statement about some parameter

• The statement is called a hypothesis, and the decision-making procedure about the hypothesis is called hypothesis testing.

• A hypothesis is thus a claim or statement about a property of a population.

Page 154: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(3)

Terms used

• Significance Level is the probability of making a decision to reject the null hypothesis when the null hypothesis is actually true.

• Critical Region is the set of values for which we reject the null hypothesis.

• Critical Values determine the boundary between a decision whether or not to reject the null hypothesis.

Page 155: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(4)

Forming the Hypothesis

• Manufacturer’s claim– The average foam height per pump is 75 mm.

– This claim is commonly referred to as the null hypothesis,H0.

– The null hypothesis is presumed true unless we have enough evidence to reject it.

• Blogger’s suspicion

– The average foam height per pump is NOT 75 mm.

– This is commonly referred to as the alternative hypothesis, H1.

Page 156: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(5)

Null and Alternative Hypothesis

• Null Hypothesis– The null hypothesis is a statement of the value of a

population parameter.

– It tests whether the sample mean is the same as the population mean:

H0 : μ = 75

• Alternative Hypothesis– The alternative hypothesis (denoted by H1) is the

statement that must be true if the null hypothesis is false.

H1 : μ ≠ 75

– This is a two – tailed test.

Page 157: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(6)

One – Tailed and Two – Tailed Test

• Example:

Hypothesis Testing Problem Null and Alternative Hypothesis

Mean burn rate is not 50 cm/s H0 : μ = 50

H1 : μ ≠ 50

Mean burn rate is less than 50 cm/s H0 : μ = 50

H1 : μ < 50

Mean burn rate is more than 50 cm/s H0 : μ = 50

H1 : μ > 50

Page 158: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(7)

One – Tailed Test

• One – Tailed test– In such a test, the critical region is in the region of the

inequality of the alternative hypothesis

(i.e. < will be left, > will be right).

Null and Alternative

Hypothesis

Acceptance Region

H0 : μ = 50

H1 : μ < 50

H0 : μ = 175

H1 : μ > 175

Page 159: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(8)

Two – Tailed Test

• Two – Tailed Test– In such a test, the critical region is split into two parts,

with (usually) equal probability placed in each tail of the distribution of the test statistic.

Null and Alternative

Hypothesis

Acceptance Region

H0 : μ = 50

H1 : μ ≠ 50

Page 160: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(9)

Test Statistic

Variance

Known?

Sampling

Distribution

Test Test Statistic

Known

(Or Large

Sample

Size)

Normal

Distribution

Z – Test

Unknown 1 Sample

t – Distribution

(n-1 Degree of

freedom)

1 Sample t - Test

• Hypothesis Tests on Mean– When conducting hypothesis testing on mean of a normally

distributed population, variance can be either known or unknown, resulting in different estimated sample distribution.

ns

xt

n

xz

Page 161: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(10)

Critical Region and Value

• The critical region is the set of all values of the test statistic that

would cause rejection of the null hypothesis.

• The critical value is the value separating the critical region from

the values of the test statistic that would not lead to rejection of

the null hypothesis.

Significance

LevelOne – Tailed

Test

Two – Tailed

Test

5%(95%

Confidence

Level)

-1.645 -1.96 1.96

Page 162: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(11)

Critical Region and ValueSignificance

LevelOne – Tailed

Test

5%(95%

Confidence

Level)

-1.645

= NORMSINV(0.05) = -1.645

Page 163: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(12)

Critical Region and ValueSignificance

LevelTwo – Tailed

Test

5%(95%

Confidence

Level)

2

-1.96 1.96

= NORMSINV(0.025) = -1.96

= NORMSINV(0.975) = 1.96

Page 164: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(13)

Test Statistic: Example 1

• Suppose we are interested in the burn rate of a solid propellant

used to power aircrew escape systems, where it has been claimed

that the mean burn rate is 50 cm/s and hypothesis test problem is

mean burn rate not 50 cm/s.

– Given = 2.5 mm/sec; n = 50 (large sample size, thus Normal

sampling distribution); = 50.25 cm/s; 95% Confidence Level

Sol:

H0: µ = 50 cm/s

H1: µ ≠ 50 cm/s

x

x

-1.96 1.96

Normal Sampling Distribution:

Since z = 0.707 is in the acceptance

region, we will not reject H0.

n

xz

707.0505.2

5025.50

z

At 95% CI

Page 165: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(14)

Test Statistic: Example 2

• Suppose we are interested in the burn rate of a solid propellant

used to power aircrew escape systems, where it has been claimed

that the mean burn rate is 50 cm/s and hypothesis test problem is

mean burn rate not 50 cm/s.

– Given = 2.5 mm/sec; n = 50 (large sample size, thus Normal

sampling distribution); = 46.55 cm/s; 95% Confidence Level

Sol:

H0: µ = 50 cm/s

H1: µ ≠ 50 cm/s

x

x

-1.96 1.96

At 95% CI

Normal Sampling Distribution:

Since z = -9.76 is NOT in the

acceptance region, we will reject H0.

n

xz

76.9505.2

5055.46

z

Page 166: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(15)

Proposed Solution (Known Variance)

• Hypothesis:

– H0: µ = 75 mm

– H1: µ ≠ 75 mm

-1.96 1.96

At 95% CI

Normal Sampling Distribution:

Since z = 7.382 is NOT in the

acceptance region, we will reject H0.

n

xz

382.7505

7522.08

z

Page 167: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(16)

Proposed Solution (Unknown Variance)

• Hypothesis:

– H0: µ = 75 mm

– H1: µ ≠ 75 mm

At 95% CI, n-1= 49

degree of freedom

1 Sample t-Test Sampling Distribution:

Since t = 6.28 is NOT in the acceptance

region, we will reject H0.

ns

xt

28.650877.5

7522.08

t

-2.01 2.01 Can also use Excel function TINV(0.1,49)

Page 168: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(17)

Proposed Solution (Estimate Sample Size)

• Hypothesis:

– Given: = 0.1, E = 1.5 mm, = 5,

2

2/

E

zn

31

2

5.1

5)645.1(

Page 169: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(18)

Proposed Solution (HT Methodology)

From Problem, identify parameter of interest

State Null Hypothesis, H0

Specify appropriate alternative hypothesis, H1

Choose a significance level,

Determine an appropriate test statistic

State the rejection region for the statistic

Decide whether or not H0 should be rejected and report in

problem contextss

Compute any necessary sample quantities, substitute

these into the equation for the test statistic, and compute

that value

Page 170: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(19)

Learning Outcomes

• Hypothesis Testing– Null and Alternative Hypothesis (One-tailed and Two-

tailed)

– Significance Level

– Test Statistic

– Methodology

• Z – Test

• 1 Sample t – Test

• Estimate Sample Size

Page 171: Statistical Methods for Engineering

SCHOOL OF ENGINEERING

DIPLOMA IN INDUSTRIAL & OPERATIONS MANAGEMENT

DIPLOMA IN SUPPLY CHAIN MANAGEMENT

DIPLOMA IN CIVIL AVIATION

P10 – CHARGED OR RECHARGE

E214 : STATISTICAL METHODS FOR ENGINEERING

Copyright © 2009 School of Engineering, Republic Polytechnic, Singapore

All rights reserved. No part of this document may be reproduced, stored in a retrieval

system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording or otherwise, without the prior permission of the School of

Engineering, Republic Polytechnic, Singapore.

Page 172: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 2 of 2

Charged or Recharge

The advertised claim for ABC batteries for mobile phones is set at 48 operating hours, with proper charging procedures. A study of 5000 batteries is carried out and 7 stop operating prior to 48 hours. Do these experimental results support the claim that less than 0.2 percent of the company’s batteries will fail during the advertised time period, with proper charging procedures?

Perform a hypothesis – testing procedure with = 0.05, discussing the errors that could arise when a wrong decision is made from the result. Can you also estimate the confidence interval for which the mean percent of batteries will fail, based on the experimental results? What is the relationship between confidence Interval and hypothesis testing outcome?

Page 173: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

P10

Charged or Recharge

Page 174: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(2)

Proportions

• Proportions provide useful information in summary

• Hypothesis testing can be applied not only to absolute data (such as sample mean), but also for population proportions.

Page 175: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(3)

Significance of Proportions

Page 176: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(4)

Assumptions

Certain assumptions must be made when testing a claim about a population proportion, probability or percentage:

1. The conditions for a binomial experiment are satisfied.

That is, there are a fixed number of independent trials having constant probabilities, and each trial has only two possible outcomes.

2. The conditions np ≥ 5 and nq ≥ 5 are both satisfied, so that the binomial distribution of sample proportions can be approximated by a normal distribution with = npand npq

Page 177: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

n

pq

pp ˆ

p̂n

x

Terms used

Notations used for hypothesis testing of

one proportion: n = number of trials

= (sample proportion), where x is the number of

‘success’ considered

p = population proportion

q = 1 – p

Test Statistics

Z =

Page 178: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(6)

Recall: Steps in Hypothesis TestingFrom Problem, identify parameter of interest

State Null Hypothesis, H0

Specify appropriate alternative hypothesis, H1

Choose a significance level,

Determine an appropriate test statistic

State the rejection region for the statistic

Decide whether or not H0 should be rejected and report in

problem context

Compute any necessary sample quantities, substitute

these into the equation for the test statistic, and compute

that value

Page 179: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(7)

Proposed SolutionStep 1: The parameter of interest is the proportion of

batteries that fail during advertised period, p

Step 2: Null Hypothesis, H0: p = 0.002

Step 3: Alternative hypothesis, H1: p<0.002

(This formulation will allow manufacturers to make a strong claim about

the proportion of batteries that fail if the null hypothesis, H0: p=0.002 is

rejected)

Step 4: Chosen significance level is = 0.05

Page 180: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(8)

Proposed SolutionStep 5: The test statistic to be used is:

Step 6: Reject H0: p = 0.002 if Z0<Z0.05 = -1.645

Step 7: Compute the test statistic:

n

pq

ppZ

0

n

pq

ppZ

0

5000

998.0*002.0

002.0)5000/7(

950.0

Page 181: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(9)

Proposed SolutionStep 8: Conclusions:

Since Z0 = -0.95 is not < Z0.05 = -1.645, we cannot

reject H0 and conclude that the

manufacturers cannot claim that less than 0.2

percent of the company’s batteries will fail

during the advertised time period at = 0.05

Page 182: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(10)

Type I () and Type II errors ()

Decision H0 is Actually True H0 is Actually False

Fail to reject H0 No error

(1-) Type II error, (failing to reject a false

null hypothesis)

Reject H0Type I error,

(rejecting a true null

hypothesis)

No error

(1-)

Page 183: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

Type I () and Type II errors ()

• β usually cannot be calculated as it depends on the actual

difference between the hypothesized value of the parameter

and the true value (we don’t know the true value!).

• 1- β is also known as Power of a Test. It measures the

sensitivity of the test to detect a real difference in

parameters if one actually exists.

• Larger results in a smaller β, and smaller results in a

larger β.

• To increase the Power, either increase the value of and/or

increase the sample size (which would reduce as well).

This would reduce the confidence interval of the sample

parameter and increase the ‘precision’ of the experiment.

(11)

Page 184: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(12)

Type I and Type II errors: Example

Decision H0 is Actually True H0 is Actually False

Fail to reject H0

(Decide that company is

NOT unfair)

No error

(1-) Type II error, (Not sue company when they

are ACTUALLY unfair)

Reject H0

(Decide that company is

unfair)

Type I error, (Suing company when they are

NOT unfair)

No error

(1-)

Suppose that you are a lawyer that is trying to establish that a

company has been unfair to workers above 50 years old with

regard to salary increases. Suppose the mean salary increase

per year is 8%.

H0: µ= 0.08 ; H1: µ < 0.08

Page 185: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(13)

Type I and Type II errors: Problem

Decision H0 is Actually True H0 is Actually False

Fail to reject H0

(Decide that NOT less than

0.2% batteries fail)

No error

(1-) Type II error, (Do not accept less than

0.2% batteries fail when it is true)

Reject H0

(Decide that less than

0.2% batteries fail)

Type I error, (Accepting less than

0.2% batteries fail when it is

NOT true)

No error

(1-)

The advertised claim for ABC batteries for mobile phones is set at

48 operating hours, manufacturer’s claim that less than 0.2

percent of the company’s batteries will fail

H0: p= 0.002 ; H1: p < 0.002

Page 186: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(14)

Confidence Interval - Definition

• Confidence interval gives an estimated range of values

which is likely to include an unknown population parameter,

the estimated range being calculated from a given set of

sample data.

• The width of the confidence interval gives us some idea

about how uncertain we are about the unknown parameter. A

very wide interval may indicate that more data should be

collected before anything very definite can be said about the

parameter.

Page 187: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(15)

Confidence Interval and Hypothesis Testing Outcome

• Close relationship between confidence intervals

and hypothesis testing

• Examples:

1. At 95% confidence interval, all values in the interval are

considered plausible values for the parameter being

estimated. If the value of the parameter specified by the

null hypothesis is contained in the 95% interval then the

null hypothesis cannot be rejected at the 0.05 level.

2. At 99% confidence interval, values outside the interval

are rejected at the 0.01 level.

Page 188: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(16)

2-tailed Confidence Interval for Problem Statement

Based on the sampling of 5000 batteries, the 95% 2-

tailed confidence interval for mean proportion of

batteries that will fail the specs is:

||||

^^

025.0

^^^

025.0

^

n

qpZpp

n

qpZp

5000

)9986.0(0014.096.10014.0

5000

)9986.0(0014.096.10014.0 p

00244.000036.0 p

Page 189: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(17)

1-tailed Confidence Interval for Problem Statement

The 95% 1-tailed confidence interval is:

Since the null hypothesis proportion value is 0.002

and lies within the interval, we cannot reject H0 at

0.05 level of significance.

|0

^^

05.0

^

n

qpZpp

5000

)9986.0(0014.065.10014.00 p

00227.00 p

Page 190: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(18)

Learning Outcomes

• What are Proportions and its significance

• Hypothesis Testing a Proportion

• Assumptions when testing a claim about

Proportion

• Type I and Type II error

• Confidence Interval and Hypothesis Testing

Page 191: Statistical Methods for Engineering

SCHOOL OF ENGINEERING

DIPLOMA IN INDUSTRIAL & OPERATIONS MANAGEMENT

DIPLOMA IN SUPPLY CHAIN MANAGEMENT

DIPLOMA IN CIVIL AVIATION

P11 – TRUE OR NOT TRUE

E214 : STATISTICAL METHODS FOR ENGINEERING

Copyright © 2009 School of Engineering, Republic Polytechnic, Singapore

All rights reserved. No part of this document may be reproduced, stored in a retrieval

system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording or otherwise, without the prior permission of the School of

Engineering, Republic Polytechnic, Singapore.

Page 192: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 2 of 2

True or Not True

It has been reported that Singapore youths (aged 15 – 24) spend the longest hours daily in the region on instant messaging. Attached are the data collected from 10 youths (age 15 – 24) from South Korea, a country known for superb IT infrastructure and high internet usage rates

P11-Students response.xls

How can we prove whether the report is true or not true with reasonable statistical confidence, assuming the populations under comparison have equal variances?

Page 193: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

P11

True or Not True

Page 194: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(2)

Testing Between 2 Samples

• There are many cases where researchers wish to compare 2 sample means. For example:– Is there a difference between the average lifetimes of 2 different brand

of tires?

– Did the students from college A score better in a common exams

compared with those from college B?

– How does the mean selling price of 4-room flats in one town compare

with another one?

– Have the soldiers’ fitness levels improved after training?

• To answer the above questions, we would collectdata for 2 samples and compare them by testing tosee if there is a statistically significant differencebetween the means

Page 195: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(3)

Case 1: 2 Sample z-test

• If we sample from 2 normal populations that are independent of each other (meaning no relationship between the subjects in each sample), and the standard deviation of each variable is known, then we use z-test for comparing the 2 means:

• If population standard deviation is unknown, both sample sizes must be 30 or more and replace σ with sample standard deviation, s.

2

2

2

1

2

1

21

nn

XXz

Page 196: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(4)

z-Test : Example 1

• The same physical fitness test was given to a group of 100 scouts

and 144 guides. The maximum score was 30. The guides obtained

a mean score of 26.81 and the scouts obtained a mean score of

27.53. Assuming that the fitness scores are normally distributed

with a common population standard deviation of 3.48, test at 95%

confidence interval whether the guides did not do as well as the

scouts in the fitness test.

Solution:

– Let X1 be the guide’s score and let the population mean be µ1.

X1 ~ N (µ1, 2)

– Let X2 be the guide’s score and let the population mean be µ2.

X2 ~ N (µ2, 2)

– Given = 3.48; n1 = 144, n2 = 100, X1 =26.81, X2 =27.53 ;

95% Confidence Level

x

Page 197: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(5)

z-Test : Example 1Solution:

Hypothesis:

H0: µ1 - µ2 = 0

H1: µ1 - µ2 < 0 (1-tailed test since we are interested to find out if the

guides did not perform as well as the scouts)

At = 0.05, critical z-value is -1.645

Using:

Since z calculated is > -1.645, we do not reject H0. Thus there is no

evidence, at 5% level, that the guides did not perform as well as

the scouts in the fitness tests.

2

2

2

1

2

1

21

nn

XXz

589.1

100

48.3

144

48.3

53.2781.26

22

Page 198: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(6)

Case 2: 2 Sample t-test

• If we sample from 2 independent normal populationswith unknown variances and the sample sizes are small, then we use 2-sample t-test for comparing the 2 means:

Where degrees of freedom are equal to n1 + n2 – 2

• The above t-test assumes that the variances of the populations are equal.

2121

2

22

2

11

21

11

2

)1()1(

nnnn

snsn

XXt

Page 199: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(7)

Is it reasonable to assume same variances between

populations for t-Test?

• In the t-test, the population variances are unknown so often we do not know if the variances can be assumed to be equal.

• If the population variances are very different, the 2-sample t-test may not be accurate as the results may be influenced by the difference in the variances.

• However, the 2-sample t-test is not overly sensitive to small differences between population variances so most of the times this test can be used.

Page 200: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(8)

2 Sample t-Test: One-sided Vs Two-sided

Left-tailed Test Right-Tailed

Test

Two-Tailed Test

H0: 1 - 2 ≥ 0

H1: 1 - 2 < 0

H0: 1 - 2 ≤ 0

H1: 1 - 2 > 0

H0: 1 - 2 = 0

H1: 1 - 2 ≠ 0

t0

.10

Reject

t0

.10

Reject

t0

.05

Reject

.05

Reject

Page 201: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(9)

P-Value Method for Hypothesis Testing

• Hypothesis testing commonly uses level of significance α of 0.05 or 0.1 which is Type I error.

• P-value represents the calculated probability of getting the sample statistic. It is the actual area under the distribution curve.

1.0

P-value

= 0.159

Z

Example

Use NORMSDIST(-1) function in Excel

to find one-sided P-value of standard

normal distribution (z=1.0):

P(Z=1.0)

= 0.159

Page 202: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(10)

Interpretation of P-Value

There are 2 ways to interpret the p-value. Assuming the

null hypothesis is true, p-value is the probability of:• Getting a test statistic like the one calculated or even

more extreme value

• Rejecting the null hypothesis when it is true

It answers this question – ‘To what extent does the data

support the null hypothesis?’

The smaller the p-value, the less the data supports the null hypothesis.

Page 203: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(11)

Decision Making Based on P-Value

α Criteria

• If p-value is smaller than or equal to level of significance α, reject null hypothesis.

• If p-value is greater than level of significance α, do not reject null hypothesis

Page 204: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(12)

Conventional Interpretation of P-values

•P > 0.10

Result is not significant

•0.05 < P < 0.10

Result is marginally

significant

•0.01 < P < 0.05

Result is significant

•P < 0.01

Result is highly significant

This is a rule-of-

thumb interpretation

without the need to

set α value.

Page 205: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(13)

Proposed SolutionLet Sample 1 be Singapore youths’ instant messaging

time, and

Sample 2 be Korean youths’ instant messaging time

Hypothesis:

H0: µ1 - µ2 = 0

H1: µ1 - µ2 > 0 (1-tailed test since we are proving

whether the report is true or not true)

Page 206: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(14)

Proposed SolutionGiven,

n1=20,n2=10,s1=100.85,s2=24.024,X1=268,X2=166.6

Using:

t-Statistic = 3.11 and p-value = 0.00214

[Excel function TDIST(3.11,28,1)]

Since p-value is <<0.05, reject null hypothesis

I.e. The report that Singaporean youths spending

most time on internet messaging is true

2121

2

22

2

11

21

11

2

)1()1(

nnnn

snsn

XXt

Page 207: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(15)

Learning Outcomes

• To test the hypotheses about the difference

between two population means

• Test Statistic for the difference between two

means (independent normal populations)

– z-Test (known variances)

– 2 sample t-Test (unknown equal variances)

• p-Value calculation and significance

• 2 Sample t-Test

Page 208: Statistical Methods for Engineering

SCHOOL OF ENGINEERING

DIPLOMA IN INDUSTRIAL & OPERATIONS MANAGEMENT

DIPLOMA IN SUPPLY CHAIN MANAGEMENT

DIPLOMA IN CIVIL AVIATION

P12 – WHO TYPES FASTER

E214 : STATISTICAL METHODS FOR ENGINEERING

Copyright © 2009 School of Engineering, Republic Polytechnic, Singapore

All rights reserved. No part of this document may be reproduced, stored in a retrieval

system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording or otherwise, without the prior permission of the School of

Page 209: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 2 of 2

Engineering, Republic Polytechnic, Singapore. Who Types Faster Some people believe that women in general can type faster than men since there are more female administrative staff compared to male ones. Others think that men have better hand-eye coordination and thus can type faster. Carry out a hypothesis test to determine if there is basis to further investigate those beliefs. You may carry out an experiment in your class and collect the relevant data by making use of typing tests provided in the following website: http://www.powertyping.com/typing_test/typing_test.shtml A second thought about typing speed is that typing an article containing non-English names will decrease typing speed. To find out whether this is true, conduct another study using hypothesis testing. Should you use the same test statistic for the two hypothesis tests? Meaning, is there any difference in the two studies in relation to assumptions about the population distribution, the relationship of the samples and the parameter under testing?

Page 210: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

P12 – Who Types Faster

Page 211: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

Recall: Testing of 2 Sample Means

• If we are comparing the means of 2 independent normal populations

with unknown variances and sampling sizes are small, 2-sample t-

test statistic can be applied:

Where degrees of freedom are equal to n1 + n2 – 2

• This t-test requires that the variances of the populations be equal.

2

2121

2

22

2

11

21

11

2

)1()1(

nnnn

snsn

XXt

Page 212: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

What if population variances are different?

• Often there is reason to suspect that variances between

2 populations may be very different. For example, output

of a newly set-up process Vs a long-run stable process.

• We may examine the sample variances. As a rule of

thumb, if the difference is 4 times or more, then we

cannot assume the population variances to be equal.

• In this case, a more appropriate test, called the Smith-

Satterthwaite Test, can be used. It is also known as the

2-sample t-test with unequal variances.

3

Page 213: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

Smith-Satterthwaite Test (2-Sample T-test with

Unequal Variances)

• When comparing the means of 2 independent samples from normal

populations whose variances are unknown and unequal, use the

following test statistic:

which is a random variable that approximates to t-distribution with

degrees of freedom equal to (round down to nearest integer):

4

2

2

2

1

2

1

21

n

s

n

s

XXt

Page 214: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering5

T-Test with Unequal Variances: An Example

A researcher wants to determine whether the salaries of professional nurses employed in private hospitals are higher than those employed by government hospitals.

Data collected:

At 99% confidence level, can it be concluded that the private hospitals pay more than the government ones?

Private Government

10

800$

26800$

1

1

1

n

s

X

8

400$

25700$

1

1

2

n

s

X

Page 215: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering6

T-Test with Unequal Variances: An Example

Let μ1 and μ2 be the average salaries of nurses in private and government hospitals respectively.

Ho: μ1 = μ2 and H1: μ1 > μ2 (Right-tailed T-test)

Assuming variances are not equal,

P-value is 0.0011 which is smaller than α = 0.01. Hence reject the null hypothesis. I.e. private hospitals pay nurses more than government ones.

13

80.3

8

400

10

800

2570026800

22

2

2

2

1

2

1

21

DOF

n

s

n

s

XXt

Page 216: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering7

Dependent or Related Samples

• Sometimes, samples under study are related or they

contain the same subjects but under different conditions.

In this case, the samples are not independent of each

other and we cannot use 2-sample t-test.

• Examples of dependent samples:

– Performance of workers before and after a training program

– Effectiveness of a drug on patients

– Comparison of IQ scores of pairs of children matched with the

same age (to block out the differences in scores due to age)

Page 217: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering8

Independent vs. Related Populations

• Independent Data

Sources

• Use Difference Between

the 2 Sample Means

• Same Data Source

– Paired/Matched

– Repeated Measures

(Before/After)

• Use Difference between

Each Pair of

Observations

Dn = X1n - X2n

Independent Related

21XX

Page 218: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering9

Two Related Populations:

Paired Sample t-Test

• The paired sample t-test is used to test means of 2

related populations

– Paired or Matched samples

– Repeated Measures (Before/After)

• Eliminates variation among subjects in the same sample

• Assumptions

– If the sample is small, the distribution of difference scores

should be normally distributed

– Both Populations Are Normally Distributed

– If Not Normal, Can Be Approximated by Normal Distribution (n1

30 & n2 30 )

Page 219: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering10

Paired Sample t-Test Statistic

Sample Mean

D

D

n

i

i

n

1

tDS

n

with df nD

1

Sample

Standard

Deviation

11

2)(

n

n

iD

iD

DS

In paired sample t-test, we test the mean of the differences

between each pair of subjects. The test statistic is:

Page 220: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering11

Paired-Sample t-Test: An Example

To ascertain the effectiveness of a training program, the following test score data is collected:

Name Before Training After Training

Sam 85 94Tamika 94 87Brian 78 79Mike 87 88

At the 90% confidence level, determine the effectiveness of the training.

Page 221: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering12

Paired-Sample t-Test:

Calculation of Test Statistic

Before After Difference

Sam 85 94 -9

Tamika 94 87 7

Brian 78 79 -1

Mike 87 88 -1

Total - 4

53.6

3

128

14

2))1(1((2))1(1((2))1(7((2))1(9((1

44

DSD

Page 222: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering13

Paired Sample t-Test : Solution

H0: D 0 (D = B - A)

H1: D < 0

= 0.10

df = 4 - 1 = 3

Critical Value(s):

Test Statistic:

Decision:

Conclusion:

Do not reject at = 0.10

There is no evidence the

training is effective.t0-1.6377

.10

Reject

tDS

nD

1

6 5

4

.0.306

Page 223: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering14

Problem Statement

• “Some people believe that women in general can type faster

than men since there are more female administrative staff

compared to male ones. Others think that men have better

hand-eye coordination and thus type faster. Carry out a

hypothesis test to determine if there is basis to further

investigate those beliefs.”

• How should you formulate H0 and H1?

• Do you use z test or t-test?

• What assumptions do you make in this test?

Page 224: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering15

Solution: 2 Sample t-Test with unequal variances

Let 1 be typing speed (words/min) for female

Let 2 be typing speed (words/min) for male

Test Hypotheses:

H0: 1 - 2 = 0

H1: 1 - 2 <> 0 (2-tailed test as the results may show men type faster!)

Are the population variances known? Are they are the same?

No, we do not know what the population variances are or

whether they are the same or not. Let us apply 2 sample t-test

with unequal variances.

Page 225: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering16

Calculations: 2 Sample t-Test with unequal variances

Assume the following has been calculated from data collected:

x1 = 31.42

x2 = 27.64

s1 = 6.35

s2 = 4.54

n1 = n2 = 10

• From Excel function TDIST(1.37,16,2) P(T>t or T<-t)

= 0.190 (p-value)

• Ho cannot be rejected at level of significance =0.1

16

37.1

8

54.4

8

35.6

64.2742.31

22

2

2

2

1

2

1

21

DOF

n

s

n

s

XXt

Page 226: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering17

Paired t-Test• How can we prove if typing an article containing non-English names

affects the typing speed?

• In this case, since different people have different typing speeds, we should compare the speed of the same person typing two articles, one with and the other without non-English names.

• Paired t-test gives a more accurate result here as it is able to detect the differences (Xi1 – Xi2) to a greater extent compared with 2 sample t-test.

Person Article 1 typing speed Article 2 typing speed

1 X11 X12

2 X21 X22

3 X31 X32

4 X41 X42

Page 227: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering18

Summary of Learning Outcomes

• Perform t-test for 2 independent samples with unequal variances

• Understand the difference between independent and related (dependent) populations

• Perform paired t-test for 2 dependent samples

Page 228: Statistical Methods for Engineering

SCHOOL OF ENGINEERING

DIPLOMA IN INDUSTRIAL & OPERATIONS MANAGEMENT

DIPLOMA IN SUPPLY CHAIN MANAGEMENT

DIPLOMA IN CIVIL AVIATION

P13 – TEST IT FOR FAIRNESS

E214 : STATISTICAL METHODS FOR ENGINEERING

Copyright © 2009 School of Engineering, Republic Polytechnic, Singapore

All rights reserved. No part of this document may be reproduced, stored in a retrieval

system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording or otherwise, without the prior permission of the School of

Engineering, Republic Polytechnic, Singapore.

Page 229: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 2 of 2

Test it for Fairness You have just developed a program that simulates a six-sided dice. You are thinking of selling this as a product commercially as an embedded software application or an online tool. You want to test whether the dice is really fair, before you go ahead to launch it. How would you go about conducting a statistical test to decide whether the dice program you have developed is fair?

Page 230: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

P13

Test it for Fairness

Page 231: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(2)

Test a distribution for Goodness-of-Fit using

Chi-Square

• Previously, we used statistical hypothesis to test

single population parameters.

• For today’s problem, we use statistical hypothesis to

determine if a population has a specified theoretical

distribution.

• This test is based on how good a fit we have between:

– the frequency of occurrence of observations in an observed

sample

– the expected frequencies obtained from the hypothesized

distribution

Page 232: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(3)

The Multinomial Experiment

• The experiment consists of n identical, independent trials.

• The outcome of each trial falls in one of k categories.

• The probabilities associated with the k outcomes, denoted by π1, π2, …, πk, remain the same from trial to trial. Since there are only k possible outcomes, we have π1 + π2 + … + πk = 1

• The experimenter records the values of o1, o2, …, ok, where oj (j = 1, 2, …, k) is equal to the observed number of trials in which the outcome is in category j.

Note that n = o1 + o2 + … + ok

Page 233: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(4)

The 2 Test Statistic

The 2 Test Statistic measures the amount of

disagreement between the observed data and

the expected data.

2 = ∑ (oj – ej)2 / ej

where the sum is over all categories, with oj being

the observed frequency count and ej the

expected frequency count in category j.

Page 234: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(5)

Test Statistic and its Applicability

Test Statistic: 2 = ∑ (oj – ej)2 / ej

with degrees of freedom equal to the number of categories minus 1 (right-tailed test), where

o = observed frequency

e = expected frequency

Assumptions for Chi-Square Goodness-of-Fit Test

1. The experiment satisfies the properties of a multinomial experiment.

2. No expected cell count, ej, is less than 5

Page 235: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(6)

Typical 2 Density Curve

The curve begins at zero is and skewed right. As the degrees

of freedom increase, the distribution stretches out along the

horizontal axis.

Page 236: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(7)

Step 1: State the Null and Alternative Hypotheses

H0: Newly developed die is fair

H1: Newly developed die is not fair

Page 237: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(8)

Step 2: Compute Expected (Ei) and Observed (Oi) Frequencies

Page 238: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(9)

Step 3: Decide on Rejection Criterion

Degrees of freedom

= Number of classes – number of restrictions

= 6 – 1 = 5

Test at 5% significance level,

Reject Ho if: 2(calc) > 2

5%, 5

i.e. if 2(calc) > 11.07

Page 239: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(10)

2 Distribution with 5 Degrees of Freedom

=0.05

2.05,5 = 11.07

Page 240: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(11)

Step 4: Conclusion

Since = 2(calc) = 2.12 < 11.07,

Ho is accepted.

Conclusion:

The newly developed die is fair and you can

confidently release it commercially.

Page 241: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(12)

Learning Outcomes

The Chi-Square Goodness-of-Fit Test

• Understand the basic properties of the multinomial experiment

• Know how to calculate the expected number of outcomes to fall in categories of a multinomial experiment

• Know the assumptions required for a chi-square goodness-of-fit test

• Know how to conduct a chi-square goodness of fit test

Page 242: Statistical Methods for Engineering

SCHOOL OF ENGINEERING

DIPLOMA IN INDUSTRIAL & OPERATIONS MANAGEMENT

DIPLOMA IN SUPPLY CHAIN MANAGEMENT

DIPLOMA IN CIVIL AVIATION

P14 – Testing with Signs

E214 : STATISTICAL METHODS FOR ENGINEERING

Copyright © 2009 School of Engineering, Republic Polytechnic, Singapore

All rights reserved. No part of this document may be reproduced, stored in a retrieval

system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording or otherwise, without the prior permission of the School of

Engineering, Republic Polytechnic, Singapore.

Page 243: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 2 of 2

Testing with Signs Your statistics facilitator thinks that RP students have less sleep than a typical person in Singapore. Can you prove whether this is true with reasonable statistical confidence, if the median duration of sleep is known to be 7 hours in Singapore? Note that the distribution of sleep duration cannot be assumed normal. Consider a small sample size from your class and perform a sign test to test the claim.

Page 244: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

P14

Testing with Signs

Page 245: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(2)

Nonparametric Tests

• Statistical tests such as z, t and F tests are called

parametric tests

• Parametric tests require the assumption that

sampling populations are normally distributed

• In situations where population distribution is not

normal, nonparametric (or distribution-free) tests

can be used

Page 246: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(3)

Pros and Cons of Nonparametric Tests

• Variables under test need not be

normally distributed

• Can be used to test hypotheses

that do not involve population

parameters, such as randomness

of sample, relationship between 2

samples

• Computations are generally

easier compared with parametric

tests

• Less sensitive than parametric

tests when normality assumption

is met. Thus, larger differences

are needed before null hypothesis

can be rejected

• Use less information than

parametric tests

• Less efficient in the sense that

larger sample size is required to

overcome loss of information

Advantages Disadvantages

Page 247: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(4)

Sign Test

• The sign test is used to test the value of a median

of a specific sample

• An alternative to 1-sample t-test or paired t-test

• Can be used for small sample size

• Assigns a ‘+’ to sample values above the

hypothesized median value and a ‘-’ to sample

values below the median

• Does not account for the difference between

values in the data and the median

Page 248: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

If the probability (p-value) is smaller than significance level

, reject the null hypothesis. Conclude appropriately.

Compute the p-value based on binomial distribution with n, r and

p=0.5:

P(X<=r) if H1 contains ‘<‘, P(X>=r) if H1 contains ‘>‘ while p-value

for 2-sided test is twice the smaller p-value

(5)

Procedure in Sign Test

From Problem, identify the claim

State null (H0), alternative (H1) hypotheses and

significance level

For single-sample test, compare each

value with the hypothesized median. If

value is larger, replace with a ‘+’ sign. If

it is smaller, replace with a ‘-’ sign. If

equal, discard the value.

Count the number of ‘+’ (r) and the total number of signs (n)

For paired-sample test, subtract

each after value from the before

value and indicate the difference

with a ‘+’ or ‘-’ sign or 0. Discard the

‘0’ value(s).

Page 249: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(6)

Example 1: One-sample Sign Test

A researcher read that the median age for

viewers of the Singapore Idol show is 20 years.

To test the claim, 80 viewers were surveyed, and

30 were under the age of 20 years old while

exactly 4 were 20 years old.

At = 0.05, test the claim. Give one reason why

an advertiser might like to know the result of this

study.

Page 250: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(7)

Example 1: SolutionThe claim under test is that median age of viewers (ū) is 20 years.

Null Hypothesis: H0: ū = 20

Alternative Hypothesis: H1: ū ≠ 20

Letting n=76 (4 values are discarded), x=30, p=0.5, the probability of

getting 30 or less ‘-’ is:

P(X<=30) = 76Cr(0.5)r(1-0.5)76-r = 0.0423

P-value is 0.0423x2 = 0.0846 as this is a 2-tailed test.

Since p-value is greater than = 0.05, there is not enough evidence to

reject null hypothesis and we accept the claim that median age of

viewers is 20 years.

x

r 0

Page 251: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(8)

Example 2: Paired-sample Sign Test

The following are the average weekly losses of

worker-hours due to accidents at 13 industrial

sites before and after a certain safety program

was put into operation:

23 and 35, 41 and 30, 20 and 8, 28 and 35, 45 and 24, 83

and 77, 26 and 24, 17 and 11, 55 and 58, 29 and 25, 15

and 10, 28 and 22 and 37 and 35.

Use 0.05 level of significance to test whether the

safety program is effective.

Page 252: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(9)

Example 2: Solution

Let ūd be the mean difference in loss hours before and after the program.

Null Hypothesis: H0: ūd = 0 (safety program is not effective)

Alternative Hypothesis: H1: ūd > 0 (safety program is effective)

The 13 sample pairs yield: - + + - + + + + - + + + +

Letting n=13, x=10, p=0.5, the probability of getting 10 or more ‘+’ is:

P(X>=10) = 1- P(X<=9) =1- 13Cr(0.5)r(1-0.5)13-r = 0.0461

P-value is 0.0461 (1-tailed test)

Since p-value is smaller than = 0.05, reject null hypothesis and

conclude that the safety program is effective.

x

r 0

Page 253: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(10)

Problem Statement: Solution Example

Suppose data (average sleep duration in hours) from 16 students are collected as follows:

6.5 7 6 5

5.5 5 6 6

6.5 7 8 6.5

5.5 5 7.5 6

Page 254: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(11)

Problem Statement: Solution Example

1. Claim is that RP students have on the average less than 7 hours of sleep every night.

2. Null Hypothesis:

H0: ū = 7

Alternative Hypothesis:

H1: ū < 7

3. Converting values into positive and negative signs, we have 2 ‘+’, 12 ‘-’ and 2 discarded values (tie)

Page 255: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(12)

Problem Statement: Solution Example

4. Letting n=14, x=2, p=0.5, the probability of getting 2 or less ‘+’ is

P(X<=2) = 14Cr(0.5)r(1-0.5)14-r = 0.0065

Answer is calculated from Excel function BINOMDIST(2,14,0.5,1)

5. Since p-value is 0.0065 < =0.01, reject null hypothesis

6. We can confidently say that RP students has less than 7 hours of sleep per night.

x

r 0

Page 256: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

(13)

Learning Outcomes

The Sign Test (Nonparametric Test)

• Understand when to apply nonparametric tests

• Know how to apply the Sign Test for small sample size where the normality assumption is not valid

• Know that the sign test is used to test population median for both one-sample and paired-sample tests

Page 257: Statistical Methods for Engineering

SCHOOL OF ENGINEERING

DIPLOMA IN INDUSTRIAL & OPERATIONS MANAGEMENT

DIPLOMA IN SUPPLY CHAIN MANAGEMENT

DIPLOMA IN CIVIL AVIATION

P15 – CHOCOLATE ADVERTISEMENTS

E214 : STATISTICAL METHODS FOR ENGINEERING

Copyright © 2009 School of Engineering, Republic Polytechnic, Singapore

All rights reserved. No part of this document may be reproduced, stored in a retrieval

system, or transmitted, in any form or by any means, electronic, mechanical,

photocopying, recording or otherwise, without the prior permission of the School of

Engineering, Republic Polytechnic, Singapore.

Page 258: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 2 of 3

Chocolate Advertisements You are working in an advertisement company. Recently, your company won a contract to design a poster advertisement to attract people to try out a new series of chocolate newly launched by your customer. To kick off the project, you have been tasked to look into the current advertisement posters by other companies and evaluate their attractiveness. You have identified 3 posters below. Conduct an investigation to evaluate if there is any significant difference in the attractiveness of the 3 advertisement posters. Poster 1

Page 259: Statistical Methods for Engineering

SCHOOL OF

ENGINEERING

Page 3 of 3

Poster 2

Poster 3

Page 260: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering

P15 – Chocolate Advertisements

Page 261: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering2

ANOVA

What is ANOVA?

Analysis of Variance (ANOVA) provides the tools to compare the means of several

populations with a single test.

The role of ANOVA is to perform a numerical test of significance that will test the equality

of all the means.

Page 262: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering3

Underlying Assumptions for ANOVA

The F distribution is also used for testing whether two or more

sample means are from the same or equal populations.

This technique is called Analysis of Variance or ANOVA.

ANOVA requires the following conditions:

– The sampled populations follow the normal distribution.

– The populations have equal standard deviations.

– The samples are randomly selected and are

independent.

Page 263: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering4

The F-Statistic

ANOVA is a procedure that compares the variability

between the samples to the variability within the

samples by computing the ratio

The F-statistic is a numerical measure of how much

the sample means differ.

samplesthewithiniance

samplesthebetweenianceF

___var

___var

Page 264: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering5

Characteristics of the F Distribution

• Each member of the family is determined by two

parameters: the numerator degrees of freedom and

the denominator degrees of freedom.

• F cannot be negative, and is a continuous

distribution.

• The F distribution is positively skewed.

• Its value ranges from 0 to . As F , the

curve approaches the x-axis.

Page 265: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering6

Procedure for the Analysis of Variance

• Null Hypothesis:

– Population means are the same.

• Alternative Hypothesis:

– At least one of the means is different.

• Test Statistic is the F Distribution.

• Decision rule is to reject the null hypothesis if

Fcalculated > Fcritical

Page 266: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering7

Procedure for the Analysis of Variance

• For k populations sampled, the numerator degrees of

freedom is (k – 1).

• For a total of n observations, the denominator degrees

of freedom is (n – k).

• The test statistic is computed by:

Where MS(Tr) is Mean Square for Treatments and MSE

is Mean Square Error

knSSE

kTrSS

MSE

TrMSF

1)()(

Page 267: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering8

Procedure for the Analysis of Variance

• SS(Tr) is the Treatment Sum of Squares.

where

TC is the column total

nc is the number of observations in each column

X the sum of all the observations

n the total number of observations

n

X

n

TTrSS

c

c

22

)(

Page 268: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering9

Procedure for the Analysis of Variance

• SST is the Total Sum of Squares

•SSE is the Sum of Squares Error

SS(Tr)- SST SSE

n

XXSST

2

2 )(

Page 269: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering10

Example 1Specializing in meals for the elderly, a restaurant recently introduced vegetarian

porridge at three of its branches.

Data on number of vegetarian porridge ordered were collected over a period of 5

days. Assuming a 5% level of significance, determine if there is a difference in the

mean number of bowls ordered per day at the three branches.

Branch 1 Branch 2 Branch 3

Day 1 13 10 18

Day 2 12 12 16

Day 3 14 13 17

Day 4 12 11 17

Day 5 14 13 17

Page 270: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering11

Example 1: Proposed Solution• SST:

• SS(Tr):

•SSE: SSE = SST – SS(Tr)

= 87 – 74.2

= 12.8

87

15

2092999

)(22

2

n

XXSST

2.74

15

)209(

5

85

5

59

5

65

)(

2222

22

n

X

n

TTrSS

c

c

Page 271: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering12

Example 1: Proposed Solution (continued)

• Step 1:

H0: Mean number of porridge sold at the 3 branches are the

same

H1: Mean number of porridge sold at the 3 branches are not

the same

• Step 2:

– H0 is rejected if F > Fcritical

– Fcritical = 3.89 as there are 2 df in the numerator and 12 df

in the denominator

Page 272: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering13

Example 1: Proposed Solution (continued)

• Calculating the value of F:

• The decision is to reject the null hypothesis as the

treatment means are not the same.

• The mean number of bowls of vegetarian porridge sold

at the three locations is not the same.

77.343158.12

132.74

1)(

knSSE

kTrSSF

Page 273: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering14

Inferences About Treatment Means

When the null hypothesis that the means are

equal is rejected, it may be necessary to know

which treatment means differ.

One of the simplest procedures to determine

this is through the use of confidence intervals.

Page 274: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering15

Confidence Interval for the

Difference Between Two Means

where

• t is obtained from the t table with degrees of freedom

(n - k),

• MSE (Mean Square Error) = [SSE/(n - k)]

)11

(t)(21

21

nnMSEXX

Page 275: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering16

Example 2

Continuing from Example 1, develop a 95% confidence

interval for the difference in the mean number of bowls of

vegetarian porridge sold in Branch 2 and Branch 3.

Can management conclude that there is a difference

between the two branches?

Page 276: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering17

Example 2: Proposed Solution• Confidence Interval:

(17 – 11.8) ± 2.179 √(1.067(1/5 + 1/5))

= 5.2 ± 1.424

= (3.776, 6.624)

Since zero is not in the interval, conclude that this pair of means differ.

Hence, the mean number of bowls of vegetarian porridge sold in Branch 2 is different from in Branch 3.

Page 277: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering18

Today’s Problem

• Ho: The mean scores for the 3 posters are the

same

• H1: The mean scores for the 3 posters are not

the same

• Critical value = 3.124

• p-value = 0.0000056

• Since F = 14.3 > 3.124 & p-value < 0.05, we

reject the null hypothesis and conclude that

there is a significant difference in the scores for

the 3 posters.

Page 278: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering19

Today’s Problem

Using Excel, the ANOVA output as follows:

Anova: Single Factor

SUMMARY

Groups Count Sum Average Variance

Poster 1 25 117 4.68 0.476666667

Poster 2 25 145 5.8 0.583333333

Poster 3 25 137 5.48 0.676666667

ANOVA

Source of Variation SS df MS F P-value F crit

Between Groups 16.64 2 8.32 14.37236084 5.59715E-06 3.123907449

Within Groups 41.68 72 0.578888889

Total 58.32 74

SS(Tr) SSE SST

k-1

n-k F statistic

MS(Tr)

MSE

Page 279: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering20

Today’s Problem

• 95% confidence interval between scores for

poster 1 and 2:

= 0.69 to 1.55

• Since 0 is not in the interval, we conclude that

this pair of means differ. We further conclude

that mean score for poster 2 is significantly

higher than poster 1.

)11

(t)(12

12nn

MSEXX

Page 280: Statistical Methods for Engineering

School of Engineering

E214 Statistical Methods for Engineering21

Learning Outcomes

• What is Analysis of Variance (ANOVA)?

• Characteristics of the F Distribution

• Test for Equal Variance (single factor

ANOVA test)

• Underlying Assumptions for ANOVA

• Confidence Interval for the Difference

Between Two Means