Introduction to Quantitative Research Methods

An Introduction to Quantitative Research

Methods

Dr Iman Ardekani

From Research Methodology to Hypothesis From Hypothesis to Experiments Basic Statistical Concepts Experimental Design and Analysis

Factorial Experiments Comparative Experiments

Content

Part IFrom Research Methodology

to Hypothesis

Iman Ardekani

From Research Methodology to Hypothesis

Research Methodology

Method 1 Method 2

Research Questions……..…. Research Questions……… Research Questions

Iman Ardekani

An example for Research Methodology

Each step may involve several research methods


Step 1: Planning and defining RQ

Step 2: Literature

Review

Step 3: Survey

Development

Step 5: Data Analysis

Step 4: Data Collection

Step 6: Documentati

on

Methodology

Iman Ardekani

Methodology Scopes (included but not limited to)

1. Descriptive research (aka statistical research): to describes data

and characteristics about the variables of a phenomenon.

2. Correlational research: to explore the statistical relationship

between variables.

3. Experimental research: to explore the causal effective relationships

between the variables in controlled environments.

4. Ex post facto research: to explore the causal effective relationships

between the variables when environment is not under control.

5. Survey research: to assess thoughts and opinions.


Iman Ardekani

What is a variable?

Something that changes, takes different values, and that we

can alter or measure. It has two types:

1. Independent Variables (e.g. the aspect of environment)

2. Dependent Variables (e.g. behaviours of systems)

Example: when studying the effect of distance on the

transmission delay in radio telecommunication, the distance is

an independent variable and the delay is a dependent

variable.


Iman Ardekani


Difference Between Research Methods and Research MethodologyResearch Methodology Research Methods

explains the methods by which you may proceed with your research.

the methods by which you conduct research into a subject or a topic.

involves the learning of the various techniques that can be used in conducting research, tests, experiments, surveys and etc.

involve conduct of experiments, tests, surveys and etc.

aims at the employment of the correct procedures to find out solutions.

aim at finding solutions to research problems.

paves the way for research methods to be conducted properly.

Iman Ardekani

Classifications of Research Methods

1. Qualitative Research Methods

2. Quantitative Research Methods


Iman Ardekani

Quantitative Research Methods

Examples are survey methods, laboratory experiments,

formal methods (e.g. econometrics), numerical methods

and mathematical modeling.

Qualitative methods produce information only on the

particular cases studied, and any more general

conclusions are only hypotheses. Quantitative methods

can be used to verify, which of such hypotheses are

true.


Iman Ardekani

A number of descriptive/relational studies show that people

have difficulty navigating websites when the navigational

bars are inconsistent in their locations through a Website.

Inductive Reasoning?

Deductive Reasoning?

Variables?

Hypothesis?


Iman Ardekani

Inductive Reasoning?

People need consistency in navigational mechanisms.

Deductive Reasoning?

People will have more difficulties with websites if the navigation is

inconsistent.

Independent variables?

Navigational Consistency: defined as characteristics of navigational bars

and their elements such as location, font, colour, etc.

Dependent variables?

Difficulty: defined as the efficiency of navigation by user. For example,

time taken to complete tasks, errors made, usage ratings.


Iman Ardekani

Hypothesis?

People will take longer to complete tasks, make more errors,

and give lower ratings of acceptability on a website with a

navigation bar that varies in its location from screen to

screen in comparison to one in which the navigation bar

appears in a consistent position on all screens.

How to test this hypothesis?

By using experiments and based on hypothesis testing

approaches!


Part IIFrom Hypothesis to

Experiments

Iman Ardekani

What is a Hypothesis:

A statement that specifically explain the

relationship between the variables of a system or

process.

It is a proposed explanation.

It should be tested. How?

From Hypothesis to Experiment

Iman Ardekani

Statistical Hypotheses – Definition

A statement either about the parameters of a probability

distribution or the variables of a system.

This may be stated formally as

H0: A = B

H1: A ≠ B

Where A and B are statistics of two experiments.


Null Hypothesis

Alternative Hypothesis

Iman Ardekani

Statistical Hypotheses – Notes

Note 1: The alternative hypothesis specified here is called a

two-sided alternative hypothesis because it would be true

if A>B or if A<B.

***

Note 2: A and B are two statistics (random variable) so for

examining A = B or A ≠ B, statistical distribution of them

should be considered.

***


Iman Ardekani

Statistical Hypotheses Testing

Testing a hypothesis involves in

1. taking a random sample

2. computing an appropriate test statistic, and then

3. rejecting or failing to reject the null hypothesis H0.

Part of this procedure is specifying the set of values for the

test statistic that leads to rejection of H0. This set of values

is called the critical region or rejection region for the test.


Iman Ardekani

Errors in Hypothesis Testing

Two kinds of errors may be committed when testing hypotheses:

Type 1: the null hypothesis is rejected but it is true.

= P(type 1 error) = P(reject H0 | H0 is true)

Type 2: the null hypothesis is not rejected but it is false.

= P(type 2 error) = P(fail to reject H0 | H0 is false)

Power of the test is defined as

Power = 1 - =P(reject H0 | H0 is false)


Iman Ardekani

Significance Level

is called the significance level.

The objective of a statistical test is to achieve low

significance level while still maintaining high test

power.


Iman Ardekani

Statistically Significant Hypotheses

The hypothesis verified using the statistical hypothesis testing

method is called statistically significant since it is unlikely to be

wrong in a probability sense.


Iman Ardekani

Experiment – Definition An experiment is a test or a series of tests.

The hypothesis can describe the relationship between x, z and y

variables and an experiment can verify this hypothesis.


Iman Ardekani

How to plan, conduct and analyze an experiment?

Step 1 - Recognition of and statement of the problem

Step 2 - Selection of the response variable

Step 3 - Choice of factors, levels, and range

Step 4 - Choice of experimental design

Step 5 - Performing the experiment

Step 6 - Statistical analysis of the data

Step 7 - Conclusions and recommendations:


Iman Ardekani

Lets continue with the following example:

I really like to play golf. Unfortunately, I do not enjoy practicing, so I am

always looking for a simpler solution to lowering my score. Some of the

factors that I think may be important, or that may influence my golf score,

are as follows:

1. The type of driver used (oversized or regular sized)

2. The type of ball used (balata or three piece)

3. Walking and carrying the golf clubs or riding in a golf cart

4. Drinking water or drinking beer while playing


Iman Ardekani

Best-guess Experiments

Change one or several factors for the next round, based on the

outcome of the current test, in order to improve the output.

Example:

Round 1: oversized driver, balata ball, walk, and water:

Score 87: Noticed several wayward shots with the big driver

Round 2: regular-sized driver, balata ball, walk, and water:

Score 80: Notice that people will easily get tired by walking

Round 3: regular-sized driver, balata ball, golf cart and water Score

78: Notice that …


Iman Ardekani

One-factor-at-a-time Experiments

Select a starting point (a default setting for each factor)

Example:

Starting point: oversized driver, balata ball, walking, and

drinking water and successively varying each factor over its

range with other factors held constant at the baseline level.


Iman Ardekani

Example for one-factor-at-a-time approach:

Conclusion:

regular-sized driver, balata ball, riding, and drinking water

is the optimal combination.


Iman Ardekani

Problem with one-factor-at-a-time approach Interactions between factors are very common. If they occur, the one-

factor-at-a-time approach will usually produce poor results.

For solving this problem, factorial experiment design can be used.


Part IIIBasic Statistical

Concepts

Iman Ardekani

Mean (μ): a measure of central tendency.

μ = E{y}

Variance (σ2): a measure of how far a set of

numbers is spread out.

σ2 = V(y) = E{(y-μ)2}

Basic Statistical Concepts

Iman Ardekani

If c is a constant and y is a random variable with the

mean of μ and variance of σ2, then

1. E(c) = c

2. E(y) = μ

3. E(cy) = c E(y) = cμ

4. V(c) = 0

5. V(y) = σ2

6. V(cy) = c2 V(y) = c2σ2


Iman Ardekani

If y1 is a random variable with the mean of μ1 and variance

of σ12, and y2 is another random variable with the mean of

μ2 and variance of σ22, then

1. E(y1+y2) = E(y1) + E(y2) = μ1 + μ2

2. E(y1-y2) = E(y1) - E(y2) = μ1 - μ2

3. V(y1+y2) = V(y1) + V(y2) = σ12 + σ2

2 (for independent and 0 mean y1 and y2)

4. V(y1-y2) = V(y1) + V(y2) = σ12 + σ2

2 (for independent and 0 mean y1 and y2)

5. E(y1y2) = E(y1) E(y2) = μ1 μ2 (for independent y1 and y2)


Iman Ardekani

Statistic: Statistical inference makes considerable

use of quantities computed from the observations

in the sample. We define a statistic as any

function of the observations in a sample that does

not contain unknown parameters:

1. Sample mean

2. Sample Variance

3. and even the random variable (quantity) itself!


Iman Ardekani

Sample Mean (shown by y)

Sample Variance (shown by S2)


Iman Ardekani

Find sample mean and sample

variance for each data set.

y1 = ?

y2 = ?

S12 = ?

S22 = ?


Iman Ardekani

Sampling Distribution

The probability distribution of a statistic is called a

sampling distribution. Important examples are:

1. Normal distribution

2. Chi Square Distribution (Χ2 Distribution)

3. t Distribution


Iman Ardekani

Normal Distribution

y ~ N (μ,σ2)

In general case, μ is the mean of the

distribution and σ is the standard

deviation.

An important special case is the

standard normal distribution, where

μ=0 and σ=1.

z = (y-μ)/σ has always an standard

normal distribution.


Iman Ardekani

The Central Limit Theorem

If y1, y2, … yn is a sequence of n independent and

identically distributed random variables with E(y i)=

and V(yi)=2 and x= y1+ y2+ …+ yn then the following

random variable has standard normal distribution

zn=


n 2

x-n

Iman Ardekani

Chi-Square Distribution

x ~ Xk2

If x can be obtained as the sum

of the squares of k independent

normally distributed random

variables, then x follows the chi-

square distribution with k

degrees of freedom.


Iman Ardekani

As an example of a random variable that follows the chi-square

distribution, suppose that y1, y2, …, yn is a random sample from

an N(μ,σ2) distribution. Then (SS=Sum of Squares)

That is SS/σ2 is distributed as chi-square with n-1 degrees of

freedom.


Iman Ardekani

Since S2 = SS/(n-1), then the distribution of S2 is

σ2 Xn-12

Thus, the sampling distribution of the sample

variance is a constant times the chi-square

distribution if the population is normally distributed.


n-1S2 ~

Iman Ardekani

t Distribution

If z~N(0,1) and Xk2 is a ch-square

variable, then the random

variable tk

follows t distribution with k

degrees of freedom.


Iman Ardekani

If y1,y2, …, yn is a random sample from the N(μ,σ2)

distribution, then the quantity

is distributed as t with n-1 degrees of freedom.


Part IVExperimental Design

Iman Ardekani

Factorial Experiments Factors are varied together, instead of one at a time.

An special kind of statsitical experiment design.

22 Factorial Design (2 factors, each at 2 levels). For example:

Factorial Experiments

Iman Ardekani

Example for 22 factorial design 8 sets

replicated twice for each driver-ball combination

Driver Effect?

Driver Effect = - = 3.25

That is, on average, switching from the oversized driver to

the regular sized deriver increases the score by 3.25 strokes

per round.


92+94+93+91

4

88+91+88+90

4

Iman Ardekani

- Ball Effect?

Ball Effect = -

= 0.75

That is, on average, switching from the balata ball to the three piece ball

increases the score by 0.75 strokes per round.


88+91+92+94

4

88+90+93+91

4

Iman Ardekani

- Driver-Ball Interaction Effect?

Driver-Ball Interaction Effect =

- = 0.25

That is, on average, switching of both ball and driver increases the score

by 0.25 strokes per round.

Finally, one can concludes that

Driver effect > Ball effect > Intercation


92+94+88+90

4

88+91+93+91

4

Iman Ardekani

23 Factorial Design (3 factors, each at 2 levels):

How to calculate ball-effect, driver effect, beverage

effect and interaction effects?


Iman Ardekani

Comparative experiments compare two experimental conditions. For

example, comparative experiments can be used to determine whether two

different formulations of a product give equivalent results.

Comparative Experiments

Apple three 1 (AKL) Apple three 2 (ChCH)

Apples weights:0.101 kg0.1110.1030.1020.1210.1020.101

same cultivation conditions

Apples weights:0.102 kg0.1010.1050.1060.1110.980.110

Hypothesis: same apples weights?

Iman Ardekani

Data Model for Comparative Experiments

The following model for each data set is considered:

yi = + i

is the mean of data set

i is assumed to be distributed by NID(0,2)


Noise

Iman Ardekani

Comparative Experiments Formulation:

In general case, we have two data sets:

y11, y12, …, y1

and n1

y21, y22, …, y2

The statistical hypothesis is formulated by

H0: 1 = 2

H1: 1 ≠ 2


Null Hypothesis

Alternative Hypothesis

n1

n2

Iman Ardekani

Two Sample t-test

1. Assume that the variance of the two data sets are

equal: 12 = 2

2

2. Form the following statistic


data set 1 sample meandata set 2 sample mean

estimate of the common variance

Iman Ardekani

Two Sample t-test

3. Assume To determine whether to reject H0 we would compare t0 to the t

distribution with n1+n2-2 degrees of freedom.

4. We would reject H0 if


-

Iman Ardekani

page(1/3)Are the bond strength of the two cement

mortars similar at the significance level

of = 0.05?


Iman Ardekani

page(1/3)

Using provided table:


Iman Ardekani

page(1/3)

Thus we would reject H0 and we can conclude that

the strengths of the two mortar are different.


t0=-2.2

Iman Ardekani

Repeat the previous problem with = 0.01?


Iman Ardekani

ExerciseTwo machines are used for filling plastic bottles with a net volume of 16.0 ounces.

The filling process can be assumed to be normal. The quality engineering

department suspects that both machines fill to the same net volume. An experiment

is performed by taking a random sample from the output of each machine. Would

you reject or accept the quality engineering department hypothesis?


Iman Ardekani

P Value

The smallest level of significance that would lead to

the rejection.


t0

vMin

Iman Ardekani

P Value Calculation for Previous Example


t0=--2.2

V=18Min = 0.0411

Iman Ardekani

Confidence Interval

It is often preferable to provide an interval within

which the statistics in question would be expected to

lie. These interval statements are called confidence

intervals.

This interval estimates the difference between the

statistics and the accuracy of this estimate.


Iman Ardekani

Confidence Interval Calculation

The 100(1-) percentage confidence interval is

L 1-2 U

L =

U =


Iman Ardekani

Confidence Interval

L 1-2 U

1-2 = 0.5(L+U) 0.5(U-L)

It means the mean difference is 0.5(L+U) and the

accuracy of this estimate is 0.5(U-L).

If 0 is not in the interval H0 would be rejected.


Iman Ardekani

For the previous example:

The 95% confidence interval is

L = -0.55

U = -0.01

-0.55 1-2 -0.01 (1-2 =0 is not in the interval)

1-2= -0.28 0.27

It means the difference between the two mortars strengths is -

0.28 with the accuracy of 0.27.


Iman Ardekani

For the previous example estimates the difference between the

tow mortars strengths and the accuracy of your estimation by

calculating the confidence interval of t-test.

The difference between the two mortars strengths is -0.28 with

the accuracy of 0.27.


Iman Ardekani

Some experiments involve comparing only one

population mean to a specified value, say,

H0: 1 = 0

H1: 1 ≠ 0

This problem is a simplified version of the two-sample t-test

problem, called one-sample Z test.


Iman Ardekani

0ne-Sample Z-test

1. Assume that the variance of the sets is 2

2. Form the following statistic

3. If H0 is true, then the distribution of Z0 is N(0, 1).

Therefore, we would reject H0 if |Z0|>Z0.5

4. Z0.5 should be obtained from a table.


Iman Ardekani

Z0.5

=0.05

1-0.5 =0.975

Z=1.96


Iman Ardekani

In the population, the average IQ is 100 with a

standard deviation of 15. A team of scientists wants to

test a new medication to see if it has either a positive

or negative effect on intelligence, or no effect at all. A

sample of 30 participants who have taken the

medication has a mean of 140. Did the medication

affect intelligence, using alpha = 0.05?


Iman Ardekani


1.96

If Z is less than -1.96, or greater than 1.96, reject the null hypothesis.

-1.96

Result: Reject the null hypothesis.Conclusion: Medication significantly affected intelligence, z = 14.60, p < 0.05.

Iman Ardekani

The confidence interval of one-sample z test is


Iman Ardekani

Find the confidence interval for the previous example.

L = 140 - 1.96 x 15 / √30 = 134.64

U= 140 + 1.96 x 15 / √30 = 145.36

140 ± 5.36


Iman Ardekani

Violation of Assumptions in t-test

Two main assumptions are:

1. Normal distribution: In practice, the assumption of

normal distribution can be violated to some extent

without affecting the effectiveness of t-test.

2. Equal variance: If this assumption is violated,

other test techniques should be used.


Science

Introduction to Quantitative Research Methods