Upload
iman-ardekani
View
160
Download
1
Embed Size (px)
Citation preview
An Introduction to Quantitative Research
Methods
Dr Iman Ardekani
From Research Methodology to Hypothesis From Hypothesis to Experiments Basic Statistical Concepts Experimental Design and Analysis
Factorial Experiments Comparative Experiments
Content
Part IFrom Research Methodology
to Hypothesis
Iman Ardekani
From Research Methodology to Hypothesis
Research Methodology
Method 1 Method 2
Research Questions……..…. Research Questions……… Research Questions
Iman Ardekani
An example for Research Methodology
Each step may involve several research methods
From Research Methodology to Hypothesis
Step 1: Planning and defining RQ
Step 2: Literature
Review
Step 3: Survey
Development
Step 5: Data Analysis
Step 4: Data Collection
Step 6: Documentati
on
Methodology
Iman Ardekani
Methodology Scopes (included but not limited to)
1. Descriptive research (aka statistical research): to describes data
and characteristics about the variables of a phenomenon.
2. Correlational research: to explore the statistical relationship
between variables.
3. Experimental research: to explore the causal effective relationships
between the variables in controlled environments.
4. Ex post facto research: to explore the causal effective relationships
between the variables when environment is not under control.
5. Survey research: to assess thoughts and opinions.
From Research Methodology to Hypothesis
Iman Ardekani
What is a variable?
Something that changes, takes different values, and that we
can alter or measure. It has two types:
1. Independent Variables (e.g. the aspect of environment)
2. Dependent Variables (e.g. behaviours of systems)
Example: when studying the effect of distance on the
transmission delay in radio telecommunication, the distance is
an independent variable and the delay is a dependent
variable.
From Research Methodology to Hypothesis
Iman Ardekani
From Research Methodology to Hypothesis
Difference Between Research Methods and Research MethodologyResearch Methodology Research Methods
explains the methods by which you may proceed with your research.
the methods by which you conduct research into a subject or a topic.
involves the learning of the various techniques that can be used in conducting research, tests, experiments, surveys and etc.
involve conduct of experiments, tests, surveys and etc.
aims at the employment of the correct procedures to find out solutions.
aim at finding solutions to research problems.
paves the way for research methods to be conducted properly.
Iman Ardekani
Classifications of Research Methods
1. Qualitative Research Methods
2. Quantitative Research Methods
From Research Methodology to Hypothesis
Iman Ardekani
Quantitative Research Methods
Examples are survey methods, laboratory experiments,
formal methods (e.g. econometrics), numerical methods
and mathematical modeling.
Qualitative methods produce information only on the
particular cases studied, and any more general
conclusions are only hypotheses. Quantitative methods
can be used to verify, which of such hypotheses are
true.
From Research Methodology to Hypothesis
Iman Ardekani
A number of descriptive/relational studies show that people
have difficulty navigating websites when the navigational
bars are inconsistent in their locations through a Website.
Inductive Reasoning?
Deductive Reasoning?
Variables?
Hypothesis?
From Research Methodology to Hypothesis
Iman Ardekani
Inductive Reasoning?
People need consistency in navigational mechanisms.
Deductive Reasoning?
People will have more difficulties with websites if the navigation is
inconsistent.
Independent variables?
Navigational Consistency: defined as characteristics of navigational bars
and their elements such as location, font, colour, etc.
Dependent variables?
Difficulty: defined as the efficiency of navigation by user. For example,
time taken to complete tasks, errors made, usage ratings.
From Research Methodology to Hypothesis
Iman Ardekani
Hypothesis?
People will take longer to complete tasks, make more errors,
and give lower ratings of acceptability on a website with a
navigation bar that varies in its location from screen to
screen in comparison to one in which the navigation bar
appears in a consistent position on all screens.
How to test this hypothesis?
By using experiments and based on hypothesis testing
approaches!
From Research Methodology to Hypothesis
Part IIFrom Hypothesis to
Experiments
Iman Ardekani
What is a Hypothesis:
A statement that specifically explain the
relationship between the variables of a system or
process.
It is a proposed explanation.
It should be tested. How?
From Hypothesis to Experiment
Iman Ardekani
Statistical Hypotheses – Definition
A statement either about the parameters of a probability
distribution or the variables of a system.
This may be stated formally as
H0: A = B
H1: A ≠ B
Where A and B are statistics of two experiments.
From Hypothesis to Experiment
Null Hypothesis
Alternative Hypothesis
Iman Ardekani
Statistical Hypotheses – Notes
Note 1: The alternative hypothesis specified here is called a
two-sided alternative hypothesis because it would be true
if A>B or if A<B.
***
Note 2: A and B are two statistics (random variable) so for
examining A = B or A ≠ B, statistical distribution of them
should be considered.
***
From Hypothesis to Experiment
Iman Ardekani
Statistical Hypotheses Testing
Testing a hypothesis involves in
1. taking a random sample
2. computing an appropriate test statistic, and then
3. rejecting or failing to reject the null hypothesis H0.
Part of this procedure is specifying the set of values for the
test statistic that leads to rejection of H0. This set of values
is called the critical region or rejection region for the test.
From Hypothesis to Experiment
Iman Ardekani
Errors in Hypothesis Testing
Two kinds of errors may be committed when testing hypotheses:
Type 1: the null hypothesis is rejected but it is true.
= P(type 1 error) = P(reject H0 | H0 is true)
Type 2: the null hypothesis is not rejected but it is false.
= P(type 2 error) = P(fail to reject H0 | H0 is false)
Power of the test is defined as
Power = 1 - =P(reject H0 | H0 is false)
From Hypothesis to Experiment
Iman Ardekani
Significance Level
is called the significance level.
The objective of a statistical test is to achieve low
significance level while still maintaining high test
power.
From Hypothesis to Experiment
Iman Ardekani
Statistically Significant Hypotheses
The hypothesis verified using the statistical hypothesis testing
method is called statistically significant since it is unlikely to be
wrong in a probability sense.
From Hypothesis to Experiment
Iman Ardekani
Experiment – Definition An experiment is a test or a series of tests.
The hypothesis can describe the relationship between x, z and y
variables and an experiment can verify this hypothesis.
From Hypothesis to Experiment
Iman Ardekani
How to plan, conduct and analyze an experiment?
Step 1 - Recognition of and statement of the problem
Step 2 - Selection of the response variable
Step 3 - Choice of factors, levels, and range
Step 4 - Choice of experimental design
Step 5 - Performing the experiment
Step 6 - Statistical analysis of the data
Step 7 - Conclusions and recommendations:
From Hypothesis to Experiment
Iman Ardekani
Lets continue with the following example:
I really like to play golf. Unfortunately, I do not enjoy practicing, so I am
always looking for a simpler solution to lowering my score. Some of the
factors that I think may be important, or that may influence my golf score,
are as follows:
1. The type of driver used (oversized or regular sized)
2. The type of ball used (balata or three piece)
3. Walking and carrying the golf clubs or riding in a golf cart
4. Drinking water or drinking beer while playing
From Hypothesis to Experiment
Iman Ardekani
Best-guess Experiments
Change one or several factors for the next round, based on the
outcome of the current test, in order to improve the output.
Example:
Round 1: oversized driver, balata ball, walk, and water:
Score 87: Noticed several wayward shots with the big driver
Round 2: regular-sized driver, balata ball, walk, and water:
Score 80: Notice that people will easily get tired by walking
Round 3: regular-sized driver, balata ball, golf cart and water Score
78: Notice that …
From Hypothesis to Experiment
Iman Ardekani
One-factor-at-a-time Experiments
Select a starting point (a default setting for each factor)
Example:
Starting point: oversized driver, balata ball, walking, and
drinking water and successively varying each factor over its
range with other factors held constant at the baseline level.
From Hypothesis to Experiment
Iman Ardekani
Example for one-factor-at-a-time approach:
Conclusion:
regular-sized driver, balata ball, riding, and drinking water
is the optimal combination.
From Hypothesis to Experiment
Iman Ardekani
Problem with one-factor-at-a-time approach Interactions between factors are very common. If they occur, the one-
factor-at-a-time approach will usually produce poor results.
For solving this problem, factorial experiment design can be used.
From Hypothesis to Experiment
Part IIIBasic Statistical
Concepts
Iman Ardekani
Mean (μ): a measure of central tendency.
μ = E{y}
Variance (σ2): a measure of how far a set of
numbers is spread out.
σ2 = V(y) = E{(y-μ)2}
Basic Statistical Concepts
Iman Ardekani
If c is a constant and y is a random variable with the
mean of μ and variance of σ2, then
1. E(c) = c
2. E(y) = μ
3. E(cy) = c E(y) = cμ
4. V(c) = 0
5. V(y) = σ2
6. V(cy) = c2 V(y) = c2σ2
Basic Statistical Concepts
Iman Ardekani
If y1 is a random variable with the mean of μ1 and variance
of σ12, and y2 is another random variable with the mean of
μ2 and variance of σ22, then
1. E(y1+y2) = E(y1) + E(y2) = μ1 + μ2
2. E(y1-y2) = E(y1) - E(y2) = μ1 - μ2
3. V(y1+y2) = V(y1) + V(y2) = σ12 + σ2
2 (for independent and 0 mean y1 and y2)
4. V(y1-y2) = V(y1) + V(y2) = σ12 + σ2
2 (for independent and 0 mean y1 and y2)
5. E(y1y2) = E(y1) E(y2) = μ1 μ2 (for independent y1 and y2)
Basic Statistical Concepts
Iman Ardekani
Statistic: Statistical inference makes considerable
use of quantities computed from the observations
in the sample. We define a statistic as any
function of the observations in a sample that does
not contain unknown parameters:
1. Sample mean
2. Sample Variance
3. and even the random variable (quantity) itself!
Basic Statistical Concepts
Iman Ardekani
Sample Mean (shown by y)
Sample Variance (shown by S2)
Basic Statistical Concepts
Iman Ardekani
Find sample mean and sample
variance for each data set.
y1 = ?
y2 = ?
S12 = ?
S22 = ?
Basic Statistical Concepts
Iman Ardekani
Sampling Distribution
The probability distribution of a statistic is called a
sampling distribution. Important examples are:
1. Normal distribution
2. Chi Square Distribution (Χ2 Distribution)
3. t Distribution
Basic Statistical Concepts
Iman Ardekani
Normal Distribution
y ~ N (μ,σ2)
In general case, μ is the mean of the
distribution and σ is the standard
deviation.
An important special case is the
standard normal distribution, where
μ=0 and σ=1.
z = (y-μ)/σ has always an standard
normal distribution.
Basic Statistical Concepts
Iman Ardekani
The Central Limit Theorem
If y1, y2, … yn is a sequence of n independent and
identically distributed random variables with E(y i)=
and V(yi)=2 and x= y1+ y2+ …+ yn then the following
random variable has standard normal distribution
zn=
Basic Statistical Concepts
n 2
x-n
Iman Ardekani
Chi-Square Distribution
x ~ Xk2
If x can be obtained as the sum
of the squares of k independent
normally distributed random
variables, then x follows the chi-
square distribution with k
degrees of freedom.
Basic Statistical Concepts
Iman Ardekani
As an example of a random variable that follows the chi-square
distribution, suppose that y1, y2, …, yn is a random sample from
an N(μ,σ2) distribution. Then (SS=Sum of Squares)
That is SS/σ2 is distributed as chi-square with n-1 degrees of
freedom.
Basic Statistical Concepts
Iman Ardekani
Since S2 = SS/(n-1), then the distribution of S2 is
σ2 Xn-12
Thus, the sampling distribution of the sample
variance is a constant times the chi-square
distribution if the population is normally distributed.
Basic Statistical Concepts
n-1S2 ~
Iman Ardekani
t Distribution
If z~N(0,1) and Xk2 is a ch-square
variable, then the random
variable tk
follows t distribution with k
degrees of freedom.
Basic Statistical Concepts
Iman Ardekani
If y1,y2, …, yn is a random sample from the N(μ,σ2)
distribution, then the quantity
is distributed as t with n-1 degrees of freedom.
Basic Statistical Concepts
Part IVExperimental Design
Iman Ardekani
Factorial Experiments Factors are varied together, instead of one at a time.
An special kind of statsitical experiment design.
22 Factorial Design (2 factors, each at 2 levels). For example:
Factorial Experiments
Iman Ardekani
Example for 22 factorial design 8 sets
replicated twice for each driver-ball combination
Driver Effect?
Driver Effect = - = 3.25
That is, on average, switching from the oversized driver to
the regular sized deriver increases the score by 3.25 strokes
per round.
Factorial Experiments
92+94+93+91
4
88+91+88+90
4
Iman Ardekani
- Ball Effect?
Ball Effect = -
= 0.75
That is, on average, switching from the balata ball to the three piece ball
increases the score by 0.75 strokes per round.
Factorial Experiments
88+91+92+94
4
88+90+93+91
4
Iman Ardekani
- Driver-Ball Interaction Effect?
Driver-Ball Interaction Effect =
- = 0.25
That is, on average, switching of both ball and driver increases the score
by 0.25 strokes per round.
Finally, one can concludes that
Driver effect > Ball effect > Intercation
Factorial Experiments
92+94+88+90
4
88+91+93+91
4
Iman Ardekani
23 Factorial Design (3 factors, each at 2 levels):
How to calculate ball-effect, driver effect, beverage
effect and interaction effects?
Factorial Experiments
Iman Ardekani
Comparative experiments compare two experimental conditions. For
example, comparative experiments can be used to determine whether two
different formulations of a product give equivalent results.
Comparative Experiments
Apple three 1 (AKL) Apple three 2 (ChCH)
Apples weights:0.101 kg0.1110.1030.1020.1210.1020.101
same cultivation conditions
Apples weights:0.102 kg0.1010.1050.1060.1110.980.110
Hypothesis: same apples weights?
Iman Ardekani
Data Model for Comparative Experiments
The following model for each data set is considered:
yi = + i
is the mean of data set
i is assumed to be distributed by NID(0,2)
Comparative Experiments
Noise
Iman Ardekani
Comparative Experiments Formulation:
In general case, we have two data sets:
y11, y12, …, y1
and n1
y21, y22, …, y2
The statistical hypothesis is formulated by
H0: 1 = 2
H1: 1 ≠ 2
Comparative Experiments
Null Hypothesis
Alternative Hypothesis
n1
n2
Iman Ardekani
Two Sample t-test
1. Assume that the variance of the two data sets are
equal: 12 = 2
2
2. Form the following statistic
Comparative Experiments
data set 1 sample meandata set 2 sample mean
estimate of the common variance
Iman Ardekani
Two Sample t-test
3. Assume To determine whether to reject H0 we would compare t0 to the t
distribution with n1+n2-2 degrees of freedom.
4. We would reject H0 if
Comparative Experiments
-
Iman Ardekani
page(1/3)Are the bond strength of the two cement
mortars similar at the significance level
of = 0.05?
Comparative Experiments
Iman Ardekani
page(1/3)
Using provided table:
Comparative Experiments
Iman Ardekani
page(1/3)
Thus we would reject H0 and we can conclude that
the strengths of the two mortar are different.
Comparative Experiments
t0=-2.2
Iman Ardekani
Repeat the previous problem with = 0.01?
Comparative Experiments
Iman Ardekani
ExerciseTwo machines are used for filling plastic bottles with a net volume of 16.0 ounces.
The filling process can be assumed to be normal. The quality engineering
department suspects that both machines fill to the same net volume. An experiment
is performed by taking a random sample from the output of each machine. Would
you reject or accept the quality engineering department hypothesis?
Comparative Experiments
Iman Ardekani
P Value
The smallest level of significance that would lead to
the rejection.
Comparative Experiments
t0
vMin
Iman Ardekani
P Value Calculation for Previous Example
Comparative Experiments
t0=--2.2
V=18Min = 0.0411
Iman Ardekani
Confidence Interval
It is often preferable to provide an interval within
which the statistics in question would be expected to
lie. These interval statements are called confidence
intervals.
This interval estimates the difference between the
statistics and the accuracy of this estimate.
Comparative Experiments
Iman Ardekani
Confidence Interval Calculation
The 100(1-) percentage confidence interval is
L 1-2 U
L =
U =
Comparative Experiments
Iman Ardekani
Confidence Interval
L 1-2 U
1-2 = 0.5(L+U) 0.5(U-L)
It means the mean difference is 0.5(L+U) and the
accuracy of this estimate is 0.5(U-L).
If 0 is not in the interval H0 would be rejected.
Comparative Experiments
Iman Ardekani
For the previous example:
The 95% confidence interval is
L = -0.55
U = -0.01
-0.55 1-2 -0.01 (1-2 =0 is not in the interval)
1-2= -0.28 0.27
It means the difference between the two mortars strengths is -
0.28 with the accuracy of 0.27.
Comparative Experiments
Iman Ardekani
For the previous example estimates the difference between the
tow mortars strengths and the accuracy of your estimation by
calculating the confidence interval of t-test.
The difference between the two mortars strengths is -0.28 with
the accuracy of 0.27.
Comparative Experiments
Iman Ardekani
Some experiments involve comparing only one
population mean to a specified value, say,
H0: 1 = 0
H1: 1 ≠ 0
This problem is a simplified version of the two-sample t-test
problem, called one-sample Z test.
Comparative Experiments
Iman Ardekani
0ne-Sample Z-test
1. Assume that the variance of the sets is 2
2. Form the following statistic
3. If H0 is true, then the distribution of Z0 is N(0, 1).
Therefore, we would reject H0 if |Z0|>Z0.5
4. Z0.5 should be obtained from a table.
Comparative Experiments
Iman Ardekani
Z0.5
=0.05
1-0.5 =0.975
Z=1.96
Comparative Experiments
Iman Ardekani
In the population, the average IQ is 100 with a
standard deviation of 15. A team of scientists wants to
test a new medication to see if it has either a positive
or negative effect on intelligence, or no effect at all. A
sample of 30 participants who have taken the
medication has a mean of 140. Did the medication
affect intelligence, using alpha = 0.05?
Comparative Experiments
Iman Ardekani
Comparative Experiments
1.96
If Z is less than -1.96, or greater than 1.96, reject the null hypothesis.
-1.96
Result: Reject the null hypothesis.Conclusion: Medication significantly affected intelligence, z = 14.60, p < 0.05.
Iman Ardekani
The confidence interval of one-sample z test is
Comparative Experiments
Iman Ardekani
Find the confidence interval for the previous example.
L = 140 - 1.96 x 15 / √30 = 134.64
U= 140 + 1.96 x 15 / √30 = 145.36
140 ± 5.36
Comparative Experiments
Iman Ardekani
Violation of Assumptions in t-test
Two main assumptions are:
1. Normal distribution: In practice, the assumption of
normal distribution can be violated to some extent
without affecting the effectiveness of t-test.
2. Equal variance: If this assumption is violated,
other test techniques should be used.
Comparative Experiments