Upload
razif-shahril
View
3.024
Download
7
Embed Size (px)
Citation preview
KNOWLEDGE FOR THE BENEFIT OF HUMANITYKNOWLEDGE FOR THE BENEFIT OF HUMANITY
BIOSTATISTICS (HFS3283)
INTRODUCTION TO BIOSTATISTICS
Dr.Dr. MohdMohd RazifRazif ShahrilShahril
School of Nutrition & Dietetics School of Nutrition & Dietetics
Faculty of Health SciencesFaculty of Health Sciences
UniversitiUniversiti Sultan Sultan ZainalZainal AbidinAbidin
1
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Synopsis “This course exposes students to different types of data,
measures of central tendency and dispersion, probability,
normal distribution and inferential analysis in biostatistics. It
also covers various basic statistical analyses including
descriptive, parametric and nonparametric tests,
association and prediction models commonly used in
research. This course provides hands-on experience for
students to perform statistical analysis using SPSS and
interpret their outcomes in answering specific research
question”
2
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Learning Outcomes At the end of this course, students should be able to;
• Describe concepts of descriptive, inferential, parametric
and non-parametric tests in biostatistics.
• Describe concepts of categorical data analysis,
association, prediction, reliability and validity in
biostatistics.
• Choose statistical analysis of data based on types of
variables and objective of analysis using SPSS and
interpret their outcomes.
3
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Teaching & Learning Strategy Blended Mode Learning (Face to Face + e-Learning)
• Interactive Lecture – offline + online
• Hands-on Practical
• Project
• Self-Directed Learning (Literature, Video)
4
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Brief Course Outline 1. Introduction to biostatistics
2. Inferential statistics
3. Independent t test
4. ANOVA & Paired t-test
5. Nonparametric analysis
6. Categorical data analysis: Chi square & Fisher exact test
7. Categorical data analysis: OR and RR
8. Correlation
9. Regression
10. Validity analysis
11. Reliability analysis
5
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Course Schedule
6
WEEK / DATE ACTIVITIES
1 MONDAY
(25/01/2016)
Lecture (8.00am - 10.00am) – Dr. Mohd Razif Shahril
INTRODUCTION TO BIOSTATISTICS
Practical (10.00am - 12.00pm) –Dr. Mohd Razif Shahril
INTRODUCTION TO SPSS
2 MONDAY
(01/02/2016)
Lecture (8.00am - 10.00am) –Dr. Mohd Razif Shahril
INFERENTIAL STATISTICS
Practical (10.00am - 12.00pm) –Dr. Mohd Razif Shahril
SPSS: DESCRIPTIVE ANALYSIS & NORMALITY
3 MONDAY
(08/02/2016)
CHINESE NEW YEAR (PUBLIC HOLIDAY)
4 MONDAY
(15/02/2016)
Lecture (8.00am - 10.00am) –Dr. Sharifah Wajihah Wafa
INDEPENDENT T-TEST
Practical (10.00am - 12.00pm) –Dr. Sharifah Wajihah Wafa
SPSS: INDEPENDENT T-TEST
5 MONDAY
(22/02/2016)
Lecture (8.00am - 10.00am) –Dr. Sharifah Wajihah Wafa
ANOVA & PAIRED T-TEST
Practical (10.00am - 12.00pm) –Dr. Sharifah Wajihah Wafa
SPSS: ANOVA & PAIRED T-TEST
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Course Schedule (cont.)
7
WEEK / DATE ACTIVITIES
6 MONDAY
(29/02/2016)
Lecture (8.00am - 10.00am) –Dr. Mohd Razif Shahril
NONPARAMETRIC ANALYSIS
Practical (10.00am - 12.00pm) –Dr. Mohd Razif Shahril
SPSS: NONPARAMETRIC ANALYSIS
7 MONDAY
(07/03/2016)
Lecture (8.00am - 10.00am) –Dr. Mohd Razif Shahril
CATEGORICAL DATA ANALYSIS: CHI SQUARE & FISHER
EXACT TEST
Practical (10.00am - 12.00pm) –Dr. Mohd Razif Shahril
SPSS: CHI SQUARE & FISHER EXACT TEST
THURSDAY
(10/03/2016)
TEST (8.30am - 10.00am) –Dr. Mohd Razif Shahril
8 MONDAY
(14/03/2016)
Lecture (8.00am - 10.00am) –Dr. Mohd Razif Shahril
CATEGORICAL DATA ANALYSIS: OR AND RR
Practical (10.00am - 12.00pm) –Dr. Mohd Razif Shahril
SPSS: OR AND RR
MID SEMESTER BREAK (20/03/2016 – 26/03/2016)
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Course Schedule (cont.)
8
WEEK / DATE ACTIVITIES
9 MONDAY
(28/03/2016)
Lecture (8.00am - 10.00am) –Dr. Mohd Razif Shahril
CORRELATION
Practical (10.00am - 12.00pm) –Dr. Mohd Razif Shahril
SPSS: CORRELATION
10 MONDAY
(04/04/2016)
Lecture (8.00am - 10.00am) –Dr. Mohd Razif Shahril
REGRESSION
Practical (10.00am - 12.00pm) –Dr. Mohd Razif Shahril
SPSS: REGRESSION
11 MONDAY
(11/04/2016)
Lecture (8.00am - 10.00am) –Dr. Mohd Razif Shahril
RELIABILITY
Practical (10.00am - 12.00pm) –Dr. Mohd Razif Shahril
SPSS: INTERNAL CONSISTENCY & INTRACLASS
CORRELATION
12 MONDAY
(18/04/2016)
Lecture (8.00am - 10.00am) –Dr. Mohd Razif Shahril
VALIDITY
Practical (10.00am - 12.00pm) –Dr. Mohd Razif Shahril
SPSS: FACTOR ANALYSIS
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Course Schedule (cont.)
9
WEEK / DATE ACTIVITIES
13 MONDAY
(25/04/2016)
Project (8.00am - 11.00am) –Dr. Mohd Razif Shahril
REPORT PREPARATION
14 MONDAY
(02/05/2016)
Project (8.00am - 11.00am) –Dr. Mohd Razif Shahril
REPORT SUBMISSION
STUDY WEEK (06/05/2016 – 09/05/2016)
END OF SEMESTER EXAMINATION (10/05/2016 – 26/05/2016)
Location Makmal Komputer Khadijah
(Khadijah Computer Lab)
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Assessment • Continuous Assessment
– Test (30%)
– Project Report (1500 words) (30%)
• Summative Assessment
– End of Semester Examination (40%)
10
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Main References
11
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
12
ATTENDANCE
LESS THAN 80% BARRED FROM EXAMINATION
[NO MC FROM PRIVATE CLINICS!]
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Topic Learning Outcomes At the end of this lecture, students should be able to;
• define data and types of data.
• define descriptive statistics, variables and scales.
• explain types of central tendency and dispersion
measurements.
13
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
What is DATA?
14
• Definition: A collection of items of information
• Types of data
– Qualitative
– Quantitative
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
What is DATA? (cont)
15
• Qualitative data
– Observation or information characterized by
measurement on a categorical scale (dichotomous,
nominal or ordinal scale).
– Data that describe a quality of the subject studied.
• E.g. gender, ethnic, death or survival, nationality etc.
– Generally describes in terms of percentages or
proportions.
– Mostly displayed by using contingency table, pie
chart, bar charts.
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
What is DATA? (cont)
16
• Quantitative data
– Data in numerical quantities such as continuous
measurements or counts.
– Observation for which the differences between
numbers have meaning on a numerical scale.
– They measure the quantity of something
– Types of numerical scales;
• Continuous scale (e.g. Age, height)
• Discrete scale (e.g. Number of pregnancy)
– Described in terms of means and standard deviation.
– Frequency tables and histograms are most often used
to display this type of information.
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
How to analyse DATA? (cont)
17
• Using STATISTICS!
– A small representative ‘sample’ is used to study a big
‘population’
• Why?
– Expensive to conduct very large study
– Impossible to collect information from everyone in the
population
• Types of statistics
– Descriptive statistics
– Inferential statistics
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
How to analyse DATA? (cont)
18
Population: A complete collection of data on the group under study Sample: A collection of sampling units selected from the population Sampling unit: A member of the population
Sampling unitSampling unit n=1n=1
SampleSample n= 200n= 200
PopulationPopulation N=20,000N=20,000
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Descriptive statistics
19
• Describe the frequency and distribution to
characterize data collected from a group of
sample to represent the population.
• E.g.
– Percentage of patients attending diabetes clinic
– Gender, age group, education level of the patients
– Patients waiting time for doctors consultation
– Patients fasting glucose and HbA1c level
– Etc.
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Variable
20
• A variable is a characteristics that can take on
different values for different members of the
group under study
– E.g. a group of university students will be found to
differ in gender, height, attitudes, intelligence and may
ways. These characteristics are called variables.
• Categories of variable
– Continuous vs. Discrete
– Independent vs. Dependent
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Variable (cont.)
21
• Continuous variable – can take on any values on the measurement scale under
study
– Do not fit into a finite number or categories
– Referred to as measurement data
– E.g. weight, height, age, blood pressure etc.
• Discrete variable – only designated values or integer values i.e. 1, 2, 3…
– Fit into limited categories
– Referred as count data (dichotomous/ multichotomous) • E.g. dichotomous
– Male-Female
– Yes-No
• E.g. multichotomous – Malay-Chinese-Indian
– Man Utd-Arsenal-Chelsea-Man City-Liverpool
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Variable (cont.)
22
• Independent variable (IV)
– Manipulated in accordance with the purpose of the
investigation
– Set by researcher
• Dependent variable (DV)
– Consequence of the independent variable
– Effected by independent variable
– Outcome
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Scales
23
• Type of scales
– Nominal – classify observation into categories that
cannot be numerically arranged (no order)
– Ordinal – assign order to categories so that one
category is higher than another
– Interval / ratio –sequential ranking of values (as
ordinal scales)
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Measurement of central tendency
24
Mean
Mode
Median
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Mode
25
• Is the most frequent occurring value in a set of
discrete data
• Can be more than one mode if two or more
values are equally common
• E.g.
– 1,3,4,7,2,5,9,4,6,7,8,9,3,4,9,6,4,5,2,1,6,6,7,4,3
– 1,1, 2,2,3,3,3,4,4,4,4,4,5,5,6,6,6,6,7,7,7,8,9,9,9
Mode=4
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Median
26
• The value halfway through the ordered data set
• Generally a good descriptive measure of the
location which works well for skewed data or
data with outliers
• E.g. (n=25)
– 3,4,7,2,5,1, 9,4,6,7,8,9,3,4,9,6,4,5,2,1,6,6,7,4,3
– 1,1, 2,2,3,3,3,4,4,4,4,4,5,5,6,6,6,6,7,7,7,8,9,9,9
Median=5
Ordered data
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Mean
27
• The sample mean is an estimator available for
estimating the population mean.
• Its value depends equally on all of the data
which may include outliers.
• E.g. (N=10) (3, 4, 7, 2, 5, 7, 5, 5, 1, 2)
= 3+3+4+7+2+5+7+5+5+1+2
10
= 4.1
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Measurement of dispersion
28
• Used to describe the variability (spread and
dispersion) in a given sample
• Dispersion measurement;
– Range
– Percentiles
– Variance
– Standard deviation
– Standard error
– Interquartile range
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Measurement of dispersion (cont.)
29
• Range – Difference between highest and lowest value
• Percentiles – Indicate the % of individuals who have equal to/below a given
value.
• Variance – Provides information about how individuals differ within sample.
• Standard deviation (SD) – Gives information about the spread/variability of scores around
the mean.
• Standard error (SE) – Indicates about the certainty of the mean itself.
• Interquartile Range (IQR) – the distance between 1st and 3rd quartile
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Range
30
• Is the difference between the smallest and
largest value in a set of observation
• Range = (the largest value – the smallest value)
– E.g. 3,5,6,7,9,10
– Range = 10 – 3 = 7
• Uses only extreme values and ignores the other
values in the data set.
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Variance
31
• Measure spread or dispersion within a set of
sample data.
• E.g. for N observation X1, X2, X3,.... Xn with
sample mean:
• Therefore, the sample variance is
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Standard deviation (SD)
32
• Measure of spread or dispersion of a set of data
• Calculated by taking square root of the variance
• The more widely the values are spread out, the
larger the standard deviation
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Standard Error (of the Mean)
33
• The SEM quantifies the precision of the mean.
• A small SEM indicates that the sample mean is
likely to be quite close to the true population
mean.
• A large SEM indicates that the sample mean is
likely to be far from the true population mean
• A small SEM can be due to a large sample size
rather than due to tight data.
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Interquartile Range (IQR)
34
• IQR is the distance between 1st and 3rd quartile.
• It is not sensitive to extreme values (outliers).
• Thus, it is usually described together with the
median in skewed distribution of observation.
• Formula: IQR = (Q3 – Q1)
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Histogram
35
• Display the frequency distributions of one
variable
• Very similar to bar chart that are used for
categorical data
• Consists of a set of columns with no space
between each of them
– On horizontal axis, the variables of consideration
– On vertical axis, the scale of frequency of occurrence
– The area under the each column represents the
frequency of each class and thus the total under all
columns equals the total frequency
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Histogram (cont.)
36
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Histogram (cont.)
37
• Unimodal - bell shaped curve • Symmetric about its mean –
Mirror image • Mean = Median = Mode
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Histogram (cont.)
38
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Histogram (cont.)
39
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Boxplot (box and whisker plot)
40
• It is a graphical display using percentile in an
ordered data
• The plot provides information about central
tendency and the variability of the distribution
• It allows detection of outliers and symmetry of
data set
– Outliers : if distance is away > (1.5 times IQR) above
and below of Median, it is shown as circle
– Extreme outliers : if distance is away > (3 times IQR)
above and below of Median, it is shown as asterisks
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Boxplot (cont.)
41
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Note
42
• In descriptive presentation use Mean (SD) or
Median (IQR)
– Normal distribution data, use Mean (SD)
– Skewed distribution data, use Median (IQR)
• In inferential results use Mean (SE)
Thank YouThank You
43