Upload
barry-gilmore
View
215
Download
0
Embed Size (px)
Citation preview
The exam is of 2 hours & Marks :40
The exam is of two parts ( Part I & Part II)
Part I is of 20 questions . Answer any 15 questionsEach question is of 2 marks . Total 30 marks.
Part II is of 15 questions. Answer any 10 questionsEach question is of 1 mark. Total 10 marks.
No MCQ’s. You should write the answers.
No major calculations. No need to memorize the formulas.
Bring your own calculator. Cell phones are not allowed to use as a calculator.
Study DesignsLevels of measurements (Type of data)Sampling Distribution of Means and proportionsNormal DistributionHypothesis testing & Z- testStudent’s t-testChi -square test, MacNemar’s Chi-square testConfidence IntervalsHow to write a research paper ? ----- ( 40 marks)
Nominal – qualitative classification of equal value: gender, race, color, city
Ordinal - qualitative classification which can be rank ordered: socioeconomic status of families
Interval - Numerical or quantitative data: can be rank ordered and sizes compared : temperature
Ratio - Quantitative interval data along with ratio: time, age.
The standard deviation (s) describes variability between individuals in a sample.
The standard error describes variation of a sample statistic.
The standard deviation describes how individuals differ.
The standard error of the mean describes the precision with which we can make inference about the true mean.
Standard error of the mean (sem):
Comments:n = sample sizeeven for large s, if n is large, we can get good
precision for semalways smaller than standard deviation (s)
s sems
nx
Two Steps in Statistical Inferencing Process
1. Calculation of “confidence intervals” from the sample mean and sample standard deviation within which we can place the unknown population mean with some degree of probabilistic confidence
2. Compute “test of statistical significance” (Risk Statements) which is designed to assess the probabilistic chance that the true but unknown population mean lies within the confidence interval that you just computed from the sample mean.
Many biologic variables follow this pattern Hemoglobin, Cholesterol, Serum
Electrolytes, Blood pressures, age, weight, height
One can use this information to define what is normal and what is extreme
In clinical medicine 95% or 2 Standard deviations around the mean is normalClinically, 5% of “normal” individuals
are labeled as extreme/abnormal We just accept this and move on.
Symmetrical about mean, Mean, median, and mode are equal Total area under the curve above the
x-axis is one square unit 1 standard deviation on both sides of
the mean includes approximately 68% of the total area2 standard deviations includes
approximately 95% 3 standard deviations includes
approximately 99%
Normal distribution is completely determined by the parameters and Different values of shift the distribution
along the x-axis Different values of determine degree of
flatness or peakedness of the graph
The Z score makes it possible, under some circumstances, to compare scores that originally had different units of measurement.
‘The mean sodium concentrations in the two populations are equal.’
Alternative hypothesisAlternative hypothesisLogical alternative to the null hypothesis
‘The mean sodium concentrations in the two populations are different.’
HypothesisHypothesis
simple, specific, in advance
100 110 120 130 140
One-tail testOne-tail test
Ho:μ= μoHa: μ> μo or μ< μo
Alternative Hypothesis: Mean systolic BP of Nephrology patients is significantly higher (or lower) than the mean systolic BP of normal patients.
0.050.05
Two-tail testTwo-tail testHo:μ= μo
Ha:μ# μo
Alternative Hypothesis : Mean systolic BP of Nephrology patients are significantly different from mean systolic BP of normal patients.
100 110 120 130 140
0.0250.025
Every decisions making process will commit two types of errors.
“We may conclude that the difference is significant when in fact there is not real difference in the population, and so reject the null hypothesis when it is true. This is error is known as type-I error, whose magnitude is denoted by the Greek letter ‘α’.
On the other hand, we may conclude that the difference is not significant, when in fact there is real difference between the populations, that is the null hypothesis is not rejected when actually it is false. This error is called type-II error, whose magnitude is denoted by ‘β’.
Disease (Gold Standard)
Present
Correct
Negative
Total
PositiveTest
False Negative
a+b
a+b+c+d
Total
Correct
a+c
b+d
c+d
False Positive
Result
Absent
a b
c d
This level of uncertainty is called type 1 error or a false-positive rate (
More commonly called a p-value In general, p ≤ 0.05 is the agreed upon
level In other words, the probability that the
difference that we observed in our sample occurred by chance is less than 5%Therefore we can reject the Ho
Stating the Conclusions of our Results
When the p-value is small, we reject the null hypothesis or, equivalently, we accept the alternative hypothesis. “Small” is defined as a p-value , where
acceptable false (+) rate (usually 0.05). When the p-value is not small, we
conclude that we cannot reject the null hypothesis or, equivalently, there is not enough evidence to reject the null hypothesis. “Not small” is defined as a p-value > , where =
acceptable false (+) rate (usually 0.05).
P-valueP-value A standard device for reporting quantitative results in research where variability plays a large role.
Measures the dissimilarity between two or more sets of measures or between one set of measurements and a standard.
“ the probability of obtaining the study results by chance if the null hypothesis is true”
“The probability of obtaining the observed value (study results) as extreme as possible”
P-value - continued P-value - continued
“ The p-value is actually a probability, normally the probability of getting a result as extreme as or more extreme than the one observed if the dissimilarity is entirely due to variability of measurements or patients response, or to sum up, due to chance alone”.
Small p value - the rare event has occurredLarge p value - likely event
-1.9
6-1
.96 00
Area = .025Area = .025
Area =.005Area =.005
ZZ
-2.5
75
-2.5
75
Area = .025Area = .025
Area = .005Area = .005
1.9
61.9
6
2.5
75
2.5
75
1.1. Test for single meanTest for single mean Whether the sample mean is equal to the predefined
population mean ?
2. Test for difference in means. Test for difference in means Whether the CD4 level of patients taking treatment A is
equal to CD4 level of patients taking treatment B ?
3. Test for paired observationTest for paired observation Whether the treatment conferred any significant benefit ?
t is a measure of:How difficult is it to believe the null hypothesis?
High t Difficult to believe the null hypothesis -
accept that there is a real difference.
Low t Easy to believe the null hypothesis -
have not proved any difference.
Student ‘s t-test will be used: --- When Sample size is small , for
mean values and for the following situations:
(1) to compare the single sample mean
with the population mean (2) to compare the sample means of two indpendent samples (3) to compare the sample means of
paired samples
BACKGROUND AND NEED OF THE TEST
Data collected in the field of medicine is often qualitative.
--- For example, the presence or absence of a symptom, classification of pregnancy as ‘high risk’ or ‘non-high risk’, the degree of severity of a disease (mild, moderate, severe)
The measure computed in each instance is a proportion, corresponding to the mean in the case of quantitative data such as height, weight, BMI, serum cholesterol.
Comparison between two or more proportions, and the test of significance employed for such purposes is called the “Chi-square test”
McNemar’s testMcNemar’s test
Situation:Situation:
Two paired binary variables that Two paired binary variables that form a particular type of 2 x 2 form a particular type of 2 x 2 tabletable
e.g. matched case-control study or e.g. matched case-control study or cross-over trialcross-over trial
When both the study variables and outcome variables are categorical (Qualitative):
Apply (i) Chi square test(ii) Fisher’s exact test (Small samples)(iii) Mac nemar’s test ( for paired
samples)
Z-test:Study variable: QualitativeOutcome variable: Quantitative or QualitativeComparison: two means or two proportionsSample size: each group is > 50Student’s t-test:Study variable: QualitativeOutcome variable: QuantitativeComparison: sample mean with population
mean; two means (independent samples); paired samples.
Sample size: each group <50 ( can be used even for large sample size)
Chi-square test:Study variable: QualitativeOutcome variable: QualitativeComparison: two or more proportionsSample size: > 20Expected frequency: > 5Fisher’s exact test:Study variable: QualitativeOutcome variable: QualitativeComparison: two proportionsSample size:< 20Macnemar’s test: (for paired samples)Study variable: QualitativeOutcome variable: QualitativeComparison: two proportionsSample size: Any
1. Number of Observations that Are Free to Vary After Sample Statistic Has Been Calculated
2. ExampleSum of 3 Numbers Is 6
X1 = 1 (or Any Number)X2 = 2 (or Any Number)X3 = 3 (Cannot Vary)Sum = 6
degrees of freedom = n -1 = 3 -1= 2
Two forms of estimation Point estimation = single value, e.g., x-bar
is unbiased estimator of μ Interval estimation = range of values
confidence interval (CI). A confidence interval consists of:
Mean, , is unknown
Population
Random Sample I am 95%
confident that is between 40 &
60.
Mean X = 50
Estimation Process
Sample
“We are 95% sure that the TRUE parameter value is in the 95% confidence interval”
“If we repeated the experiment many many times, 95% of the time the TRUE parameter value would be in the interval”
“the probability that the interval would contain the true parameter value was 0.95.”
CI 90% corresponds to p 0.10CI 95% corresponds to p 0.05CI 99% corresponds to p 0.01
Note:p value only for analytical studiesCI for descriptive and analytical studies
If p value <0.05, then 95% CI:
exclude 0 (for difference), because if A=B then A-B = 0 p>0.05
exclude 1 (for ratio), because if A=B then A/B = 1, p>0.05
--The (im) precision of the estimate is indicated by the width of the confidence interval.
--The wider the interval the less precision THE WIDTH OF C.I. DEPENDS ON: ---- SAMPLE SIZE ---- VAIRABILITY ---- DEGREE OF CONFIDENCE
p values (hypothesis testing) gives you the probability that the result is merely caused by chance or not by chance, it does not give the magnitude and direction of the difference
Confidence interval (estimation) indicates estimate of value in the population given one result in the sample, it gives the magnitude and direction of the difference