23
1 BIOSTATISTICS BIOSTATISTICS Dr. Kanchan Chitnis Dr. Kanchan Chitnis

Bio Statistics Pg CD m

Embed Size (px)

Citation preview

Page 1: Bio Statistics Pg CD m

11

BIOSTATISTICSBIOSTATISTICS

Dr. Kanchan Chitnis Dr. Kanchan Chitnis

Page 2: Bio Statistics Pg CD m

22

DEFINITIONDEFINITION

Collection, organization, summarization Collection, organization, summarization and analysis of data, andand analysis of data, andDrawing inferencesDrawing inferencesData derived from biological sciences and Data derived from biological sciences and medicine is analyzed = Biostatisticsmedicine is analyzed = Biostatistics

Page 3: Bio Statistics Pg CD m

33

CHARACTERISTICSCHARACTERISTICS

QuantitativeQuantitativeDiscrete Ex: blood Discrete Ex: blood cell count cell count Continuous Ex: Continuous Ex: height, weightheight, weight

QualitativeQualitativeNominal Ex: Nominal Ex: male/female, male/female, dead/alivedead/aliveOrdinal Ex: Ordinal Ex: mild/moderate/severe mild/moderate/severe better/same/worsebetter/same/worse

Page 4: Bio Statistics Pg CD m

44

POPULATION AND SAMPLEPOPULATION AND SAMPLE

SampleSampleStatisticsStatistics

Mean = x barMean = x barSD = sSD = s

PopulationPopulationParametersParametersMean =Mean =μμSD =SD =σσ

Page 5: Bio Statistics Pg CD m

55

Characteristics of representative Characteristics of representative samplesample

Precision =Nearness of sample statistics Precision =Nearness of sample statistics to population parameters, n =to population parameters, n = 4(sd) 4(sd)22

LL22

L= allowable errorL= allowable errorProblem: Mean pulse rate of population is Problem: Mean pulse rate of population is

70/ min, with SD of 8 beats. Calculate size 70/ min, with SD of 8 beats. Calculate size of sample to verify this if L = of sample to verify this if L = ++2 beats, and 2 beats, and when L = when L = ++1 beat.1 beat.

Page 6: Bio Statistics Pg CD m

66

Characteristics of representative Characteristics of representative samplesample

Control and experimental group must be there if Control and experimental group must be there if possiblepossible

Sample should be unbiasedSample should be unbiased Ex: Survey of prevalence of TB in a well to do Ex: Survey of prevalence of TB in a well to do

locality does not give correct picturelocality does not give correct picture

Page 7: Bio Statistics Pg CD m

77

NORMAL DISTRIBUTIONNORMAL DISTRIBUTION

Large number of observations, small Large number of observations, small group intervalsgroup intervalsMean Mean ++ 1 SD limits include 68% of obs 1 SD limits include 68% of obsMean Mean ++ 1.96 SD limits include 95% of obs 1.96 SD limits include 95% of obsMean Mean ++ 2.58 SD limits include 99% of obs 2.58 SD limits include 99% of obsBinomial and Poisson distributionBinomial and Poisson distribution

Page 8: Bio Statistics Pg CD m

88

STANDARD ERRORSTANDARD ERROR

Chance variation from sample to sample Chance variation from sample to sample or from sample to populationor from sample to populationIn any sampling distribution;In any sampling distribution;μμ ++ 1 SE limits include 68% of obs 1 SE limits include 68% of obsμμ ++ 1.96 SE limits include 95% of obs 1.96 SE limits include 95% of obsμμ ++ 2.58 SE limits include 99% of obs 2.58 SE limits include 99% of obsThis sampling distribution forms the basis This sampling distribution forms the basis of tests of significanceof tests of significance

Page 9: Bio Statistics Pg CD m

99

ESTIMATION OF POPULATION ESTIMATION OF POPULATION PARAMETERPARAMETER

Point estimate: Single statistic like meanPoint estimate: Single statistic like meanConfidence interval: Range with attached Confidence interval: Range with attached probabilityprobability

Ex: Mean Ex: Mean ++1.96 SE =95% confidence 1.96 SE =95% confidence interval and population mean would lie in interval and population mean would lie in that range in 95% casesthat range in 95% cases

Any values outside that range are rare, Any values outside that range are rare, probability of their occurrence by chance is probability of their occurrence by chance is 5% ( p= 0.05) 5% ( p= 0.05)

Page 10: Bio Statistics Pg CD m

1010

TESTING STATISTICAL TESTING STATISTICAL HYPOTHESESHYPOTHESES

Null hypothesis (Ho): of no differenceNull hypothesis (Ho): of no differenceAlternate hypothesis (H1): of significant Alternate hypothesis (H1): of significant differencedifferenceZone of rejectionZone of rejectionZone of acceptanceZone of acceptance

Page 11: Bio Statistics Pg CD m

1111

TYPE I AND TYPE II ERRORSTYPE I AND TYPE II ERRORSType I error (Type I error (αα error): Ho error): Ho rejected when estimate rejected when estimate falls in zone of falls in zone of acceptance of Ho acceptance of Ho We change level from 5 We change level from 5 to 6 to 8 %to 6 to 8 %Serious errorSerious errorMinimize by lowering Minimize by lowering level of significancelevel of significance

Type II (Type II (ββ error): Ho error): Ho accepted when estimate accepted when estimate falls in zone of rejectionfalls in zone of rejectionWe change level from 5 We change level from 5 to 4 to 3 %to 4 to 3 %Not serious errorNot serious errorNeeds confirmation by Needs confirmation by changing level and changing level and increasing size of sampleincreasing size of sample

Page 12: Bio Statistics Pg CD m

1212

TO SUMMARIZETO SUMMARIZE

Actual SituationActual Situation Actual SituationActual Situation

DecisionDecision Ho is trueHo is true Ho is falseHo is false

Ho rejectedHo rejected Type I errorType I error No errorNo error

Ho acceptedHo accepted No errorNo error Type II errorType II error

Page 13: Bio Statistics Pg CD m

1313

Type II error and Power of a testType II error and Power of a test

ββ= type II error= type II error1- 1- ββ=power of a test = probability that we =power of a test = probability that we reject false Ho = we will take correct action reject false Ho = we will take correct action when Ho is falsewhen Ho is false

Page 14: Bio Statistics Pg CD m

1414

To determine sample size to control To determine sample size to control type II errorstype II errors

n=[(Zo + Z1)n=[(Zo + Z1)σσ /( /(μμ0 – 0 – μμ1)]1)]22

ProblemProblem

Page 15: Bio Statistics Pg CD m

1515

To determine sample size to To determine sample size to estimate meansestimate means

Size of sample depends on Size of sample depends on σσ, degree of , degree of reliability and desired interval width( d)reliability and desired interval width( d)d= reliability coefficient x SEd= reliability coefficient x SEIf sampling is with replacement,If sampling is with replacement,

d= z d= z σσ/√n; n = z/√n; n = z22 σσ22/ d/ d22……..(1)……..(1)If sampling is without replacement,If sampling is without replacement, n =N zn =N z22 σσ22/ d/ d22(N-1)+ z(N-1)+ z22 σσ22 ……..(2) ……..(2)

Page 16: Bio Statistics Pg CD m

1616

Sources of estimates for Sources of estimates for σσ22

A pilot sample may be drawn from A pilot sample may be drawn from population and variance computed may be population and variance computed may be usedusedEstimates may be available from previous Estimates may be available from previous studiesstudiesProblemProblem

Page 17: Bio Statistics Pg CD m

1717

To determine sample size to To determine sample size to estimate proportionsestimate proportions

p=proportion in population possessing the p=proportion in population possessing the characteristic of interestcharacteristic of interestq=1-p = not possessingq=1-p = not possessingn = zn = z22pq/dpq/d22

ProblemProblem

Page 18: Bio Statistics Pg CD m

1818

TESTS OF SIGNIFICANCETESTS OF SIGNIFICANCE

Parametric testsParametric testsStudent`s t testStudent`s t testZ-testZ-testPearson correlationPearson correlationANOVAANOVA

Non parametric testsNon parametric testsChi squareChi squareFischer`s exact testFischer`s exact testSpearman`s rSpearman`s rWicoxon signed rankWicoxon signed rank

Page 19: Bio Statistics Pg CD m

1919

Z testZ test

Sample size Sample size more than 30more than 30ProblemProblem

ZZ 1.61.6 2.02.0 2.32.3 2.62.6pp 0.10.1 0.050.05 0.020.02 0.010.01

Page 20: Bio Statistics Pg CD m

2020

T test T test

Unpaired t testUnpaired t testExperiment and Experiment and controlcontrolproblemproblem

Paired t testPaired t testBefore and after Before and after treatmenttreatmentproblemproblem

Page 21: Bio Statistics Pg CD m

2121

Chi square testChi square test

For discrete dataFor discrete dataTo find association of two events, To find association of two events, goodness of fit to theoretical valuesgoodness of fit to theoretical valuesProblemProblem

Page 22: Bio Statistics Pg CD m

2222

OTHER TESTS OTHER TESTS ofof SIGNIFICANCE SIGNIFICANCE

Correlation coefficient tells strength of Correlation coefficient tells strength of association between two continuous association between two continuous variablesvariablesRegression coefficient helps to predict Regression coefficient helps to predict change in dependent variable change in dependent variable F ratio ( ANOVA) used to compare means F ratio ( ANOVA) used to compare means of more than 2 samples of more than 2 samples

Page 23: Bio Statistics Pg CD m

2323

Thank you…Thank you…