Biostatistics course Part 17 Non-parametric methods Dr. C. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division of Health Sciences and

Biostatistics coursePart 17

Non-parametric methods

Dr. C. Nicolas Padilla RaygozaDepartment of Nursing and Obstetrics

Division of Health Sciences and EngineersCampus Celaya-Salvatierra

University of Guanajuato

Biosketch

Médico Cirujano por la Universidad Autónoma de Guadalajara. Pediatra por el Consejo Mexicano de Certificación en Pediatría. Diplomado en Epidemiología, Escuela de Higiene y Medicina

Tropical de Londres, Universidad de Londres. Master en Ciencias con enfoque en Epidemiología, Atlantic

International University. Doctorado en Ciencias con enfoque en Epidemiología, Atlantic

International University. Profesor Asociado C, Department of Nursing and Obstetrics,

Division of Health Sciences and Engineerings, Campus Celaya Salvatierra, University of Guanjuato.

[email protected]

Competencies

The reader will know the non-parametric methods and when he(she) can use them.

He (she) will apply non-parametric methods in an appropriate form.

He (she) can obtain a confidence interval in non-paramethric analysis

He (she) will apply Wilcoxon sum rank test He (she) will apply Wilcoxon He (she) will apply r Spearman.

Introduction

Parametric methods They are base in means, standard deviations

or probabilities. The Normal distribution is not always

appropriate To study variables with a few observations, Non-symmetrical distributions, or Variables that can have more than two values

Introduction (contd…)

When this happens, we use other anaylisis methods

Non-parametric methods They are not based in the same assumptions

that parametric methods, but also have some assumptions.

Categories (ranking), means, medians

Some non-parametric methods use rankings en lugar de los real values.

Categories are use to compare data, more for their ranking that for their size.

Patient Glucose in blood (mg/dl)

1 135

2 225

3 70

4 100

5 110

6 150

7 90

8 100

9 170

10 60

11 80

Categories (ranking), means, median

Ranked in ascending order

Patient Glucose in blood (mg/dl) Ranking

10 60 1

3 70 2

11 80 3

7 90 4

4 100 5

8 100 6

5 110 7

1 135 8

6 150 9

9 170 10

2 225 11

Are mean and median equals?

To use mean and confidence interval is adequate, the distribution of values should be symmetric.

To the median and confidence intervals are adequate, no need for assumptions.

Using the order (ranking) instead of original values, reduces the need for assumptions about the distribution, the calculations are simpler and faster.

The disadvantage is that the original values are lost.

Thus, non-parametric methods are used only to test hypotheses, not for estimation purposes.

Are mean and median equals?

Non-parametric methods

Situation Non-paramethric method

Paramethric methods

One sample Wilcoxon signed rank test

Z statistic ( t test)

Two indpendent samples

Wilcoxon sum rank test

Z statistic for two independent samples (t test)

Two paired samples

Wilcoxon signed rank test

Z-paired statistic (t-paired test)

One sample, two quantitative variables

Correlation coefficient of Spearrman

Correlation coefficient of Pearson

Data of one sample

The table show data of glucose levels in blood from 11 patients.

We want to know if the mean is 100 mg/dl.Patient Glucose in blood (mg/dl) Ranking

10 60 1

3 70 2

11 80 3

7 90 4

4 100 5

8 100 6

5 110 7

1 135 8

6 150 9

9 170 10

2 225 11

Data of one sample

Alternative no parametric test is Wilcoxon signed rank test.

It can be used to evaluate if the values in the sample are centered in 100 mg/dl.

This test does not require Normality of the distribution of data, but requires that the distribution is symmetrical, but not necessarily take the form of "bell" as Normal.

Data of one sample

Wilcoxon signed rank test is calculate by six steps:1. To calculate the difference between each observation

and the interest value, 100 mg/dl.2. You should exclude any difference = 0.3. To classify and order (ranking) differences by

magnitude , not taken into accoun the sign.4. Sum the rankings of positive differences.5. Sum the rankings of negative differences.6. Select the more little sums and call it T.

Data of one sample

Patient Glucose in blood (mg/dl)

Differences with 100 mg/dl

Rnking

10 60 -40 6

3 70 -30 4

11 80 -20 3

7 90 -10 2

4 100 0

8 100 0

5 110 10 1

1 135 35 5

6 150 50 7

9 170 70 8

2 225 125 9

Two independent groups

30 teenagers with acute apendicitis, were distributed 15 to underwent traditional apendicectomia and 15 with laparoscopic apedicectomia.

For both groups, we evaluate post-surgical pain.Post-surgical pain Traditional Laparoscopy

None 1 3

Slight 5 7

Moderate 5 4

Severe 4 1

Total 15 15


To compate post-surgical pain in both groups, we can use Wilcoxon rank sum test.

We define the null hypothesis Ho: the two distributions overlap.

We define alternative hypothesis Hi: the two distributions are not overlap.


Wilcoxon rank sum test has three steps: We order the values in both groups in

ascendant order. To calculate T as the sum of rankings of more

short sample or one of two if the sample size is equal.

To compare T-value in the critical values of Wilcoxon rank sum test.


Post-surgical pain Traditional Laparoscopy

Rankings

None 1 1+

None 3 3

Slight 5 9+

Slight 7 15

Moderate 5 21+

Moderate 4 25

Severe 4 29+

Severe 1 30

Total 15 15

Two paired groups The table show hours of improvement given by two analgesics in 12

patients with rheumatoid arthritis. To test that the improvement is the same with both analgesics, we can

use paired-t test or Wilcoxon signed ranking test. With both methods, we calculate the difference of improvement in hours

for each patient.Patient A Analgesic B Analgesic

1 3.5 3.5

2 3.6 5.7

3 2.6 2.9

4 2.6 2.4

5 7.3 9.9

6 3.4 3.3

7 14.9 16.7

8 6.6 6.0

9 2.3 3.8

10 2.0 4.0

11 6.8 9.1

12 8.5 26.9

Two paired groups

With Wilcoxon signed rank test, it is no requirement the Normality, but the data should be symmetrical to both sides of the median.

Ho: difference in medians = 0 Hi= difference in medians ≠ 0Patient A Analgesic B Analgesic Difference Rankings

1 3.5 3.5 0

2 3.6 5.7 -2.1 8

3 2.6 2.9 -0.3 3

4 2.6 2.4 0.2 2

5 7.3 9.9 -2.6 10

6 3.4 3.3 0.1 1

7 14.9 16.7 -1.8 6

8 6.6 6.0 0.6 5

9 2.3 3.8 -1.5 4

10 2.0 4.0 -2.0 5

11 6.8 9.1 -2.3 7

12 8.5 26.9 -18.4 11

Two paired groups

We calculate the Wilcoxon signed rank test for differences, making the following:

1.- Count how many differences non-zero.2.- Order the differences by their magnitude, without take into

account the sign.3.- Sum rankings of positive differences.4.- Sum rankings of negative differences.5.- Select the more shor of the two sums and call it T. (Sum of

negative differences = 59, sum of positive differences = 7, T=7).6.- Compare the T-value in the critical values tables for Wilcoxon

signed rank test. T=7, p<0.05.

Spearman’s correlation of ranks

Table and graphic show incidence of colon cancer and average of meat intake per capita in 13 countries.

Country

Incidence colon ca

Mean of intake of meat

1 10 1

2 8 9

3 11 5

4 12 5

5 22 33

6 67 37

7 73 32

8 48 8

9 37 41

10 31 12

11 21 29

12 17 3

13 3 1

Spearman ranks correlation

It is appropiate for monotonic relationships, non-lineal.

It is calculate at the same time that r’s Pearson, only using the rankings.

To calculate it, we need three steps: To order the values of first variable, To order the values of second variable, To apply the formulae of r’s Pearson, using the

rankings instead of original values.

Spearman ranks correlation

Country

Incidence colon ca

Mean of meat intake

Ranking of cancer

Ranking of meat intake

1 10 1 3 1

2 8 9 2 7

3 11 5 4 5

4 12 5 5 4

5 22 33 8 11

6 67 37 12 12

7 73 32 13 10

8 48 8 11 6

9 37 41 10 13

10 31 12 9 8

11 21 29 7 9

12 17 3 6 3

13 3 1 1 2

Comparison of methods

Example Parametric method Non-parametric method

Glucose in blood

t test for one sample p>0.05

Wilcoxon signed rank test, p>0.2

Intensity of surgical pain

t test for two independent samples p<0.05

Wilcoxon sun rank test p<0.05

Improvement of pain

t paired test p>0.1 Wilcoxon signed rank test, p<0.05

Corrlation between colon cancer and meat intake

R Pearson, r= 0.65 R Spearman, r=0.74

Bibliografy

1.- Last JM. A dictionary of epidemiology. New York, 4ª ed. Oxford University Press, 2001:173.

2.- Kirkwood BR. Essentials of medical ststistics. Oxford, Blackwell Science, 1988: 1-4.

3.- Altman DG. Practical statistics for medical research. Boca Ratón, Chapman & Hall/ CRC; 1991: 1-9.

Documents

Biostatistics course Part 17 Non-parametric methods Dr. C. Nicolas Padilla Raygoza Department of Nursing and Obstetrics Division of Health Sciences and