8
1 Some R Basics EPP 245/298 Statistical Analysis of Laboratory Data

1 Some R Basics EPP 245/298 Statistical Analysis of Laboratory Data

Embed Size (px)

Citation preview

Page 1: 1 Some R Basics EPP 245/298 Statistical Analysis of Laboratory Data

1

Some R Basics

EPP 245/298

Statistical Analysis of

Laboratory Data

Page 2: 1 Some R Basics EPP 245/298 Statistical Analysis of Laboratory Data

October 20, 2004 EPP 245 Statistical Analysis of Laboratory Data

2

Getting Data into R

• Many times the most direct method is to edit the data in Excel, Export as a txt file, then import to R using read.delim

• We will do this two ways for the energy data from Dalgaard

• Frequently, the data from studies I am involved in arrives in Excel

Page 3: 1 Some R Basics EPP 245/298 Statistical Analysis of Laboratory Data

October 20, 2004 EPP 245 Statistical Analysis of Laboratory Data

3

> eg <- read.delim("energy1.txt")> eg Obese Lean1 9.21 7.532 11.51 7.483 12.79 8.084 11.85 8.095 9.97 10.156 8.79 8.407 9.69 10.888 9.68 6.139 9.19 7.9010 NA 7.0511 NA 7.4812 NA 7.5813 NA 8.11

Page 4: 1 Some R Basics EPP 245/298 Statistical Analysis of Laboratory Data

October 20, 2004 EPP 245 Statistical Analysis of Laboratory Data

4

> class(eg)[1] "data.frame"> t.test(eg$Obese,eg$Lean)

Welch Two Sample t-test

data: eg$Obese and eg$Lean t = 3.8555, df = 15.919, p-value = 0.001411alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 1.004081 3.459167 sample estimates:mean of x mean of y 10.297778 8.066154

> mean(eg$Obese)-mean(eg$Lean)[1] NA> mean(eg$Obese[1:9])-mean(eg$Lean)[1] 2.231624>

Page 5: 1 Some R Basics EPP 245/298 Statistical Analysis of Laboratory Data

October 20, 2004 EPP 245 Statistical Analysis of Laboratory Data

5

> class(eg)[1] "data.frame"> t.test(eg$Obese,eg$Lean)

Welch Two Sample t-test

data: eg$Obese and eg$Lean t = 3.8555, df = 15.919, p-value = 0.001411alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 1.004081 3.459167 sample estimates:mean of x mean of y 10.297778 8.066154

> mean(eg$Obese)-mean(eg$Lean)[1] NA> mean(eg$Obese[1:9])-mean(eg$Lean)[1] 2.231624>

Page 6: 1 Some R Basics EPP 245/298 Statistical Analysis of Laboratory Data

October 20, 2004 EPP 245 Statistical Analysis of Laboratory Data

6

> eg2 <- read.delim("energy2.txt")> eg2 expend stature1 9.21 Obese2 11.51 Obese3 12.79 Obese4 11.85 Obese5 9.97 Obese6 8.79 Obese7 9.69 Obese8 9.68 Obese9 9.19 Obese10 7.53 Lean11 7.48 Lean12 8.08 Lean13 8.09 Lean14 10.15 Lean15 8.40 Lean16 10.88 Lean17 6.13 Lean18 7.90 Lean19 7.05 Lean20 7.48 Lean21 7.58 Lean22 8.11 Lean

Page 7: 1 Some R Basics EPP 245/298 Statistical Analysis of Laboratory Data

October 20, 2004 EPP 245 Statistical Analysis of Laboratory Data

7

> class(eg2)[1] "data.frame"> t.test(eg2$expend ~ eg2$stature)

Welch Two Sample t-test

data: eg2$expend by eg2$stature t = -3.8555, df = 15.919, p-value = 0.001411alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -3.459167 -1.004081 sample estimates: mean in group Lean mean in group Obese 8.066154 10.297778

> mean(eg2[eg2[,2]=="Lean",1])-mean(eg2[eg2[,2]=="Obese",1])[1] -2.231624

Page 8: 1 Some R Basics EPP 245/298 Statistical Analysis of Laboratory Data

October 20, 2004 EPP 245 Statistical Analysis of Laboratory Data

8

> mean(eg2[eg2[,2]=="Lean",1])-mean(eg2[eg2[,2]=="Obese",1])[1] -2.231624

> tapply(eg2[,1],eg2[,2],mean) Lean Obese 8.066154 10.297778 > tmp <-tapply(eg2[,1],eg2[,2],mean)> tmp Lean Obese 8.066154 10.297778 > class(tmp)[1] "array"> dim(tmp)[1] 2> tmp[1]-tmp[2] Lean -2.231624