Using EpiCurve - cran.r-project.org · Plotnon-aggregatedcases Daily-non-aggregatedcases DF

Preview:

Citation preview

Using EpiCurvejean.pierre.decorps@gmail.com

2020-08-26

1

Package EpiCurve

Description

EpiCurve allows the user to create epidemic curves from case-based and aggregated data.

Details

The EpiCurve function creates a graph of number of cases by time of illness (for example date of onset).Each case is represented by a square. EpiCurve allows the time unit for the x-axis to have hourly, daily,weekly or monthly intervals. The hourly interval can be split into 1, 2, 3, 4, 6, 8 or 12 hour time units.

EpiCurve works on both case-based (one case per line) or aggregated data (where there is a count of casesfor each date). With aggregated data, you need to specify the variable for the count of cases in the “freq”parameter.

With case-based (non-aggregated data), the date format for EpiCurve can be:

• hourly: YYYY-MM-DD HH:MM or YYYY-mm-DD HH:MM:SS• daily: YYYY-MM-DD• monthly: YYYY-MM

If the date format is daily or hourly, you can change and force the period for aggregation on the graph withthe parameter “period” setted with “day”, “week” or “month”.

For aggregated data, the date formats can be as above, but they can also be weekly: YYYY-Wnn. Here, weneed to specify how the data are aggregated in the parameter “period”. If we want to further aggregate theaggregated data for the epidemic curve (e.g. move from daily aggregated cases to weekly aggregated cases),we can specify the parameter “to.period”.

When the date format is hourly, the dataset is considered case-based, whether the “freq” parameter of theEpiCurve function is supplied or not.

The EpiCurve function

EpiCurve (

x,date = NULL,freq = NULL,cutvar = NULL,period = NULL,to.period = NULL,split = 1,cutorder = NULL,colors = NULL,title = NULL,xlabel = NULL,ylabel = NULL,note = NULL,square = TRUE

)

2

Arguments

Parameter Descriptionx data.frame with at least one column with datesdate character, name of date columnfreq character, name of a column with a value to displaycutvar character, name of a column with factorsperiod character, c(“hour”, “day”,“week”, “month”)to.period character, Convert date period to another period only for aggregated data. If period is

“day”, to.period can be “week” or “month”. If period is “week”, to.period can be“month”.

split integer, c(1,2,3,4,6,8,12) value for hourly splitcutorder character vector of factorscolors character, vector of colorsstitle character, title of the plotxlabel character, label for x axisylabel character, label for y axisnote character, add a note under the graphsquare boolean, If TRUE (default) squares are used to plot the curve, else if the number of cases

is too hight please use square = FALSE.

Depends

ggplot2, dplyr, ISOweek, scales, timeDate

3

Plot non-aggregated cases

Daily - non-aggregated cases

DF <- read.csv("daily_unaggregated_cases.csv", stringsAsFactors=FALSE)kable(head(DF, 12))

UTS V1 V22016-10-26 7.20 1882016-11-02 7.03 952016-11-03 5.14 1602016-11-05 9.89 1652016-11-05 9.69 1092016-11-05 4.15 1542016-11-05 4.97 1442016-11-06 8.97 1872016-11-06 4.45 1202016-11-06 6.60 1162016-11-07 7.68 1412016-11-07 10.08 126

EpiCurve(DF,date = "UTS", period = "day", colors ="#9900ef",xlabel=sprintf("From %s to %s", min(DF$UTS), max(DF$UTS)))

## Day## ’data.frame’: 57 obs. of 4 variables:## $ Freq: num 1 0 0 0 0 0 0 1 1 0 ...## $ Cut : chr "1 cas" NA NA NA ...## $ Day : chr "2016-10-26" "2016-10-27" "2016-10-28" "2016-10-29" ...## $ Date: Date, format: "2016-10-26" "2016-10-27" ...

4

02468

1012

10−

24

10−

31

11−

07

11−

14

11−

21

11−

28

12−

05

12−

12

12−

19

From 2016−10−26 to 2016−12−21

With no squaresEpiCurve(DF,

date = "UTS",period = "day",colors ="#9900ef",xlabel=sprintf("From %s to %s", min(DF$UTS), max(DF$UTS)),square = F)

## Day## ’data.frame’: 57 obs. of 4 variables:## $ Freq: num 1 0 0 0 0 0 0 1 1 0 ...## $ Cut : chr "1 cas" NA NA NA ...## $ Day : chr "2016-10-26" "2016-10-27" "2016-10-28" "2016-10-29" ...## $ Date: Date, format: "2016-10-26" "2016-10-27" ...

5

0

2

4

6

8

10

1210

−24

10−

31

11−

07

11−

14

11−

21

11−

28

12−

05

12−

12

12−

19

From 2016−10−26 to 2016−12−21

6

Hourly - non-aggregated cases

DF <- read.csv("hourly_unaggregated_cases.csv", stringsAsFactors=FALSE)kable(head(DF, 12))

UTS X1 X22017-04-12 16:31 5.17 1662017-04-12 17:35 8.69 1012017-04-12 17:38 6.81 1402017-04-12 18:06 4.95 1202017-04-12 18:36 10.92 1892017-04-12 18:38 7.02 1852017-04-12 18:43 8.03 1752017-04-12 19:05 6.39 1022017-04-12 19:11 4.61 1262017-04-12 19:24 6.36 1882017-04-12 19:37 7.80 1122017-04-12 19:41 6.18 123

EpiCurve(DF,date = "UTS",period = "hour",split = 1,colors ="#339933",ylabel="Number of cases",xlabel=sprintf("From %s to %s", min(DF$UTS), max(DF$UTS)))

0

2

4

6

8

10

12

14

16

18

12 0

0:00

−00

:59

12 0

1:00

−01

:59

12 0

2:00

−02

:59

12 0

3:00

−03

:59

12 0

4:00

−04

:59

12 0

5:00

−05

:59

12 0

6:00

−06

:59

12 0

7:00

−07

:59

12 0

8:00

−08

:59

12 0

9:00

−09

:59

12 1

0:00

−10

:59

12 1

1:00

−11

:59

12 1

2:00

−12

:59

12 1

3:00

−13

:59

12 1

4:00

−14

:59

12 1

5:00

−15

:59

12 1

6:00

−16

:59

12 1

7:00

−17

:59

12 1

8:00

−18

:59

12 1

9:00

−19

:59

12 2

0:00

−20

:59

12 2

1:00

−21

:59

12 2

2:00

−22

:59

12 2

3:00

−23

:59

13 0

0:00

−00

:59

13 0

1:00

−01

:59

13 0

2:00

−02

:59

13 0

3:00

−03

:59

13 0

4:00

−04

:59

13 0

5:00

−05

:59

13 0

6:00

−06

:59

13 0

7:00

−07

:59

13 0

8:00

−08

:59

13 0

9:00

−09

:59

13 1

0:00

−10

:59

13 1

1:00

−11

:59

13 1

2:00

−12

:59

13 1

3:00

−13

:59

13 1

4:00

−14

:59

13 1

5:00

−15

:59

13 1

6:00

−16

:59

13 1

7:00

−17

:59

13 1

8:00

−18

:59

13 1

9:00

−19

:59

13 2

0:00

−20

:59

13 2

1:00

−21

:59

13 2

2:00

−22

:59

13 2

3:00

−23

:59

From 2017−04−12 16:31 to 2017−04−13 03:23

Num

ber

of c

ases

7

Hourly - non-aggregated cases with factors

DF <- read.csv("hourly_unaggregated_cases_factors.csv", stringsAsFactors=FALSE)kable(head(DF, 12))

UTS X1 X2 Confirmed2017-04-12 16:31 5.17 166 YES2017-04-12 17:35 8.69 101 YES2017-04-12 17:38 6.81 140 NO2017-04-12 18:06 4.95 120 NO2017-04-12 18:36 10.92 189 NO2017-04-12 18:38 7.02 185 YES2017-04-12 18:43 8.03 175 NO2017-04-12 19:05 6.39 102 NO2017-04-12 19:11 4.61 126 NO2017-04-12 19:24 6.36 188 YES2017-04-12 19:37 7.80 112 NO2017-04-12 19:41 6.18 123 NO

EpiCurve(DF,date = "UTS",period = "hour",split = 1,cutvar = "Confirmed",colors = c("#339933","#eebb00"),xlabel=sprintf("From %s to %s", min(DF$UTS), max(DF$UTS)))

02468

1012141618

12 0

0:00

−00

:59

12 0

1:00

−01

:59

12 0

2:00

−02

:59

12 0

3:00

−03

:59

12 0

4:00

−04

:59

12 0

5:00

−05

:59

12 0

6:00

−06

:59

12 0

7:00

−07

:59

12 0

8:00

−08

:59

12 0

9:00

−09

:59

12 1

0:00

−10

:59

12 1

1:00

−11

:59

12 1

2:00

−12

:59

12 1

3:00

−13

:59

12 1

4:00

−14

:59

12 1

5:00

−15

:59

12 1

6:00

−16

:59

12 1

7:00

−17

:59

12 1

8:00

−18

:59

12 1

9:00

−19

:59

12 2

0:00

−20

:59

12 2

1:00

−21

:59

12 2

2:00

−22

:59

12 2

3:00

−23

:59

13 0

0:00

−00

:59

13 0

1:00

−01

:59

13 0

2:00

−02

:59

13 0

3:00

−03

:59

13 0

4:00

−04

:59

13 0

5:00

−05

:59

13 0

6:00

−06

:59

13 0

7:00

−07

:59

13 0

8:00

−08

:59

13 0

9:00

−09

:59

13 1

0:00

−10

:59

13 1

1:00

−11

:59

13 1

2:00

−12

:59

13 1

3:00

−13

:59

13 1

4:00

−14

:59

13 1

5:00

−15

:59

13 1

6:00

−16

:59

13 1

7:00

−17

:59

13 1

8:00

−18

:59

13 1

9:00

−19

:59

13 2

0:00

−20

:59

13 2

1:00

−21

:59

13 2

2:00

−22

:59

13 2

3:00

−23

:59

From 2017−04−12 16:31 to 2017−04−13 03:23

Confirmed

YES

NO

8

With no squaresEpiCurve(DF,

date = "UTS",period = "hour",split = 1,cutvar = "Confirmed",colors = c("#339933","#eebb00"),xlabel=sprintf("From %s to %s", min(DF$UTS), max(DF$UTS)),square = FALSE)

0

2

4

6

8

10

12

14

16

18

12 0

0:00

−00

:59

12 0

1:00

−01

:59

12 0

2:00

−02

:59

12 0

3:00

−03

:59

12 0

4:00

−04

:59

12 0

5:00

−05

:59

12 0

6:00

−06

:59

12 0

7:00

−07

:59

12 0

8:00

−08

:59

12 0

9:00

−09

:59

12 1

0:00

−10

:59

12 1

1:00

−11

:59

12 1

2:00

−12

:59

12 1

3:00

−13

:59

12 1

4:00

−14

:59

12 1

5:00

−15

:59

12 1

6:00

−16

:59

12 1

7:00

−17

:59

12 1

8:00

−18

:59

12 1

9:00

−19

:59

12 2

0:00

−20

:59

12 2

1:00

−21

:59

12 2

2:00

−22

:59

12 2

3:00

−23

:59

13 0

0:00

−00

:59

13 0

1:00

−01

:59

13 0

2:00

−02

:59

13 0

3:00

−03

:59

13 0

4:00

−04

:59

13 0

5:00

−05

:59

13 0

6:00

−06

:59

13 0

7:00

−07

:59

13 0

8:00

−08

:59

13 0

9:00

−09

:59

13 1

0:00

−10

:59

13 1

1:00

−11

:59

13 1

2:00

−12

:59

13 1

3:00

−13

:59

13 1

4:00

−14

:59

13 1

5:00

−15

:59

13 1

6:00

−16

:59

13 1

7:00

−17

:59

13 1

8:00

−18

:59

13 1

9:00

−19

:59

13 2

0:00

−20

:59

13 2

1:00

−21

:59

13 2

2:00

−22

:59

13 2

3:00

−23

:59

From 2017−04−12 16:31 to 2017−04−13 03:23

Confirmed

YES

NO

9

Plot aggregated data

Daily

Without factors

date value2016-03-01 52016-03-03 52016-03-05 52016-03-06 12016-03-07 32016-03-14 42016-03-15 12016-03-16 102016-03-27 32016-03-28 22016-03-30 42016-03-31 22016-04-01 32016-04-03 12016-04-04 62016-04-07 62016-04-08 92016-04-09 152016-04-13 22016-04-14 12016-04-15 32016-04-16 62016-04-17 42016-04-18 172016-04-27 132016-04-29 32016-04-30 6

10

EpiCurve(DF,date = "date",freq = "value",period = "day",ylabel="Number of cases",xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),title = "Epidemic Curve",note = "Daily epidemic curve")

## Day## ’data.frame’: 61 obs. of 4 variables:## $ Freq: num 5 0 5 0 5 1 3 0 0 0 ...## $ Cut : chr "1 cas" NA "1 cas" NA ...## $ Day : chr "2016-03-01" "2016-03-02" "2016-03-03" "2016-03-04" ...## $ Date: Date, format: "2016-03-01" "2016-03-02" ...

02468

10121416

02−

29

03−

07

03−

14

03−

21

03−

28

04−

04

04−

11

04−

18

04−

25

05−

02

From 2016−03−01 to 2016−04−30

Daily epidemic curve

Num

ber

of c

ases

Epidemic Curve

11

With factors

date value factor2016-03-01 3 Validated Case2016-03-01 2 Unvalidated Case2016-03-03 4 Validated Case2016-03-03 1 Unvalidated Case2016-03-05 5 Validated Case2016-03-06 1 Unvalidated Case2016-03-07 2 Validated Case2016-03-07 1 Unvalidated Case2016-03-14 4 Validated Case2016-03-15 1 Validated Case2016-03-16 7 Validated Case2016-03-16 3 Unvalidated Case2016-03-27 3 Unvalidated Case2016-03-28 1 Validated Case2016-03-28 1 Unvalidated Case2016-03-30 4 Validated Case2016-03-31 2 Validated Case2016-04-01 3 Validated Case2016-04-03 1 Validated Case2016-04-04 2 Validated Case2016-04-04 4 Unvalidated Case2016-04-07 6 Unvalidated Case2016-04-08 9 Validated Case2016-04-09 7 Validated Case2016-04-09 8 Unvalidated Case2016-04-13 2 Validated Case2016-04-14 1 Validated Case2016-04-15 3 Validated Case2016-04-16 6 Validated Case2016-04-17 4 Validated Case2016-04-18 8 Validated Case2016-04-18 9 Unvalidated Case2016-04-27 11 Validated Case2016-04-27 2 Unvalidated Case2016-04-29 3 Validated Case2016-04-30 6 Validated Case

12

EpiCurve(DF,date = "date",freq = "value",cutvar = "factor",period = "day",ylabel="Number of cases",xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),title = "Epidemic Curve",note = "Daily epidemic curve")

## Day## ’data.frame’: 70 obs. of 4 variables:## $ Freq: num 3 2 0 4 1 0 5 1 2 1 ...## $ Cut : Factor w/ 2 levels "Unvalidated Case",..: 2 1 NA 2 1 NA 2 1 2 1 ...## $ Day : chr "2016-03-01" "2016-03-01" "2016-03-02" "2016-03-03" ...## $ Date: Date, format: "2016-03-01" "2016-03-01" ...

02468

10121416

02−

29

03−

07

03−

14

03−

21

03−

28

04−

04

04−

11

04−

18

04−

25

05−

02

From 2016−03−01 to 2016−04−30

Daily epidemic curve

Num

ber

of c

ases

factor

Validated Case

Unvalidated Case

Epidemic Curve

13

Weekly

Without factors

date value2016-W10 32016-W11 42016-W12 12016-W13 52016-W14 22016-W15 72016-W16 12016-W17 42016-W18 3

EpiCurve(DF,date = "date",freq = "value",period = "week",colors=c("#990000"),ylabel="Number of cases",xlabel=sprintf("Du %s au %s", min(DF$date), max(DF$date)),title = "Epidemic Curve\n")

0

1

2

3

4

5

6

7

2016

−W

10

2016

−W

11

2016

−W

12

2016

−W

13

2016

−W

14

2016

−W

15

2016

−W

16

2016

−W

17

2016

−W

18

Du 2016−W10 au 2016−W18

Num

ber

of c

ases

Epidemic Curve

14

With factors

date value factor2016-W10 3 Valid2016-W10 2 Invalid2016-W11 4 Valid2016-W12 1 Invalid2016-W13 5 Valid2016-W14 2 Valid2016-W14 1 Invalid2016-W15 7 Valid2016-W15 3 Invalid2016-W17 1 Valid2016-W17 1 Invalid2016-W18 4 Valid2016-W18 2 Invalid2016-W20 3 Valid

EpiCurve(DF,date = "date",freq = "value",period = "week",cutvar = "factor",colors=c("Blue", "Red"),ylabel="Cases",xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),title = "Epidemic Curve\n")

0

2

4

6

8

10

2016

−W

10

2016

−W

11

2016

−W

12

2016

−W

13

2016

−W

14

2016

−W

15

2016

−W

16

2016

−W

17

2016

−W

18

2016

−W

19

2016

−W

20

From 2016−W10 to 2016−W20

Cas

es

factor

Valid

Invalid

Epidemic Curve

15

Monthly

Without factors

date value2016-02 32016-03 42016-04 12016-05 52016-07 22016-08 72016-10 12016-11 42016-12 3

EpiCurve(DF,date = "date",freq = "value",period = "month",ylabel="Number of cases",xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),title = "Epidemic Curve\n")

0

1

2

3

4

5

6

7

2016

−02

2016

−03

2016

−04

2016

−05

2016

−06

2016

−07

2016

−08

2016

−09

2016

−10

2016

−11

2016

−12

From 2016−02 to 2016−12

Num

ber

of c

ases

Epidemic Curve

16

With factors

date value factor2016-02 3 Valid2016-02 2 Invalid2016-03 4 Valid2016-04 1 Invalid2016-05 5 Valid2016-06 2 Valid2016-06 1 Invalid2016-07 7 Valid2016-07 3 Invalid2016-09 1 Valid2016-09 1 Invalid2016-11 4 Valid2016-11 2 Invalid2016-12 3 Valid

EpiCurve(DF,date = "date",freq = "value",cutvar = "factor",period = "month",ylabel="Number of cases",xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),title = "Epidemic Curve\n")

0

2

4

6

8

10

2016

−02

2016

−03

2016

−04

2016

−05

2016

−06

2016

−07

2016

−08

2016

−09

2016

−10

2016

−11

2016

−12

From 2016−02 to 2016−12

Num

ber

of c

ases

factor

Valid

Invalid

Epidemic Curve

17

Converted period (aggragated cases)

“day” to “week”

date value2016-03-01 52016-03-03 52016-03-05 52016-03-06 12016-03-07 32016-03-14 42016-03-15 12016-03-16 102016-03-27 32016-03-28 22016-03-30 42016-03-31 22016-04-01 32016-04-03 12016-04-04 62016-04-07 62016-04-08 92016-04-09 152016-04-13 22016-04-14 12016-04-15 32016-04-16 62016-04-17 42016-04-18 172016-04-27 132016-04-29 32016-04-30 6

EpiCurve(DF,date = "date",freq = "value",period = "day",to.period = "week",ylabel="Number of cases",xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),title = "Epidemic Curve",note = "Daily epidemic curve")

18

0

5

10

15

20

25

30

35

2016

−W

0920

16−

W10

2016

−W

1120

16−

W12

2016

−W

1320

16−

W14

2016

−W

1520

16−

W16

2016

−W

17

From 2016−03−01 to 2016−04−30

Daily epidemic curve

Num

ber

of c

ases

Epidemic Curve

19

“day” to “month”

EpiCurve(DF,date = "date",freq = "value",period = "day",to.period = "month",ylabel="Number of cases",xlabel=sprintf("From %s o %s", min(DF$date), max(DF$date)),title = "Epidemic Curve",note = "Daily epidemic curve")

20

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

2016

−03

2016

−04

From 2016−03−01 o 2016−04−30

Daily epidemic curve

Num

ber

of c

ases

Epidemic Curve

21

“week” to “month”

date value2016-W10 32016-W11 42016-W12 12016-W13 52016-W14 22016-W15 72016-W16 12016-W17 42016-W18 3

EpiCurve(DF,date = "date",freq = "value",period = "week",to.period = "month",colors=c("#990000"),ylabel="Number of cases",xlabel=sprintf("Du %s au %s", min(DF$date), max(DF$date)),title = "Epidemic Curve\n")

0

2

4

6

8

10

12

14

2016

−03

2016

−04

2016

−05

Du 2016−W10 au 2016−W18

Num

ber

of c

ases

Epidemic Curve

22

“week” to “month” with factors

date value factor2016-W10 3 Valid2016-W10 2 Invalid2016-W11 4 Valid2016-W12 1 Invalid2016-W13 5 Valid2016-W14 2 Valid2016-W14 1 Invalid2016-W15 7 Valid2016-W15 3 Invalid2016-W17 1 Valid2016-W17 1 Invalid2016-W18 4 Valid2016-W18 2 Invalid2016-W20 3 Valid

EpiCurve(DF,date = "date",freq = "value",period = "week",to.period = "month",cutvar = "factor",colors=c("Blue", "Red"),ylabel="Cases",xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),title = "Epidemic Curve\n")

23

0

2

4

6

8

10

12

14

2016

−03

2016

−04

2016

−05

From 2016−W10 to 2016−W20

Cas

es

factor

Valid

Invalid

Epidemic Curve

24

Recommended