Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Package EpiCurve
Description
EpiCurve allows the user to create epidemic curves from case-based and aggregated data.
Details
The EpiCurve function creates a graph of number of cases by time of illness (for example date of onset).Each case is represented by a square. EpiCurve allows the time unit for the x-axis to have hourly, daily,weekly or monthly intervals. The hourly interval can be split into 1, 2, 3, 4, 6, 8 or 12 hour time units.
EpiCurve works on both case-based (one case per line) or aggregated data (where there is a count of casesfor each date). With aggregated data, you need to specify the variable for the count of cases in the “freq”parameter.
With case-based (non-aggregated data), the date format for EpiCurve can be:
• hourly: YYYY-MM-DD HH:MM or YYYY-mm-DD HH:MM:SS• daily: YYYY-MM-DD• monthly: YYYY-MM
If the date format is daily or hourly, you can change and force the period for aggregation on the graph withthe parameter “period” setted with “day”, “week” or “month”.
For aggregated data, the date formats can be as above, but they can also be weekly: YYYY-Wnn. Here, weneed to specify how the data are aggregated in the parameter “period”. If we want to further aggregate theaggregated data for the epidemic curve (e.g. move from daily aggregated cases to weekly aggregated cases),we can specify the parameter “to.period”.
When the date format is hourly, the dataset is considered case-based, whether the “freq” parameter of theEpiCurve function is supplied or not.
The EpiCurve function
EpiCurve (
x,date = NULL,freq = NULL,cutvar = NULL,period = NULL,to.period = NULL,split = 1,cutorder = NULL,colors = NULL,title = NULL,xlabel = NULL,ylabel = NULL,note = NULL,square = TRUE
)
2
Arguments
Parameter Descriptionx data.frame with at least one column with datesdate character, name of date columnfreq character, name of a column with a value to displaycutvar character, name of a column with factorsperiod character, c(“hour”, “day”,“week”, “month”)to.period character, Convert date period to another period only for aggregated data. If period is
“day”, to.period can be “week” or “month”. If period is “week”, to.period can be“month”.
split integer, c(1,2,3,4,6,8,12) value for hourly splitcutorder character vector of factorscolors character, vector of colorsstitle character, title of the plotxlabel character, label for x axisylabel character, label for y axisnote character, add a note under the graphsquare boolean, If TRUE (default) squares are used to plot the curve, else if the number of cases
is too hight please use square = FALSE.
Depends
ggplot2, dplyr, ISOweek, scales, timeDate
3
Plot non-aggregated cases
Daily - non-aggregated cases
DF <- read.csv("daily_unaggregated_cases.csv", stringsAsFactors=FALSE)kable(head(DF, 12))
UTS V1 V22016-10-26 7.20 1882016-11-02 7.03 952016-11-03 5.14 1602016-11-05 9.89 1652016-11-05 9.69 1092016-11-05 4.15 1542016-11-05 4.97 1442016-11-06 8.97 1872016-11-06 4.45 1202016-11-06 6.60 1162016-11-07 7.68 1412016-11-07 10.08 126
EpiCurve(DF,date = "UTS", period = "day", colors ="#9900ef",xlabel=sprintf("From %s to %s", min(DF$UTS), max(DF$UTS)))
## Day## ’data.frame’: 57 obs. of 4 variables:## $ Freq: num 1 0 0 0 0 0 0 1 1 0 ...## $ Cut : chr "1 cas" NA NA NA ...## $ Day : chr "2016-10-26" "2016-10-27" "2016-10-28" "2016-10-29" ...## $ Date: Date, format: "2016-10-26" "2016-10-27" ...
4
02468
1012
10−
24
10−
31
11−
07
11−
14
11−
21
11−
28
12−
05
12−
12
12−
19
From 2016−10−26 to 2016−12−21
With no squaresEpiCurve(DF,
date = "UTS",period = "day",colors ="#9900ef",xlabel=sprintf("From %s to %s", min(DF$UTS), max(DF$UTS)),square = F)
## Day## ’data.frame’: 57 obs. of 4 variables:## $ Freq: num 1 0 0 0 0 0 0 1 1 0 ...## $ Cut : chr "1 cas" NA NA NA ...## $ Day : chr "2016-10-26" "2016-10-27" "2016-10-28" "2016-10-29" ...## $ Date: Date, format: "2016-10-26" "2016-10-27" ...
5
0
2
4
6
8
10
1210
−24
10−
31
11−
07
11−
14
11−
21
11−
28
12−
05
12−
12
12−
19
From 2016−10−26 to 2016−12−21
6
Hourly - non-aggregated cases
DF <- read.csv("hourly_unaggregated_cases.csv", stringsAsFactors=FALSE)kable(head(DF, 12))
UTS X1 X22017-04-12 16:31 5.17 1662017-04-12 17:35 8.69 1012017-04-12 17:38 6.81 1402017-04-12 18:06 4.95 1202017-04-12 18:36 10.92 1892017-04-12 18:38 7.02 1852017-04-12 18:43 8.03 1752017-04-12 19:05 6.39 1022017-04-12 19:11 4.61 1262017-04-12 19:24 6.36 1882017-04-12 19:37 7.80 1122017-04-12 19:41 6.18 123
EpiCurve(DF,date = "UTS",period = "hour",split = 1,colors ="#339933",ylabel="Number of cases",xlabel=sprintf("From %s to %s", min(DF$UTS), max(DF$UTS)))
0
2
4
6
8
10
12
14
16
18
12 0
0:00
−00
:59
12 0
1:00
−01
:59
12 0
2:00
−02
:59
12 0
3:00
−03
:59
12 0
4:00
−04
:59
12 0
5:00
−05
:59
12 0
6:00
−06
:59
12 0
7:00
−07
:59
12 0
8:00
−08
:59
12 0
9:00
−09
:59
12 1
0:00
−10
:59
12 1
1:00
−11
:59
12 1
2:00
−12
:59
12 1
3:00
−13
:59
12 1
4:00
−14
:59
12 1
5:00
−15
:59
12 1
6:00
−16
:59
12 1
7:00
−17
:59
12 1
8:00
−18
:59
12 1
9:00
−19
:59
12 2
0:00
−20
:59
12 2
1:00
−21
:59
12 2
2:00
−22
:59
12 2
3:00
−23
:59
13 0
0:00
−00
:59
13 0
1:00
−01
:59
13 0
2:00
−02
:59
13 0
3:00
−03
:59
13 0
4:00
−04
:59
13 0
5:00
−05
:59
13 0
6:00
−06
:59
13 0
7:00
−07
:59
13 0
8:00
−08
:59
13 0
9:00
−09
:59
13 1
0:00
−10
:59
13 1
1:00
−11
:59
13 1
2:00
−12
:59
13 1
3:00
−13
:59
13 1
4:00
−14
:59
13 1
5:00
−15
:59
13 1
6:00
−16
:59
13 1
7:00
−17
:59
13 1
8:00
−18
:59
13 1
9:00
−19
:59
13 2
0:00
−20
:59
13 2
1:00
−21
:59
13 2
2:00
−22
:59
13 2
3:00
−23
:59
From 2017−04−12 16:31 to 2017−04−13 03:23
Num
ber
of c
ases
7
Hourly - non-aggregated cases with factors
DF <- read.csv("hourly_unaggregated_cases_factors.csv", stringsAsFactors=FALSE)kable(head(DF, 12))
UTS X1 X2 Confirmed2017-04-12 16:31 5.17 166 YES2017-04-12 17:35 8.69 101 YES2017-04-12 17:38 6.81 140 NO2017-04-12 18:06 4.95 120 NO2017-04-12 18:36 10.92 189 NO2017-04-12 18:38 7.02 185 YES2017-04-12 18:43 8.03 175 NO2017-04-12 19:05 6.39 102 NO2017-04-12 19:11 4.61 126 NO2017-04-12 19:24 6.36 188 YES2017-04-12 19:37 7.80 112 NO2017-04-12 19:41 6.18 123 NO
EpiCurve(DF,date = "UTS",period = "hour",split = 1,cutvar = "Confirmed",colors = c("#339933","#eebb00"),xlabel=sprintf("From %s to %s", min(DF$UTS), max(DF$UTS)))
02468
1012141618
12 0
0:00
−00
:59
12 0
1:00
−01
:59
12 0
2:00
−02
:59
12 0
3:00
−03
:59
12 0
4:00
−04
:59
12 0
5:00
−05
:59
12 0
6:00
−06
:59
12 0
7:00
−07
:59
12 0
8:00
−08
:59
12 0
9:00
−09
:59
12 1
0:00
−10
:59
12 1
1:00
−11
:59
12 1
2:00
−12
:59
12 1
3:00
−13
:59
12 1
4:00
−14
:59
12 1
5:00
−15
:59
12 1
6:00
−16
:59
12 1
7:00
−17
:59
12 1
8:00
−18
:59
12 1
9:00
−19
:59
12 2
0:00
−20
:59
12 2
1:00
−21
:59
12 2
2:00
−22
:59
12 2
3:00
−23
:59
13 0
0:00
−00
:59
13 0
1:00
−01
:59
13 0
2:00
−02
:59
13 0
3:00
−03
:59
13 0
4:00
−04
:59
13 0
5:00
−05
:59
13 0
6:00
−06
:59
13 0
7:00
−07
:59
13 0
8:00
−08
:59
13 0
9:00
−09
:59
13 1
0:00
−10
:59
13 1
1:00
−11
:59
13 1
2:00
−12
:59
13 1
3:00
−13
:59
13 1
4:00
−14
:59
13 1
5:00
−15
:59
13 1
6:00
−16
:59
13 1
7:00
−17
:59
13 1
8:00
−18
:59
13 1
9:00
−19
:59
13 2
0:00
−20
:59
13 2
1:00
−21
:59
13 2
2:00
−22
:59
13 2
3:00
−23
:59
From 2017−04−12 16:31 to 2017−04−13 03:23
Confirmed
YES
NO
8
With no squaresEpiCurve(DF,
date = "UTS",period = "hour",split = 1,cutvar = "Confirmed",colors = c("#339933","#eebb00"),xlabel=sprintf("From %s to %s", min(DF$UTS), max(DF$UTS)),square = FALSE)
0
2
4
6
8
10
12
14
16
18
12 0
0:00
−00
:59
12 0
1:00
−01
:59
12 0
2:00
−02
:59
12 0
3:00
−03
:59
12 0
4:00
−04
:59
12 0
5:00
−05
:59
12 0
6:00
−06
:59
12 0
7:00
−07
:59
12 0
8:00
−08
:59
12 0
9:00
−09
:59
12 1
0:00
−10
:59
12 1
1:00
−11
:59
12 1
2:00
−12
:59
12 1
3:00
−13
:59
12 1
4:00
−14
:59
12 1
5:00
−15
:59
12 1
6:00
−16
:59
12 1
7:00
−17
:59
12 1
8:00
−18
:59
12 1
9:00
−19
:59
12 2
0:00
−20
:59
12 2
1:00
−21
:59
12 2
2:00
−22
:59
12 2
3:00
−23
:59
13 0
0:00
−00
:59
13 0
1:00
−01
:59
13 0
2:00
−02
:59
13 0
3:00
−03
:59
13 0
4:00
−04
:59
13 0
5:00
−05
:59
13 0
6:00
−06
:59
13 0
7:00
−07
:59
13 0
8:00
−08
:59
13 0
9:00
−09
:59
13 1
0:00
−10
:59
13 1
1:00
−11
:59
13 1
2:00
−12
:59
13 1
3:00
−13
:59
13 1
4:00
−14
:59
13 1
5:00
−15
:59
13 1
6:00
−16
:59
13 1
7:00
−17
:59
13 1
8:00
−18
:59
13 1
9:00
−19
:59
13 2
0:00
−20
:59
13 2
1:00
−21
:59
13 2
2:00
−22
:59
13 2
3:00
−23
:59
From 2017−04−12 16:31 to 2017−04−13 03:23
Confirmed
YES
NO
9
Plot aggregated data
Daily
Without factors
date value2016-03-01 52016-03-03 52016-03-05 52016-03-06 12016-03-07 32016-03-14 42016-03-15 12016-03-16 102016-03-27 32016-03-28 22016-03-30 42016-03-31 22016-04-01 32016-04-03 12016-04-04 62016-04-07 62016-04-08 92016-04-09 152016-04-13 22016-04-14 12016-04-15 32016-04-16 62016-04-17 42016-04-18 172016-04-27 132016-04-29 32016-04-30 6
10
EpiCurve(DF,date = "date",freq = "value",period = "day",ylabel="Number of cases",xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),title = "Epidemic Curve",note = "Daily epidemic curve")
## Day## ’data.frame’: 61 obs. of 4 variables:## $ Freq: num 5 0 5 0 5 1 3 0 0 0 ...## $ Cut : chr "1 cas" NA "1 cas" NA ...## $ Day : chr "2016-03-01" "2016-03-02" "2016-03-03" "2016-03-04" ...## $ Date: Date, format: "2016-03-01" "2016-03-02" ...
02468
10121416
02−
29
03−
07
03−
14
03−
21
03−
28
04−
04
04−
11
04−
18
04−
25
05−
02
From 2016−03−01 to 2016−04−30
Daily epidemic curve
Num
ber
of c
ases
Epidemic Curve
11
With factors
date value factor2016-03-01 3 Validated Case2016-03-01 2 Unvalidated Case2016-03-03 4 Validated Case2016-03-03 1 Unvalidated Case2016-03-05 5 Validated Case2016-03-06 1 Unvalidated Case2016-03-07 2 Validated Case2016-03-07 1 Unvalidated Case2016-03-14 4 Validated Case2016-03-15 1 Validated Case2016-03-16 7 Validated Case2016-03-16 3 Unvalidated Case2016-03-27 3 Unvalidated Case2016-03-28 1 Validated Case2016-03-28 1 Unvalidated Case2016-03-30 4 Validated Case2016-03-31 2 Validated Case2016-04-01 3 Validated Case2016-04-03 1 Validated Case2016-04-04 2 Validated Case2016-04-04 4 Unvalidated Case2016-04-07 6 Unvalidated Case2016-04-08 9 Validated Case2016-04-09 7 Validated Case2016-04-09 8 Unvalidated Case2016-04-13 2 Validated Case2016-04-14 1 Validated Case2016-04-15 3 Validated Case2016-04-16 6 Validated Case2016-04-17 4 Validated Case2016-04-18 8 Validated Case2016-04-18 9 Unvalidated Case2016-04-27 11 Validated Case2016-04-27 2 Unvalidated Case2016-04-29 3 Validated Case2016-04-30 6 Validated Case
12
EpiCurve(DF,date = "date",freq = "value",cutvar = "factor",period = "day",ylabel="Number of cases",xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),title = "Epidemic Curve",note = "Daily epidemic curve")
## Day## ’data.frame’: 70 obs. of 4 variables:## $ Freq: num 3 2 0 4 1 0 5 1 2 1 ...## $ Cut : Factor w/ 2 levels "Unvalidated Case",..: 2 1 NA 2 1 NA 2 1 2 1 ...## $ Day : chr "2016-03-01" "2016-03-01" "2016-03-02" "2016-03-03" ...## $ Date: Date, format: "2016-03-01" "2016-03-01" ...
02468
10121416
02−
29
03−
07
03−
14
03−
21
03−
28
04−
04
04−
11
04−
18
04−
25
05−
02
From 2016−03−01 to 2016−04−30
Daily epidemic curve
Num
ber
of c
ases
factor
Validated Case
Unvalidated Case
Epidemic Curve
13
Weekly
Without factors
date value2016-W10 32016-W11 42016-W12 12016-W13 52016-W14 22016-W15 72016-W16 12016-W17 42016-W18 3
EpiCurve(DF,date = "date",freq = "value",period = "week",colors=c("#990000"),ylabel="Number of cases",xlabel=sprintf("Du %s au %s", min(DF$date), max(DF$date)),title = "Epidemic Curve\n")
0
1
2
3
4
5
6
7
2016
−W
10
2016
−W
11
2016
−W
12
2016
−W
13
2016
−W
14
2016
−W
15
2016
−W
16
2016
−W
17
2016
−W
18
Du 2016−W10 au 2016−W18
Num
ber
of c
ases
Epidemic Curve
14
With factors
date value factor2016-W10 3 Valid2016-W10 2 Invalid2016-W11 4 Valid2016-W12 1 Invalid2016-W13 5 Valid2016-W14 2 Valid2016-W14 1 Invalid2016-W15 7 Valid2016-W15 3 Invalid2016-W17 1 Valid2016-W17 1 Invalid2016-W18 4 Valid2016-W18 2 Invalid2016-W20 3 Valid
EpiCurve(DF,date = "date",freq = "value",period = "week",cutvar = "factor",colors=c("Blue", "Red"),ylabel="Cases",xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),title = "Epidemic Curve\n")
0
2
4
6
8
10
2016
−W
10
2016
−W
11
2016
−W
12
2016
−W
13
2016
−W
14
2016
−W
15
2016
−W
16
2016
−W
17
2016
−W
18
2016
−W
19
2016
−W
20
From 2016−W10 to 2016−W20
Cas
es
factor
Valid
Invalid
Epidemic Curve
15
Monthly
Without factors
date value2016-02 32016-03 42016-04 12016-05 52016-07 22016-08 72016-10 12016-11 42016-12 3
EpiCurve(DF,date = "date",freq = "value",period = "month",ylabel="Number of cases",xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),title = "Epidemic Curve\n")
0
1
2
3
4
5
6
7
2016
−02
2016
−03
2016
−04
2016
−05
2016
−06
2016
−07
2016
−08
2016
−09
2016
−10
2016
−11
2016
−12
From 2016−02 to 2016−12
Num
ber
of c
ases
Epidemic Curve
16
With factors
date value factor2016-02 3 Valid2016-02 2 Invalid2016-03 4 Valid2016-04 1 Invalid2016-05 5 Valid2016-06 2 Valid2016-06 1 Invalid2016-07 7 Valid2016-07 3 Invalid2016-09 1 Valid2016-09 1 Invalid2016-11 4 Valid2016-11 2 Invalid2016-12 3 Valid
EpiCurve(DF,date = "date",freq = "value",cutvar = "factor",period = "month",ylabel="Number of cases",xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),title = "Epidemic Curve\n")
0
2
4
6
8
10
2016
−02
2016
−03
2016
−04
2016
−05
2016
−06
2016
−07
2016
−08
2016
−09
2016
−10
2016
−11
2016
−12
From 2016−02 to 2016−12
Num
ber
of c
ases
factor
Valid
Invalid
Epidemic Curve
17
Converted period (aggragated cases)
“day” to “week”
date value2016-03-01 52016-03-03 52016-03-05 52016-03-06 12016-03-07 32016-03-14 42016-03-15 12016-03-16 102016-03-27 32016-03-28 22016-03-30 42016-03-31 22016-04-01 32016-04-03 12016-04-04 62016-04-07 62016-04-08 92016-04-09 152016-04-13 22016-04-14 12016-04-15 32016-04-16 62016-04-17 42016-04-18 172016-04-27 132016-04-29 32016-04-30 6
EpiCurve(DF,date = "date",freq = "value",period = "day",to.period = "week",ylabel="Number of cases",xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),title = "Epidemic Curve",note = "Daily epidemic curve")
18
0
5
10
15
20
25
30
35
2016
−W
0920
16−
W10
2016
−W
1120
16−
W12
2016
−W
1320
16−
W14
2016
−W
1520
16−
W16
2016
−W
17
From 2016−03−01 to 2016−04−30
Daily epidemic curve
Num
ber
of c
ases
Epidemic Curve
19
“day” to “month”
EpiCurve(DF,date = "date",freq = "value",period = "day",to.period = "month",ylabel="Number of cases",xlabel=sprintf("From %s o %s", min(DF$date), max(DF$date)),title = "Epidemic Curve",note = "Daily epidemic curve")
20
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
2016
−03
2016
−04
From 2016−03−01 o 2016−04−30
Daily epidemic curve
Num
ber
of c
ases
Epidemic Curve
21
“week” to “month”
date value2016-W10 32016-W11 42016-W12 12016-W13 52016-W14 22016-W15 72016-W16 12016-W17 42016-W18 3
EpiCurve(DF,date = "date",freq = "value",period = "week",to.period = "month",colors=c("#990000"),ylabel="Number of cases",xlabel=sprintf("Du %s au %s", min(DF$date), max(DF$date)),title = "Epidemic Curve\n")
0
2
4
6
8
10
12
14
2016
−03
2016
−04
2016
−05
Du 2016−W10 au 2016−W18
Num
ber
of c
ases
Epidemic Curve
22
“week” to “month” with factors
date value factor2016-W10 3 Valid2016-W10 2 Invalid2016-W11 4 Valid2016-W12 1 Invalid2016-W13 5 Valid2016-W14 2 Valid2016-W14 1 Invalid2016-W15 7 Valid2016-W15 3 Invalid2016-W17 1 Valid2016-W17 1 Invalid2016-W18 4 Valid2016-W18 2 Invalid2016-W20 3 Valid
EpiCurve(DF,date = "date",freq = "value",period = "week",to.period = "month",cutvar = "factor",colors=c("Blue", "Red"),ylabel="Cases",xlabel=sprintf("From %s to %s", min(DF$date), max(DF$date)),title = "Epidemic Curve\n")
23
0
2
4
6
8
10
12
14
2016
−03
2016
−04
2016
−05
From 2016−W10 to 2016−W20
Cas
es
factor
Valid
Invalid
Epidemic Curve
24