38
Outline Objectives Motivation Development of the ACF Plot Final Result Closing Denver User Group Creating an Autocorrelation Plot in ggplot2 Peter DeWitt [email protected] 18 January 2011 Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

Denver User Group

Creating an Autocorrelation Plot in ggplot2

Peter [email protected]

18 January 2011

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 2: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

1 Objectives

2 Motivation

3 Development of the ACF PlotThe Data SetFor One VariableA More General ACF Plot Function

4 Final ResultFinished qacf() FunctionExamplesMore Complex Examples

3 Series Example4 Series Example

One Variable Plots

5 ClosingThings to doSome Resources for ggplot2 and Sweave

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 3: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

Show the thought process and coding to make a customggplot2 style ACF plot.

Show some of the errors and solutions to them.Give an overview of the construction of the plots using layers.Show how to use the base functions to generate the dataneeded for the ggplot2 graphics.

Provide an example of creating a LATEX, in this case a Beamer,document using Sweave.

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 4: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

Need an Autocorrelation Plot

While working on a homework assignment for an introductoryBayesian course I needed an autocorrelation plot toinvestigate the convergence of a MCMC algorithm.

I had used ggplot2 for creating all the plots in the report, butwhen looking for the autocorrelation in the simulatedparameter values I found that there was no option in theggplot2 library for ACF Plots.

Not wanting to use the base acf() plot I though I shouldmake my own using ggplot2.

The function in this talk shows the one I used for theassignment and then a more generalized function I started towrite for future use.

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 5: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

The Data Set

The Data Set and some Syntax

The goal of the assignment was to find the posterior distributionfor two parameters θ and τ . The simulated values from the MCMCsimulations where stored in a data frame called params.

> dim(params)

[1] 25001 2

> head(params)

theta tau

1 0.01000000 1.0000000

2 0.01238948 0.7005268

3 0.01180436 0.6339757

4 0.01200773 0.7051600

5 0.01648895 0.5985906

6 0.01013832 0.6795824

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 6: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

For One Variable

> acf(with(params, theta))

0 10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

Series with(params, theta)

> acf(with(params, tau))

0 10 20 30 400.

00.

20.

40.

60.

81.

0

Lag

AC

F

Series with(params, tau)

Small note: the use of the with command is used here because of color

coding issues in the text editor I use and the LATEX command $.

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 7: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

For One Variable

Where to start?

The ACF Plots in the base package could be improved on.

What do I need to create one?

Calculate the correlation for each lag between 1 and, uh. . . , abig numberStore the different correlations in a data frame and then. . .

WAIT! Before wasting time to build a whole new function seewhat is already generated from the acf() function.

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 8: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

For One Variable

What is in the acf() function?

> p <- acf(with(params, theta), plot = FALSE)

> summary(p)

Length Class Mode

acf 44 -none- numeric

type 1 -none- character

n.used 1 -none- numeric

lag 44 -none- numeric

series 1 -none- character

snames 0 -none- NULL

> p

Autocorrelations of series aAYwith(params, theta)aAZ, by lag

0 1 2 3 4 5 6 7 8 9 10

1.000 0.563 0.332 0.200 0.111 0.064 0.037 0.021 0.009 0.000 -0.001

11 12 13 14 15 16 17 18 19 20 21

0.002 0.001 0.003 0.004 0.004 0.004 0.007 0.007 0.008 0.005 0.003

22 23 24 25 26 27 28 29 30 31 32

0.004 0.015 0.011 0.012 0.012 0.006 -0.001 -0.008 -0.012 -0.018 -0.015

33 34 35 36 37 38 39 40 41 42 43

-0.017 -0.014 -0.010 -0.009 -0.010 -0.003 0.003 0.008 0.011 0.014 0.006

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 9: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

For One Variable

Start of the acf plotData Manipulation and first steps with a bar plot

> baseACF <- with(p, data.frame(lag, acf))

> head(baseACF, 3)

lag acf

1 0 1.0000000

2 1 0.5630029

3 2 0.3322593

> qplot(lag, acf, data = baseACF, geom = "bar")

stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

Error in pmin(y, 0) : object 'y' not found

Fix this error by using the argument: stat_identity. This will keep the lags as is andwill not bin them together.

> qplot(lag, acf, data = baseACF, geom = "bar", stat = "identity")

Warning message:

Stacking not well defined when ymin != 0

Fix this error with the argument: position_identity.

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 10: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

For One Variable

First Iteration of the ggplot2 style ACF Plot

> q <- qplot(x = lag, y = acf, data = baseACF, geom = "bar", stat = "identity",

+ position = "identity")

> print(q)

lag

acf

0.0

0.2

0.4

0.6

0.8

1.0

0 10 20 30 40

> acf(with(params, theta))

0 10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

Series with(params, theta)

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 11: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

For One Variable

Adding Confidence Intervals

It will be beneficial to start writing a function.

> qacf <- function(x, conf.level = 0.95) {

+ ciline <- qnorm((1 - conf.level)/2)/sqrt(length(x))

+ bacf <- acf(x, plot = FALSE)

+ bacfdf <- with(bacf, data.frame(lag, acf))

+ q <- qplot(lag, acf, data = bacfdf, geom = "bar", stat = "identity",

+ position = "identity", ylab = "Autocorrelation")

+ q <- q + geom_hline(yintercept = -ciline, color = "blue",

+ size = 0.2)

+ q <- q + geom_hline(yintercept = ciline, color = "blue",

+ size = 0.2)

+ q <- q + geom_hline(yintercept = 0, color = "red", size = 0.3)

+ return(q)

+ }

The syntax has each layer added to the plot one at a time. Sweave does

not preserve linebreaks in a single line of code. This is because of the use

of the parse() and deparse() functions calls in .

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 12: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

For One Variable

What does it look like now?

> p <- qacf(with(params, theta))

> print(p)

lag

Aut

ocor

rela

tion

0.0

0.2

0.4

0.6

0.8

1.0

0 10 20 30 40

> acf(with(params, theta))

0 10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

Series with(params, theta)

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 13: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

For One Variable

How many lags to display?

How many lags should be displayed? There is a default setting inthe acf() function.A few lines of code from the base acf function:

> acf

function (x, lag.max = NULL, type = c("correlation", "covariance",

"partial"), plot = TRUE, na.action = na.fail, demean = TRUE,

...)

{

sampleT <- nrow(x)

nser <- ncol(x)

if (is.null(lag.max))

lag.max <- floor(10 * (log10(sampleT) - log10(nser)))

lag.max <- min(lag.max, sampleT - 1)

if (lag.max < 0)

stop("'lag.max' must be at least 0")

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 14: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

For One Variable

What about the lags?

> qacf <- function(x, conf.level = 0.95, max.lag = NULL, min.lag = 0) {

+ ciline <- qnorm((1 - conf.level)/2)/sqrt(length(x))

+ bacf <- acf(x, plot = FALSE, lag.max = max.lag)

+ bacfdf <- with(bacf, data.frame(lag, acf))

+ if (min.lag > 0) {

+ bacfdf <- bacfdf[-seq(1, min.lag), ]

+ }

+ q <- qplot(lag, acf, data = bacfdf, geom = "bar", stat = "identity",

+ position = "identity", ylab = "Autocorrelation")

+ q <- q + geom_hline(yintercept = -ciline, color = "blue",

+ size = 0.2)

+ q <- q + geom_hline(yintercept = ciline, color = "blue",

+ size = 0.2)

+ q <- q + geom_hline(yintercept = 0, color = "red", size = 0.3)

+ return(q)

+ }

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 15: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

For One Variable

A Couple Quick ExamplesExample 1

> p <- qacf(with(params, theta), conf.level = 0.99, max.lag = 15)

> print(p)

lag

Aut

ocor

rela

tion

0.0

0.2

0.4

0.6

0.8

1.0

0 5 10 15

> acf(with(params, theta), ci = 0.99, lag.max = 15)

0 5 10 15

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

Series with(params, theta)

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 16: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

For One Variable

A Couple Quick ExamplesExample 2

> p <- qacf(with(params, theta), conf.level = 0.98, min.lag = 1,

+ max.lag = 15)

> print(p)

lag

Aut

ocor

rela

tion

0.0

0.1

0.2

0.3

0.4

0.5

2 4 6 8 10 12 14

> p <- p + ylim(-0.1, 0.75)

> print(p)

lag

Aut

ocor

rela

tion

0.0

0.2

0.4

0.6

2 4 6 8 10 12 14

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 17: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

For One Variable

Which Lags are Significance?

Mark the lags which are significantly different from 0. The abilityto added a title to the plot has also been added.> qacf <- function(x, conf.level = 0.95, max.lag = NULL, min.lag = 0,

+ title = "") {

+ ciline <- qnorm((1 - conf.level)/2)/sqrt(length(x))

+ bacf <- acf(x, plot = FALSE, lag.max = max.lag)

+ bacfdf <- with(bacf, data.frame(lag, acf))

+ if (min.lag > 0) {

+ bacfdf <- bacfdf[-seq(1, min.lag), ]

+ }

+ significant <- (abs(bacfdf[, 2]) > abs(ciline))^2

+ bacfdf <- cbind(bacfdf, significant)

+ q <- qplot(lag, acf, data = bacfdf, geom = "bar", stat = "identity",

+ position = "identity", ylab = "Autocorrelation", main = title,

+ fill = factor(significant))

+ q <- q + geom_hline(yintercept = -ciline, color = "blue",

+ size = 0.2)

+ q <- q + geom_hline(yintercept = ciline, color = "blue",

+ size = 0.2)

+ q <- q + geom_hline(yintercept = 0, color = "red", size = 0.3)

+ q <- q + scale_fill_hue(name = paste("Significant at the\n",

+ conf.level, "level"), breaks = 0:1, labels = c("False",

+ "True"))

+ return(q)

+ }

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 18: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

For One Variable

Example

> p <- qacf(with(params, theta), title = expression(paste("ACF Plot for ",

+ theta)))

> print(p)

ACF Plot for θ

lag

Aut

ocor

rela

tion

0.0

0.2

0.4

0.6

0.8

1.0

0 10 20 30 40

Significant at the 0.95 level

False

True

ACF Plot for θ

lag

Aut

ocor

rela

tion

0.0

0.2

0.4

0.6

0.8

1.0

0 2 4 6 8 10 12

Significant at the 0.95 level

False

True

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 19: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

A More General ACF Plot Function

ACF plots for more than one variable

The qacf() function so far willonly work for a single vector.

What if we are interested in adata frame with more than oneseries in it? The plot shown onthe right is the base acf()

plot for two series.

How can we create a similarplot in ggplot2?

Use some datamanipulation and a fewtweaks to the qacf()

function.

> acf(params)

0 10 20 30 40

−0.

50.

00.

51.

0

Lag

AC

F

theta

0 10 20 30 40

−0.

50.

00.

51.

0

Lag

theta & tau

−40 −30 −20 −10 0

−0.

50.

00.

51.

0

Lag

AC

F

tau & theta

0 10 20 30 40

−0.

50.

00.

51.

0

Lag

tau

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 20: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

A More General ACF Plot Function

What do we get from the base acf() function?

> summary(acf(params, plot = FALSE))

Length Class Mode

acf 164 -none- numeric

type 1 -none- character

n.used 1 -none- numeric

lag 164 -none- numeric

series 1 -none- character

snames 2 -none- character

> with(acf(params, plot = FALSE), acf)

, , 1

[,1] [,2]

[1,] 1.0000000000 -0.7111475314

[2,] 0.5630029497 -0.4355914620

[3,] 0.3322593196 -0.2702013561

[4,] 0.2000358269 -0.1619737468

[5,] 0.1106916075 -0.0954319788

[6,] 0.0639472781 -0.0532004509

[7,] 0.0370311938 -0.0336147073

[8,] 0.0209297569 -0.0174571412

[9,] 0.0090523897 -0.0063593840

[10,] -0.0003122346 0.0005081987

[11,] -0.0012644767 0.0011341353

[12,] 0.0023562378 -0.0020786249

[13,] 0.0007609590 -0.0015092209

[14,] 0.0034975304 -0.0037055447

[15,] 0.0044867895 -0.0101140986

[16,] 0.0042916448 -0.0074159149

[17,] 0.0044016651 -0.0075213010

[18,] 0.0071416469 -0.0132521103

[19,] 0.0068978391 -0.0169607774

[20,] 0.0079556824 -0.0144812212

[21,] 0.0053649637 -0.0101400927

[22,] 0.0031969812 -0.0068433998

[23,] 0.0037121955 -0.0039823442

[24,] 0.0151086739 -0.0040824815

[25,] 0.0107885610 -0.0021925182

[26,] 0.0116725589 -0.0030037089

[27,] 0.0122952747 0.0018921981

[28,] 0.0061210449 0.0025381202

[29,] -0.0013522764 0.0097648077

[30,] -0.0079829400 0.0125894193

[31,] -0.0124203881 0.0144607267

[32,] -0.0180933406 0.0096532545

[33,] -0.0152697304 0.0148591984

[34,] -0.0167297588 0.0161117846

[35,] -0.0144280384 0.0073966086

[36,] -0.0098165547 0.0016020995

[37,] -0.0087571450 0.0012412412

[38,] -0.0102627629 0.0034002519

[39,] -0.0028486640 -0.0024740142

[40,] 0.0025860819 -0.0076409806

[41,] 0.0075348356 -0.0105179622

, , 2

[,1] [,2]

[1,] -0.7111475314 1.0000000000

[2,] -0.7150865781 0.6066417025

[3,] -0.4358645801 0.3748833509

[4,] -0.2687528106 0.2283673415

[5,] -0.1661994254 0.1377873449

[6,] -0.0954677885 0.0821213527

[7,] -0.0515022143 0.0462729553

[8,] -0.0286139943 0.0294912747

[9,] -0.0163796861 0.0133459750

[10,] -0.0049572500 0.0011774312

[11,] 0.0042247851 -0.0044597932

[12,] 0.0032366721 -0.0033555696

[13,] 0.0007301501 0.0005701342

[14,] 0.0023295880 -0.0051499948

[15,] 0.0008874252 0.0030868415

[16,] -0.0062277067 0.0123283351

[17,] -0.0147545652 0.0154230543

[18,] -0.0103192202 0.0124411284

[19,] -0.0064722304 0.0142430908

[20,] -0.0026188694 0.0119120241

[21,] -0.0013944064 0.0071459158

[22,] 0.0025763453 0.0093446299

[23,] -0.0043223782 0.0144876606

[24,] -0.0168617225 0.0111589897

[25,] -0.0190781913 0.0111929427

[26,] -0.0151178655 0.0040777423

[27,] -0.0101731809 0.0010981319

[28,] -0.0067579135 0.0051610565

[29,] -0.0059629999 0.0013550177

[30,] -0.0045449954 -0.0011250824

[31,] -0.0017097920 -0.0048490327

[32,] 0.0041384910 -0.0037383189

[33,] 0.0105563035 -0.0066771356

[34,] 0.0121207002 -0.0108815611

[35,] 0.0129820946 -0.0118054072

[36,] 0.0123304659 -0.0050040737

[37,] 0.0094296668 -0.0047287260

[38,] 0.0132117650 -0.0037253249

[39,] 0.0090078428 -0.0072689496

[40,] 0.0052783533 -0.0007635038

[41,] 0.0011076210 0.0033790422

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 21: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

A More General ACF Plot Function

What do we get from the base acf() function?

First bit of good news, the names for the object returned by theacf() function are the same as when only one series was passedinto the function.

acf: a 3D array with the numeric values for the type of plot.The [

”1] is the first column of plots in the produced graphic.

type: is the plot for correlation? covariance? or a partialautocorrelation plot?n.used: length of the serieslag: a vector of the lags to be plotted with the correct valuestored in acf.series: name of the object passed into the acf() function.snames: names of the columns in the data frame passed intothe acf() function.

By manipulating the data in acf and lag into a new data frame weshould be able to generate a similar graphic using ggplo2.

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 22: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

Finished qacf() Function

The finished qacf() function

The code for the qacf() function has change quite a bit toaccount for the data sets and some plotting options.

The final code for the acf() function is too long for aBeamer slide. We’ll look over the code in the .R file extractedby the Sweave function Stangle().

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 23: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

Examples

qacf() vs. acf()

> p <- qacf(params, show.sig = TRUE)

> print(p)

params

lag

corr

elat

ion

−0.5

0.0

0.5

1.0

−0.5

0.0

0.5

1.0

theta, theta

0 10 20 30 40tau, theta

−40 −30 −20 −10 0

theta, tau

0 10 20 30 40tau, tau

0 10 20 30 40

Significant at the 0.95 level

False

True

> acf(params)

0 10 20 30 40

−0.

50.

00.

51.

0

Lag

AC

F

theta

0 10 20 30 40

−0.

50.

00.

51.

0

Lag

theta & tau

−40 −30 −20 −10 0

−0.

50.

00.

51.

0

Lag

AC

F

tau & theta

0 10 20 30 40

−0.

50.

00.

51.

0

Lag

tau

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 24: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

Examples

qacf() vs. acf()

> p <- qacf(params, lag.max = 15, show.sig = TRUE)

> print(p)

params

lag

corr

elat

ion

−0.5

0.0

0.5

1.0

−0.5

0.0

0.5

1.0

theta, theta

0 5 10 15tau, theta

−15 −10 −5 0

theta, tau

0 5 10 15tau, tau

0 5 10 15

Significant at the 0.95 level

False

True

> acf(params, lag.max = 15)

0 5 10 15

−0.

50.

00.

51.

0

Lag

AC

F

theta

0 5 10 15

−0.

50.

00.

51.

0

Lag

theta & tau

−15 −10 −5 0

−0.

50.

00.

51.

0

Lag

AC

F

tau & theta

0 5 10 15

−0.

50.

00.

51.

0

Lag

tau

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 25: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

More Complex Examples

A random set of three series to use:> set.seed(42)

> n <- 250

> alpha <- c(5)

> beta <- c(5)

> gamma <- c(5)

> Z.1 <- rnorm(n, 0, 1)

> Z.2 <- rnorm(n, 0, 2)

> Z.3 <- rnorm(n, 0, 5)

> for (i in 2:n) {

+ alpha[i] <- alpha[i - 1] + 2 * Z.1[i] + 3.14 * Z.1[i - 1]

+ beta[i] <- beta[i - 1] - 2 * Z.2[i] + Z.2[i - 1]

+ gamma[i] <- gamma[i - 1] * 0.2 * Z.3[i] + 4.31 * Z.3[i -

+ 1]

+ }

> testdf <- data.frame(alpha, beta, gamma)

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 26: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

More Complex Examples

> p <- qacf(testdf)

> print(p)

testdf

lag

corr

elat

ion

−0.20.00.20.40.60.81.0

−0.20.00.20.40.60.81.0

−0.20.00.20.40.60.81.0

alpha, alpha

0 5 10 15beta, alpha

−15 −10 −5 0gamma, alpha

−15 −10 −5 0

alpha, beta

0 5 10 15beta, beta

0 5 10 15gamma, beta

−15 −10 −5 0

alpha, gamma

0 5 10 15beta, gamma

0 5 10 15gamma, gamma

0 5 10 15

> acf(testdf)

0 5 10 15

−0.

20.

20.

61.

0

Lag

AC

F

alpha

0 5 10 15

−0.

20.

20.

61.

0

Lag

alph & beta

0 5 10 15

−0.

20.

20.

61.

0

Lag

alph & gamm

−15 −10 −5 0

−0.

20.

20.

61.

0

Lag

AC

F

beta & alph

0 5 10 15

−0.

20.

20.

61.

0

Lag

beta

0 5 10 15

−0.

20.

20.

61.

0

Lag

beta & gamm

−15 −10 −5 0

−0.

20.

20.

61.

0

Lag

AC

F

gamm & alph

−15 −10 −5 0

−0.

20.

20.

61.

0

Lag

gamm & beta

0 5 10 15

−0.

20.

20.

61.

0

Lag

gamma

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 27: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

More Complex Examples

> p <- qacf(testdf, show.sig = TRUE)

> print(p)

testdf

lag

corr

elat

ion

−0.20.00.20.40.60.81.0

−0.20.00.20.40.60.81.0

−0.20.00.20.40.60.81.0

alpha, alpha

0 5 10 15beta, alpha

−15 −10 −5 0gamma, alpha

−15 −10 −5 0

alpha, beta

0 5 10 15beta, beta

0 5 10 15gamma, beta

−15 −10 −5 0

alpha, gamma

0 5 10 15beta, gamma

0 5 10 15gamma, gamma

0 5 10 15

Significant at the 0.95 level

False

True

> acf(testdf)

0 5 10 15

−0.

20.

20.

61.

0

Lag

AC

F

alpha

0 5 10 15

−0.

20.

20.

61.

0

Lag

alph & beta

0 5 10 15

−0.

20.

20.

61.

0

Lag

alph & gamm

−15 −10 −5 0

−0.

20.

20.

61.

0

Lag

AC

F

beta & alph

0 5 10 15

−0.

20.

20.

61.

0

Lag

beta

0 5 10 15

−0.

20.

20.

61.

0

Lag

beta & gamm

−15 −10 −5 0

−0.

20.

20.

61.

0

Lag

AC

F

gamm & alph

−15 −10 −5 0

−0.

20.

20.

61.

0

Lag

gamm & beta

0 5 10 15

−0.

20.

20.

61.

0

Lag

gamma

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 28: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

More Complex Examples

> p <- qacf(testdf, type = "covariance")

> print(p)

testdf

lag

cova

rianc

e

0100020003000400050006000

0100020003000400050006000

0100020003000400050006000

alpha, alpha

0 5 10 15beta, alpha

−15 −10 −5 0gamma, alpha

−15 −10 −5 0

alpha, beta

0 5 10 15beta, beta

0 5 10 15gamma, beta

−15 −10 −5 0

alpha, gamma

0 5 10 15beta, gamma

0 5 10 15gamma, gamma

0 5 10 15

> acf(testdf, type = "covariance")

0 5 10 15

020

0060

00

Lag

AC

F (

cov)

alpha

0 5 10 15

020

0060

00

Lag

alph & beta

0 5 10 15

020

0060

00

Lag

alph & gamm

−15 −10 −5 0

020

0060

00

Lag

AC

F (

cov)

beta & alph

0 5 10 15

020

0060

00

Lag

beta

0 5 10 15

020

0060

00

Lag

beta & gamm

−15 −10 −5 0

020

0060

00

Lag

AC

F (

cov)

gamm & alph

−15 −10 −5 0

020

0060

00

Lag

gamm & beta

0 5 10 15

020

0060

00

Lag

gamma

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 29: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

More Complex Examples

> p <- qacf(testdf, type = "partial")

> print(p)

testdf

lag

part

ial

−3−2−1

0123

−3−2−1

0123

−3−2−1

0123

alpha, alpha

5 10 15beta, alpha

−15 −10 −5gamma, alpha

−15 −10 −5

alpha, beta

5 10 15beta, beta

5 10 15gamma, beta

−15 −10 −5

alpha, gamma

5 10 15beta, gamma

5 10 15gamma, gamma

5 10 15

> acf(testdf, type = "partial")

5 10 15

−3

−1

12

3

Lag

Par

tial A

CF

alpha

5 10 15

−3

−1

12

3

Lag

alph & beta

5 10 15

−3

−1

12

3

Lag

alph & gamm

−15 −10 −5

−3

−1

12

3

Lag

Par

tial A

CF

beta & alph

5 10 15

−3

−1

12

3

Lag

beta

5 10 15

−3

−1

12

3

Lag

beta & gamm

−15 −10 −5

−3

−1

12

3

Lag

Par

tial A

CF

gamm & alph

−15 −10 −5

−3

−1

12

3

Lag

gamm & beta

5 10 15

−3

−1

12

3

Lag

gamma

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 30: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

More Complex Examples

> p <- qacf(testdf, type = "partial", show.sig = TRUE)

> print(p)

testdf

lag

part

ial

−3−2−1

0123

−3−2−1

0123

−3−2−1

0123

alpha, alpha

5 10 15beta, alpha

−15 −10 −5gamma, alpha

−15 −10 −5

alpha, beta

5 10 15beta, beta

5 10 15gamma, beta

−15 −10 −5

alpha, gamma

5 10 15beta, gamma

5 10 15gamma, gamma

5 10 15

Significant at the 0.95 level

False

True

> acf(testdf, type = "partial")

5 10 15

−3

−1

12

3

Lag

Par

tial A

CF

alpha

5 10 15

−3

−1

12

3

Lag

alph & beta

5 10 15

−3

−1

12

3

Lag

alph & gamm

−15 −10 −5

−3

−1

12

3

Lag

Par

tial A

CF

beta & alph

5 10 15

−3

−1

12

3

Lag

beta

5 10 15

−3

−1

12

3

Lag

beta & gamm

−15 −10 −5

−3

−1

12

3

Lag

Par

tial A

CF

gamm & alph

−15 −10 −5

−3

−1

12

3

Lag

gamm & beta

5 10 15

−3

−1

12

3

Lag

gamma

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 31: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

More Complex Examples

4 Series Example

> n <- 250

> alpha <- c(5)

> beta <- c(5)

> gamma <- c(5)

> delta <- c(5)

> Z.1 <- rnorm(n, 0, 1)

> Z.2 <- rnorm(n, 0, 2)

> Z.3 <- rnorm(n, 0, 5)

> for (i in 2:n) {

+ alpha[i] <- alpha[i - 1] + Z.1[i] - Z.1[i - 1] + delta[i -

+ 1] - beta[i - 1]

+ beta[i] <- beta[i - 1] - 2 * Z.2[i] + Z.2[i - 1] - delta[i -

+ 1]

+ gamma[i] <- gamma[i - 1] + beta[i - 1] + 0.2 * Z.3[i] + Z.3[i -

+ 1]

+ delta[i] <- delta[i - 1] + runif(1, 0.5, 1.5) * delta[i -

+ 1]

+ }

> testdf <- data.frame(alpha, beta, gamma, delta)

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 32: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

More Complex Examples

> p <- qacf(testdf)

> print(p)

testdf

lag

corr

elat

ion

−0.50.00.51.0

−0.50.00.51.0

−0.50.00.51.0

−0.50.00.51.0

alpha, alpha

0 5 10 15beta, alpha

−15 −10 −5 0gamma, alpha

−15 −10 −5 0delta, alpha

−15 −10 −5 0

alpha, beta

0 5 10 15beta, beta

0 5 10 15gamma, beta

−15 −10 −5 0delta, beta

−15 −10 −5 0

alpha, gamma

0 5 10 15beta, gamma

0 5 10 15gamma, gamma

0 5 10 15delta, gamma

−15 −10 −5 0

alpha, delta

0 5 10 15beta, delta

0 5 10 15gamma, delta

0 5 10 15delta, delta

0 5 10 15

> acf(testdf)

0 5 10 15

−1.

00.

01.

0

Lag

AC

F

alpha

0 5 10 15

−1.

00.

01.

0

Lag

alph & beta

0 5 10 15

−1.

00.

01.

0

Lag

alph & gamm

0 5 10 15

−1.

00.

01.

0

Lag

alph & delt

−15 −10 −5 0

−1.

00.

01.

0

Lag

AC

F

beta & alph

0 5 10 15

−1.

00.

01.

0

Lag

beta

0 5 10 15

−1.

00.

01.

0

Lag

beta & gamm

0 5 10 15

−1.

00.

01.

0

Lag

beta & delt

−15 −10 −5 0

−1.

00.

01.

0

Lag

AC

F

gamm & alph

−15 −10 −5 0

−1.

00.

01.

0

Lag

gamm & beta

0 5 10 15

−1.

00.

01.

0

Lag

gamma

0 5 10 15

−1.

00.

01.

0

Lag

gamm & delt

−15 −10 −5 0

−1.

00.

01.

0

Lag

AC

Fdelt & alph

−15 −10 −5 0

−1.

00.

01.

0

Lag

delt & beta

−15 −10 −5 0

−1.

00.

01.

0

Lag

delt & gamm

0 5 10 15

−1.

00.

01.

0

Lag

delta

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 33: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

More Complex Examples

> p <- qacf(testdf, show.sig = TRUE)

> print(p)

testdf

lag

corr

elat

ion

−0.50.00.51.0

−0.50.00.51.0

−0.50.00.51.0

−0.50.00.51.0

alpha, alpha

0 5 10 15beta, alpha

−15−10 −5 0gamma, alpha

−15−10 −5 0delta, alpha

−15−10 −5 0

alpha, beta

0 5 10 15beta, beta

0 5 10 15gamma, beta

−15−10 −5 0delta, beta

−15−10 −5 0

alpha, gamma

0 5 10 15beta, gamma

0 5 10 15gamma, gamma

0 5 10 15delta, gamma

−15−10 −5 0

alpha, delta

0 5 10 15beta, delta

0 5 10 15gamma, delta

0 5 10 15delta, delta

0 5 10 15

Significant at the 0.95 level

False

True

> acf(testdf)

0 5 10 15

−1.

00.

01.

0

Lag

AC

F

alpha

0 5 10 15

−1.

00.

01.

0

Lag

alph & beta

0 5 10 15

−1.

00.

01.

0

Lag

alph & gamm

0 5 10 15

−1.

00.

01.

0

Lag

alph & delt

−15 −10 −5 0

−1.

00.

01.

0

Lag

AC

F

beta & alph

0 5 10 15

−1.

00.

01.

0

Lag

beta

0 5 10 15

−1.

00.

01.

0

Lag

beta & gamm

0 5 10 15

−1.

00.

01.

0

Lag

beta & delt

−15 −10 −5 0

−1.

00.

01.

0

Lag

AC

F

gamm & alph

−15 −10 −5 0

−1.

00.

01.

0

Lag

gamm & beta

0 5 10 15

−1.

00.

01.

0

Lag

gamma

0 5 10 15

−1.

00.

01.

0

Lag

gamm & delt

−15 −10 −5 0

−1.

00.

01.

0

Lag

AC

Fdelt & alph

−15 −10 −5 0

−1.

00.

01.

0

Lag

delt & beta

−15 −10 −5 0

−1.

00.

01.

0

Lag

delt & gamm

0 5 10 15

−1.

00.

01.

0

Lag

delta

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 34: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

One Variable Plots

One Variable PlotsAutocorrelation

> p <- with(params, qacf(theta))

> print(p)

theta

lag

corr

elat

ion

0.0

0.2

0.4

0.6

0.8

1.0

0 10 20 30 40

> with(params, acf(theta))

0 10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

Series theta

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 35: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

One Variable Plots

One Variable PlotsPartial Autocorrelation

> p <- with(params, qacf(theta, type = "partial", show.sig = TRUE))

> print(p)

theta

lag

part

ial

0.0

0.1

0.2

0.3

0.4

0.5

10 20 30 40

Significant at the 0.95 level

False

True

> with(params, acf(theta, type = "partial"))

0 10 20 30 40

0.0

0.1

0.2

0.3

0.4

0.5

Lag

Par

tial A

CF

Series theta

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 36: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

One Variable Plots

One Variable PlotsCovariance

> p <- with(params, qacf(theta, type = "covariance"))

> p <- p + opts(title = expression(paste("Covariance plot for ",

+ theta)))

> print(p)

Covariance plot for θ

lag

cova

rianc

e

0e+00

1e−06

2e−06

3e−06

4e−06

5e−06

6e−06

7e−06

0 10 20 30 40

> with(params, acf(theta, type = "covariance"))

0 10 20 30 40

0e+

002e

−06

4e−

066e

−06

Lag

AC

F (

cov)

Series theta

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 37: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

Things to do

Add error handling

Add control on the facet labels - this is functionality needed inggplot2 before it can be added to my qacf() function.

Suppress the “Using as id variables” message generated fromthe melt() function.

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2

Page 38: Denver User Group - Meetup3 Development of the ACF Plot The Data Set For One Variable A More General ACF Plot Function 4 Final Result Finished qacf() Function Examples More Complex

Outline Objectives Motivation Development of the ACF Plot Final Result Closing

Some Resources for ggplot2 and Sweave

For ggplot2

ggplot2 Google Group: groups.google.com/group/ggplot2

Hadley Wickham’s website: http://had.co.nz/ggplot2

“ggplot2 Elegant Graphics for Data Analysis”, by HadleyWickham, Springer, 2009

For Sweave

User manual is actually very helpful for learning the options.Best to see examples from other users and trial and error.Remember that when using Sweave the code chunks will notwork in many of the different environments in LATEXunless youhave made the environment fragile.Great example on R-Bloggers: http:

//pineda-krch.com/2011/01/17/the-joy-of-sweave

Peter DeWitt [email protected] Creating an Autocorrelation Plot in ggplot2