14
An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: [email protected] 1

An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: [email protected] 1

Embed Size (px)

Citation preview

Page 1: An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: mreaston@ucdavis.edu 1

An Introduction to R

POL 51October 13, 2008Malcolm EastonOffice hours: Thursday 3:30-5:30 in Rm 245Email: [email protected]

1

Page 2: An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: mreaston@ucdavis.edu 1

Section Outline

• What is R?• Presentation of some R basics• Set of exercises done in class if time permits

2

Page 3: An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: mreaston@ucdavis.edu 1

What can R do?

• R can act as an overgrown calculator• R can run some very high end statistical

models

3

Page 4: An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: mreaston@ucdavis.edu 1

Loading R

• Go to Class website: http://psfaculty.ucdavis.edu/bsjjones/pol51fall2008.html

• Follow instructions to load R

4

Page 5: An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: mreaston@ucdavis.edu 1

Why R is so user-friendly

• R is an object oriented environment!• Everything is an object. You write code to

create objects or variables and then define how these objects relate to each other.

• xbar<-mean(x)• xbar=mean(x)• You create a variable name (xbar)and then

you describe its “value”, in this case the mean of whatever x is.

5

Page 6: An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: mreaston@ucdavis.edu 1

Inputting data manually

• It is easy!• weight<-c(60, 72, 57, 90, 95, 72)

• c(…) is used to define a vector of numbers.• You can generate a matrix (two-dimensional

array of numbers—rows and columns) by binding two vectors together.

• height<-c(6, 5, 7, 5, 7, 5)

6

Page 7: An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: mreaston@ucdavis.edu 1

cbind and rbind

• You can now “glue” both of these vectors together either as two columns with the variable name on top or two rows with the variable names on the left hand side.

• xmat1<-cbind(weight, height)• xmat2<-rbind(weight, height)

7

Page 8: An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: mreaston@ucdavis.edu 1

This is what it looks like• xmat1=cbind(weight, height)• xmat1• weight height• [1,] 60 6• [2,] 72 5• [3,] 57 7• [4,] 90 5• [5,] 95 7• [6,] 72 5• xmat2=rbind(weight, height)• xmat2• [,1] [,2] [,3] [,4] [,5] [,6]• weight 60 72 57 90 95 72• height 6 5 7 5 7 5

8

Page 9: An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: mreaston@ucdavis.edu 1

How to ask for help• What if you want to use data that is provided on

the website? Specifically the Congressional Control Pricing data in excel format.

• You have to “read” the data into R.• Well, if you are a beginner just ask R how to read

in data.• Ex: ?read.table• If you cannot be specific (why did he

write .table?) you can use a built-in search by typing help.search(“table”)

9

Page 10: An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: mreaston@ucdavis.edu 1

Ok, now this is how you read a .csv file into R

• General code: data.set<-read.table(“C:/file name.ext”, header=TRUE)

• Note another oddity about R, you must use forward slashes.

• congress<-read.csv("C:/Documents and Settings/Malcolm Easton/Desktop/Pol 51/congressprice.csv", header=TRUE)

• Also notice that header=TRUE is just telling R that the first line is a header containing the names of variables in the file.

10

Page 11: An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: mreaston@ucdavis.edu 1

Now you can play with summary statistics!

• summary(congress$rhdsprice)

• Or if writing all of that is a pain you can choose to assign that path to an object.

• price1<-congress$rhdsprice• summary(price1)

• Or easier still, just “attach” your data set which tells R to look for objects among the variables in a given data frame.

11

Page 12: An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: mreaston@ucdavis.edu 1

Attaching a data frame

• If you type attach(congress)you can now summarize your data by typing summary(rhdsprice)

12

Page 13: An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: mreaston@ucdavis.edu 1

More fun with summary stats

• The summary() function gives you some basic summary stats, but you can get more specific if you like.

• Mean: mean(rhdsprice)• Median: median(rhdsprice)• Maximum: max(rhdsprice)• Minimum: min(rhdsprice)

13

Page 14: An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm 245 Email: mreaston@ucdavis.edu 1

Is that all R can do?• No, that is just the tip of the iceberg.• You can code functions into R or use a large

number of pre-coded functions.• You can use R to calculate the variance, and

standard deviation of a variable as well as a slew of graphical options as well.

• Built in code: var(rhdsprice)• Manually coding: n=length(rhdsprice)• Manual.var=(1/(n-1))*sum((rhdsprice-mean(rhdsprice))^2)

14