An Introduction to R POL 51 October 13, 2008 Malcolm Easton Office hours: Thursday 3:30-5:30 in Rm...

Preview:

Citation preview

An Introduction to R

POL 51October 13, 2008Malcolm EastonOffice hours: Thursday 3:30-5:30 in Rm 245Email: mreaston@ucdavis.edu

1

Section Outline

• What is R?• Presentation of some R basics• Set of exercises done in class if time permits

2

What can R do?

• R can act as an overgrown calculator• R can run some very high end statistical

models

3

Loading R

• Go to Class website: http://psfaculty.ucdavis.edu/bsjjones/pol51fall2008.html

• Follow instructions to load R

4

Why R is so user-friendly

• R is an object oriented environment!• Everything is an object. You write code to

create objects or variables and then define how these objects relate to each other.

• xbar<-mean(x)• xbar=mean(x)• You create a variable name (xbar)and then

you describe its “value”, in this case the mean of whatever x is.

5

Inputting data manually

• It is easy!• weight<-c(60, 72, 57, 90, 95, 72)

• c(…) is used to define a vector of numbers.• You can generate a matrix (two-dimensional

array of numbers—rows and columns) by binding two vectors together.

• height<-c(6, 5, 7, 5, 7, 5)

6

cbind and rbind

• You can now “glue” both of these vectors together either as two columns with the variable name on top or two rows with the variable names on the left hand side.

• xmat1<-cbind(weight, height)• xmat2<-rbind(weight, height)

7

This is what it looks like• xmat1=cbind(weight, height)• xmat1• weight height• [1,] 60 6• [2,] 72 5• [3,] 57 7• [4,] 90 5• [5,] 95 7• [6,] 72 5• xmat2=rbind(weight, height)• xmat2• [,1] [,2] [,3] [,4] [,5] [,6]• weight 60 72 57 90 95 72• height 6 5 7 5 7 5

8

How to ask for help• What if you want to use data that is provided on

the website? Specifically the Congressional Control Pricing data in excel format.

• You have to “read” the data into R.• Well, if you are a beginner just ask R how to read

in data.• Ex: ?read.table• If you cannot be specific (why did he

write .table?) you can use a built-in search by typing help.search(“table”)

9

Ok, now this is how you read a .csv file into R

• General code: data.set<-read.table(“C:/file name.ext”, header=TRUE)

• Note another oddity about R, you must use forward slashes.

• congress<-read.csv("C:/Documents and Settings/Malcolm Easton/Desktop/Pol 51/congressprice.csv", header=TRUE)

• Also notice that header=TRUE is just telling R that the first line is a header containing the names of variables in the file.

10

Now you can play with summary statistics!

• summary(congress$rhdsprice)

• Or if writing all of that is a pain you can choose to assign that path to an object.

• price1<-congress$rhdsprice• summary(price1)

• Or easier still, just “attach” your data set which tells R to look for objects among the variables in a given data frame.

11

Attaching a data frame

• If you type attach(congress)you can now summarize your data by typing summary(rhdsprice)

12

More fun with summary stats

• The summary() function gives you some basic summary stats, but you can get more specific if you like.

• Mean: mean(rhdsprice)• Median: median(rhdsprice)• Maximum: max(rhdsprice)• Minimum: min(rhdsprice)

13

Is that all R can do?• No, that is just the tip of the iceberg.• You can code functions into R or use a large

number of pre-coded functions.• You can use R to calculate the variance, and

standard deviation of a variable as well as a slew of graphical options as well.

• Built in code: var(rhdsprice)• Manually coding: n=length(rhdsprice)• Manual.var=(1/(n-1))*sum((rhdsprice-mean(rhdsprice))^2)

14

Recommended