Upload
chris-orwa
View
200
Download
0
Embed Size (px)
Citation preview
Data TransformationSummer Data Jam
Chris Orwa14th July 2015
Principal Component Analysis
Principal component analysis (PCA) is a technique used
to emphasize variation and bring out strong patterns in a
dataset. It's often used to make data easy to explore and
visualize.
Statistically, PCA is the eigenvectors of a covariance
matrix.
Let us Look at Some Concepts
Covariance
The covariance of two variables x and y in a data sample
measures how the variance of two attributes are related.
R codeduration = faithful$eruptions
waiting = faithful$waiting
cov(duration, waiting)
Covariance Matrix
Eigen Vectors
Eigenvector is a vector of a square matrix that points in a
direction invariant under the associated linear
transformation.
R codeB <- matrix(1:9, 3)
eigen(B)
Principal Component Analysis
R Code#load dataa = read.csv(‘my_data.csv') #perform PCAc = prcomp(a)