20
From W1-S16

From W1-S16

  • Upload
    darryl

  • View
    55

  • Download
    0

Embed Size (px)

DESCRIPTION

From W1-S16. From W2-S9. Node failure. The probability that at least one node failing is: f = 1 – (1-p) n When n =1; then f = p Suppose p =0.0001 but n =10000, then: f = 1 – (1 -0.0001) 10000 = 0.63 [why/how ?] - PowerPoint PPT Presentation

Citation preview

Page 1: From W1-S16

From W1-S16

Page 2: From W1-S16

Node failure• The probability that at least one node failing is: f= 1 – (1-p)n

• When n =1; then f =p

• Suppose p=0.0001 but n=10000, then:

f = 1 – (1 -0.0001)10000 = 0.63 [why/how ?]

• This is one of the most important formulas to know (in general).

From W2-S9

Page 3: From W1-S16

Example• For example suppose the hash functions maps {to, Java, road}

to one node. Then– (to,1) remains (to,1)– (Java,1);(Java,1);(Java,1) (Java, [1,1,1])– (road,1);(road,1)(road,[1,1]);

• Now REDUCE function converts – (Java,[1,1,1]) (Java,3) etc.

• Remember this is a very simple example…the challenge is to take complex tasks and express them as Map and Reduce!

From W2-S15

Page 4: From W1-S16

Similarity Example [2]

Notice, it requires some ingenuity to come up with key-value pairs. This iskey to suing map-reduce effectively

From W2-S19

Page 5: From W1-S16

K-means algorithmLet C = initial k cluster centroids (often selected randomly)Mark C as unstableWhile <C is unstable> Assign all data points to their nearest centroid in C. Compute the centroids of the points assigned to each element of C. Update C as the set of new centroids. Mark C as stable or unstable by comparing with previous set of centroids. End While

Complexity: O(nkdI)n:num of points; k: num of clusters; d: dimension; I: num of iterationsTake away: complexity is linear in n.

From W3-S14

Page 6: From W1-S16

Example: 2 Clusters

c

c

c

c

A(-1,2) B(1,2)

C(-1,-2) D(1,-2)(0,0)

K-means Problem: Solution is (0,2) and (0,-2) and the clusters are {A,B} and{C,D}

K-means Algorithm: Suppose the initial centroids are (-1,0) and (1,0) then{A,C} and {B,D} end up as the two clusters.

4

2

From W3-S16

Page 7: From W1-S16

Bayes Rule

PriorPosterior

From W4-S21

Page 8: From W1-S16

Example: Iris Flower• F=Flower; SL=Sepal Length; SW = Sepal Width; • PL=Petal Length; PW =Petal Width

• Data

Large Small Medium Small ?

choose themaximum

From W4-S25

Page 9: From W1-S16

Confusion Matrix

Actual Label (1) Actual Label (-1)

Predicted Label (1) True Positive (N1) False Positive (N2)

Predicted Label (-1) False Negatives (N3) True Negatives (N4)

Label 1 is called Positive, Label -1 is called Negative

Let the number of test samples be N

N = N1 + N2 + N3 + N4.

True Positive Rate (TPR) = N1/(N1+N3)True Negative Rate (TNR) = N4/(N4+N2)

False Positive Rate (FPR) = N2/(N2+N4)

False Negative Rate (FNR) = N3/(N1+N3)

Accuracy = (N1+N4)/(N1+N2+N3+N4)

Precision = N1/(N1+N2) Recall = N1/(N1+N3)

From W5-S7

Page 10: From W1-S16

ROC (Receiver Operating Characteristic) Curves

• Generally a learning algorithm A will return a real number…but what we want is a label {1 or -1}

• We can apply a threshold..TA 0.7 0.6 0.5 0.2 0.1 0.09 0.08 0.02 0.01

T=0.1 1 1 1 1 1 -1 -1 -1 -1

True Label

1 1 -1 -1 1 1 -1 -1 -1

A 0.7 0.6 0.5 0.2 0.1 0.09 0.08 0.02 0.01

T=0.2 1 1 1 1 -1 -1 -1 -1 -1

True Label

1 1 -1 -1 1 1 -1 -1 -1

TPR = 3/4FPR = 2/5

TPR = 2/4FPR = 2/5

From W5-S9

Page 11: From W1-S16

Random Variable

• A random variable X can take values in a set which is:– discrete and finite.

• Lets toss a coin and X = 1 if it’s a head and X=0 if it’s a tail. X is random variable

– discrete and infinite (countable)• Let X be the number of accidents in Sydney in a day.. Then X

= 0,1,2,…..– Infinite (uncountable)

• Let X be the height of a Sydney-sider.– X = 150, 150.11,150.112,……

From W5-S13

Page 12: From W1-S16

From W7-S2

These slides are from Steinbach, Pang and Kumar

Page 13: From W1-S16

From W7-S7

Page 14: From W1-S16

From W7-S8

Page 15: From W1-S16

From W9-S9

Page 16: From W1-S16

From W9-S12

Page 17: From W1-S16

From W9-S21

Page 18: From W1-S16

From W9-S26

Page 19: From W1-S16

The Key Idea• Decompose the User x Rating matrix into:

• User x Rating = ( User x Genre ) x (Genre x Movies)– Number of Genres is typically small

• Or

• R =~ UV

• Find U and V such that ||R – UV|| is minimized…– Almost like k-means clustering…why ?

From W11-S9

Page 20: From W1-S16

UV Computation….From W11-S15

This example is from Rajaraman, Leskovic and Ullman: See Textbook