7
6-hop myrrh example (from Damian). Market agency targeting advertising to friends of customers: Entities: 1. advertisements 2. markets 3. merchants 4. products 5. reviews 6. customers 7. friends The hop logic would be: 1. Advertisements target specific markets. 2. Markets have particular merchants 3. Merchants sell individual products. 4. Products have reviews. 5. Reviews are provided by customers. 6. Customers have friends. Sell=S(F, G) IsUsedBy= R(E,F) 0 0 0 1 0 0 1 0 0 0 0 1 0 1 0 0 1 0 0 1 0 1 1 1 1 0 0 0 1 1 0 0 1 2 3 4 E=Markets F=merchants 2 3 4 5 1 2 3 4 G=products A C RatedAt(G,H) 0 0 0 1 1 0 1 0 0 0 0 1 0 1 0 1 H=rating 2 3 4 5 AreGivenBy=U(H,I) 1 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 2 3 4 I=Customers IsAFriend? (I,J) 0 0 0 1 1 0 1 0 0 0 0 1 0 1 0 1 J=Pers on 2 3 4 5 D=Ads 2 3 4 5 targetsQ( D,E) 1 1 0 1 0 0 0 1 1 1 0 1 1 1 0 0

6-hop myrrh example (from Damian). Market agency targeting advertising to friends of customers: Entities: 1. advertisements 2. markets 3. merchants 4

Embed Size (px)

Citation preview

Page 1: 6-hop myrrh example (from Damian). Market agency targeting advertising to friends of customers: Entities: 1. advertisements 2. markets 3. merchants 4

6-hop myrrh example (from Damian).

Market agency targeting advertising to friends of customers:

Entities:1. advertisements2. markets3. merchants4. products5. reviews6. customers7. friends

The hop logic would be:

1. Advertisements target specific markets.2. Markets have particular merchants3. Merchants sell individual products.4. Products have reviews.5. Reviews are provided by customers.6. Customers have friends.

Sell=S(F,G)

IsUsedBy= R(E,F)

0 0 0 10 0 1 00 0 0 10 1 0 0

1 0 0 10 1 1 11 0 0 01 1 0 0

1234

E=MarketsF=merchants2 3 4 5

1234

G=products

A

C

RatedAt(G,H)

0 0 0 11 0 1 00 0 0 10 1 0 1

H=rating2 3 4 5

AreGivenBy=U(H,I)

1 0 0 10 1 0 11 0 0 01 1 0 0

1234

I=Customers

IsAFriend?(I,J)

0 0 0 11 0 1 00 0 0 10 1 0 1

J=Person2 3 4 5

D=Ads

2 3 4 5

targetsQ(D,E)

1 1 0 10 0 0 11 1 0 11 1 0 0

Page 2: 6-hop myrrh example (from Damian). Market agency targeting advertising to friends of customers: Entities: 1. advertisements 2. markets 3. merchants 4

    r   r r v v        r  mr   r      v v v      r    r       v mv v     r    v v    r            v                   

FAUST Oblique. formula: P(Xod)<a X any set of vectors.

To separate rs from vs: Using means_midpoints as cutpoints , calculate a as follows:

a

Viewing mr, mv as vectors ( e.g., mr≡originpt_mr ), a = ( mr+(mv-mr)/2 ) o d = (mr+mv)/2 o d

d

D≡ mrmv. Let d = D/|D|.

What if d points away from the intersection, , of the Cut-hyperplane (Cut-line in this 2-D case) and the d-line (as it does for class=V, where d = (mvmr)/|mvmr| ? Then a is the negative of the distance shown (the angle is obtuse so its cosine is negative). But each vod is a larger negative number than a=(mr+mv)/2od, so we still want vod < ( 1*mv+1*mr ) o d

1 + 1or vod < ½(mv+mr) o d Next we will take a std-based alternative linear combination of mv and mr

Page 3: 6-hop myrrh example (from Damian). Market agency targeting advertising to friends of customers: Entities: 1. advertisements 2. markets 3. merchants 4

Oblique FAUST (means midpt level-0 case):

R G ir1 ir2 means62.83 95.29 108.12 89.50 148.84 39.91 113.89 118.31 287.48 105.50 110.60 87.46 377.41 90.94 95.61 75.35 459.59 62.27 83.02 69.95 569.01 77.42 81.59 64.13 7

R G irR1 ir2 stds8 15 13 9 18 13 13 19 25 7 7 6 36 8 8 7 46 12 13 13 55 8 9 7 7

Oblique level-0 (Oblique without eliminating classes as they are predicted) 1's 2's 3's 4's 5's 7's True Positives: 322 199 344 145 174 353

False Positives: 28 3 80 171 107 74

NonOblique lev-0 1's 2's 3's 4's 5's 7's True Positives: 99 193 325 130 151 257

Class Totals-> 461 224 397 211 237 470 NonOblq lev-1 50% 1's 2's 3's 4's 5's 7's True Positives: 212 183 314 103 157 330

False Positives: 14 1 42 103 36 189

Page 4: 6-hop myrrh example (from Damian). Market agency targeting advertising to friends of customers: Entities: 1. advertisements 2. markets 3. merchants 4

    r   r r v v         r mr   r    v v v      r    r       v mv v     r v v    r            v                   

PX dot d<a = PdiXi<aFAUST Oblique coordinate-wise std ratio level-0: D≡ mrmv , d=D/|D|

To separate r from v: Using the coordinate-wise std ratios cutpoint , calculate a as follows:

d

Viewing mr, mv as vectors, a = ( mr + mv ) o dstdr+stdv

stdr

stdr+stdv

stdv

Just as there is no median for a set of vectors, there is no std either. What is meant by the purple expression above is, for each coordinate (dimension) one calculates the stds of those coordinate values of the r and v vector sets, the ratios with those coordinate values of mr and mv.Is that the same as projecting the r-set and v-set onto the d-line (using Rod and Vod) and then using the stds of those shadow lengths to adjust the cutpoint? This is different and is a better way to do it? The next slide shows this better approach.

Page 5: 6-hop myrrh example (from Damian). Market agency targeting advertising to friends of customers: Entities: 1. advertisements 2. markets 3. merchants 4

    r   r r v v         r mr   r    v v v      r    r       v mv v     r v v    r            v                   

pmr

|

PX dot d<a = PdiXi<aFAUST Oblique: X any set of vectors. D≡ mrmv , d=D/|D|

To separate r from v: Using the mean and std of the projections cutpoint , calculate a as follows:

dr|

r

||r

|

r

|r pm

v |

v|

v

||v|

v

|v

a = pmr + (pmv-pmr) =pstdr+pstdv

pstdr pmr*pstdr + pmr*pstdv + pmv*pstdr - pmr*pstdr pstdr +pstdv

By pmr, we mean this distance, m

rod, which is also mean{rod|rR}By pstdr, std{rod|rR}

next? pmr + (pmv-pmr) =pstdv+2pstdr

2pstdr pmr*2pstdr + pmr*pstdv + pmv*2pstdr - pmr*2pstdr 2pstdr +pstdv

In this case the predicted classes will overlap (i.e., a given sample point may be assigned multiple classes) therefore we will have to order the class predictions.

Page 6: 6-hop myrrh example (from Damian). Market agency targeting advertising to friends of customers: Entities: 1. advertisements 2. markets 3. merchants 4

Oblique FAUST(level-0 case):

R G ir1 ir2 means62.83 95.29 108.12 89.50 148.84 39.91 113.89 118.31 287.48 105.50 110.60 87.46 377.41 90.94 95.61 75.35 459.59 62.27 83.02 69.95 569.01 77.42 81.59 64.13 7

R G irR1 ir2 stds8 15 13 9 18 13 13 19 25 7 7 6 36 8 8 7 46 12 13 13 55 8 9 7 7

Oblique level-0 using means of midpoints 1's 2's 3's 4's 5's 7's True Positives: 322 199 344 145 174 353

False Positives: 28 3 80 171 107 74

NonOblique lev-0 1's 2's 3's 4's 5's 7's True Positives: 99 193 325 130 151 257

Class Totals-> 461 224 397 211 237 470

NonOblq lev-1 50% 1's 2's 3's 4's 5's 7's True Positives: 212 183 314 103 157 330

False Positives: 14 1 42 103 36 189

Oblique level-0 using means and stds of projections (w/o class elimination) 1's 2's 3's 4's 5's 7's True Positives: 359 205 332 144 175 324

False Positives: 29 18 47 156 131 58

Oblique lev-0, means, stds of projs (with class elimination in 2,3,4,5,6,7,1 order) Note that no elimination occurs!

1's 2's 3's 4's 5's 7's True Positives: 359 205 332 144 175 324

False Positives: 29 18 47 156 131 58

Page 7: 6-hop myrrh example (from Damian). Market agency targeting advertising to friends of customers: Entities: 1. advertisements 2. markets 3. merchants 4

PX dot d<a = PdiXi<aFAUST Oblique: X any set of vectors. D≡ mrmv , d=D/|D|

To separate r from v: Using the mean and std of the projections cutpoint , and:

    r   r r v v         r mr   r    v v v      r    r       v mv v     r v v    r            v                   

pmr

|

dr|r

| |r

|

r

|r pm

v |

v|

v

| |v|

v

|v

a = pmr + (pmv-pmr) =pstdv+2pstdr

2pstdr pmr*2pstdr + pmr*pstdv + pmv*2pstdr - pmr*2pstdr pstdr +2pstdv

Oblique level-0 using means and stds of projections 1's 2's 3's 4's 5's 7's True Positives: 359 205 332 144 175 324

False Positives: 29 18 47 156 131 58

Oblique level-0 using means and stds of projections, doubling pstdr as above 1's 2's 3's 4's 5's 7's True Positives: 410 212 277 179 199 324

False Positives: 114 40 113 259 235 58

Class Totals-> 461 224 397 211 237 470

Oblique l-0, means, stds of projs, doubling pstdr, classify, eliminate in 2,3,4,5,7,1 order 1's 2's 3's 4's 5's 7's True Positives: 309 212 277 154 163 248

False Positives: 22 40 65 211 196 27So the number of FPs is drastically reduced and TPs somewhat reduced. Is that better? If we parameterize 2 and adjust to

maximize TPs and minimize FPs, what is the optimal multiplier parameter value?

Code question:In a case stmt, if 1st case is true, does the 2nd case get

evaluated?It doesn't if coded:If (case 1) then action1Else if (case 2) then action2Else if (case 3) then action3,etc.