DETERMINANTAL POINT PROCESSESapparikh/nips2014ml-nlp/slides/dpps-nips2014.pdfDETERMINANTAL POINT...

DETERMINANTAL POINT PROCESSESFOR

NATURAL LANGUAGE PROCESSING

Jennifer GillenwaterJoint work with Alex Kulesza and Ben Taskar

OUTLINE

Motivation & background on DPPs

OUTLINE

Large-scale settings

OUTLINE

Structured summarization

OUTLINE

Structured summarization

Other potential NLP applications

MOTIVATION &BACKGROUND

SUMMARIZATION

Quality: relevance to

the topic

SUMMARIZATION

Quality: relevance to

the topic

Diversity: coverage of core ideas

SUBSET SELECTION

AREA AS SET-GOODNESS

feature space

feature spacequality =

similarity = B>i Bj

Bi +Bj

ik2 2kB

jk2 2�(B

>iBj)2

similarity = B>i Bj

Bi +Bj

ik2 2kB

jk2 2�(B

>iBj)2

similarity = B>i Bj

Bi +Bj

ik2 2kB

jk2 2�(B

>iBj)2

VOLUME AS SET-GOODNESS

area =q

kBik22kBjk22 � (B>i Bj)2

area =q

length = kBik2

area =q

length = kBik2 volume = base ⇥ height

area =q

= ||B1||2vol(proj?B1(B2:N ))

vol(B) = height⇥ base

length = kBik2 volume = base ⇥ height

AREA AS A DETarea =

qkBik22kBjk22 � (B>

i Bj)2

AREA AS A DETarea =

i Bj)2

||Bi||22

B>i Bj

B>i Bj( )= det

||Bj ||22

AREA AS A DETarea =

i Bj)2

||Bi||22

B>i Bj

B>i Bj( )= det

||Bj ||22

= det( )Bi

VOLUME AS A DET

= det( )Bi

vol(B{i,j})

VOLUME AS A DET

= det( )Bi

vol(B{i,j})

vol(B) = det

... B1

BN. . .( )

vol(B)

2= det(B>B) = det(L)

COMPLEX STATISTICS

COMPLEX STATISTICSP

COMPLEX STATISTICS

N items =) 2N sets

EFFICIENT COMPUTATION

EFFICIENT COMPUTATION2

POINT PROCESSESY = {1, . . . , N}

( )P = 0.2

DETERMINANTAL

DETERMINANTALP({2, 3, 5}) /

DETERMINANTAL

L11 L12 L13 L14 L15

L21 L22 L23 L24 L25

L35L34L33L32L31

L41 L42 L43 L44 L45

L55L54L53L52L51

P({2, 3, 5}) /

DETERMINANTAL

L11 L12 L13 L14 L15

L21 L22 L23 L24 L25

L35L34L33L32L31

L41 L42 L43 L44 L45

L55L54L53L52L51

P({2, 3, 5}) /

DETERMINANTALL22 L23 L25

L35L33L32

L55L53L52

P({2, 3, 5}) /

L35L33L32

L55L53L52

det( )P({2, 3, 5}) /

L35L33L32

L55L53L52

det( )P({2, 3, 5}) =

L35L33L32

L55L53L52

det( )P({2, 3, 5}) =

det(L+ I)

EFFICIENT INFERENCE

PL(Y = Y )Normalizing:

EFFICIENT INFERENCE

PL(Y = Y )

P(Y ✓ Y)

Normalizing:

Marginalizing:

EFFICIENT INFERENCE

PL(Y = Y )

PL(Y = B | A ✓ Y)

PL(Y = B | A \Y = ;)

P(Y ✓ Y)

Normalizing:

Marginalizing:

Conditioning:

EFFICIENT INFERENCE

PL(Y = Y )

PL(Y = B | A ✓ Y)

PL(Y = B | A \Y = ;)

Y ⇠ PL

P(Y ✓ Y)

Normalizing:

Marginalizing:

Conditioning:

Sampling:

EFFICIENT INFERENCE

PL(Y = Y )

PL(Y = B | A ✓ Y)

PL(Y = B | A \Y = ;)

Y ⇠ PL

P(Y ✓ Y)

Normalizing:

Marginalizing:

Conditioning:

Sampling:

LARGE-SCALE SETTINGS

DUAL KERNELKULESZA AND TASKAR (NIPS 2010)

DUAL KERNEL

BN. . .B2

LKULESZA AND TASKAR (NIPS 2010)

DUAL KERNEL

BN. . .B2

N ⇥N

KULESZA AND TASKAR (NIPS 2010)

DUAL KERNEL

BN. . .B2

N ⇥N

DUAL KERNEL

BN. . .B2

DUAL KERNEL

BN. . .B2

= D ⇥D

CKULESZA AND TASKAR (NIPS 2010)

DUAL INFERENCE

L = V ⇤V > C = V̂ ⇤V̂ >

DUAL INFERENCE

L = V ⇤V > C = V̂ ⇤V̂ >V = B>V̂ ⇤� 12

DUAL INFERENCE

L = V ⇤V > C = V̂ ⇤V̂ >V = B>V̂ ⇤� 12

O(D3)P

Y det(LY )Normalizing

DUAL INFERENCE

L = V ⇤V > C = V̂ ⇤V̂ >V = B>V̂ ⇤� 12

O(D3)P

O(D3 +D2k2)Marginalizing & Conditioning

DUAL INFERENCE

L = V ⇤V > C = V̂ ⇤V̂ >V = B>V̂ ⇤� 12

O(ND2k)Y ⇠ PLSampling

O(D3)P

O(D3 +D2k2)Marginalizing & Conditioning

EXPONENTIAL N

N = O({sentence length}{sentence length})

We want to select a diverse set of parses.

EXPONENTIAL N

N = O({node degree}{path length})

STRUCTURE FACTORIZATIONKULESZA AND TASKAR (NIPS 2010)

STRUCTURE FACTORIZATION

Bi = q(i)�(i)

quality similarity

Bi = q(i)�(i)

i = {i↵}↵2F↵

quality similarity

Bi = q(i)�(i)

i = {i↵}↵2F↵

quality similarity

i = {i↵}↵2F

Q↵2F

q(i↵)

��(i)

i = {i↵}↵2F

Q↵2F

q(i↵)

� P↵2F

�(i↵)

i = {i↵}↵2F

Q↵2F

q(i↵)

� P↵2F

�(i↵)

O(ND2k)Y ⇠ PL

Q↵2F

q(i↵)

� P↵2F

�(i↵)

M = R =

Y ⇠ PL O(D2k3 +Dk2M cR)

Q↵2F

q(i↵)

� P↵2F

�(i↵)

M = R =

Y ⇠ PL O(D2k3 +Dk2M cR)

M cR = 42 ⇤ 12 = 192 ⌧ N = 412 = 16,777,216

LARGE FEATURE SETS?

Large Exponential

Small dualdual +

structure

Large ? ?

LARGE FEATURE SETS?N = # of items

#offeatures

Large Exponential

Small dualdual +

structure

Large ? ?

LARGE FEATURE SETS?N = # of items

#offeatures

RANDOM PROJECTIONSGILLENWATER, KULESZA, AND TASKAR (EMNLP 2012)

RANDOM PROJECTIONS

GILLENWATER, KULESZA, AND TASKAR (EMNLP 2012)

RANDOM PROJECTIONS

D⇥�

RANDOM PROJECTIONS

M cR M cR

RANDOM PROJECTIONSd

VOLUME PRESERVATION

VOLUME PRESERVATIONJOHNSON AND LINDENSTRAUSS (1984)

VOLUME PRESERVATIONMAGEN AND ZOUZIAS (2008)

DPP PRESERVATIONvol

2= det

DPP PRESERVATION

2= det

DPP PRESERVATION

2= det

DPP PRESERVATION

2= det

DPP PRESERVATION

d = O⇣

k✏ ,

log(1/�)+log(N)

✏2 + ko⌘

DPP PRESERVATION

d = O⇣

k✏ ,

log(1/�)+log(N)

✏2 + ko⌘

total # of itemssubset size

DPP PRESERVATION

d = O⇣

k✏ ,

log(1/�)+log(N)

✏2 + ko⌘

w.p. 1� � : kPk � P̃kk1 e6k✏ � 1

DPP PRESERVATION

d = O⇣

k✏ ,

log(1/�)+log(N)

✏2 + ko⌘

0 50 100 1500

Projection dimension

4x 108

w.p. 1� � : kPk � P̃kk1 e6k✏ � 1

STRUCTUREDSUMMARIZATION

NEWS THREADING

March 28: Health officials confirm Ebola outbreak in Guinea’s capital

NEWS THREADING

August 8: World Health Organization declares Ebola epidemic an

international health emergency

NEWS THREADING

September 2: GlaxoSmithKlein begins Ebola vaccine drug trial

NEWS THREADING

September 2: GlaxoSmithKlein begins Ebola vaccine drug trial

M ⇡ 35,000

PROJECTING NEWS FEATURES

�(i)

D = 36,356

PROJECTING NEWS FEATURES

G�(i)�(i)

D = 36,356 d = 50

DPP THREADS

Jan 08 Jan 28 Feb 17 Mar 09 Mar 29 Apr 18 May 08 May 28 Jun 17

pope vatican church parkinson

israel palestinian iraqi israeli gaza abbas baghdad

owen nominees senate democrats judicial filibusters

social tax security democrats rove accounts

iraq iraqi killed baghdad arab marines deaths forces

DPP THREADS

Jan 08 Jan 28 Feb 17 Mar 09 Mar 29 Apr 18 May 08 May 28 Jun 17

pope vatican church parkinson

israel palestinian iraqi israeli gaza abbas baghdad

owen nominees senate democrats judicial filibusters

social tax security democrats rove accounts

iraq iraqi killed baghdad arab marines deaths forces

Feb 24: Parkinson's Disease Increases Risks to PopeFeb 26: Pope's Health Raises Questions About His Ability to LeadMar 13: Pope Returns Home After 18 Days at HospitalApr 01: Pope's Condition Worsens as World Prepares for End of PapacyApr 02: Pope, Though Gravely Ill, Utters Thanks for PrayersApr 18: Europeans Fast Falling Away from ChurchApr 20: In Developing World, Choice [of Pope] Met with SkepticismMay 18: Pope Sends Message with Choice of Name

System k-means DTM DPP

ROUGE-1F 16.5 14.7 17.2

R-SU4F 3.76 3.44 3.98

Coherence 2.73 3.2 3.3

QUANTITATIVE RESULTS

ROUGE-1F 16.5 14.7 17.2

R-SU4F 3.76 3.44 3.98

ROUGE-1F 16.5 14.7 17.2

R-SU4F 3.76 3.44 3.98

ROUGE-1F 16.5 14.7 17.2

R-SU4F 3.76 3.44 3.98

ROUGE-1F 16.5 14.7 17.2

R-SU4F 3.76 3.44 3.98

Runtime (s) 626 19,434 252

OTHER POTENTIALNLP APPLICATIONS

POTENTIAL APP: RE-RANKING

• Parser: simple model with local features defines basic scores for all possible parse trees

• Re-ranker: more complex model with non-local features provides more refined scores

• Typical pipeline: find the k highest-scoring parses under the simple model, then score these k with the more complex model and output the best

• Issue: the k may be largely redundant, so re-ranker does not get to consider significantly different parses

IDEA: USE DPPS FORSELECTING RE-RANKER INPUT

Quality: standard

parser scores

Quality: standard

parser scores

Diversity: edge lengths,

POS pairs, etc.

POTENTIAL APP:WORD SENSE INDUCTION

POTENTIAL APP:WORD SENSE INDUCTION• Goal: identify all possible senses of ambiguous

words (e.g. river bank vs bank deposit)

• Typical approach: unsupervised clustering, cluster centers represent word senses

• Why DPPs fit: can re-express finding cluster centers as the problem of finding a high-quality, diverse set

Quality: centrality (density of points nearby)

Diversity: same as standard

WSI features

QUESTIONS?

DETERMINANTAL POINT PROCESSESapparikh/nips2014ml-nlp/slides/dpps-nips2014.pdfDETERMINANTAL POINT...

Documents

Generic Vehicle Models IWG-DPPS - UNECE Wiki

Expectation-Maximization for Learning Determinantal Point ...jgillenw.com/nips2014.pdf · Expectation-Maximization for Learning Determinantal Point Processes Jennifer Gillenwater

DPPS ANNUAL REPORT

PEDIATRIC INTESTINAL OBSTRUCTION DENNIS S. FLORES, MD, DPPS, DPSN, DPNSP

DPPS High School Student & Family Handbook (2018-19)dpchs.democracyprep.org/wp-content/uploads/2018/09/DPPS...teaching our scholars how to be joyful in their learning by approaching

CSS Lessons Learned The Hard Way – Zoe Gillenwater

37965451 Bansal Classes 12th Standard Maths DPPs

NIPS2014 reading - Top rank optimization in linear time

37966694 Bansal Classes 11th Standard Physics DPPs

k-DPPs: Fixed-Size Determinantal Point Processes · 2011-06-01 · k-DPPs: Fixed-Size Determinantal Point Processes Figure 1. A set of points in the plane drawn from a DPP (left),

Presentation of the CoHerent Project IWG-DPPS

Expectation-Maximization for Learning Determinantal Point ...jgillenw.com/nips2014-poster.pdfAffandi et al. (ICML 2014) This work: no ﬁxed values or restrictive parameterizations

Zero-shot recognition with unreliable attributesgrauman/papers/dinesh-nips2014-poster.pdfZero-shot recognition with unreliable attributes Dinesh Jayaraman and Kristen Grauman UT Austin

NIPS2014読み会 NIPS参加報告

TF-DPPS/1/10 Bundesanstalt für Straßenwesen

DPPS 통신통신및및및결정론적결정론적 성능에성능에대한대한 ... · 2018-01-01 · DPPS계통의Advant vant Controller (ACvant Controller (ACController (AC160)

Gillenwater v. Honeywell International, Inc., 2013 IL - State of Illinois

Deep Fragment Embeddings for Bidirectional Image Sentence ...karpathy/nips2014.pdfDeep Fragment Embeddings for Bidirectional Image Sentence Mapping Andrej Karpathy Armand Joulin Li

DCH NEWS JUNE2011 DPPS

Mercedita Magdaleno-Macalintal, MD, DPPS