Update - 26 Jan 2009

  • Upload
    ldkhang

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

  • 8/14/2019 Update - 26 Jan 2009

    1/23

    ProductReviewSummarization

    LyDuy Khang

  • 8/14/2019 Update - 26 Jan 2009

    2/23

    Recall:

    Summarizationsystem

    .

    2. Productfacet

    identification

    3. Facetoriente sentencec ustering

    4. Infocused

    snippet

    summarization

  • 8/14/2019 Update - 26 Jan 2009

    3/23

    Recall:

    Productfacet

    identification

    (1 2)

    Statisticalmeasurement

    (ARM)

    to

    extract

    frequent

    Noise+Facetcoverage+noimplicitfacets

  • 8/14/2019 Update - 26 Jan 2009

    4/23

    Recall:

    Productfacet

    identification

    (2 2)

    Identifyfacets

    manually

    wouldtriggerthatfacet

    >

    Noise+Facetcoverage+noimplicitfacets

  • 8/14/2019 Update - 26 Jan 2009

    5/23

    Recall:

    Facetoriented

    sentence

    clustering

    ,

    Facetis

    labeled

    based

    on

    keywords

    occurrence

  • 8/14/2019 Update - 26 Jan 2009

    6/23

    Infocusedsnippetsummarization

    Problemformulation

    (1 2)

    Clustersentences

    belonging

    to

    each

    facet.

    Output: Sentenceranking+grouping

    Sentencepolarity

    Infocused

    snippet

    representation

  • 8/14/2019 Update - 26 Jan 2009

    7/23

    Infocusedsnippetsummarization

    Problemformulation

    (2 2)

    Battery:

    ##sen1

    Battery:

    +sen1(and5similarities)

    ##sen2

    ##sen3

    ##sen4

    ##sen5

    sen3 an 0simi arities

    +sen5(and1similarities)

    sen6 and2similarities

    ##sen6

    ##senN

    +sen2(and2similarities)

    ngroupe :

    +sen7

    sen10

    thedrawbackswerethattheywerenotuserfriendlyforthecasualphotographer,

    thelcd screenisalittletoosmall.

    the

    lcd screen

    is

    too

    small

    .

  • 8/14/2019 Update - 26 Jan 2009

    8/23

    Infocusedsnippetsummarization:

    Methodology

    Editing

    C ustering

    Ranking

  • 8/14/2019 Update - 26 Jan 2009

    9/23

    Methodology:

    Editing(1 13)

    Jing,H.

    (2000).

    Sentence

    reduction

    for

    automatic

    Machinelearningtechnique

    Features:

    Grammarchecking(Integratedlexicon)

    Contextinformation

  • 8/14/2019 Update - 26 Jan 2009

    10/23

    Methodology:

    Editing(2 13)

    Knight,K.,

    &

    Marcu,

    D.

    (2000).

    Statistics

    based

    Noisychannelmodel

    Themodelislearnedbymachinelearning

    Syntactictree

  • 8/14/2019 Update - 26 Jan 2009

    11/23

    Methodology:

    Editing(3 13)

    Generalpurpose

    compression

    targetedpartofthesentence.

  • 8/14/2019 Update - 26 Jan 2009

    12/23

    Methodology:

    Editing(4 13)

    Initializea

    focused

    part

    of

    the

    sentence

    minimumyetmeaningful.

  • 8/14/2019 Update - 26 Jan 2009

    13/23

    Methodology:

    Editing(5 13)

    the

    drawbacks

    were

    that

    they

    were

    not

    user

    friendly

    for

    the

    casual

    photographer

    ,

    thelcd screenisalittletoosmall.

    thedrawbackswerethattheywerenotuserfriendlyforthecasualphotographer,

    thelcd screenisalittletoosmall.

    thedrawbackswerethattheywerenotuserfriendlyforthecasualphotographer,

    thelcd screenis alittletoosmall.

    t e

    raw ac s

    were

    t at

    t ey

    were

    not

    user

    r en y

    or

    t e

    casua

    p otograp er

    ,

    thelcd screenisalittletoosmall.

    e c screen s oosma .

  • 8/14/2019 Update - 26 Jan 2009

    14/23

    Methodology:

    Editing(6 13)

    Considera

    sentence

    can

    be

    represented

    as

    a

    se uenceofwordsinthefollowin form:

    NNNNNN00111000NNN

    1:We

    want

    to

    keep

    the

    word

    at

    this

    position

    0:Wewanttoremovethewordatthisposition

    N:Wehaventdecidedwiththewordatthisposition

    ,wewanttokeepitornot

    .

  • 8/14/2019 Update - 26 Jan 2009

    15/23

    Methodology:

    Editing(7 13)

    N10N

    N100

    N101

    010N

    N1NN

    N11N

    N1N0

    110N

    N1N1

    01NN

    11N1

    T=0 T=1 T=2

  • 8/14/2019 Update - 26 Jan 2009

    16/23

    Methodology:

    Editing(8 13)

    Howlarge

    is

    the

    search

    space?

  • 8/14/2019 Update - 26 Jan 2009

    17/23

    Methodology:

    Editing(9 13)

    Thesearchs ace:

    Factorial Heuristic:Dependencylinks

    Ex:Thebatterylifeofthecameraisimpressive.

    det(life3,The1)

    nn(life3,battery2)

    ,

    det(camera6,

    the

    5)

    prep_of(life3,camera6)

    ,

  • 8/14/2019 Update - 26 Jan 2009

    18/23

    Methodology:

    Editing(10 13)

    Onlywords

    that

    has

    at

    least

    one

    link

    with

    the

  • 8/14/2019 Update - 26 Jan 2009

    19/23

    Methodology:

    Editing(11 13)

    Wewant

    to

    compute: P(St+1|St)

    where

    St is

    a

    sequenceattimet

    The Vterbi al orithm to find the maximum

    sequence.

    Weusethefollowin formula:

  • 8/14/2019 Update - 26 Jan 2009

    20/23

    Methodology:

    Editing(12 13)

    movethe

    state

    from

    tto

    t+1.

    LetD X =1ifXiske t otherwiseD X =0

    LetEbethesetofwordsthathavebeendecidedat

    timet,

    and

    R(X)

    subset

    of

    E

    be

    the

    set

    of

    words

    that

    havedependencywithX.

    1+ tt

    ))(,()1())(,(-),(

    )|)((

    ++=

    =

    XRXStatXRXgramNEXC

    EXDP

    1,0where

  • 8/14/2019 Update - 26 Jan 2009

    21/23

    Methodology:

    Editing(13 13)

    =

    esdependenciofsetexistingtherepresents)Re(

    ))Re(|),(Re(),(

    E

    EEXPEXC

    esdependenciofsetexpandedtherepresentsE)Re(X,

    WebPMIsearchProximitywithmodeledbecan))(,( + XRXgramN

    evidencecorpusReview:))(,( XRXStat

  • 8/14/2019 Update - 26 Jan 2009

    22/23

    measurementadapted

    from

    baseline

    1

    thresholdwilltellushowmanyclustersof

    .

  • 8/14/2019 Update - 26 Jan 2009

    23/23

    ,

    formulationof

    the

    editing

    part

    by

    considering

    .

    Nextmeeting:

    e nes ay e : am ngapore me