Peer-review analysis

Embed Size (px)


Peer-review analysis. Comprehensive exam Presentered by :  Wenting Xiong Committees:     Diane Litman                          Rebecca Hwa                         Jingtao Wang. Motivation. Goal - PowerPoint PPT Presentation

Text of Peer-review analysis

  • Peer-review analysis

    Comprehensive exam

    Presentered by:Wenting Xiong

    Committees: Diane Litman Rebecca Hwa Jingtao Wang*

  • MotivationGoal Mine useful information in peers feedback and represent them in a intuitive and concise way

    Tasks and related research topics Identify review helpfulnessNLP Review analysis

    Summarize reviewers commentsNLP Paraphrasing and Summarization

    Sense-making of review comments interactive review explorationHCI Visual text analytics*

  • Part.1NLP -- Review Analysis*

  • OutlineReview helpfulness analysis

    Sentiment analysis (opinion mining)Aspect detectionSentiment orientationSentiment classification & extraction


  • 1 Review helpfulness analysisAutomatic predictionLearning techniquesFeatures utilitiesThe ground-truth

    Analysis of perceived review helpfulnessUsers bias when vote for helpfulnessInfluence of the other reviews of the same product*

  • 1.1 -- Learning techniquesProblem formalization Input: textual reviewsOutput: helpfulness score

    Learning AlgorithmsSupervised learning RegressionProduct reviews (e.g. electronics) , , ,,

  • 1.1-- Feature utilitiesFeatures used to model review helpfulness

    Controversial results about the effectiveness of subjectivity featuresterm-based counts not useful , category-based count shows positive words correlate with greater helpfulness

    Data sparsity issues? *

    CategoryFeature typeLinguisticUnigrams, bigramsLow levelStructuralSyntacticSemantic: 1) domain lexicons 2) SubjectivitySentiment analysisReadability metricsHigh levelSocial factorsReviewer profile Product ratings

  • 1.1 --The ground-truthVarious gold-standard of review helpfulnessAggregated helpfulness votes Perceived helpfulnesse.g. Manual annotations of helpfulness Real helpfulness

    Problems Percentage of helpful votes is not consistent with annotators judgments based on helpfulness specifications Error rate of preference pair < 0.5 *

  • 1 Review helpfulness analysisAutomatic predictionLearning techniquesFeatures utilitiesThe ground-truth

    Analysis of perceived review helpfulnessBiased voting of review helpfulness on Amazon.comThe perceived helpfulness is not only determined by the textual content


  • 1.2 Analysis of perceived review helpfulnessBiased voting of review helpfulness on Amazon.comImbalanced voteWinner Circle biasEarly bird bias x/y does not capture the true helpfulness of reviews

    The perceived helpfulness is not only determined by the textual contentInfluence of the other reviews of the same productIndividual bias


  • 1 Review helpfulness analysisSummaryEffective features for identify review helpfulnessPerceived helpfulness VS. realhelpfulness

    CommentsNew features Introduce domain knowledge and information from other dimensions

    Data sparsity problemHigh-level featuresDeep learning from low-level features

    Other machine learningtechniquesTheory-based generative models *

  • OutlineReview helpfulness analysis

    Sentiment analysis (opinion mining)


  • 2 Sentient analysis (opinion mining)How people think about what?Aspect detectionSentiment orientationSentiment classification & extraction*

  • 2.1 Aspect detectionFrequency-based approachMost frequent noun-phrase + sentiment-pivot expansion PMI (pointwise Mutual information) with meronymy discriminators + WordNet

    Generative approach LDA, MG-LDA ,sentence-level local-LDA Multiple-aspect sentiment model Content-attitudemodel


  • 2.2 Sentiment orientationAggregating from subjective termsManuallyconstructedsubjectivelexicons

    Bootstrapping with PMIAdj & adv opinion-bearing words

    Graph-based approachRelaxiationlabeling Scoring

    Domain adaptationSCL algorithm

    Through topic modelsMAS -- aspect-independent +aspect-dependent Content-attitude models -- predicted posterior of sentiment distribution


  • 2.3 Sentiment classification and extractionClassificationBinary Finer-grained e.g. metric labeling Data sparsityBag-of-Words vs. Bag-of-Opinions

    Opinion-oriented extractionTopic of interestPre-definedAutomatically learnedUser-specified


  • 2 SummaryComparing reviews helpfulness and sentiment In terms of automatic prediction, both are metric inferring problem, that can be formalized as standard ML problems with same input X though different output Y

    The learned knowledge about opinion topics and the associated sentiments would help model the general utility of reviews *

  • Part.2NLP -- Paraphrasing & Summarization*

  • OutlineParaphrasingParaphrases aresemanticallyequivalentwith each otherParaphrase recognitionParaphrase generation

    Summarization Shorter representation of the same semantic information of the input textInformativeness computationExtracted summarization of evaluative text


  • 1.1 Paraphrase recognitionDiscriminativeapproachVarious string similarity metricsDifferent level of abstraction of textual strings

    Question:Any useful existing resourses for identifying equivalent semantic information?Word-level: dictionary, WordNetPhrase-level: ?Sentence-level: ?


  • 1.2 Paraphrase generationCorporaMonolingual vs. bilingual

    MethodsDistributional-similarity basedCorpora based

    EvaluationIntrinsic evaluation vs. extrinsic evaluation*

  • 1.2 -- CorporaMonolingual corporaParallel corporaTranslation candidatesDefinitions of the same termComparable corporaSummary of the same eventDocuments on the same topic Bilingual parallel corpora


  • 1.1 -- Methods.1Distributional-similarity based methodsDIRT, paths frequently occur with same words at their endsUsing a single monolingual corpusMI to measure association strength between slot and its arguments

    Sentence-lattices, argument similarity of multiple slots on sentence-latticesUsing a comparable monolingual corpusHierarchical clustering for grouping similar sentencesMSA to induce lattices


  • 1.2 -- Methods.2Corpora-based methodsMonolingual parallel corpusMonolingual MT Merging partial parse trees FSA Paraphrasing from definitions Monolingual comparable corpusMSR paraphrase corpus Edit distance, Journalism conventionSentence-lattices

    Bilingual parallel corpusPivot approach Random-walk based HTP *

  • 1.2 -- EvaluationIntrinsic evaluationResponsivenessCan access precision, but no recallStandard test references Manually aligned corpusLower bound precision & relative recallExtrinsic evaluationAlignment tasks in monolingual translationAlignment error rateAlignment precision, recall, F-measure

    Model-specific evaluationFSA


  • 2 SummarizationTasks in automatic summarizationContent selectionInformation orderingAutomatic editing, information fusion

    Focus of this talk --Informativeness computationInformation selection (and generation)Summarization evaluation


  • 2.1 Computing informativenessSemantic information (Topic identification)Word-levelFrequency, TFIDF ,Topic signature , PMI(w, topic) , external domain knowledge Sentence-levelHMM content models Category classification + sentence clustering Summary-levelSentiment-aspect match model + KL divergence

    Opinion-based sentiment scores for evaluative textsSentiment polarity, intensity, mismatch, diversity

    Discriminative approach to predict informativenessCombine statistic, semantic, sentiment features in linear or log-linear models *

  • 2.2 Information selection & generationExtractionRank-based sentence selectionAggregation of word informative weights (+ discourse features) Optimized by Maximal Marginal Relevance

    Topic-based selection HMM content modelLanguge-model based clustering of informative phrasesSummarize citations based on category-cluster-setence

    Structured evaluative summaryAspect + overall rating Aspect + pos and cons Hierarchical aspects + sentiment phrasal expressions AbstractionGenerate evaluative arguments based on aggregation of extracted information Graph-based summarization using adjacently matrix to model dialogue structure


  • 2.3 Summarization evaluationPyramid (empirical)Multiple human wrote gold-standardsSCU

    ROUGEAutomatically compare with gold-standardConsider correlation based on unigram, bigram, longest common subsequence

    Fully automaticGood summary should be similar to the inputKL divergence, JS divergence

    User preference of sentiment summarizer

    ManualsummaryManual ratingResponsiveness

    PyramidROUGEFully auto

  • Paraphrasing and summarization -- SummaryCommon themeSemantic equivalence

    Related to sentiment analysis in computing informativeness of reviewsAspect-dependent sentiment orientationOverall vs. distribution statisticsAspect coverageCompute through scoring or measuring probabilistic model's distribution divergence


  • Part. 3HCI -- Visual text analytics*

  • OutlineText visualizationInner-set visualization for abstractionIntra-set visualization for comparison

    Interactive explorationDesign principles and examples *

  • 1 Text visualizationInner-set visualization for abstractionSemantic informationSentiment information (opinions)

    Intra-set visualization for comparison


  • 1.1 Inner-set visualization techniquesSemantic informationOriginal text with highlighted keywordsMost detailed informationTopic-based representationList of target entities (Jigsaw, )Haystack (Themail, )Tagcloud (OpinionSeer ), TIARA , reviewSpotlight )

    Vector-based re