80
Lecture 6 Engineering Knowledge Base Query Agents 1. Why is natural language (English) so hard? 2. Idea 1: Domain-Independent Templates 3. Idea 2: Automatic Property-Level Annotation 4. Neural Paraphrasing 5. Idea 3: Automatic Sentence-Level Paraphraser 6. Evaluation Silei Xu, Sina Semnani, Monica Lam 1 Stanford CS224v Course Conversational Virtual Assistants with Deep Learning

Engineering Knowledge Base Query Agents

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Engineering Knowledge Base Query Agents

Lecture 6

Engineering Knowledge Base Query Agents

1. Why is natural language (English) so hard? 2. Idea 1: Domain-Independent Templates3. Idea 2: Automatic Property-Level Annotation4. Neural Paraphrasing5. Idea 3: Automatic Sentence-Level Paraphraser6. Evaluation

Silei Xu, Sina Semnani, Monica Lam 1

Stanford CS224v CourseConversational Virtual Assistants with Deep Learning

Page 2: Engineering Knowledge Base Query Agents

A.

A. enie: Open Pretrained Assistant

Pretrained Language Models

Read Form-filling Instructions

Fill Web Forms

Customer Support

APIs

DB Access Dialogues

GroundingPrimitives

GenericDialogue

Models

Agents

Transaction Dialogue State Machine

Restaurants Play Songs Turn on Lights

Abstract APIDialogues

DB Schemas

Page 3: Engineering Knowledge Base Query Agents

A.

A.

Summary: Methodology• Complete everything that the computer can do:

• What if the computer is missing functionality?

3

“Ask not what your country can do for you but

what you can do for your country?”

“Ask not what your user wants to know, but

what the computer can tell the user?”

John F. Kennedy:

Our Motto:

Page 4: Engineering Knowledge Base Query Agents

A.

A.

Summary: ThingTalk Query Representation• Completeness by mapping SQL to ThingTalk deliberately• Drop rename: since we can run join on columns with 2 different names• Unions, differences: support through inheritance supertable

• Match user’s mental model: more natural to understand by humans too!• SQL is not the most intuitive

• Precise, unambiguous, canonicalized• Facilitate training accuracy

• Inter-operate with action APIs (to be discussed later)

4

new!

Page 5: Engineering Knowledge Base Query Agents

A.

A.

Summary: Paraphrasing Methodology• Method• Synthesize canonical sentences (synthetic-text, logical form)• Human provides paraphrases (paraphrased-text, logical form)• No need to annotate

• Big plus: reduce cost of data acquisition• Limitations: • Still expensive• Inaccurate paraphrases• Lack of variety: resembles the original terminology• Does not work with real input

5

Page 6: Engineering Knowledge Base Query Agents

A.

A.

Outline

1. Why is natural language (English) so hard? 2. Idea 1: Domain-Independent Templates3. Idea 2: Automatic Property-Level Annotation4. Neural Paraphrasing5. Idea 3: Automatic Sentence-Level Paraphraser

6. Evaluation

6

Page 7: Engineering Knowledge Base Query Agents

A.

A.

Why is NL so Hard?

• Alternatives for just 1 fact: “Dr. Smith is Ann’s doctor”

7

Relation Part-of-Speech (POS) Unknown: Ann

Doctor Has-a Who has Dr. Smith as a doctor?

Is-a Who is Dr. Smith a doctor of ?

Active Verb Whom does Dr. Smith treat?

Passive Verb Who is treated by Dr. Smith?

Patient Has-a Who does Dr. Smith have as a patient?

Is-a Who is a patient of Dr. Smith?

Active Verb Who consults with Dr. Smith?

Passive Verb By whom is Dr. Smith consulted?

Unknown: Dr. Smith

Who does Ann have as a doctor?

Who is a doctor of Ann?

Who treats Ann?

By whom is Ann treated?

Who has Ann as a patient?

Who is Ann a patient of?

With whom does Ann consult?

Who is consulted by Ann?

Page 8: Engineering Knowledge Base Query Agents

A.

A.

Why is NL so Hard?

• Alternatives for just 1 fact: “Dr. Smith is Ann’s doctor”

• Type-based terminology• Example: an operation “>= ”

• Date/time: “later than”, “after”, …• Temperature: “higher than”, “warmer than”, “hotter than”, “over”,

…• Weight: “heavier than”, “over”, “more than”, …• Distance: “farther than”, “longer than”, …

8

Page 9: Engineering Knowledge Base Query Agents

A.

A.

Why is NL so Hard?

• Alternatives for just 1 fact: “Dr. Smith is Ann’s doctor”• Type-based terminology: “>=”• Alternative phrasing• Example: “restaurants with the highest rating”• “restaurants with best reviews”• “top-rated restaurants”• “best restaurants”

• Domain-specific “shortcuts” in NL• Word-level: “my father’s brother” à uncle• Sentence-level:

“Send a message to the sender of some message” à “reply”

9

Page 10: Engineering Knowledge Base Query Agents

A.

A.

Why is NL so Hard?

• Expressiveness: all database queries

• Variety in saying the same thing

10

PropertyLevel

SentenceLevel

Domain-independent Idea 1

Domain-dependent Idea 2 Idea 3

Page 11: Engineering Knowledge Base Query Agents

A.

A.

Outline

1. Why is natural language (English) so hard? 2. Idea 1: Domain-Independent Templates3. Idea 2: Automatic Property-Level Annotation4. Neural Paraphrasing

5. Idea 3: Automatic Sentence-Level Paraphraser6. Evaluation

11

Page 12: Engineering Knowledge Base Query Agents

STANFORDLAM

DB Constructs: Ground English to ThingTalkOperation English Template ThingTalk Example

Selection table with fname equal to value table, fname = value restaurants with rating equal to 3Projection the fname of table [fname] of table the cuisine of restaurantsSubquery the table1 of table2 table1,in_array(id,any(table1 of table2)) reviews of restaurant XJoin table1 with their table2 table join table2 restaurants with their reviews

Aggregatethe number of table count (table) The number of restaurantsthe op fname in table op (fname of table) The average rating of restaurants

Aggregate & Group by

the number of table in each fname count (table by fname) The number of restaurantsthe op fname1 in table in each fname2 op (fname1 of table by fname2) The average rating of restaurants

Ranking the n table with the min fname sort (fname asc of table)[1:n] the 3 restaurants with the min rating

Quantifiertable1 with table2 table1, contains(table2, any(table2) restaurants with review with …table1 with no table2 table1, !contains(table2, any(table2) restaurants with no review with …

Row-wise function

the distance of table from location [distance(geo, location)] of table The distance of restaurants from herethe number of fname in table [count(fname)] of table The number of reviews in restaurants

Canonical English templates: covers all queries

Page 13: Engineering Knowledge Base Query Agents

STANFORDLAM

Idea 1

Add more domain-independent templates

for variety

Page 14: Engineering Knowledge Base Query Agents

STANFORDLAM

Standard Variation in NLProperty Type Comparison Weight Lighter, heavierHeight Taller, shorterAge Older, youngerLength Shorter, longerSize Smaller, biggerPrice Cheaper, more expensiveSpeed Slower, fasterTemperature Colder, hotterTime Earlier, before, later, afterDuration Shorter, longerDistance Closer, nearer, farther, more distant

Subject Type Interrogative Words People WhoObject WhatTime WhenLocation Where

SentencePurpose Example Grammar

Declarative I am looking for …Imperative Search for ..Interrogative What is …Exclamatory —

From English grammar books

Page 15: Engineering Knowledge Base Query Agents

STANFORDLAM

NL: Connectives

restaurant that serves Italian cuisine and was rated 5 starsrestaurant with rating 5 and Italian cuisine5-star restaurant that serves Italian cuisineItalian restaurant with 5 stars5-star Italian restaurant

cuisine == “Italian” rating == 5

Page 16: Engineering Knowledge Base Query Agents

A.

A.

Property-Level Templates

16

POS Annotation Example Template Example utterance

Is-a Noun

Has-a Noun

Active verb

Passive verb

Adjective

Prepositional

alumniOf property in people table

alumni of <value> table who are [noun phrase] value people who are alumni of Stanford

a <value> degree table with a value [noun phrase] people with a Stanford degree

graduated from <value> table who [verb phrase] value people who graduated from Stanford

educated at <value> table [passive verb phrase] value people who educated at Stanford

<value> value table Stanford people

from <value> table [prepositional phrase] value people from Stanford

Based on POS (Part-of-Speech)

Page 17: Engineering Knowledge Base Query Agents

A.

A.

Genie Templates for English

• Kinds of templates

• Canonical templates: to ground English to all possible queries• Templates for attributes based on POS• Templates based on kinds of sentences, types, connectives

• Total: 900 templates

17

Page 18: Engineering Knowledge Base Query Agents

A.

A.

Quiz

• 900 templates! How much work is it?

• Is it worth it?

• How many templates are there in other languages?

18

Page 19: Engineering Knowledge Base Query Agents

A.

A.

Synthesis: Using Property Annotations (POS)

19

POS Annotation

Is-a Noun alumni of <value>

Has-a Noun a <value> degree

Active verb graduated from <value>

Passive verb educated at <value>

Adjective <value>

Prepositional from <value>

is-a noun

<value>Stanford

<table> who are <is-a noun>

alumni of Stanford

people who are alumni of Stanford

<table>People

<filtered table>

Page 20: Engineering Knowledge Base Query Agents

A.

A.

Synthesis: Using Property Annotations (POS)

20

POS Annotation

Is-a Noun alumni of <value>

Has-a Noun a <value> degree

Active verb graduated from <value>

Passive verb educated at <value>

Adjective <value>

Prepositional from <value>

<table> with <has-a noun>

<value>Stanford

<table> who have <has-a noun>

alumni of Stanford

people who are alumni of Stanford

people with a Stanforddegree

people who have a Stanford degree

<table>People

has-a noun

<filtered table>

Page 21: Engineering Knowledge Base Query Agents

A.

A.

Synthesis: Using Property Annotations (POS)

21

POS Annotation

Is-a Noun alumni of <value>

Has-a Noun a <value> degree

Active verb graduated from <value>

Passive verb educated at <value>

Adjective <value>

Prepositional from <value>

<value>Stanford

<table> who <active verb>

alumni of Stanford

people who are alumni of Stanford

people with a Stanforddegree

people who have a Stanford degree

people who graduated from Stanford

<table>People

active-verb

<filtered table>

Page 22: Engineering Knowledge Base Query Agents

A.

A.

Synthesis: Using Property Annotations (POS)

22

POS Annotation

Is-a Noun alumni of <value>

Has-a Noun a <value> degree

Active verb graduated from <value>

Passive verb educated at <value>

Adjective <value>

Prepositional from <value>

<value>Stanford

<table> who were <passive verb>

<table>People

alumni of Stanford

people who are alumni of Stanford

people with a Stanforddegree

people who have a Stanford degree

people who graduated from Stanford

people educated at Stanford

people who were educated at Stanford

<table> <passive verb>

passive-verb

<filtered table>

Page 23: Engineering Knowledge Base Query Agents

A.

A.

Synthesis: Using Property Annotations (POS)

23

POS Annotation

Is-a Noun alumni of <value>

Has-a Noun a <value> degree

Active verb graduated from <value>

Passive verb educated at <value>

Adjective <value>

Prepositional from <value>

<value>Stanford

<adjective> <table>

<table>People

alumni of Stanford

people who are alumni of Stanford

people with a Stanforddegree

people who have a Stanford degree

people who graduated from Stanford

people educated at Stanford

people who were educated at Stanford

Stanford people

adjective

<filtered table>

Page 24: Engineering Knowledge Base Query Agents

A.

A.

Synthesis: Using Property Annotations (POS)

24

POS Annotation

Is-a Noun alumni of <value>

Has-a Noun a <value> degree

Active verb graduated from <value>

Passive verb educated at <value>

Adjective <value>

Prepositional from <value>

<value>Stanford

<table> <prepositional>

<table>People

alumni of Stanford

people who are alumni of Stanford

people with a Stanforddegree

people who have a Stanford degree

people who graduated from Stanford

people educated at Stanford

people who were educated at Stanford

Stanford people

people from Stanford

prepositional

<filtered table>

Page 25: Engineering Knowledge Base Query Agents

A.

A.

Synthesis: Using Property Annotations (POS)

25

POS Annotation

Is-a Noun alumni of <value>

Has-a Noun a <value> degree

Active verb graduated from <value>

Passive verb educated at <value>

Adjective <value>

Prepositional from <value>

<value>Stanford

<table> <prepositional>

<table>People

alumni of Stanford

people who are alumni of Stanford

people with a Stanforddegree

people who have a Stanford degree

people who graduated from Stanford

people educated at Stanford

people who were educated at Stanford

Stanford people

people from Stanford

prepositional

<filtered table>

Page 26: Engineering Knowledge Base Query Agents

A.

A.

Synthesis: Multiple Filtersalumni of Stanford

people who are alumni of Stanford

people with a Stanforddegree

people who have a Stanford degree

people who graduated from Stanford

people educated at Stanford

people who were educated at Stanford

Stanford people

people from Stanford

employee of Apple

Apple as their employer

works for Apple

employed by Apple

alumni of Stanford who are employee of Applealumni of Stanford who have Apple as their employeralumni of Stanford who works for Applealumni of Stanford employed by Applealumni of Stanford who are employed by Applepeople who are alumni of Stanford and employee of Applepeople who are alumni of Stanford and have Apple as their employerpeople who are alumni of Stanford and works for Applepeople who are alumni of Stanford and are employed by Appleemployee of Apple with a Stanford degreePeople with a Stanford degree who have Apple as their employerPeople with a Stanford degree that works for Apple…

<filtered table> (1) <filtered table> (2) <filtered table> (1) & (2)

Page 27: Engineering Knowledge Base Query Agents

A.

A.

Synthesis: Add <search> To Get to Full Questions

27

alumni of Stanford

people who are alumni of Stanford

people with a Stanforddegree

people who have a Stanford degree

people who graduated from Stanford

people educated at Stanford

people who were educated at Stanford

Stanford people

people from Stanford

<generic verb for search> <filtered table>

search for

find

get

<generic verb for search> Search for alumni of StanfordSearch for people who are alumni of StanfordSearch for people who have a Stanford degreeSearch for people who graduated from StanfordSearch for people educated at Stanford Search for Stanford peopleSearch for people from StanfordFind alumni of StanfordFind people who are alumni of StanfordFind people who have a Stanford degreeFind people who graduated from StanfordFind people educated at Stanford Find Stanford peopleFind people from StanfordGet alumni of StanfordGet people who are alumni of StanfordGet people who have a Stanford degreeGet people who graduated from StanfordGet people educated at Stanford Get Stanford peopleGet people from…

<filtered table> <questions>

Page 28: Engineering Knowledge Base Query Agents

A.

A.

Template Syntax

28

TargetThe target non-terminal for the template;$root is the top-level non-terminal for a command

Expansion:a list of literals or non-terminals to compose the target

Semantic function:Build the ThingTalk abstract syntax tree of the target

$root : “show me” $filtered_table => return filtered_table;

$table “who” $verb_filter => addFilter(table, verb_filter)$filtered_table :

Natural Language ThingTalk abstract syntax treeNonterminals

Page 29: Engineering Knowledge Base Query Agents

A.

A.

Template Syntax

29

$filtered_table : $table “who” $verb_filter => addFilter(table, verb_filter)

$root : “show me” $filtered_table => return filtered_table;

“people”:@people()

“graduated from Stanford”:alumniOf == entity(“Stanford”)

“people who graduated from Stanford”

“show me people who graduated from Stanford”

@people() filter alumniOf == entity(“Stanford”)

@people() filter alumniOf == entity(“Stanford”)

Page 30: Engineering Knowledge Base Query Agents

A.

A.

1. Load templates and manifest

2. depth = 03. while depth < max_depth {

depth ++for each template whose non-terminals have been expanded {

1. exhaustively generate nl-thingtalk pairs using non-terminals from a lower depth2. apply semantic function and filter out rejected results

2. sample from generated results based on pruning_size3. save them for the target non-terminal }}

4. for each phrase generated for $root {for each constant placeholder {

replace it with real values sampled from parameter dataset of matching type }}

3030

At least one non-terminal from depth depth – 1 to avoid duplicates

All results from lower depths are memorized to save time (at the cost of memory)

Semantic function may reject an expansion:E.g., conflict filters like “rating > 3 && rating > 4”

Synthesis Algorithm

Bottom-up generation

Augment synthetic data with real-world values

Page 31: Engineering Knowledge Base Query Agents

STANFORDLAM

Neural Semantic Parser Model• Pre-trained BERT encoder

• LSTM decoder

Schema2QA: High-Quality and Low-Cost Q&A Agents for the Structured WebSilei Xu, Giovanni Campagna, Jian Li, and Monica S. LamIn Proceedings of the 29th ACM International Conference on Information and Knowledge Management, October 2020

Page 32: Engineering Knowledge Base Query Agents

A.

A. Comparison with SEMPRE

32

SEMPRE (Overnight paper) Genie Schema2QA

Manual: Annotate properties (same POS) Manual: Annotate properties (different POS)

Automatic: Grammar-based synthesis (canonical only)

Automatic: Grammar-based synthesis(with 900 templates)

Manual: Paraphrase synthesized sentences

Manual: Paraphrase 2% of synthesized sentences

Train with only paraphrased sentences Train with synthesized + few-shot paraphrased data

Page 33: Engineering Knowledge Base Query Agents

A.

A.

Overnight Dataset [Wang 2015]

• Train & evaluation: manual paraphrase• 8 domains• 26K examples

33

Page 34: Engineering Knowledge Base Query Agents

A.

A.

Results: Comparison with SEMPRE

34

Page 35: Engineering Knowledge Base Query Agents

A.

A.

Evaluate on Realistic User Input

• Schema2QA [Xu 2020a]

• Based on real-world Schema.org crawls

• Evaluation: much more realistic user input

35

namecuisinereviews…

restaurant

questions annotate

Page 36: Engineering Knowledge Base Query Agents

A.

A.

Evaluate on Realistic User Input

• Schema2QA [Xu 2020a]

• Based on real-world Schema.org crawls

• Evaluation: much more realistic user input• Over 2/3 of the questions have at least 2 properties in them

• Contains unseen values during training

36

Restaurant People Movie Book Music Hotel Average

Dev 528 499 389 362 326 433 424.5

Test 524 500 413 410 288 528 443.8

Page 37: Engineering Knowledge Base Query Agents

A.

A.

Training Data

37

Restaurant People Movie Book Music Hotel Average

# of Properties 25 13 16 15 19 18 17.7

Schema2QA

Human Annotations 122 95 111 96 103 83 101.7Synthetic 270K 270K 270K 270K 270K 270K 270K

Human Paraphrase 6.4K 7.1K 3.8K 3.9K 3.6K 3.3K 4.7K

Page 38: Engineering Knowledge Base Query Agents

A.

A.

Evaluation Result

0%

20%

40%

60%

80%

100%

Restaurants People Movies Books Music Hotels Average

Baseline: Templates only Templates + manual annotation & paraphrases

Page 39: Engineering Knowledge Base Query Agents

A.

A.

Quiz

• Template-based generation: 900 templates

• Is it worth the work?

• Do we need to repeat for every language?

39

Page 40: Engineering Knowledge Base Query Agents

A.

A.

Outline

1. Why is natural language (English) so hard? 2. Idea 1: Domain-Independent Templates3. Idea 2: Automatic Property-Level Annotation4. Neural Paraphrasing

5. Idea 3: Automatic Sentence-Level Paraphraser6. Evaluation

40

Page 41: Engineering Knowledge Base Query Agents

STANFORDLAM

Idea 2

Property Annotationwith an Automatic Paraphraser

Page 42: Engineering Knowledge Base Query Agents

A.

A.

Auto-Annotator

42

alumniOf property in people table

1. Generate canonical annotation based on name, and assign type by POS Tagger2. Construct simple example sentences with templates3. Paraphrase with a neural paraphrase model4. Parse the paraphrases with POS-based parser to extract annotations

POS Annotation Example Template Example utterance

Is-a Noun table who are [noun phrase] value

Has-a Noun table with a value [noun phrase]

Active verb table who [verb phrase] value

Passive verb table [passive verb phrase] value

Adjective value table

Prepositional table [prepositional phrase] value

people with a Stanford degree

people who graduated from Stanford

people who educated at Stanford

Stanford people

people from Stanford

a <value> degree

graduated from <value>

educated at <value>

<value>

from <value>

alumni of <value> people who are alumni of Stanford

Page 43: Engineering Knowledge Base Query Agents

A.

A.

Auto-Annotator

• Use simple sentences: one property at a time

• Less mistakes• Focus on the property to obtain more variety

• Always use real-world values• More context to the language model

• All annotations are amplified with templates to compose different questions

43

Page 44: Engineering Knowledge Base Query Agents

A.

A.

Training

44

Restaurant People Movie Book Music Hotel Average

# of Properties 25 13 16 15 19 18 17.7

Schema2QA

Manual Annotations 122 95 111 96 103 83 101.7Synthetic 270K 270K 270K 270K 270K 270K 270K

Human Paraphrase 6.4K 7.1K 3.8K 3.9K 3.6K 3.3K 4.7K

AutoQAAuto Annotations 151 121 157 150 144 160 147.2

Synthetic 270K 270K 270K 270K 270K 270K 270K

Page 45: Engineering Knowledge Base Query Agents

A.

A.

Evaluation on Schema2QA Data

0%

20%

40%

60%

80%

100%

Restaurants People Movies Books Music Hotels Average

Templates Only With Auto-Annotator With Manual Annotations & Paraphrases

Page 46: Engineering Knowledge Base Query Agents

A.

A.

Evaluation on Schema2QA Dataset

0%

20%

40%

60%

80%

100%

Restaurants People Movies Books Music Hotels Average

Templates Only With Auto-Annotator With Manual Annotations & ParaphrasesAccuracy

goes up by ~19%

Page 47: Engineering Knowledge Base Query Agents

A.

A.

Evaluation on Schema2QA Dataset

0%

20%

40%

60%

80%

100%

Restaurants People Movies Books Music Hotels Average

Templates Only With Auto-Annotator With Manual Annotations & Paraphrases

Page 48: Engineering Knowledge Base Query Agents

A.

A.

Outline

1. Why is natural language (English) so hard? 2. Idea 1: Domain-Independent Templates3. Idea 2: Automatic Property-Level Annotation4. Neural Paraphrasing5. Idea 3: Automatic Sentence-Level Paraphraser6. Evaluation

48

Page 49: Engineering Knowledge Base Query Agents

STANFORDLAM

Neural Paraphrasing

Page 50: Engineering Knowledge Base Query Agents

A.

A.

Paraphraser

• Fine-tune a Seq2Seq model on a paraphrasing dataset

• how hard can it be?• A paraphrasing dataset• sentence pairs (Xi, Yi) where Xi and Yi are paraphrases of each other

• What is the loss function in training?

• Predict the next token in the gold sentence• Negative log likelihood

• What is the metric for evaluation?

50

Page 51: Engineering Knowledge Base Query Agents

A.

A.

BLEU Score: Validation Metric• Bilingual Evaluation Understudy (2002)

• Compares machine generations to one or several human-written references• Computes a similarity score by matching n-grams of the generated text

against the references:

𝐵𝐿𝐸𝑈 = 𝛽'!"#

$

𝑝!#%!

• 𝛽 is a function of the length of the generated text, to penalize short ones• n-grams: Overlapping spans of n-words• 𝑝! is n-gram precision: (# matched n-grams) / (# n-grams in generated text)

• a gram in the reference can be matched only once• k is usually 4

• A popular, but a lame metric• BERTscore replaces exact match with soft-matching

• Using BERT pre-trained representations

Page 52: Engineering Knowledge Base Query Agents

A.

A.

Paraphrase Generation Datasets

• How the model performs depends on the dataset• What defines good paraphrases?

Page 53: Engineering Knowledge Base Query Agents

A.

A.

What is a Paraphrase?

1. Two sentences are paraphrases if their representations are similar, according to some model (Lewis et al, 2020)

2. Two sentences that describe the same picture are paraphrases (Prakash et al, 2016)

a dog makes a face while rolling on the ground.a brown and white dog laying on his back smilinga dog is on it's back on the grass with an open mouth.a dog laying on its back in the grass with its mouth open.a dog with its mouth open lays in the grass

MSCOCO dataset

Page 54: Engineering Knowledge Base Query Agents

A.

A.

What is a Paraphrase? 3. Translations from the same sentence

The ParaBank 2 Dataset (Hu et al, 2019)• Example

French: L'homme est né libre, et partout il est dans les fers, Rousseau• Google Translations• Man is born free, but everywhere he is in chains• Man was born free, and everywhere he is in chains

• Human Translation• Man is born free, and everywhere he is in shackles• People are born free, but they are in iron chains throughout

Page 55: Engineering Knowledge Base Query Agents

A.

A.

The ParaBank 2 Dataset

• Created using a bilingual (English-Czech) corpus of news, books, movie subtitles, etc.

• English translations of Czech à paraphrases of the English counterpart

• Lots of tricks to improve the grammaticality and diversity of machine translated sentences

• Scores ~90% on grammaticality• 84% on semantic similarity of pairs, according to human judges

55

Page 56: Engineering Knowledge Base Query Agents

A.

A.

Our Paraphraser

• Fine-tune a BART pretrained seq2seq modelwith the ParaBank 2 dataset

56

Page 57: Engineering Knowledge Base Query Agents

A.

A.

BART Pre-trained Seq2Seq Model

57

Bert: Masked model, not generational GPT: Next word prediction, not bidirectional

BART: Seq-2-seq, Denoising model, bidirectional, generational

Graphic courtesy of: BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension Mike Lewis*, Yinhan Liu*, Naman Goyal*, etc, arXiv: 1910.13461

Page 58: Engineering Knowledge Base Query Agents

A.

A.

Bart: Pretrained Model

• Denoising• Token masking• Token deletion• Text span masking: replace a span with one mask token• Sentence permutation• Document rotation: starting at a random position

• Model• 6-layer transformer each for encode and decode• BART-large: 12 layers

• Useful for downstream tasks• Question answering, entailment, summarization, response generation, translation

58

Page 59: Engineering Knowledge Base Query Agents

A.

A.

Quiz

• In the task of generating paraphrases for ThingTalk, what is the best definition of a paraphrase?

• What properties do we want the paraphrases to have?

• Pre-training improves the performance on many downstream tasks. Does pre-training (e.g. BART) help paraphrase generation?

59

Page 60: Engineering Knowledge Base Query Agents

A.

A.

Outline

1. Why is natural language (English) so hard? 2. Idea 1: Domain-Independent Templates3. Idea 2: Automatic Property-Level Annotation4. Neural Paraphrasing

5. Idea 3: Automatic Sentence-Level Paraphraser6. Evaluation

61

Page 61: Engineering Knowledge Base Query Agents

STANFORDLAM

Idea 3

Automatic sentence-level paraphrases

Page 62: Engineering Knowledge Base Query Agents

A.

A.

Auto-Paraphrase

Synthetic Paraphrased

Search some cafeteria that have greater star than 3, and do not have smoking.

Search for a restaurant that has more than 3 stars and doesn't smoke .

Find restaurants close to my home. Find restaurants near me.

Search for people who are employed by Stanford.

[greedy] Look for people employed at Stanford.

[temperature=0.3] Look for people who work at Stanford.

[temperature=1.2] Find people at Stanford.

[temperature=1.5] Actually, look for those who are currently employed at Stanford.

ParaphrasedSentence

SyntheticSentence

BART-based paraphraser

Page 63: Engineering Knowledge Base Query Agents

A.

A.

Self-TrainingWe need to filter out the noisy paraphrases

1. Train a parser with synthetic dataset2. Generate potentially noisy paraphrases of the synthetic dataset3. Use the parser from (1) to parse the paraphrases of (2)4. Remove all paraphrases where the new parse does not match the label5. Add filtered paraphrases to the training set6. Repeat

Auto-Paraphraser

Paraphraser Semantic Parseri Paraphrase

Filter

TRAINlogical formsparaphrases

paraphrases + original logical forms

Synthetic dataset

Page 64: Engineering Knowledge Base Query Agents

A.

A.

Genie Summary

65

Database schema and values

Auto-Annotator

Paraphraser

POS-Based Annotation Extraction

Template-based Data Synthesizer

attribute annotations

Auto-Paraphraser

Paraphraser Semantic Parseri Paraphrase

Filter

TRAINlogical formsparaphrases

paraphrases + original logical forms

English-grammar-based comprehensive

templates

Page 65: Engineering Knowledge Base Query Agents

A.

A.

Quiz

• Why bother with self-trainingif we only accept paraphrases that are already parsed correctly?

• Do we need to filter noise on property-level paraphrases?

• Can we skip property-level paraphrases?

66

Page 66: Engineering Knowledge Base Query Agents

A.

A.

Outline

1. Why is natural language (English) so hard? 2. Idea 1: Domain-Independent Templates3. Idea 2: Automatic Property-Level Annotation4. Neural Paraphrasing

5. Idea 3: Automatic Sentence-Level Paraphraser6. Evaluation

67

Page 67: Engineering Knowledge Base Query Agents

STANFORDLAM

Results on Overnight Dataset

0%20%40%60%80%

100%

Basketball* Blocks Calendar Housing Publications Recipes Restaurants Social Average

SOTA (out-of-domain human data) Herzig and Berant (2018) Genie (no human data) SOTA (in-domain human data) Cao et al. (2019)

Building a Semantic Parser OvernightYushi Wang, Jonathan Berant, Percy LiangIn Proceedings of the 53rd Annual Meeting of ACL, 2015

* Herzig and Berant did not report Basketball numbers

AutoQA: From Databases To Q&A Semantic Parsers With Only Synthetic Training DataSilei Xu*, Sina J. Semnani*, Giovanni Campagna, Monica S. Lam In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, November 2020

Testing on paraphrased data

Page 68: Engineering Knowledge Base Query Agents

A.

A. Schema2QA: Training Set

• Training takes about 3 hours using V100 for 30K iterations

69

Restaurant People Movie Book Music Hotel Average

# of Properties 25 13 16 15 19 18 17.7

Schema2QA

# of Annotations 122 95 111 96 103 83 101.7

Synthetic 270K 270K 270K 270K 270K 270K 270K

Human Paraphrase 6.4K 7.1K 3.8K 3.9K 3.6K 3.3K 4.7K

AutoQA

# of Annotations 151 121 157 150 144 160 147.2

Synthetic 270K 270K 270K 270K 270K 270K 270K

Auto paraphrase 281K 299K 331K 212K 341K 285K 292K

Page 69: Engineering Knowledge Base Query Agents

A.

A.

Evaluation Result

0%

20%

40%

60%

80%

100%

Restaurants People Movies Books Music Hotels Average

Templates Only Auto-Annotator

Auto-Annotator + Naive Paraphraser AutoQA (Auto-Annotator + Auto-Paraphraser)

(Schema2QA) Manual Annotations & Paraphrases

Page 70: Engineering Knowledge Base Query Agents

A.

A.

Evaluation Result

0%

20%

40%

60%

80%

100%

Restaurants People Movies Books Music Hotels Average

Templates Only With Auto-Annotator

Auto-Annotator + Naive Paraphraser AutoQA (Auto-Annotator + Auto-Paraphraser)

(Schema2QA) Manual Annotations & ParaphrasesAccuracy

goes up by ~19%

Page 71: Engineering Knowledge Base Query Agents

A.

A.

Evaluation Result

0%

20%

40%

60%

80%

100%

Restaurants People Movies Books Music Hotels Average

Templates Only With Auto-Annotator

With Auto-Annotator + Naive Paraphraser AutoQA (Auto-Annotator + Auto-Paraphraser)

(Schema2QA) Manual Annotations & ParaphrasesAccuracy

goes downby ~10%!

(No filtering on paraphrases)

Page 72: Engineering Knowledge Base Query Agents

A.

A.

Evaluation Result

0%

20%

40%

60%

80%

100%

Restaurants People Movies Books Music Hotels Average

Templates OnlyWith Auto-AnnotatorWith Auto-Annotator + Naive ParaphraserWith Auto-Annotator + Auto-Paraphraser Accuracy

goes up by ~8%

Page 73: Engineering Knowledge Base Query Agents

A.

A.

Evaluation Result

0%

20%

40%

60%

80%

100%

Restaurants People Movies Books Music Hotels Average

Templates Only With Auto-Annotator

With Auto-Annotator + Naive Paraphraser With Auto-Annotator + Auto-Paraphraser

With Manual Annotations & Paraphrases There is a ~6% gap

Page 74: Engineering Knowledge Base Query Agents

A.

A.

• Auto-annotator

• phrase-level• generic

• Auto-paraphraser

• sentence-level• value-specific

0%

20%

40%

60%

80%

100%

Restaurants People Movies Books Music Hotels Average

Auto-Annotator Auto-Paraphraser Auto-Annotation+Auto-Paraphraser

Auto-Annotator & Auto-Paraphraser are Complementary

Page 75: Engineering Knowledge Base Query Agents

A.

A.

Change the BERT-LSTM to Fine-Tuning BART

0%

20%

40%

60%

80%

100%

Restaurants People Movies Books Music Hotels Average

Templates Only With Auto-AnnotatorWith Auto-Annotator + Naive Paraphraser With Auto-Annotator + Auto-ParaphraserWith Manual Annotations & Paraphrases With Auto-annotator+Auto-Paraphraser on BART

Page 76: Engineering Knowledge Base Query Agents

A.

A.

Quiz

• Now we can automate everything, should we generate as much data as possible?

77

Page 77: Engineering Knowledge Base Query Agents

A.

A.

Quiz

• Now we can automate everything, how much data should we generate?

• Accuracy grows logarithmically with the amount of data • [Oren et al 2021] On Schema2QA dataset,

a carefully sampled dataset with 5K examples can achieve comparable accuracy (83.4%) with a model trained with 1M examples (85%)

• Find the sweet point that balance accuracy and computation cost!

78

Page 78: Engineering Knowledge Base Query Agents

A.

A.

Quiz

• Is the performance good enough?

• How do we improve the performance?

79

Page 79: Engineering Knowledge Base Query Agents

A.

A.

Conclusions

• Paraphraser: Fine-tuned on BART using The ParaBank 2 Dataset

• Self-Training: Use model i to label more data to train model i+1• Data synthesis

1. Property-level paraphrases to extract POS2. Domain-independent templates (900)

3. Sentence-level paraphrases with noise-filtering with self-training• Importance to test with real data• Fully automatic tool: Schema à Question semantic parser

80

Page 80: Engineering Knowledge Base Query Agents

A.

A.

References• [Wang 2015] Building a Semantic Parser Overnight• [Su 2017]: Cross-domain Semantic Parsing via Paraphrasing• [Campagna 2019] Genie: A Generator of Natural Language Semantic Parsers for Virtual

Assistant Commands• [Xu 2020a] Schema2QA: High-Quality and Low-Cost Q&A Agents for the Structured Web• [Xu 2020b] AutoQA: From Databases To Q&A Semantic Parsers With Only Synthetic Training Data• [Marion 2021] Structured Context and High-Coverage Grammar for Conversational Question

Answering over Knowledge Graphs• [Oren et al 2021] Finding needles in a haystack: Sampling Structurally-diverse Training Sets from

Synthetic Data for Compositional Generalization

81