151
Dialogue State Induction Using Neural Latent Variable Models Qingkai Min 1;2 , Libo Qin 3 , Zhiyang Teng 1;2 , Xiao Liu 4 and Yue Zhang 1;2 1 School of Engineering, Westlake University 2 Institute of Advanced Technology, Westlake Institute for Advanced Study 3 Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology 4 School of Computer Science and Technology, Beijing Institute of Technology

taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

Dialogue State Induction Using Neural Latent Variable Models

Qingkai Min1;2 , Libo Qin3 , Zhiyang Teng1;2 , Xiao Liu4 and Yue Zhang1;2

1School of Engineering, Westlake University2Institute of Advanced Technology, Westlake Institute for Advanced Study3Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology4School of Computer Science and Technology, Beijing Institute of Technology

Page 2: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

目录

Motivation

Method

Experiments

Conclusion

Page 3: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

Motivation

Page 4: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 What is a task-oriented dialogue system?

Pic from: Gao J, Galley M, Li L. Neural approaches to conversational AI[C]//The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 2018: 1371-1374.

Typical modular architecture

Assist user in solving a task

Page 5: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 What is a task-oriented dialogue system?

End-to-end architecture

Assist user in solving a task

Pic from: Gao J, Galley M, Li L. Neural approaches to conversational AI[C]//The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 2018: 1371-1374.

Page 6: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 What is a task-oriented dialogue system?

Typical modular architecture

Pic from: Gao J, Galley M, Li L. Neural approaches to conversational AI[C]//The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 2018: 1371-1374.

Page 7: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 What is dialogue state tracking (DST)?

The dialogue state represents what the user is looking for at the current turn of the conversation.

Page 8: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

User: I want an expensive restaurant that serves Turkish food.

System: Anatolia serves Turkish food.User: What is the area?

Turn 1:inform(price=expensive, food=Turkish)

Turn 2:inform(price=expensive, food=Turkish); request(area)

What is dialogue state tracking (DST)?

The dialogue state represents what the user is looking for at the current turn of the conversation.

Page 9: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Current DST scenarios

Page 10: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Current DST scenarios

NLU DST

Propagatinguncertainty

Traditional DST:

utterance dialogue state

Page 11: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

NLU DST

Propagatinguncertainty

Traditional DST:

A single DST component

End-to-end DST:

utterance dialogue state

dialogue stateutterance

Current DST scenarios

Page 12: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 End-to-end DST

Page 13: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 End-to-end DST

User: I want an expensive restaurant that serves Turkish food.

System: Anatolia serves Turkish food.User: What is the area?

Page 14: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 End-to-end DST

User: I want an expensive restaurant that serves Turkish food.Manual labeling: inform(price=expensive, food=Turkish)

System: Anatolia serves Turkish food.User: What is the area?Manual labeling: inform(price=expensive, food=Turkish); request(area)

Ontology(optional): price: [cheap, expensive, moderate, ...] food: [Turkish, Italian, polish, ...] area: [south, north, center, ...]

Page 15: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 End-to-end DST

User: I want an expensive restaurant that serves Turkish food.Manual labeling: inform(price=expensive, food=Turkish)

System: Anatolia serves Turkish food.User: What is the area?Manual labeling: inform(price=expensive, food=Turkish); request(area)

Ontology(optional): price: [cheap, expensive, moderate, ...] food: [Turkish, Italian, polish, ...] area: [south, north, center, ...]

DST

Page 16: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 End-to-end DST

User: I want an expensive restaurant that serves Turkish food.Manual labeling: inform(price=expensive, food=Turkish)

System: Anatolia serves Turkish food.User: What is the area?Manual labeling: inform(price=expensive, food=Turkish); request(area)

Ontology(optional): price: [cheap, expensive, moderate, ...] food: [Turkish, Italian, polish, ...] area: [south, north, center, ...]

Turn 1:inform(price=expensive, food=Turkish)

Turn 2: inform(price=expensive, food=Turkish); request(area)

DST

Page 17: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Limitation of end-to-end DST

Page 18: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Limitation of end-to-end DST

End-to-end DST paradigm:

manual labeling train testDST modelcorpus

Page 19: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Limitation of end-to-end DST

End-to-end DST paradigm:

manual labeling train testDST modelcorpus

Limitations of end-to-end DST:

Page 20: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Limitation of end-to-end DST

End-to-end DST paradigm:

manual labeling train testDST modelcorpus

Limitations of end-to-end DST:

• Costly and slow: 8438 dialogues with 1249 workers in MultiWOZ 2.0 dataset

Page 21: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Limitation of end-to-end DST

End-to-end DST paradigm:

manual labeling train testDST modelcorpus

Limitations of end-to-end DST:

• Costly and slow: 8438 dialogues with 1249 workers in MultiWOZ 2.0 dataset

• Error-prone:Annotation errors

MultiWOZ 2.0 around 40% [Eric et al., 2019]

MultiWOZ 2.1 over 30% [Zhang et al., 2019]

[Eric et al., 2019] Mihail Eric, Rahul Goel, Shachi Paul, Abhishek Sethi, Sanchit Agarwal, Shuyag Gao, and Dilek Hakkani-Tur. Multiwoz 2.1: Multi-domain dialogue state corrections and state tracking baselines. arXiv, 2019.[Zhang et al., 2019] Jian-Guo Zhang, Kazuma Hashimoto, Chien-Sheng Wu, Yao Wan, Philip S Yu, Richard Socher, and Caiming Xiong. Find or classify? dual strategy for slot-value predictions on multi-domain dialog state tracking. arXiv, 2019.

Page 22: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Limitation of end-to-end DST

End-to-end DST paradigm:

manual labeling train testDST modelcorpus

Limitations of end-to-end DST:

• Costly and slow: 8438 dialogues with 1249 workers in MultiWOZ 2.0 dataset

• Error-prone:Annotation errors

MultiWOZ 2.0 around 40% [Eric et al., 2019]

MultiWOZ 2.1 over 30% [Zhang et al., 2019]

[Eric et al., 2019] Mihail Eric, Rahul Goel, Shachi Paul, Abhishek Sethi, Sanchit Agarwal, Shuyag Gao, and Dilek Hakkani-Tur. Multiwoz 2.1: Multi-domain dialogue state corrections and state tracking baselines. arXiv, 2019.[Zhang et al., 2019] Jian-Guo Zhang, Kazuma Hashimoto, Chien-Sheng Wu, Yao Wan, Philip S Yu, Richard Socher, and Caiming Xiong. Find or classify? dual strategy for slot-value predictions on multi-domain dialog state tracking. arXiv, 2019.

now updated toMultiWOZ 2.2

Page 23: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 What is the problem?

Page 24: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

Successful in narrow domains with large annotated datasets

What is the problem?

Page 25: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

Successful in narrow domains with large annotated datasets

What is the problem?

Page 26: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

Limited to the domain trained on and do not afford generalization to new domains.

Successful in narrow domains with large annotated datasets

What is the problem?

Page 27: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 What is the problem?

Successful in narrow domains with large annotated datasets

Limited to the domain trained on and do not afford generalization to new domains.

Page 28: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Dialogue State Induction (DSI)

Definition:

What is given?

A set of customer service records without annotation.

User: I want an expensive restaurant that serves Turkish food.System: Anatolia serves Turkish food.User: What is the area?

Page 29: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

Definition:

What is given?

A set of customer service records without annotation.

What is the target?

Automatically discover information that the user is looking for at each turn.

User: I want an expensive restaurant that serves Turkish food.System: Anatolia serves Turkish food.User: What is the area?

inform(price=expensive, food=Turkish)

inform(price=expensive, food=Turkish); request(area)

Dialogue State Induction (DSI)

Page 30: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Dialogue State Induction vs DST

Page 31: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Dialogue State Induction vs DST

User: I want an expensive restaurant that serves Turkish food.

System: Anatolia serves Turkish food.User: What is the area?

Page 32: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Dialogue State Induction vs DST

User: I want an expensive restaurant that serves Turkish food.Manual labeling: inform(price=expensive, food=Turkish)

System: Anatolia serves Turkish food.User: What is the area?Manual labeling: inform(price=expensive, food=Turkish); request(area)

Page 33: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Dialogue State Induction vs DST

User: I want an expensive restaurant that serves Turkish food.Manual labeling: inform(price=expensive, food=Turkish)

System: Anatolia serves Turkish food.User: What is the area?Manual labeling: inform(price=expensive, food=Turkish); request(area)

Ontology(optional): price: [cheap, expensive, moderate, ...] food: [Turkish, Italian, polish, ...] area: [south, north, center, ...]

Page 34: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Dialogue State Induction vs DST

User: I want an expensive restaurant that serves Turkish food.Manual labeling: inform(price=expensive, food=Turkish)

System: Anatolia serves Turkish food.User: What is the area?Manual labeling: inform(price=expensive, food=Turkish); request(area)

Ontology(optional): price: [cheap, expensive, moderate, ...] food: [Turkish, Italian, polish, ...] area: [south, north, center, ...]

Turn 1:inform(price=expensive, food=Turkish)

Turn 2:inform(price=expensive, food=Turkish); request(area)

Page 35: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Dialogue State Induction vs DST

User: I want an expensive restaurant that serves Turkish food.Manual labeling: inform(price=expensive, food=Turkish)

System: Anatolia serves Turkish food.User: What is the area?Manual labeling: inform(price=expensive, food=Turkish); request(area)

Ontology(optional): price: [cheap, expensive, moderate, ...] food: [Turkish, Italian, polish, ...] area: [south, north, center, ...]

Turn 1:inform(price=expensive, food=Turkish)

Turn 2:inform(price=expensive, food=Turkish); request(area)

User: I want an expensive restaurant that serves Turkish food.

System: Anatolia serves Turkish food.User: What is the area?

Page 36: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Dialogue State Induction vs DST

User: I want an expensive restaurant that serves Turkish food.Manual labeling: inform(price=expensive, food=Turkish)

System: Anatolia serves Turkish food.User: What is the area?Manual labeling: inform(price=expensive, food=Turkish); request(area)

Ontology(optional): price: [cheap, expensive, moderate, ...] food: [Turkish, Italian, polish, ...] area: [south, north, center, ...]

Turn 1:inform(price=expensive, food=Turkish)

Turn 2:inform(price=expensive, food=Turkish); request(area)

User: I want an expensive restaurant that serves Turkish food.Manual labeling: inform(price=expensive, food=Turkish)

System: Anatolia serves Turkish food.User: What is the area?Manual labeling: inform(price=expensive, food=Turkish); request(area)

Ontology(optional): price: [cheap, expensive, moderate, ...] food: [Turkish, Italian, polish, ...] area: [south, north, center, ...]

Page 37: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 Dialogue State Induction vs DST

User: I want an expensive restaurant that serves Turkish food.Manual labeling: inform(price=expensive, food=Turkish)

System: Anatolia serves Turkish food.User: What is the area?Manual labeling: inform(price=expensive, food=Turkish); request(area)

Ontology(optional): price: [cheap, expensive, moderate, ...] food: [Turkish, Italian, polish, ...] area: [south, north, center, ...]

Turn 1:inform(price=expensive, food=Turkish)

Turn 2:inform(price=expensive, food=Turkish); request(area)

User: I want an expensive restaurant that serves Turkish food.Manual labeling: inform(price=expensive, food=Turkish)

System: Anatolia serves Turkish food.User: What is the area?Manual labeling: inform(price=expensive, food=Turkish); request(area)

Ontology(optional): price: [cheap, expensive, moderate, ...] food: [Turkish, Italian, polish, ...] area: [south, north, center, ...]

Turn 1:inform(price=expensive, food=Turkish)

Turn 2:inform(price=expensive, food=Turkish); request(area)

Page 38: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1 DSI VS zero-shot DST

Zero-shot DST: support unseen domains (services)

Page 39: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

Zero-shot DST: support unseen domains (services)

Motivation: Different domains (services) with similar schemas

DSI VS zero-shot DST

Page 40: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

Zero-shot DST: support unseen domains (services)

service_name: “Flights" Service

name: "origin" Slots

name: "destination"

service_name: "Trains" Service

name: "from" Slots

name: "to"

Motivation: Different domains (services) with similar schemas

DSI VS zero-shot DST

Page 41: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

Zero-shot DST: support unseen domains (services)

Example from the SGD dataset [Rastogi et al., 2019].

Motivation: Different domains (services) with similar schemasslots along with their natural language description

service_name: “Flights" Servicedescription: "Search for cheap flights across multiple providers"

name: "origin" Slotsdescription: “City of origin for the flight"

name: "destination"description: “City of destination for the flight"

service_name: "Trains" Servicedescription: “Service to find train journeys between cities"

name: "from" Slotsdescription: “Starting city for train journey"

name: "to"description: “Ending city for the train journey"

[Rastogi et al., 2019]Rastogi A, Zang X, Sunkara S, et al. Towards scalable multi-domain conversational agents: The schema-guided dialogue dataset[J]. arXivpreprint arXiv:1909.05855, 2019.

DSI VS zero-shot DST

Page 42: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

service_name: “Flights" Servicedescription: "Search for cheap trains across multiple providers"

name: "origin" Slotsdescription: “City of origin for the flight"

name: "destination"description: “City of destination for the flight"

service_name: "Trains" Servicedescription: “Service to find train journeys between cities"

name: "from" Slotsdescription: “Starting city for train journey"

name: "to"description: “Ending city for the train journey"

DSI VS zero-shot DST

Page 43: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

service_name: “Flights" Servicedescription: "Search for cheap trains across multiple providers"

name: "origin" Slotsdescription: “City of origin for the flight"

name: "destination"description: “City of destination for the flight"

service_name: "Trains" Servicedescription: “Service to find train journeys between cities"

name: "from" Slotsdescription: “Starting city for train journey"

name: "to"description: “Ending city for the train journey"

Zero-shot DST Limitations:

DSI VS zero-shot DST

Page 44: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

• High qualified (consistent) human annotation

service_name: “Flights" Servicedescription: "Search for cheap trains across multiple providers"

name: "origin" Slotsdescription: “City of origin for the flight"

name: "destination"description: “City of destination for the flight"

service_name: "Trains" Servicedescription: “Service to find train journeys between cities"

name: "from" Slotsdescription: “Starting city for train journey"

name: "to"description: “Ending city for the train journey"

Zero-shot DST Limitations:

DSI VS zero-shot DST

Page 45: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

service_name: “Flights" Servicedescription: "Search for cheap trains across multiple providers"

name: "origin" Slotsdescription: “City of origin for the flight"

name: "destination"description: “City of destination for the flight"

service_name: "Trains" Servicedescription: “Service to find train journeys between cities"

name: "from" Slotsdescription: “Starting city for train journey"

name: "to"description: “Ending city for the train journey"

• High qualified (consistent) human annotationZero-shot DST Limitations:

DSI VS zero-shot DST

Page 46: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

service_name: “Flights" Servicedescription: "Search for cheap trains across multiple providers"

name: "origin" Slotsdescription: “City of origin for the flight"

name: "destination"description: “City of destination for the flight"

service_name: "Trains" Servicedescription: “Service to find train journeys between cities"

name: "from" Slotsdescription: “Starting city for train journey"

name: "to"description: “Ending city for the train journey"

• High qualified (consistent) human annotationZero-shot DST Limitations:

DSI VS zero-shot DST

Page 47: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

service_name: “Flights" Servicedescription: "Search for cheap trains across multiple providers"

name: "origin" Slotsdescription: “City of origin for the flight"

name: "destination"description: “City of destination for the flight"

service_name: "Trains" Servicedescription: “Service to find train journeys between cities"

name: "from" Slotsdescription: “Starting city for train journey"

name: "to"description: “Ending city for the train journey"

consistency

• High qualified (consistent) human annotationZero-shot DST Limitations:

DSI VS zero-shot DST

Page 48: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

service_name: “Flights" Servicedescription: "Search for cheap trains across multiple providers"

name: "origin" Slotsdescription: “City of origin for the flight"

name: "destination"description: “City of destination for the flight"

service_name: "Trains" Servicedescription: “Service to find train journeys between cities"

name: "from" Slotsdescription: “Starting city for train journey"

name: "to"description: "Ending city for the train journey"

consistency

• High qualified (consistent) human annotation

• Transfer to distant domain (service)

Zero-shot DST Limitations:

DSI VS zero-shot DST

Page 49: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

service_name: "Financial" Servicedescription: “Search for financial products across multiple providers"

name: “deposit_rate" Slotsdescription: "the standard for calculating deposit interest"

name: "loan_rate"description: "the interest rate that banks charge to borrowers"

service_name: "Trains" Servicedescription: “Service to find train journeys between cities"

name: "from" Slotsdescription: "Starting city for train journey"

name: "to"description: "Ending city for the train journey"

consistency

• High qualified (consistent) human annotation

• Transfer to distant domain (service)

Zero-shot DST Limitations:

DSI VS zero-shot DST

Page 50: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

service_name: "Financial" Servicedescription: “Search for financial products across multiple providers"

name: “deposit_rate" Slotsdescription: "the standard for calculating deposit interest"

name: "loan_rate"description: "the interest rate that banks charge to borrowers"

service_name: "Trains" Servicedescription: “Service to find train journeys between cities"

name: "from" Slotsdescription: "Starting city for train journey"

name: "to"description: "Ending city for the train journey"

consistency

• High qualified (consistent) human annotation

• Transfer to distant domain (service)

XZero-shot DST Limitations:

DSI VS zero-shot DST

Page 51: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 1

service_name: "Financial" Servicedescription: “Search for financial products across multiple providers"

name: “deposit_rate" Slotsdescription: "the standard for calculating deposit interest"

name: "loan_rate"description: "the interest rate that banks charge to borrowers"

service_name: "Trains" Servicedescription: “Service to find train journeys between cities"

name: "from" Slotsdescription: "Starting city for train journey"

name: "to"description: "Ending city for the train journey"

consistency

• High qualified (consistent) human annotation

Zero-shot DST Limitations:

• Transfer to distant domain (service)

XDSI features:• Release human burden

• Data-driven: automatically discover

DSI VS zero-shot DST

Page 52: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

Method

Page 53: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 How we solve DSI?

Two steps: Utterance: I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Page 54: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 How we solve DSI?

train, Chicago, Dallas, Wednesday

• Candidates (values) extraction (POS tag, NER, coreference)

Two steps: Utterance: I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Page 55: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 How we solve DSI?

train, Chicago, Dallas, Wednesday

• Candidates (values) extraction (POS tag, NER, coreference)

Inform{train-departure=Chicago, train-destination=Dallas, train-leave at=Wednesday}train=None

Two steps:

• Slot assignment: two neural latent variable models(DSI-base and DSI-GM)

Utterance: I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Page 56: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 What is VAE?

Pics from https://www.jianshu.com/p/ffd493e10751

AutoEncoder Variational AutoEncoder

Page 57: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 Model: generative DSI-base

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder

Page 58: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 Model: generative DSI-base

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

0 1 0 1 … 0 0 1 1

vocab length (all candidates)

one-hot

Encoder

Page 59: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 Model: generative DSI-base

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

0 1 0 1 … 0 0 1 1

vocab length (all candidates)

one-hot

LT+SoftPlus+Dropout

Encoder

Page 60: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 Model: generative DSI-base

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

0 1 0 1 … 0 0 1 1

vocab length (all candidates)

one-hot

LT+SoftPlus+Dropout

encoded oh

Encoder

Page 61: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 Model: generative DSI-base

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

0 1 0 1 … 0 0 1 1

vocab length (all candidates)

one-hot Pre-trained ELMo

train Chicago Dallas Wednesdaycontextualized

embedding

LT+SoftPlus+Dropout

encoded oh

Encoder

Page 62: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 Model: generative DSI-base

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

0 1 0 1 … 0 0 1 1

vocab length (all candidates)

one-hot Pre-trained ELMo

train Chicago Dallas Wednesdaycontextualized

embedding

LT+SoftPlus+Dropout

AvgPooling+LT+SoftPlus+Dropout

encoded oh

Encoder

Page 63: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 Model: generative DSI-base

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

0 1 0 1 … 0 0 1 1

vocab length (all candidates)

one-hot Pre-trained ELMo

train Chicago Dallas Wednesdaycontextualized

embedding

LT+SoftPlus+Dropout

AvgPooling+LT+SoftPlus+Dropout

encoded oh encoded ce

Encoder

Page 64: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 Model: generative DSI-base

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

0 1 0 1 … 0 0 1 1

vocab length (all candidates)

one-hot Pre-trained ELMo

train Chicago Dallas Wednesdaycontextualized

embedding

LT+SoftPlus+Dropout

AvgPooling+LT+SoftPlus+Dropout

encoded oh encoded ce

LT+BatchNorm LT+BatchNorm

posterior µ 2posterior logσ

Encoder

Page 65: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

posterior µ 2posterior logσ

Model: generative DSI-base

Page 66: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

Sampling

posterior µ 2posterior logσ

Model: generative DSI-base

Page 67: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

( ), 0,µ σ ε ε+ ∗ Ι

reparameterization trick

Sampling

posterior µ 2posterior logσ

Model: generative DSI-base

Page 68: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

( ), 0,µ σ ε ε+ ∗ Ι

reparameterization trick

zlatent state vector

Sampling

posterior µ 2posterior logσ

Model: generative DSI-base

Page 69: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

( ), 0,µ σ ε ε+ ∗ Ι

reparameterization trick

zlatent state vector

Sampling

Decoder

posterior µ 2posterior logσ

Model: generative DSI-base

Page 70: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

( ), 0,µ σ ε ε+ ∗ Ι

reparameterization trick

zlatent state vector

LT+Softmax

Sampling

Decoder

posterior µ 2posterior logσ

Model: generative DSI-base

Page 71: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

( ), 0,µ σ ε ε+ ∗ Ι

reparameterization trick

zlatent state vector

LT+Softmax

slatent slot vector

(dimension: slot_num)

Sampling

Decoder

posterior µ 2posterior logσ

Model: generative DSI-base

Page 72: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

( ), 0,µ σ ε ε+ ∗ Ι

reparameterization trick

zlatent state vector

LT+Softmax

slatent slot vector

(dimension: slot_num)

LT+BatchNorm

Sampling

Decoder

posterior µ 2posterior logσ

Model: generative DSI-base

Page 73: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

( ), 0,µ σ ε ε+ ∗ Ι

reparameterization trick

zlatent state vector

LT+Softmax

slatent slot vector

(dimension: slot_num)

LT+BatchNorm

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4

reconstructed ohvocab length (all candidates)

Sampling

Decoder

posterior µ 2posterior logσ

Model: generative DSI-base

Page 74: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

( ), 0,µ σ ε ε+ ∗ Ι

reparameterization trick

zlatent state vector

LT+Softmax

slatent slot vector

(dimension: slot_num)

LT+BatchNorm LT+BatchNorm

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4

reconstructed ohvocab length (all candidates)

Sampling

Decoder

posterior µ 2posterior logσ

Model: generative DSI-base

Page 75: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

( ), 0,µ σ ε ε+ ∗ Ι

reparameterization trick

zlatent state vector

LT+Softmax

slatent slot vector

(dimension: slot_num)

LT+BatchNorm LT+BatchNorm

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4

reconstructed oh

recon_µ

vocab length (all candidates)contextualized

feature dimension

Sampling

Decoder

posterior µ 2posterior logσ

Model: generative DSI-base

Page 76: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

( ), 0,µ σ ε ε+ ∗ Ι

reparameterization trick

zlatent state vector

LT+Softmax

slatent slot vector

(dimension: slot_num)

LT+BatchNorm LT+BatchNorm LT+BatchNorm

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4

reconstructed oh

recon_µ

vocab length (all candidates)contextualized

feature dimension

Sampling

Decoder

posterior µ 2posterior logσ

Model: generative DSI-base

Page 77: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

( ), 0,µ σ ε ε+ ∗ Ι

reparameterization trick

zlatent state vector

LT+Softmax

slatent slot vector

(dimension: slot_num)

LT+BatchNorm LT+BatchNorm LT+BatchNorm

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4

reconstructed oh

recon_µ 2recon_ logσ

vocab length (all candidates)contextualized

feature dimensioncontextualized

feature dimension

Sampling

Decoder

posterior µ 2posterior logσ

Model: generative DSI-base

Page 78: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

( ), 0,µ σ ε ε+ ∗ Ι

reparameterization trick

zlatent state vector

LT+Softmax

slatent slot vector

(dimension: slot_num)

LT+BatchNorm LT+BatchNorm LT+BatchNorm

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4

reconstructed oh

recon_µ 2recon_ logσ

vocab length (all candidates)contextualized

feature dimensioncontextualized

feature dimensionreconstructed parameters for ce

Sampling

Decoder

posterior µ 2posterior logσ

Model: generative DSI-base

Page 79: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

reconstruction term

regularization term

Loss

Page 80: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4 reconstructed oh

vocab length (all candidates)

0 1 0 1 … 0 0 1 1 ohCross Entropy

Loss

reconstruction term

regularization term

Loss

Page 81: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4 reconstructed oh

recon_µ

2recon_ logσ

vocab length (all candidates)

contextualized feature dimension

contextualized feature dimension

0 1 0 1 … 0 0 1 1 ohCross Entropy

Loss

Multivariate Gaussian Distribution

reconstruction term

regularization term

Loss

Page 82: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4 reconstructed oh

recon_µ

2recon_ logσ

vocab length (all candidates)

contextualized feature dimension

contextualized feature dimension

0 1 0 1 … 0 0 1 1 ohCross Entropy

Loss

Multivariate Gaussian Distribution

train Chicago Dallas Wednesdayreconstruction

term

regularization term

Loss

Page 83: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4 reconstructed oh

recon_µ

2recon_ logσ

vocab length (all candidates)

contextualized feature dimension

contextualized feature dimension

0 1 0 1 … 0 0 1 1 ohCross Entropy

Loss

Multivariate Gaussian Distribution

Pic from https://www.datalearner.com/blog/1051485590815771

train Chicago Dallas Wednesday

-dist.log_prob(ce[train])

-dist.log_prob(ce[Wednesday])

-dist.log_prob(ce[Chicago])

-dist.log_prob(ce[Dallas])

reconstruction term

regularization term

Loss

Page 84: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4 reconstructed oh

recon_µ

2recon_ logσ

vocab length (all candidates)

contextualized feature dimension

contextualized feature dimension

0 1 0 1 … 0 0 1 1 ohCross Entropy

Loss

Multivariate Gaussian Distribution

Pic from https://www.datalearner.com/blog/1051485590815771

train Chicago Dallas Wednesday

-dist.log_prob(ce[train])sum

-dist.log_prob(ce[Wednesday])

-dist.log_prob(ce[Chicago])

-dist.log_prob(ce[Dallas])

reconstruction term

regularization term

Loss

Page 85: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4 reconstructed oh

recon_µ

2recon_ logσ

vocab length (all candidates)

contextualized feature dimension

contextualized feature dimension

0 1 0 1 … 0 0 1 1 ohCross Entropy

Loss

Multivariate Gaussian Distribution

Pic from https://www.datalearner.com/blog/1051485590815771

train Chicago Dallas Wednesday

-dist.log_prob(ce[train])sum

-dist.log_prob(ce[Wednesday])

-dist.log_prob(ce[Chicago])

-dist.log_prob(ce[Dallas])

reconstruction term

( ) ( )( , , 0,1 )KL µ σ KL Divergence between posterior and prior: regularization term

Loss

Page 86: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 DSI-base inferenceI need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Page 87: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 DSI-base inferenceI need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder

Page 88: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 DSI-base inferenceI need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder

posterior µ 2posterior logσ

Page 89: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 DSI-base inferenceI need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder

posterior µ 2posterior logσ

Sampling

Page 90: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 DSI-base inferenceI need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder

zlatent state vector

posterior µ 2posterior logσ

Sampling

Page 91: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 DSI-base inferenceI need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder

zlatent state vector

posterior µ 2posterior logσ

Sampling

LT+Softmax

Page 92: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 DSI-base inferenceI need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder

zlatent state vector

posterior µ 2posterior logσ

Sampling

LT+Softmax

s

Page 93: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 DSI-base inferenceI need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder

zlatent state vector

posterior µ 2posterior logσ

Sampling

LT+Softmax

s

p(s)[slot_num,1]

Page 94: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 DSI-base inferenceI need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder

zlatent state vector

posterior µ 2posterior logσ

Sampling

LT+Softmax

s

p(s)

For each candidate in {train, Chicago, Dallas, Wednesday}

[slot_num,1]

Page 95: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

slatent slot vector

(dimension: slot_num)

LT+BatchNorm LT+BatchNorm LT+BatchNorm

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4

reconstructed oh

recon_µ 2recon_ logσ

vocab length (all candidates)contextualized

feature dimensioncontextualized

feature dimensionreconstructed parameters for ce

What does the model learn ?

Decoder

Page 96: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

slatent slot vector

(dimension: slot_num)

LT+BatchNorm LT+BatchNorm LT+BatchNorm

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4

reconstructed oh

recon_µ 2recon_ logσ

vocab length (all candidates)contextualized

feature dimensioncontextualized

feature dimensionreconstructed parameters for ce

slot_num

vocab len

W: parameter matrix

What does the model learn ?

Decoder

Page 97: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

0.10.7

0.01

0.08

……

What does the model learn ?s: slot vector

slot_num

vocab_len

W: parameter matrix

slot_num 0.01 0.2 0.01 0.8 … 0.03 0.02 0.3 0.4× =

Page 98: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

0.10.7

0.01

0.08

……

What does the model learn ?s: slot vector

slot_num

vocab_len

W: parameter matrix

slot_num 0.01 0.2 0.01 0.8 … 0.03 0.02 0.3 0.4

0 1 0 1 … 0 0 1 1

× =Cross Entropy Loss

Page 99: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

…….

0.10.7

0.01

0.08

……

What does the model learn ?s: slot vector

slot_num

vocab_len

W: parameter matrix

slot_num 0.01 0.2 0.01 0.8 … 0.03 0.02 0.3 0.4

0 1 0 1 … 0 0 1 1

× =Cross Entropy Loss

slot_num different slot-candidate distributions: p(c|si), i=1,2,…, slot_num

Page 100: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 DSI-base inference

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder

zlatent state vector

posterior µ 2posterior logσ

Sampling

LT+Softmax

s

p(s)

For each candidate in {train, Chicago, Dallas, Wednesday}

[slot_num,1]

Page 101: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 DSI-base inference

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder

zlatent state vector

posterior µ 2posterior logσ

Sampling

LT+Softmax

s

p(s)

For each candidate in {train, Chicago, Dallas, Wednesday}

0 0 0 1 … 0 0 0 0oh

[slot_num,1]

Page 102: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 DSI-base inference

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder

zlatent state vector

posterior µ 2posterior logσ

Sampling

LT+Softmax

s

p(s)

For each candidate in {train, Chicago, Dallas, Wednesday}

0 0 0 1 … 0 0 0 0oh

slot_num

vocab_len

[slot_num,1]

Page 103: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 DSI-base inference

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder

zlatent state vector

posterior µ 2posterior logσ

Sampling

LT+Softmax

s

p(s)

For each candidate in {train, Chicago, Dallas, Wednesday}

0 0 0 1 … 0 0 0 0oh

slot_num

vocab_len

p(oh|s)[slot_num,1] [slot_num,1]

Page 104: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 DSI-base inference

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder

zlatent state vector

posterior µ 2posterior logσ

Sampling

LT+Softmax

s

p(s)

For each candidate in {train, Chicago, Dallas, Wednesday}

0 0 0 1 … 0 0 0 0oh

slot_num

vocab_len

p(oh|s)[slot_num,1] [slot_num,1]

Chicagoce

Page 105: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

slatent slot vector

(dimension: slot_num)

LT+BatchNorm LT+BatchNorm LT+BatchNorm

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4

reconstructed oh

recon_µ 2recon_ logσ

vocab length (all candidates)contextualized

feature dimensioncontextualized

feature dimensionreconstructed parameters to generate ce

DSI-base inference

feature_dim

Decoder

Page 106: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

slatent slot vector

(dimension: slot_num)

LT+BatchNorm LT+BatchNorm LT+BatchNorm

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4

reconstructed oh

recon_µ 2recon_ logσ

vocab length (all candidates)contextualized

feature dimensioncontextualized

feature dimensionreconstructed parameters to generate ce

feature_dim

DSI-base inference

slot_num

feature_dim

Decoder

Page 107: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

slatent slot vector

(dimension: slot_num)

LT+BatchNorm LT+BatchNorm LT+BatchNorm

0.01 0.2 0.01 0.08 … 0.03 0.02 0.3 0.4

reconstructed oh

recon_µ 2recon_ logσ

vocab length (all candidates)contextualized

feature dimensioncontextualized

feature dimensionreconstructed parameters to generate ce

feature_dim

DSI-base inference

slot_num

feature_dim

slot_num

Decoder

Page 108: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

0.10.7

0.01

0.08

……

What does the model learn ?

s: slot vector

slot_num

featrure_dim

W: parameter matrix

slot_num ×

=

slot_num

featrure_dim

recon_µ

2recon_ logσ=

Page 109: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

0.10.7

0.01

0.08

……

What does the model learn ?

s: slot vector

slot_num

featrure_dim

W: parameter matrix

slot_num ×

=

slot_num

featrure_dim

recon_µ

2recon_ logσ=

1Sµ −

…….

1Sσ −

……

Page 110: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

0.10.7

0.01

0.08

……

What does the model learn ?

s: slot vector

slot_num

featrure_dim

W: parameter matrix

slot_num ×

=

slot_num

featrure_dim

recon_µ

2recon_ logσ=

1Sµ −

…….

1Sσ −

……

slot_num different Multivariate Gaussian Distributions

( , ), 1, 2, ,i i i Sµ σ =

Page 111: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 Multivariate Gaussian probability density

µ

σ

contextualized feature dimension

contextualized feature dimension Multivariate Gaussian Distribution

Pic from https://www.datalearner.com/blog/1051485590815771

train Chicago Dallas Wednesday

-dist.log_prob(ce[train])

-dist.log_prob(ce[Wednesday])

-dist.log_prob(ce[Chicago])

-dist.log_prob(ce[Dallas])

Page 112: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

zlatent state vector

LT+Softmax

slatent slot vector(dimension: slot_num)

p(s)

For each candidate in {train, Chicago, Dallas, Wednesday}

0 0 0 1 … 0 0 0 0

p(oh|s)

oh

[slot_num,1]

slot_num

vocab_len

[slot_num,1]

Chicagoce

DSI-base inference

Page 113: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

zlatent state vector

LT+Softmax

slatent slot vector(dimension: slot_num)

p(s)

For each candidate in {train, Chicago, Dallas, Wednesday}

0 0 0 1 … 0 0 0 0

p(oh|s)

oh

[slot_num,1]

slot_num

vocab_len

[slot_num,1]

Chicagoce

( , ),1, 2, , _

i i

i slot numµ σ

=

DSI-base inference

Page 114: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

zlatent state vector

LT+Softmax

slatent slot vector(dimension: slot_num)

p(s)

For each candidate in {train, Chicago, Dallas, Wednesday}

0 0 0 1 … 0 0 0 0

p(oh|s)

oh

[slot_num,1]

slot_num

vocab_len

[slot_num,1]

Chicagoce

( , ),1, 2, , _

i i

i slot numµ σ

=

p(ce|s)[slot_num,1]

DSI-base inference

Page 115: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

zlatent state vector

LT+Softmax

slatent slot vector(dimension: slot_num)

p(s)

For each candidate in {train, Chicago, Dallas, Wednesday}

0 0 0 1 … 0 0 0 0

p(oh|s)

oh

[slot_num,1]

slot_num

vocab_len

[slot_num,1]

Chicagoce

( , ),1, 2, , _

i i

i slot numµ σ

=

p(ce|si)[slot_num,1]

* *

DSI-base inference

Page 116: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

zlatent state vector

LT+Softmax

slatent slot vector(dimension: slot_num)

p(s)

For each candidate in {train, Chicago, Dallas, Wednesday}

0 0 0 1 … 0 0 0 0

p(oh|s)

oh

[slot_num,1]

slot_num

vocab_len

[slot_num,1]

Chicagoce

( , ),1, 2, , _

i i

i slot numµ σ

=

p(ce|si)[slot_num,1]

* *p(s|Chicago)[slot_num,1]

DSI-base inference

Page 117: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 Post-processing: slot mapping

Chicago slot 0

Houston

Washington

Boston

slot 0

slot 0

slot 0

predictions

train-departure

references

Mapping from slot indexes to labels?

{0: train-departure}

Majority vote

train-departure

train-departure

train-departure

Chicago

Houston

Washington

Boston

Magdalene College slot 0 Magdalene College attraction-name

4 times

1 time

Page 118: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 Post-processing: slot mapping

Chicago slot 0

Houston

Washington

Boston

slot 0

slot 0

slot 0

predictions

train-departure

references

Mapping from slot indexes to labels?

train-departure

train-departure

train-departure

Chicago

Houston

Washington

Boston

Magdalene College slot 0 Magdalene College train-departure

{0: train-departure}

Page 119: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 What is the problem with DSI-base?

Multiple domains train

departuredestinationday

taxi

leave atarrive bySimilar slots

Page 120: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

Multiple domains train

departuredestinationday

taxi

leave atarrive by

I need to arrive by 19:45 and departs from Cambridge.

Similar slots

Similar contexts

What is the problem with DSI-base?

Page 121: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 How does DSI-GM work?

train-arrive by

train-leave at

train-daytrain-destination

train-departure

taxi-arrive by

taxi-leave at

taxi-destinationtaxi-departure

Z

A single Gaussian prior

µ σ

Page 122: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2 How does DSI-GM work?

train-arrive by

train-leave at

train-daytrain-destination

train-departure

taxi-arrive by

taxi-leave at

taxi-destinationtaxi-departure

Z

A single Gaussian prior

µ σ

train-arrive by

train-leave at

train-day

train-destinationtrain-departure

taxi-arrive by

taxi-leave at

taxi-destination

taxi-departureZ

A Mixture-of-Gaussians prior

1 Kµ µ 1 Kσ σ

Z

π

Page 123: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

How does DSI-GM work?

Page 124: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

DSI-base

How does DSI-GM work?

Page 125: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

DSI-base

How does DSI-GM work?

Page 126: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

z

DSI-base

How does DSI-GM work?

Page 127: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

z

Prediction

DSI-base

How does DSI-GM work?

Page 128: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

z

Prediction

Chicago: taxi-departureDSI-base

How does DSI-GM work?

Page 129: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

z

Prediction

Chicago: taxi-departureDSI-base DSI-GM

How does DSI-GM work?

Page 130: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

z

Prediction

Chicago: taxi-departureDSI-base

Encoder+Sampling

DSI-GM

How does DSI-GM work?

Page 131: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

z

Prediction

Chicago: taxi-departureDSI-base

Encoder+Sampling

z

DSI-GM

How does DSI-GM work?

Page 132: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

z

Prediction

Chicago: taxi-departureDSI-base

Encoder+Sampling

z

DSI-GM

A Mixture-of-Gaussians

Domain: train

How does DSI-GM work?

Page 133: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

z

Prediction

Chicago: taxi-departureDSI-base

Encoder+Sampling

z

Prediction

DSI-GM

A Mixture-of-Gaussians

Domain: train

How does DSI-GM work?

Page 134: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

z

Prediction

Chicago: taxi-departureDSI-base

Encoder+Sampling

z

Prediction Chicago: departure

DSI-GM

A Mixture-of-Gaussians

Domain: train

How does DSI-GM work?

Page 135: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 2

I need to take a train out of Chicago, I will be leaving Dallas on Wednesday.

Encoder+Sampling

z

Prediction

Chicago: taxi-departureDSI-base

Encoder+Sampling

z

Prediction Chicago: departure

DSI-GM

A Mixture-of-Gaussians

Domain: train

Chicago: train-departure

How does DSI-GM work?

Page 136: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

请输入您的标题NINDEBIAOTIExperiments

Page 137: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 3 DSI results

Page 138: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 3 DSI results

Page 139: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 3 DSI-Based Response Generation

Page 140: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 3 DSI-Based Response Generation

[Chen et al., 2019] Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, and William Yang Wang. Semantically conditioned dialog response generation via hierarchical disentangled self-attention. In ACL, 2019.

Page 141: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 3 DSI-Based Response Generation

[Chen et al., 2019] Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, and William Yang Wang. Semantically conditioned dialog response generation via hierarchical disentangled self-attention. In ACL, 2019.

Page 142: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 3 DSI-Based Response Generation

[Chen et al., 2019] Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, and William Yang Wang. Semantically conditioned dialog response generation via hierarchical disentangled self-attention. In ACL, 2019.

Page 143: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 3 Analysis

Page 144: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 3

Domain level comparison of the latent representation z.

Analysis

Page 145: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

请输入您的标题NINDEBIAOTIConclusion

Page 146: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 4 Conclusion

Page 147: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 4

• Dialogue state induction: a novel task to automatically identify dialogue states

Conclusion

Page 148: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 4

• Dialogue state induction: a novel task to automatically identify dialogue states

Conclusion

• DSI-base/DSI-GM: two neural generative models with latent variables

Page 149: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 4

• Dialogue state induction: a novel task to automatically identify dialogue states

Conclusion

• DSI-base/DSI-GM: two neural generative models with latent variables

• Challenging and promising: unsupervised setting is very practical

Page 150: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

CHAPTER 4

• Dialogue state induction: a novel task to automatically identify dialogue states

Conclusion

• DSI-base/DSI-GM: two neural generative models with latent variables

• Challenging and promising: unsupervised setting is very practical

• IJCAI review: this problem is important and interesting, this area should attract more attention. This work has great potential of motivating follow-up research.

Page 151: taolusi.github.io · Dialogue State Induction Using Neural Latent Variable Models. Qingkai Min. 1;2 , Libo Qin. 3 , Zhiyang Teng. 1;2 , Xiao Liu. 4. and Yue Zhang. 1;2. 1. School

Contact: [email protected]

Paper: https://www.ijcai.org/Proceedings/2020/0532.pdf

GitHub: https://github.com/taolusi/dialogue-state-induction

GitHubpaper