Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
DisenQNet: Disentangled Representation Learning for Educational Questions
Zhenya Huang, Xin Lin, Hao Wang, Qi Liu, Enhong Chen*, Jianhui Ma, Yu Su, Wei Tong
Unversity of Science and Technology of China, iFLYTEK Co., Ltd
The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'2021)
Outline
¨ Introduction
¨ Preliminary
¨ Our Method
¨ Experiment
¨ Conclusion
2
Introduction
¨ Online learning systemso Collect millions of learning materials n Course, question, test, etc
o Provide intelligent services to improve learning experiencen Students select suitable questions or courses to acquire knowledgen Systems provide personalized recommendations
3
Question Course
Introduction
¨ Real world challenges with millions of learning materialso How to organize, search and recommend questions?o How to promote question-based applications?n Search questions to find similar onesn recommend questions with property difficulty (for students)n ……
¨ Fundamental topic in AI educationo Question understanding (automatic)o Goal: learning informative representations of question
4
Related work
¨ Traditional NLP work (earlier)o Lexical analysis or Semantic analysisn Design fine-grained rules or grammars
o Representation: explicit trees or templates¨ NLP based worko End-to-end frameworksn Understand question contentn Learn from application tasks, e.g., difficulty estimation, similarity search
o Representation: latent semantic vector ¨ Recent Pre-training worko Pre-training with large question corpusn Enhance question semantics learning
o Representation: latent semantic vector
5
Related work—Limitations
¨ Supervised mannero Requiring sufficient labeled data n E.g., question difficulty, question pair similarity
o Scarcity of labels with high qualityn E.g., difficulty is being examined in standard tests (GRE)
6
Label
Related work—Limitations
¨ Task-dependent representationo Different models for same questions in different application taskso Poor transferability across tasks
7
Model A
Model B
Model C
Task A
Task B
Task C
Question
Transfer
Related work—Limitations
¨ One unified vector representationo All the information are integrated togethero Question with same concept are quite differentn Conceptn Personal properties (difficulty, semantics)
8
differentquestions
same concept
Introduction
¨ Ideal question representation modelo Get rid of labels in specific tasksn Try to learn information of question on their own
o Distinguish the different characteristics of questionsn Reduce noisen Explicit way to get good interpretability
o Question representations should be flexiblen Can be applied in different downstream tasks n Improve the applications in online learning systems
9
Our work—main idea
¨ Disentangled representationo Disentangle question information into two representationsn Concept representation n Individual representation
o Concept representation n high dependency to concept information (knowledge)
o Individual representation n high dependency to individual information (difficulty, semantics, et al.)
o Two representations with high independency to each othern Contain no information from each other
10
Outline
¨ Introduction
¨ Preliminary
¨ Our Method
¨ Experiment
¨ Conclusion
11
Problem definition
¨ Unsupervised question representation Learningo Given: ! = {$%, $', … , $)},$ = {+%, +', … , +,} with - ∈ /o Goal: disentangled question representationn Concept representation 01 ∈ 23n Individual representation 04 ∈ 23
¨ Question-based supervised taskso Given ! = !5 ∪ !7 and !5 ≪ |!7|n Labeled !5 = $%, $', … , $5 with {:%, :', … , :5}n Unlabeled !7 = {$%, $', … , $7}
o Goal: predict properties of unknown questionsn e.g., difficulty of one questionn e.g., similarity of question pair
12
Outline
¨ Introduction
¨ Preliminary
¨ Our Method
¨ Experiment
¨ Conclusion
13
DisenQNet: glance
¨ Disentangled Question Network (DisenQNet)o Unsupervised model without labelso Question encodern Learn to disentangle one question into two ideal representations
o Self-supervised optimization n Three information estimators
14
Model architecture Model optimization
DisenQNet
¨ Question Encodero Learn to disentangle one question into two ideal representationsn Concept representation !"n Individual representation !#
o Key: they focus on different contentn e.g., concept: “function”, individual: “f(-1)=2”
15
Integrated semantics!$ = &'({!*: !* ∈ -})e.g., TextCNN, CNN, LSTM.
Disentangled semanticsAttention Network0$ = !", !#!" = Σ3456 73!373 = Softmax MLP !3, !$
DisenQNet
¨ How to optimize? — Self-supervised optimizationo Three estimators to measure the information dependency
16
MI Estimator!" = $%, $' contains all information of one questionØ maximize MI between () and each word *+ ∈ -
DisenQNet
¨ How to optimize? — Self-supervised optimizationo Three estimators to measure the information dependency
17
Concept Estimator!" containsthegivenconceptmeaning explicitlyØ Multi-labelconceptclassificationtask:predictconcepts
DisenQNet
¨ How to optimize? — Self-supervised optimizationo Three estimators to measure the information dependency
18
Disentangling EstimatorØ Keep%& and%* independent:%& mustnotcontaintheinformationby%*Ø Minimizethemutualinformationbetween%& and%* (cannotdirectlylearn)Ø Method:WGAN-likeadversarial training
Minimize WassersteindistancebetweenF %&, %* andF(%&) ⊗ F(%*)=>F %& ⊗ F(%*) ≈ F(%&, %*) => Independent %& and %*
DisenQNet+
¨ DisenQNet+ — Question-based supervised tasks o Transfer !" from DisenQNet to improve different applicationsn e.g., difficulty estimation, similarity search
o Key: individual !" focus more on unique information
19
Traditionally: end-to-end method Ø may suffer from overfitting due to insufficient data
Our improve: force !" from DisenQNet to task model via mutual information maximization
Outline
¨ Introduction
¨ Preliminary
¨ Our Method
¨ Experiment
¨ Conclusion
20
Experiment
¨ Dataseto System1: high school level questionso System2: middle school level questionsn Concepts: “Function”, “Triangle”, “Set”, etc
o Math23K: elementary school level questionsn Concepts (five operations): +, −, ×, ÷, ∧
21
Experiment
¨ DisenQNet Evaluation (!" and !#)o Task: Concept Prediction Performanceo Baseline: Text model, NLP pre-trained models, question pre-trained model
22
Disentangled representation learning is necessaryØ DisenQNet-!" is well predicted: !" capture the concept information of questionsØ DisenQNet-!# fails to predict concepts: !# removes the concept information
Experiment
¨ DisenQNet Visualization (!" and !#)23
Ø !" are easier to be grouped by concepts
Ø !# are scattered
Ø !" is more related to concept words (“Odd function”, “solution set”, “inequality”)Ø !# focuses more on mathematical expressions (“f (-1) = 2” )
Experiment¨ DisenQNet+ evaluation
24
Ø Disentangled learning is better than integrated learningØ !" improves the application performance (best)
Ø It can preserve personal information of questionsØ It has good ability to be transferred across different tasks
Similarity ranking
Difficulty ranking
Two tasks
Outline
¨ Introduction
¨ Preliminary
¨ Our Method
¨ Experiment
¨ Conclusion
25
Conclusion
¨ Summaryo Disentangled representation learning for educational questionso Unsupervised DisenQNetn Distinguish concept and individual information of questionsn Good interpretability
o Semi-supervised DisenQNet+n Improve the performance of different tasksn Good transferability
¨ Future worko More sophisticated models for disentanglement implementationo Heterogeneous questions, e.g., geometry o Deeper knowledge transferring
26
Thanks!
27
The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'2021)
[email protected]@mail.ustc.edu.cn