Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Distillation of Knowledge from the Research Literatures on Alzheimer’s
Dementia
Wutthipong Kongburan, Mark Chignell, and Jonathan H. Chan
School of Information TechnologyKing Mongkut's University of Technology Thonburi
JSCI 2017 1
Agenda
INTRODUCTION
PROPOSED METHOD
EXPERIMENT, RESULT AND DISCUSSION
CONCLUSION AND FUTURE WORK
2
Agenda
INTRODUCTION
PROPOSED METHOD
EXPERIMENTAL RESULT AND DISCUSSION
CONCLUSION AND FUTURE WORK
3
4
Alzheimer Disease 5
Number of publications related to Alzheimer treatment
added into PubMed every year
0
500
1000
1500
2000
2500
300019
9519
9619
9719
9819
9920
0020
0120
0220
0320
0420
0520
0620
0720
0820
0920
1020
1120
1220
1320
1420
15
#Publications
6
Text Mining Research & Applications 7
“the process of automatic deriving high-quality information from natural language text”
Overview of Information ExtractionOmega-3 fatty acids in fish may help prevent cognitive decline in Alzheimer patient.
8
Named Entity Recognition (NER)
Classify tokens in to predefined categories such as person name, organizations, diseases, proteins.
9
NER Features and Models
Campos, David, José Luís Oliveira, and Sérgio Matos. Biomedical named entity recognition: a survey of machine-learning tools. INTECH Open Access Publisher, 2012.
10
NER Features
http://nlp.stanford.edu/software/jenny-ner-2007.pdf
11
Conditional Random Fields (CRFs) 12
a type of discriminative probabilistic model.
O is observation words sequence S is label sequence
Linear chain CRFs define the conditional probability of a state sequence given an input sequence to be:
Z0 is a normalization factor of all state sequences [0, 1] fj(si−1 , si , o, i) is one of m functions that describes a feature j is a learned weight for each such feature function (define) If j < 0 : CRFs to avoid using this feature If j = 0 : this feature does not affect the probability If j > 0 : improve probability of S
si−1 , si represent previous and current label for observation words position i
Conditional Random Fields (CRFs) 13
fj(si−1 , si , o, i) 1 if si = Intervention and o begins with
memantin- 0 Otherwise
= 1 Z0 is a normalization factor
memantine Interventionmemantine monotherapy Interventionmemantine treatment Intervention
Morphological featurePrefixmemantin-…
Testing tokenmemantinchlorid
P(Disease | memantinchlorid) = 0.0P(Other | memantinchlorid) = 0.0P(Intervention | memantinchlorid) = 1.0
Classifier/Model
Training Corpus
Extract feature/Train model NER
memantinchlorid Intervention
Distillation of Knowledge from the Research Literatures on Alzheimer’s Dementia
14
“Text mining technology may assist in distilling knowledge from the vast corpus of research literature on Alzheimer’s dementia.”
Agenda
INTRODUCTION
PROPOSED METHOD
EXPERIMENTAL RESULT AND DISCUSSION
CONCLUSION AND FUTURE WORK
15
Proposed method
Manual annotation
AD Intervention
corpus
Comparison
Evaluation results
Annotationresults
Named entity recognition
Medical abstracts
Train classifier
Manually annotated sentence
16
Corpus construction 50 abstracts
“Alzheimer treatment” “Alzheimer treatment” “Nonpharmacologictreatment for Alzheimer”
20 abstracts 10 abstracts 20 abstracts
TitleAbstract
17
Annotation Procedure
The three labels assigned are Disease, Intervention, and Other.
All entities to be annotated were nouns.
A general term will be excluded.
Pronouns and co-references were excluded.
Special characters (e.g., quote, dash, or parenthesis) at the beginning or end of entities were not considered.
18
Disease entity
Disease included both synonymous mentions and abbreviated forms of AD.
19
Intervention entity
Any object or action that is performed by doctor or other clinician targeted in order to help the AD patient.Treatment, prevention, diagnosis, cause, risk factor
Such typical treatment - the side effects.Medical interventions
alternative treatment increasing patient satisfaction lead to better health outcomes
20
Agenda
INTRODUCTION
PROPOSED METHOD
EXPERIMENTAL RESULT AND DISCUSSION
CONCLUSION AND FUTURE WORK
21
Experimental Design
Manual annotation
AD Intervention
corpus
Comparison
Evaluation results
Annotationresults
Named entity recognition
Medical abstracts
Train classifier
Manually annotated sentence
22
Experiment (1)
Manual annotation
AD Intervention
corpus
Comparison
Evaluation results
Annotationresults
Named entity recognition
Medical abstracts
Train classifier
Manually annotated sentence
“Alzheimer treatment”
first 10 webpages
23
Experiment (1)Measure PerformancePrecision 0.987Recall 0.397F1-score 0.566Accuracy 0.982
24
Discussion
Further analysis on confusion matrix, false positive and false negative tokens of intervention entity errors between intervention and other class -> such
intervention entities are not in corpus (e.g., Aromatherapy, relaxing song, art therapy, coconut oil, pet therapy, Rivastigmine or Exelon, Galantamine or Razadyne). errors between intervention and disease class -> entity's
boundary. For instance, Disease Alzheimer’s should be labeled as Disease, however when the token is Disease Alzheimer’s drugs, they must be labeled as Intervention. a case sensitive in configuration. Although we can exclude
a proper noun from annotation, such as Alzheimer’s Healthcare Center, a mislabeled entity can occur for example, Memantine is not annotated as Intervention.
25
Experiment (2)“Alzheimer treatment”
Search criteria Unseen/new InterventionNormal search unfamiliar music, dance therapy, environmental
cueing, Chinese medicine, coral calcium, animal-assisted therapy, multi-sensory therapy, massage
physical therapy, intranasal insulin therapy
physical therapy, intranasal insulin therapy, antiepileptic drugs, light therapy, flashing light therapy, mouse’s spineflashing light therapy, routine screeningbiogen therapy, nilvadipine, BACE inhibitors
26
Agenda
INTRODUCTION
PROPOSED METHOD
EXPERIMENTAL RESULT AND DISCUSSION
CONCLUSION AND FUTURE WORK
27
Conclusion and Future work 28
29