Upload
dickson-lukose
View
397
Download
1
Embed Size (px)
Citation preview
Dr. Dickson Lukose
Artificial Intelligence Lab. 26th November 2015
Reaping Benefits from Social
Network Data
BIG DATA EXECUTIVE TRAINING, PENANG
Contents
• Emergence of Social Web
• World of Data
• Knowledge Audit, Ontology
• Knowledge Portals Challenges
• Semantic Technology Platform
• Artificial Intelligence Stack – Text Understanding
– Natural Language Query
– Data Harmonization
• Social Media Intelligence
• Social Network Intelligence
• Conclusions
2
© 2015 MIMOS Berhad. All Rights Reserved.
Emergence of Social Web
© 2015 MIMOS Berhad. All Rights Reserved.
3
Organization Units
Partners
Ad Hoc Teams
Communities of Interest
Communities
Communities
of Practice
Social Networks (e.g. LinkedIn and
Facebook)
Engineered Emergent
Purpose Drives
Interest Drives
Gartner, 2009
Knowledge Audit
2015 MIMOS Berhad. All Rights Reserved.
5
Enterprise Data Linked Open Data
Sensor Web
ACTIONABLE KNOWLEDGE
ASSETS
Structured, Semi-Structured & Unstructured
Unstructured Structured & Semi-Structured
Ontology
Technical Writing
Term Databanks
Machine Translation
Human Translation
InformationRetrieval
Knowledge Engineering
Consumer Information
R&D
Standardisation
Nomenclature
Terminology
AGROVOC
MYGMO
PADIPEDIA
AGRIS
HERBAL MEDICINE
2015 MIMOS Berhad. All Rights Reserved.
6
SNOMED-CT
STW
Knowledge Portal Challenges
© 2015 MIMOS Berhad. All Rights Reserved.
7
Knowledge Portal
Natural Interface
High Density Visualization
Integration and Insight into Social
Media
CoP Network Analytics
Collaborative Problem
Solving
NOT What is the Content, BUT, What is
in the Content
Making Sense of Social Big
Data
Big Data Challenges:
Velocity Diversity Volume
Seamless Integration of
SME in Problem Solving
Artificial Intelligence Stack
(c) 2014 MIMOS Berhad. All Rights Reserved.
9
KNOWLEDGE REPRESENTATION &
REASONING
MACHINE LEARNING
STATISTICS GENETIC
ALGORITHM NEURAL
NETWORK
SUBJECTIVE ANALYTICS
NATURAL LANGUAGE
PROCESSING
VIDEO ANALYTICS
Mi-SEMANTICS
Mi-SP
Mi-CLIP
NETWORK ANALYTICS
ACCELERATION TECHNOLOGY
Mi-INTELLIGENCE
Mi-VISUALITIC
IMAGE ANALYTICS
Mi-TARGET Mi-
HARMONY Mi-
AVComm Mi-BIS
ALGORITHM
Mi-DSS
Mi-AccLib
Mi-AVSafe
Finance
What is Text Understanding?
Conceptualization
John is going to the bank by bus
Person
Human
Animate
Financial institution Road vehicle
Agent Destination using Instrument
Inanimate
Knowledge Graph
go
© 2015 MIMOS Berhad. All Rights Reserved.
English Text:
11
male-person:
“John” go: * bank: *
bus: *
agnt dest
inst
What is Text Understanding?
Penduduk mendapat bantuan daripada kerajaan
Human
Animate
resource organization
Inanimate
Knowledge Graph
mendapat
Malay Text:
12
Agent Resource originating Source
© 2015 MIMOS Berhad. All Rights Reserved.
Conceptualization
大卫 在 图书馆 等了 3小时。
What is Text Understanding?
Person
Human
Animate
Library
wait
13
3 hours
Mandarin Text:
© 2015 MIMOS Berhad. All Rights Reserved.
Knowledge Graph
Agent Location for Time Location
Conceptualization
Malay NLP
Sentence:
Meningkatkan harga barang dan minyak kerana inflasi negara.
Annotated sentence:
Meningkatkan_VB harga_NN barang_NN dan_CC minyak_NN
kerana_CC inflasi_NN negara_NN
Knowledge Graph:
15
© 2015 MIMOS Berhad. All Rights Reserved.
Mandarin NLP
16
Text
Segmented
Text
Part-of-
Speech
Results
© 2015 MIMOS Berhad. All Rights Reserved.
Mandarin NLP
17
Text
Segmented
Text
Entity
Recognition
Results
© 2015 MIMOS Berhad. All Rights Reserved.
Traditional Search
Too many results
Pattern matching Pattern matching
20
© 2015 MIMOS Berhad. All Rights Reserved.
Knowledge Base Preparation
Mi-CLIP
Mi-HARVESTER
Mi-KRAKEN
Mi-NLP
KNOWLEDGE BASE
World Wide Web
Linked Open Data
Enterprise Database 22
© 2015 MIMOS Berhad. All Rights Reserved.
Question Answering System
Mi-CLIPTM
Text Understanding
SQ VSQ
Knowledge
Graph
Texts
Question
Mi-Reasoner
Answer
English
NLP Malay
NLP
23
Mandarin
NLP
© 2015 MIMOS Berhad. All Rights Reserved.
SNOMED CT terminology
311,000+ concepts
Hypertensive complication
[SCT_449759005]
Kidney disease [SCT_90708001]
Neoplastic disease [SCT_55342001]
Hypertensive renal disease
[SCT_38481006]
Neoplasm of kidney
[SCT_126880001]
Renal impairment [SCT_236423003]
Chronic renal impairment
[SCT_236425005]
Kidney disease
Concept ID: 90708001
Preferred term: Kidney disease
Synonym(s): - Renal Disease - Nephrosis - Disorder of kidney - Nephropathy - Disease of kidney - Renal disorder
28
4,060,716+ triples 1,360,000+ relationships
Disorder of abdomen Kidney finding Disorder of body cavity Disorder of kidney and/or ureter Finding_site.Kidney structure
Logical formula:
What can SNOMED CT
be used for?
© 2015 MIMOS Berhad. All Rights Reserved.
Data Harmonization Process
HIS1
HIS2
HIS3
DB2
DB3
DB1
Schema Tables
Tables Schema
Schema Tables
IT Staff (Consolidate the data)
Subject Matter Expert (Interpret the data)
- Time Consuming - Limited Resources - Skills-dependent - Prone to errors
29
Automate processes Enable semantics Improve reporting accuracy
SNOMED CT Big Data: Freetext
Challenge
© 2015 MIMOS Berhad. All Rights Reserved.
ID Patient Name Gender Symptoms Diagnosis
1 xx M Fatigue, Chest pain Heart failure
2 yy F Breathlessness Asthma
ID Nama Pesakit Jantina Tanda-tanda Diagnosis
1 aa L Letih, Sakit dada Lemah jantung
2 bb P Sesak nafas Penyakit lelah
ID Pat.Name Sex Disorder Diagnosis
1 kk Male Tiredness, Pain in chest Cardiac failure
2 ll Female Dyspnea BHR
HIS1
HIS2
HIS3 SNOMED CT (Refsets)
Challenge of freetext
[371484003] [371484003] [371484003]
[346741003] [346741003] [346741003]
[263495000] [263495000] [263495000]
[243814003] [243814003] [243814003]
[439401001] [439401001] [439401001]
[276179002] [276179002] [276179002]
[84229001] [29857009] [84229001] [29857009] [84229001] [29857009]
[84114007] [84114007] [84114007] [267036007] [267036007] [267036007]
30
Query: how many cases of Dyspnea?
1 3
[195967001] [195967001] [195967001]
© 2015 MIMOS Berhad. All Rights Reserved.
Challenges and Solution
33
Internet
Expensive, Lengthy and subject to Error
Manual Searching and Monitoring Automatic
Eliminate the manual process, Save time and cost
Aid in Decision Making
Centralized Analytics
Dashboards
© 2015 MIMOS Berhad. All Rights Reserved.
Social Media Analytics
Internet
3. Visual reports on Insights presented to decision maker for further action
1. Harvest content from the internet (world wide web, social web, content web)
Automated Data Harvesting 2. Social Media Analytics to
generate insights about the topic of interest
Social Media Analysis
© 2015 MIMOS Berhad. All Rights Reserved.
Domain Ontology 34
Mi-Intelligence Process Overview
35
Expansion
Harvesting
Processing
Subjective Analytics
Visual Analytics
Domain Ontology
Insights
GATHER DATA
SEMANTIFY DATA
DISCOVER KNOWLEDGE
IDENTIFY PATTERNS Domain
Ontology
ESTABLISH SEARCHSPACE
© 2015 MIMOS Berhad. All Rights Reserved.
35
Automated Data Harvesting
© 2015 MIMOS Berhad. All Rights Reserved.
1. Harvest content from the internet (world wide web, social web, content web)
Add new topics to analyse
Summary of data harvested and processed
Search topics
36
Content Insights
© 2015 MIMOS Berhad. All Rights Reserved.
Where are the posts coming from?
Drill down to discover detailed information
about posts Who are the
users?
Where are the posts coming from?
37
Social Media Analysis
© 2015 MIMOS Berhad. All Rights Reserved.
Sentiments
Domain Ontology
Emotions
Anxiety machine understandable
domain knowledge
38
Social Media Analysis
© 2015 MIMOS Berhad. All Rights Reserved.
2. Social Media Analytics to generate insights about the topic of interest
Sub topics related to the main topic
Sentiments from social media
Sentiments from different regions
Sentiments from different regions
39
Social Media Analysis
© 2015 MIMOS Berhad. All Rights Reserved.
2. Social Media Analytics to generate insights about the topic of interest
Drill down to discover detailed information
about posts
Sentiments from different regions
Sentiments from different regions
Narrow down to the individuals posts of
interest
40
Social Network Analysis (Wikipedia)
© 2015 MIMOS Berhad. All Rights Reserved.
43
Social network analysis (SNA) is the process of investigating social structures through the use of network and graph theories.
It characterizes networked structures in terms of nodes (individual actors, people, or things within the network) and the ties or edges (relationships or interactions) that connect them.
It is a key technique in modern sociology. It has also gained a significant following in anthropology, biology, communication studies, economics, geography, history, information science, organizational studies, political science, social psychology, development studies, and sociolinguistics.
Mi-VisualiticsTM
Unstructured Data
Structured Data
Mi-Visualitics © 2014 MIMOS Berhad. All Rights Reserved.
45
Filtering Visualizing
Centrality Analysis Ego-group Extraction
Mi-VisualiticTM
© 2014 MIMOS Berhad. All Rights Reserved.
46
Clustering
Clique Discovery
Path Searching
Degree Centrality
Mi-VisualiticTM
© 2014 MIMOS Berhad. All Rights Reserved.
47
2015 MIMOS Berhad. All Rights Reserved.
Harvest, Extract, Harmonize, and Transform Data
Information related to Suspects
Drugs Cases
JKDM
WCO
Rule Mining
(Machine Learning) SME
(Risk Officer)
JKDM HQ
Suspect KB
Rules
Social Network Identification
(Mi-Target)
750 Suspects 30 Million Facts
Suspect Data Preparations
49
Suspect Network Analysis
Harvest, Extract, Harmonize, and Transform Data
Information related to Suspects
Suspect Network Analysis Engine (Mi-Visualitics)
Potential High Risk Individuals • Most Influential • Top Connectors
Intelligence Officers
Suspect Watchlist
Drugs Cases
JKDM
WCO
Suspect KB
Intelligence Officers
Other Information
2015 MIMOS Berhad. All Rights Reserved.
50
2015 MIMOS Berhad. All Rights Reserved. 51
Rule Mining (Machine Learning)
SME (Risk Officer)
Passenger Risk Profiling
Adaptive Personal
Demographic Analysis
(Mi-Target)
Observation:
• Travelling Purpose • Duration of Stay • Profession • Physical
Appearance
Flight Information:
•Flight Code
•Airline Operator
•Departing Airport
•Arrival Airport
Personal Information:
•Name
•Passport Number
•Nationality
•Age
Suspect Watchlist
JKDM HQ
Airports
Suspect KB
Rules
Profiling Rules using Ego-Group
Intelligence Officers
Suspect Watchlist
Hector B.L
Hector Beltran Leyva
Adaptive Personal
Demographic Analysis
(MI-Target)
Suspect KB
Rules
Suspect Network Analysis Engine
(Mi-Visualitics)
2015 MIMOS Berhad. All Rights Reserved.
52
So, What are we doing in AI-Lab, MIMOS?
Are we doing Big Data Analytics?
ANS: YES (but focused unstructured data)
Specifically, Social Media Intelligence & Social Network Intelligence
Are we working on Cognitive Computing?
ANS: YES for the last 8 years, and will continue to be one of the main R&D focus in RM-11.
Are we working on Prescriptive Analytics?
ANS: Not yet, but it is the MAIN focus of R&D
in RM-11.
55
© 2015 MIMOS Berhad. All Rights Reserved.