Upload
kouji-kozaki
View
3.098
Download
4
Tags:
Embed Size (px)
DESCRIPTION
For efficient and innovative use of big data, it is important to integrate multiple data bases across domains. For example, various public data bases are developed in life science, and how to find a novel scientific result using them is an essential technique. In social and business areas, open data strategies in many countries promote diversity of public data, how to combine big data and open data is a big challenge. That is, diversity of dataset is a problem to be solved for big data. Ontology gives a systematized knowledge to integrate multiple datasets across domains with semantics of them. Linked Data also provides techniques to interlink datasets based on semantic web technologies. We consider that combinations of ontology and Linked Data based on ontological engineering can contribute to solution of diversity problem in big data. In this talk, I discuss how ontological engineering could be applied to big data with some trial examples.
Citation preview
Ontology Engineering for Big Data
Kouji KozakiThe Institute of Scientific and Industrial Research (I.S.I.R),
Osaka University, Japan
2013/09/03 1
Ontology and Semantic Web for Big Data (ONSD2013) Workshop in the 2013 International Computer Science and Engineering Conference ( ICSEC2013), Bangkok, Thailand, 5th Sep. 2013
ONSD2013@ICEC2013
Self introduction: Kouji KOZAKI
Brief biography 2002 Received Ph.D. from Graduate School of Engineering,
Osaka University. 2002- Assistant Professor, 2008- Associate Professor in ISIR,
Osaka University. Specialty
Ontological Engineering Main research topics
Fundamental theories of ontological engineering
2013/09/03 2ONSD2013@ICEC2013
Ontological topics Some examples of topics which I work
on Definition of disease
What’s “disease” ? What’s “causal chain” ? Is it a object or process ?
Role theory What’s ontological difference among the following
concepts? Person Teacher Walker Murderer Mother
2013/09/03 3
…. Natural type
Role (dependent concept)
ONSD2013@ICEC2013
Self introduction: Kouji KOZAKI
Brief biography 2002 Received Ph.D. from Graduate School of Engineering, Osaka University. 2002- Assistant Professor, 2008- Associate Professor in ISIR, Osaka University.
Specialty Ontological Engineering
Main research topics Fundamental theories of ontological engineering Ontology development tool based on the ontological theories Ontology development in several domains and ontology-based application
Hozo( 法造 ) -an environment for ontology building/using- (1996- ) A software to support ontology ( = 法) building ( = 造)
and use It’s available at http://www.hozo.jp as a free software
Registered Users : 3,500 (June 2012) Java API for application development is provided. Support formats: Original format, RDF(S), OWL. Linked Data publishing support is coming soon.
2013/09/03 4ONSD2013@ICEC2013
My history on Ontology Building
2002-2007 Nano technology ontology Supported by NEDO(New Energy and Industrial Technology Development Organization)
2006- Clinical Medical ontology Supported by Ministry of Health, Labour and Welfare, Japan Cooperated with: Graduate School of Medicine, The University of Tokyo.
2007-2009 Sustainable Science ontology Cooperated with: Research Institute for Sustainability Science, Osaka Univ.
2007-2010 IBMD(Integrated Bio Medical Database) Supported by MEXT through "Integrated Database Project". Cooperated with: Tokyo Medical and Dental University, Graduate School of Medicine, Osaka U.
2008-2012 Protein Experiment Protocol ontology Cooperated with: Institute for Protein Research, Osaka Univ.
2008-2010 Bio Fuel ontology Supported by the Ministry of Environment, Japan.
2009-2012 Disaster Risk ontology Cooperated with: NIED (National Research Institute for Earth Science and Disaster Prevention)
2012- Bio mimetic ontology Supported by JSPS KAKENHI Grant-in-Aid for Scientific Research on Innovative Areas
2012- Ontology of User Action on Web Cooperated with: Consumer first Corp.
2013- Information Literacy ontology Supported by JSPS KAKENHI2013/09/03 5ONSD2013@ICEC2013
Agenda (1) Motivation
Ontology vs. Big Data How we can use ontology for big data?
(2) Case Studies towards Ontology Engineering for Big Data Ontology Exploration according to the users
viewpoints A Disease Ontology developed in Japanese Medical
Ontology Project
(3) Concluding Remarks
2013/09/03 6ONSD2013@ICEC2013
Ontology vs. Big Data
Question Is Ontology useful for Big Data?
My answer : (I believe) Yes Combination of ontology and Big Data could
provide new solutions for many problem.
2013/09/03 7
Ontology Not so big. (someone is big) Built by hands. Used based on
semantics by reasoning.
Big Data Very big.
Collected automatically.
Used without semantics by Machine Learning or Data mining.
ONSD2013@ICEC2013
How to combine Ontology and Big Data
Basic technology Mapping ontology to database
Mapping classes (concepts) defined in ontology to database schema
Mapping classes/instances defined in ontology to data in DB
Add metadata on data using vocabulary defined in ontology
e.g. annotation on document such as webpage, paper etc. Convert database (e.g. RDB) to ontology-based
(RDF) database e.g. linked data such as DBPedia, some bioinformatics DB,
etc. You can choose some of these technology
according to your purpose2013/09/03 ONSD2013@ICEC2013 8
How to combine Ontology and Big Data
Basic technology Mapping ontology to database
Mapping classes (concepts) defined in ontology to database schema
Mapping classes/instances defined in ontology to data in DB
Add metadata on data using vocabulary defined in ontology
e.g. annotation on document such as webpage, paper etc. Convert database (e.g. RDB) to ontology-based
(RDF) database e.g. linked data such as DBPedia, some bioinformatics DB,
etc. You can choose some of these technology
according to your purpose2013/09/03 ONSD2013@ICEC2013 9
Case StudyA method for mapping Abnormality Ontology (in medical domain) to medical database
hypertension
Classification of Abnormality Representations 1
blood pressure 200 mmHg
blood pressure is high
Various types of abnormality representations
are used in medical domain
blood glucose level 150 mm/dL
blood glucose level is high
hyperglycemia
2013/09/03 10
ONSD2013@ICEC2013
☑
11
Classification of Abnormality representations 2
※Based on quality and quantity ontologies in the Upper Ontology “YAMATO”.
Propertyrepresentati
on
Quantitativerepresentati
on
blood pressure 200 mmHg
blood glucose level 150 mm/dL
Qualitative representati
on
blood pressure is high
blood glucose level is high
hypertension
hyperglycemia
☑DiagnosisIdentify a concrete
value for each patient in clinical tests
☑Definition of disease
2013/09/03 ONSD2013@ICEC2013
Abnormality Ontology
MedicalDatabase
Mapping
Structural abnormality
Sizeabnormalit
y
Formational
abnormality
Conformational abnormality
Small in size
Small in line
Small in area
Small in volume
Narrowing tube
Vascular stenosis Gastrointestinal tract stenosis
Arterial stenosis … Intestinal stenosis
Layer 1 :Generic Abnormal States (Object-independent)
Layer 3:Specific context-dependent Abnormal States
Coronary stenosis in
Angina pectoris
Coronary stenosisin
Arteriosclerosis
Intestinal stenosisin
Ileus
Esophageal stenosis in
Esophagitis
Esophageal stenosis
is-a
Materialabnormality
Largein size
diseasedependent
Blood vesseldependent
Topological abnormalit
y ……
…
Is-a hierarchy of Abnormality Ontology
12
Tube-dependent…
Narrowing of valve
Layer2 :Object-dependentAbnormal States
……
…
Coronary stenosis
2013/09/03
How can we deal with clinical test data ?
• In hospitals, huge volume of diagnostic/clinical test data have been accumulated.
• Most are quantitative data: e.g., blood prresure 180mmHg, blood cross-sectional area 40 mmx2,
Quantitative value Qualitative value 180mmHg (Vqt) high (Vql)
Quantitative value:180
mmhgThreshold value
blood pressure
high
13
high
e.g., 140mmhg
2013/09/03
blood pressure
Attribute (A)
high
Value (V)
Basic policy for definition of abnormal states
hypertension
Property (P)
A property is decomposed into a tuple: <Attribute (A), Attribute Value (V)> in a qualitative
form. 14
Qualitative representation can be converted into a Property representation.
2013/09/03
Quantity
Property
blood pressure 180 mmhgcross-section area xxcmx2
abnormality
knowledge
Clinical test data
blood pressure high
cross-section area small
HypertensionNarrowing
Quality
Our model enables “Interoperability” from Clinical test data to conceptual knowledge about abnormal States. 15
Qualitative representation can be converted Quantitative data to Property representation.
2013/09/03
How to combine Ontology and Big Data
Basic technology Mapping ontology to database
Mapping classes (concepts) defined in ontology to database schema
Mapping classes/instances defined in ontology to data in DB
Add metadata on data using vocabulary defined in ontology
e.g. annotation on document such as webpage, paper etc. Convert database (e.g. RDB) to ontology-based
(RDF) database e.g. linked data such as DBPedia, some bioinformatics DB,
etc. You can choose some of these technology
according to your purpose2013/09/03 ONSD2013@ICEC2013 16
Case StudyAnnotation on web browsing history of users based on Web User Action Ontology
0
5
10
15
20
25
30
35
40
会議毎の利用タイプの推移
(9) Knowledge Systematization
(8) Knowledge Modeling
(7) Information Extraction
(6) Semantic Analysis
(5) Knowledge Sharing
(4) Data Schema
(3) Index
(2) Search
(1) Common Vocabulary
The amount of papers surveyed in each conference9 19 18 24 25 11 23 26 17 18T
he amountsof typ
es of usage説明 Web行動オントロジーとの対応
タイトル URL Webで示される サイトのタイトル -
Webサイトのカテゴリor( インスタンス)
Web etc.) サイトの種類(例:宿泊ポータル,宿泊施設の個別サイトニュースサイト,ブログ, まYahoo!たは,メジャーサイトの場合はインスタンス名(例:楽天トラベル, )
行為系列名 CVタスクにおける部分行為系列名→ CV CV CVトリガー行為系列, 前行為系列,目的 系列, 後行為系列,のいずれか
RH構成行為 行為系列における部分行為の役割
CC構成行為 行為系列における部分行為となる「行為の種類」
構成行為の部分行為 構成行為が複合行為(現状は,情報収集行為のみ)のとき,その複合行為における役割→ 現状は,情報源取得,条件入力,候補情報一覧の閲覧,個別情報の閲覧,のいずれか
対象物 Webサイトが記述の対象としているインスタンス名 ※ Web要検討( ページの「対象」)
対象カテゴリ Webサイトが記述の対象の種類 ※ 要検討(クラスを追加?)
対象情報カテゴリ 行為(主に情報収集行為を想定)が対象とする情報の種類
地名(都道府県) Webサイトの記述対象の地理情報(都道府県名レベルを想定) ※ 要検討?
ランドマーク Webサイトの記述対象の地理情報において,ランドマークとなるもの(観光地名,施設名など) ※ 要検討?
CV条件URL(今回は対象外、 解
析の結果を使う
コンバージョンを行う際に設定した条件(宿泊日,価格,施設の種類など)※URL解析の結果を使うことを想定 -
説明 Web行動オントロジーとの対応
タイトル URL Webで示される サイトのタイトル -
Webサイトのカテゴリor( インスタンス)
Web etc.) サイトの種類(例:宿泊ポータル,宿泊施設の個別サイトニュースサイト,ブログ, まYahoo!たは,メジャーサイトの場合はインスタンス名(例:楽天トラベル, )
行為系列名 CVタスクにおける部分行為系列名→ CV CV CVトリガー行為系列, 前行為系列,目的 系列, 後行為系列,のいずれか
RH構成行為 行為系列における部分行為の役割
CC構成行為 行為系列における部分行為となる「行為の種類」
構成行為の部分行為 構成行為が複合行為(現状は,情報収集行為のみ)のとき,その複合行為における役割→ 現状は,情報源取得,条件入力,候補情報一覧の閲覧,個別情報の閲覧,のいずれか
対象物 Webサイトが記述の対象としているインスタンス名 ※ Web要検討( ページの「対象」)
対象カテゴリ Webサイトが記述の対象の種類 ※ 要検討(クラスを追加?)
対象情報カテゴリ 行為(主に情報収集行為を想定)が対象とする情報の種類
地名(都道府県) Webサイトの記述対象の地理情報(都道府県名レベルを想定) ※ 要検討?
ランドマーク Webサイトの記述対象の地理情報において,ランドマークとなるもの(観光地名,施設名など) ※ 要検討?
CV条件URL(今回は対象外、 解
析の結果を使う
コンバージョンを行う際に設定した条件(宿泊日,価格,施設の種類など)※URL解析の結果を使うことを想定 -
説明 Web行動オントロジーとの対応
タイトル URL Webで示される サイトのタイトル -
Webサイトのカテゴリor( インスタンス)
Web etc.) サイトの種類(例:宿泊ポータル,宿泊施設の個別サイトニュースサイト,ブログ, まYahoo!たは,メジャーサイトの場合はインスタンス名(例:楽天トラベル, )
行為系列名 CVタスクにおける部分行為系列名→ CV CV CVトリガー行為系列, 前行為系列,目的 系列, 後行為系列,のいずれか
RH構成行為 行為系列における部分行為の役割
CC構成行為 行為系列における部分行為となる「行為の種類」
構成行為の部分行為 構成行為が複合行為(現状は,情報収集行為のみ)のとき,その複合行為における役割→ 現状は,情報源取得,条件入力,候補情報一覧の閲覧,個別情報の閲覧,のいずれか
対象物 Webサイトが記述の対象としているインスタンス名 ※ Web要検討( ページの「対象」)
対象カテゴリ Webサイトが記述の対象の種類 ※ 要検討(クラスを追加?)
対象情報カテゴリ 行為(主に情報収集行為を想定)が対象とする情報の種類
地名(都道府県) Webサイトの記述対象の地理情報(都道府県名レベルを想定) ※ 要検討?
ランドマーク Webサイトの記述対象の地理情報において,ランドマークとなるもの(観光地名,施設名など) ※ 要検討?
CV条件URL(今回は対象外、 解
析の結果を使う
コンバージョンを行う際に設定した条件(宿泊日,価格,施設の種類など)※URL解析の結果を使うことを想定 -
説明 Web行動オントロジーとの対応
タイトル URL Webで示される サイトのタイトル -
Webサイトのカテゴリor( インスタンス)
Web etc.) サイトの種類(例:宿泊ポータル,宿泊施設の個別サイトニュースサイト,ブログ, まYahoo!たは,メジャーサイトの場合はインスタンス名(例:楽天トラベル, )
行為系列名 CVタスクにおける部分行為系列名→ CV CV CVトリガー行為系列, 前行為系列,目的 系列, 後行為系列,のいずれか
RH構成行為 行為系列における部分行為の役割
CC構成行為 行為系列における部分行為となる「行為の種類」
構成行為の部分行為 構成行為が複合行為(現状は,情報収集行為のみ)のとき,その複合行為における役割→ 現状は,情報源取得,条件入力,候補情報一覧の閲覧,個別情報の閲覧,のいずれか
対象物 Webサイトが記述の対象としているインスタンス名 ※ Web要検討( ページの「対象」)
対象カテゴリ Webサイトが記述の対象の種類 ※ 要検討(クラスを追加?)
対象情報カテゴリ 行為(主に情報収集行為を想定)が対象とする情報の種類
地名(都道府県) Webサイトの記述対象の地理情報(都道府県名レベルを想定) ※ 要検討?
ランドマーク Webサイトの記述対象の地理情報において,ランドマークとなるもの(観光地名,施設名など) ※ 要検討?
CV条件URL(今回は対象外、 解
析の結果を使う
コンバージョンを行う際に設定した条件(宿泊日,価格,施設の種類など)※URL解析の結果を使うことを想定 -
Web browsing history (access logs) of usersList of all URLs the user accessed for 130M users × 2 years
Web User Action Ontology
Analysis of consumption behavior
Annotation on web browsing history of users based on ontology
This is collaborative work with Consumer first, Inc.
Basic Idea The format of the access logs (Web browsing history)
of users provided by Consumer first, Inc. User id, access date and time, URL …
Problem URL is meaning less string for human while someone guess its
contents if it is famous site. Diversity of access logs. In order to analyze them, we need consistent meaning.
Annotations on the access log
We tried to add metadata which present human understandable meaning of each URL
We also developed a prototype of automatic annotation Its recall and relevance rate is almost 0.7 ~ 0.9 We think this result is not bad for statistical analysis.
2013/09/03 ONSD2013@ICEC2013 18
Ontology Engineeringfor Big Data
Basic technology = How to combine Ontology and Big Data
Mapping ontology to database Add metadata on data using vocabulary defined in
ontology Convert database (e.g. RDB) to ontology-based
(RDF) database
How to use Combinations of Ontology and Big Data Ontology can provide semantics to add raw data. Generalized concepts in ontology can connect data in
various concept levels across domains. We can use ontology as given (and authorized)
knowledge to analysis big data.2013/09/03 19ONSD2013@ICEC2013
Ontology Engineeringfor Big Data
Features of ontology in class level It reflects understanding of the target world. Well organized ontologies have generalized rich
knowledge based on consistent semantics. Ontologies are systematized knowledge of
domains.
Combination of ontology and big data Ontology can provide semantics to add raw data. Generalized concepts in ontology can connect data
in various concept levels across domains. We can use ontology as given (and authorized)
knowledge to analysis big data.2013/09/03 20ONSD2013@ICEC2013
Two possible way to use ontology for big data
Metadata
...
LOD(Linked Open Data)
Ontology
Big Data
Ontology
Use ontology to bridge datasets across domains
Use ontology to combine deep domain knowledge and raw data
2013/09/03 21ONSD2013@ICEC2013
Case studies Use ontology to bridge datasets
across domains Understanding an Ontology through
Divergent Exploration Presented at ESWC2011
Use ontology to combine deep domain knowledge and raw data Japanese Medical Ontology project
Disease ontology and Ontology of Abnormal State
presented at ICBO (International Conference on Biomedical Ontology) 2011, 2012 and 2013
2013/09/03 22ONSD2013@ICEC2013
Use ontology to bridge datasets across domains
Basic technology Terms (classes/instances) defined in ontology are used as
common vocabulary for search data. If the ontology has mapping to Multiple DBs, the user can
search across them.
Motivation and Issue Combinations of multiple datasets could be valuable for Big Data Analysis.
e.g. climate and agriculture, healthcare and life science, etc.
However, to get all combinations acrossmultiple Big Data is not realistic for their size. Requests by the users are also very differentaccording to their interests. It is important to consider efficient method to obtain meaningful combinations. 2013/09/03 ONSD2013@ICEC2013 23
Ontology
Documents / Law Data
Search
Search across multiple DBs
Common Vocabulary
Raw
A method to obtain meaningful combinations using ontology exploration
2013/09/03 24
Problem Setting
Problem Solution
Innovation
Layer 0
Layer 1
Layer 2
Layer 3
Layer 4
Contents Managementusing the Metadata
Map GenerationDepending onViewpoints
Comparison andConvergenceof multiple Maps
Context Based Convergence
DivergentExploration
Ontology-basedInformationRetrieval
An ontology presents an explicit essential understanding of the target world. It provides a base knowledge to be shared among the users.
They explore the ontology according to their viewpoint and generate conceptual maps as the result.These maps represent understanding from the their own viewpoints.
They explore the ontology according to their viewpoint and generate conceptual maps as the result.These maps represent understanding from the their own viewpoints.
They can use the maps as viewpoints (combinations) to get data from multiple DBs.
They can use the maps as viewpoints (combinations) to get data from multiple DBs.
ONSD2013@ICEC2013
(Divergent) Ontology exploration tool
Exploration of an ontology
“Hozo” – Ontology Editor
Multi-perspective conceptual chains represent the explorer’s understanding of ontology from the specific viewpoint. Conceptual maps
Visualizations as conceptual maps from different view points
1) Exploration of multi-perspective conceptual chains2) Visualizations of conceptual chains
2013/09/03 25ONSD2013@ICEC2013
Referring to another concept
2013/09/03 26
Node represents a
concept(=rdfs:Class)
slot represents a relationship
(=rdf:Property)
Is-a (sub-class-of) relationshp
ONSD2013@ICEC2013
272013/09/03 ONSD2013@ICEC2013
2013/09/03 28
Aspect dialog
constriction tracing classes
Option settings for exploration
property names
Conceptual map visualizer
Kinds of aspects
Selected relationships are traced and shown as links in conceptual map
ONSD2013@ICEC2013
29
Explore the focused (selected) path.
2013/09/03 ONSD2013@ICEC2013
Functions for ontology exploration
Exploration using the aspect dialog: Divergent exploration from one concept using the
aspect dialog for each step Search path:
Exploration of paths from stating point and ending points.
The tool allows users to post-hoc editing for extracting only interesting portions of the map.
Change view: The tool has a function to highlight specified paths of
conceptual chains on the generated map according to given viewpoints.
Comparison of maps: The system can compare generated maps and show the
common conceptual chains both of the maps. 2013/09/03 30
Manual exploration
Machine exploration
ONSD2013@ICEC2013
2013/09/03 31
Ending point (1)
Ending point (3)Ending point (2)
Search Path
Starting point
Selecting of ending pointsFinding all possible paths from stating point to ending points
ONSD2013@ICEC2013
2013/09/03 32
Search Path
Selected ending points
ONSD2013@ICEC2013
2013/09/03 33
What does the result mean?
Selected ending points
ONSD2013@ICEC2013
Problem
Kinds of method to solve the problem
Possible combination of them
DEMO: Ontology Exploration
2013/09/03 34ONSD2013@ICEC2013
Usage and evaluation of ontology exploration tool Step 1: Usage for knowledge structuring in
sustainability science
Step 2: Verification of exploring the abilities of the ontology exploration tool
Step 3: Experiments for evaluating the ontology exploration tool
2013/09/03 35ONSD2013@ICEC2013
Sustainability Science
Sustainability Science probes interactions between global, social, and human systems, the complex mechanisms that lead to degradation of these systems, and concomitant risks to human well-being.
The journal provides a platform for building sustainability science as a new academic discipline.
These include endeavors to simultaneously understand phenomena and solve problems, uncertainty and application of the precautionary principle, the co-evolution of knowledge and recognition of problems, and trade-offs between global and local problem solving.
Volume 1 / 2006 - Volume 8 / 2013Editor-in-Chief: Kazuhiko TakeuchiManaging Editor: Osamu SaitoISSN: 1862-4065 (print version)ISSN: 1862-4057 (electronic version)
36
Knowledge Structuring in Sustainability Science Sustainability Science (SS)
– We aimed at establishing a new interdisciplinary scheme that serves as a basis for constructing a vision that will lead global society to a sustainable one.
– It is required an integrated understanding of the entire field instead of domain-wise knowledge structuring.
Sustainability science ontology – Developed in collaboration with domain expert in
Osaka University Research Institute for Sustainability Science (RISS).
– Number of concepts : 649, Number of slots :1,075
Usage of the ontology exploration tool– It was confirmed that the exploration was fun for
them and the tool had a certain utility for achieving knowledge structuring in sustainability science. [Kumazawa 2009]
http://en.ir3s.u-tokyo.ac.jp/about_sus
Sustainability Science
37
Biofuel Use Strategies for Sustainable Development (BforSD, FY2008-FY2010)
Development of the ontology-based mapping system which create comprehensive views of problems and policy measures on biofuel
(1) Structuring biofuel problems: Develop the biofuel ontology which explicitly conceptualizes biofuel problems through literature review and interviews
(2)Develop an ontology exploration tool which interactively generates conceptual maps with paths between concepts in the biofuel ontology
(3)In collaboration with other sub-themes, develop an application method of this map tool for policy making support to find, frame and prioritize relevant problems and policy measures.
(source) US DOE
38
One of the sub-themes
Usage and evaluation of ontology exploration tool Step 1: Usage for knowledge structuring in
sustainability science
Step 2: Verification of exploring the abilities of the ontology exploration tool
Step 3: Experiments for evaluating the ontology exploration tool
2013/09/03 39ONSD2013@ICEC2013
Verification of Ontology Exploration Tool Verification methods
1) Enrichment of SS ontologyWe enriched the SS ontology on the basis of 29 typical scenarios (cases) structured by domain experts in biofuel through literature review and interviews
Scenarios
(Cases)Scenarios
(Cases)Scenarios
(Cases)Scenarios
(Cases)Scenarios
(Cases)Scenarios
(Cases)
29 scenarios(cases)
27 conceptual maps
40
1) Energy services for the poor
(+/−) Competition of biomass energy systems with the present use of biomass resources (such as agricultural residues) in applications such as animal feed and bedding, fertilizer, and construction materials 1
(−) In many developing countries, small-scale biomass energy projects face challenges obtaining finance from traditional financing institutions1
(−) Liquid biofuels are likely to replace only a small share of global energy supplies and cannot alone eliminate our dependence on fossil fuels2
2) Agro-industrial development and job creation
(+) Biofuel is powering new small- and large-scale agro-industrial development and spawning new industries in industrialized and developing countries1
(+/−) In the short-to-medium term, bioenergy use will depend heavily on feedstock costs and reliability of supply, cost and availability of competing energy sources, and government policy decisions1
(+) In the longer term, the economics of biofuel will probably improve as agricultural productivity and agro-industrial efficiency improve, more supportive agricultural and energy policies are adopted, carbon markets mature and expand, and new methodologies for carbon sequestration accounting are developed1
(+) In the longer term, expanded demand and increased prices for agricultural commodities may represent opportunities for agricultural and rural development2
(+) Biofuel industries create jobs, including highly skilled science, engineering, and business-related employment; medium-level technical staff; low-skill industrial plant jobs; and unskilled agricultural labor 1
(+/−) Small-scale and labor intensive production often lead to trade-offs between production efficiency and economic competitiveness 1
3) Health and gender
(−) Market opportunities cannot overcome existing social and institutional barriers to equitable growth, with exclusion factors such as gender, ethnicity, and political powerless, and may even worsen them 2
(−) Forest burning for development of feedstock plantation and sugarcane burning to facilitate manual harvesting result in air pollution, higher surface water runoff, soil erosion, and unintended forest fires3,4
(−) Exploitation of cheap labor (plantation and migrant workers)4
(−) Increased use of pesticides could create health hazards for labors and communities living near areas of feedstock production 1,3
4) Agricultural structure
(−) The demand for land to grow biofuel crops could put pressure on competing land usage for food crops, resulting in an increase in food prices1,2
(+/−) Significant economies of scale can be gained from processing and distributing biofuels on a large scale. The transition to liquid biofuels can be harmful to farmers who do not own their own land, and to the rural and urban poor who are net buyers of food 1
(−) While global market forces could lead to new and stable income streams, they could also increase marginalization of poor and indigenous people and affect traditional ways of living if they end up driving small farmers without clear titles from their land and destroying their livelihood1
(+): Positive effects , (−): Negative effects , (+/−): Both positive and negative effects(Source) 1: UN-Energy (2007), 2: FAO (2008), 3: CBD (2008), 4: Martinelli et al. (2008)
Positive and negative effects of biofuel
41
5) Food security (−) Demand for agricultural feedstock for liquid biofuels will be a significant factor for agricultural markets and world agriculture over the next decade and perhaps beyond2
(−) Rapid growing demand for biofuel feedstock has contributed to higher food prices, which poses an immediate threat to the food security of poor net food buyers in both urban and rural areas2
(+/−) The effect of biofuels on food security is context-specific, depending on the particular technology and country characteristics involved1
6) Government budget
(−) Because ethanol is used largely as a substitute for gasoline, providing a large tax reduction for blending ethanol and gasoline reduces government revenue from this tax, mainly targeting the non-poor1
(−) Production of biofuels in many countries, except sugarcane-based ethanol production in Brazil, is not currently economically viable without subsidies, given existing agricultural production and biofuel-processing technologies and recent relative prices of commodity feedstock and crude oil2
(−) Policy intervention, especially in the form of subsidies and mandated blending of biofuels with fossil fuels, are driving the rush to liquid biofuels, which leads to high economic, social, and environmental costs in both developed and developing countries 2
7) Trade, foreign exchange balance, and energy security
(+) Diversifying global fuel supplies could have beneficial effects on the global oil market and many developing countries because fossil fuel dependence has become a major risk for many developing economies 1
(+/−) Rapidly rising demand for ethanol has had an impact on the price of sugar and maize in recent years, bringing substantial rewards to farmers not only in Brazil and the United States but around the world1,2
(−) Linking of agricultural prices to the vicissitudes of the world oil market clearly presents risks; however, it is an essential transition to the development of a biofuel industry that does not rely on major food commodity crops 1
8) Biodiversity and natural resource management
(+/−) Depending on the types of crop grown, what they replaced, and the methods of cultivation and harvesting, biofuels can have negative and positive effects on land use, soil and water quality, and biodiversity 1,3
(−) Problems with water availability and use may represent a limitation on agricultural biofuel production 1,3
(−) Introduction of criteria, standards, and certification schemes for biofuels may generate indirect negative environmental and biodiversity effects, passively in other countries3
(−) If the production of biofuel feedstock requires increased fertilizer and pesticide use, there could be additional detrimental effects such as increase in GHGs emission and eutrophicating nutrients and biodiversity loss 3
(−) Wild biodiversity is threatened by loss of habitat when the area under crop production is expanded, whereas agricultural biodiversity is vulnerable in the case of large-scale monocropping, which is based on a narrow pool of genetic material, and can also lead to reduced use of traditional varieties2,3
(+) If crops are grown on degraded or abandoned land, such as previously deforested areas or degraded crop- and grasslands, and if soil disturbances are minimized, feedstock production for biofuels can have a positive impact on biodiversity by restoring or conserving habitat and ecosystem function3
9) Climate change
(+/−) Full lifecycle GHG emissions of biofuel vary widely based on land use changes, choice of feedstock, agricultural practices, refining or conversion processes, and end-use practices1,2
(−) Land use change associated with production of biofuel feedstock can affect GHG emissions; draining wetlands and clearing land with fire are detrimental with regard to GHG emissions and air quality2,3
(−) The greatest potential for reducing GHG emission comes from replacement of coal rather than petroleum fuels 1
(+) Biofuels offer the only realistic near-term renewable option for displacing and supplementing liquid transport fuels 1
(+): Positive effects , (−): Negative effects , (+/−): Both positive and negative effects(Source) 1: UN-Energy (2007), 2: FAO (2008), 3: CBD (2008), 4: Martinelli et al. (2008) 42
Verification of Ontology Exploration Tool
burn agriculture= ( deforestation, soil deterioration caused by farmland development for biofuel crops )⇒ harvest sugarcanes ( air pollution caused by intentional burn ), disruption of ecosystem caused by deforestation ( water pollution )
burn agriculture= ( deforestation, soil deterioration caused by farmland development for biofuel crops )⇒ harvest sugarcanes ( air pollution caused by intentional burn ), disruption of ecosystem caused by deforestation ( water pollution )
The concepts appearing in these scenarios were extracted and generalized to add into the ontology
Example: Air pollution, cause of forest fire, soil deterioration, water pollution are attributed to intentional burn when forest is logged or sugarcanes are harvested in the farmland development for biofuel crops.
43
Verification of Ontology Exploration Tool Verification methods
1) Enrichment of SS ontologyWe enriched the SS ontology on the basis of 29 typical scenarios (cases) structured by domain experts in biofuel through literature review and interviews
2) Verification of scenario reproducing operations
We verified whether the ontology exploration tool could generate conceptual maps which represent original scenarios.
Result: – 93% (27/29) of the scenarios were
successfully reproduced as conceptual maps.
Scenarios
(Cases)Scenarios
(Cases)Scenarios
(Cases)Scenarios
(Cases)Scenarios
(Cases)Scenarios
(Cases)
29 scenarios(cases)
27 conceptual maps
44
Usage and evaluation of ontology exploration tool Step 1: Usage for knowledge structuring in
sustainability science
Step 2: Verification of exploring the abilities of the ontology exploration tool
Step 3: Experiments for evaluating the ontology exploration tool
1) Whether meaningful maps for domain experts were obtained.
2) Whether meaningful maps other than anticipated maps were obtained.
2013/09/03 45
Maps which are representing the contents of the scenarios anticipated by ontology developers at the time of ontology construction.
Note: the subjects don’t know what scenarios are anticipated.ONSD2013@ICEC2013
Experiment for evaluating ontology exploration tool
Experimental method1) The four experts to generated
conceptual maps with the tool in accordance with condition settings of given tasks.
2) They remove paths that were apparently inappropriate from the paths of conceptual chains included in the generated maps.
3) They select paths according to their interests and enter a four-level general evaluation with free comments.
2013/09/03 46
The subjects:4 experts in different fields. A: Agricultural economics B: Social science (stakeholder analysis) C: Risk analysis D: Metropolitan environmental planning
A: Interesting B: Important but ordinaryC: Neither good or poorD: Obviously wrong
ONSD2013@ICEC2013
Experimental results (1)
2013/09/03 47
A B C DExpert A 2 2Expert A(second time) 1 1
Expert B 7 4 1 2Expert B(second time) 6 3 3
Expert C 8 1 5 2Expert D 3 1 1 1Expert A 1 1Expert B 6 5 1Expert C 7 2 4 1Expert D 5 3 1 1Expert B 8 4 2 2Expert C 4 2 2Expert D 3 3
61 30 22 8 1
Task 3
Total
Number ofselected paths
Path distribution based on general evaluation
Task 1
Task 2
(N) Nodes and links included in
the paths of anticipated maps
(M) Nodes and links included in the paths of generated and selected by the experts
50 15050
N∩M
Each area of circle represents the numbers of nodes and links included in paths. Note, the number in the circles represent not the actual number but the rates between each paths.
Fig.7 The rate of paths. ONSD2013@ICEC2013
Experimental results (1)
2013/09/03 48
A B C DExpert A 2 2Expert A(second time) 1 1
Expert B 7 4 1 2Expert B(second time) 6 3 3
Expert C 8 1 5 2Expert D 3 1 1 1Expert A 1 1Expert B 6 5 1Expert C 7 2 4 1Expert D 5 3 1 1Expert B 8 4 2 2Expert C 4 2 2Expert D 3 3
61 30 22 8 1
Task 3
Total
Number ofselected paths
Path distribution based on general evaluation
Task 1
Task 2
(N) Nodes and links included in
the paths of anticipated maps
(M) Nodes and links included in the paths of generated and selected by the experts
50 15050
N∩M
Each area of circle represents the numbers of nodes and links included in paths. Note, the number in the circles represent not the actual number but the rates between each paths.
Fig.7 The rate of paths.
Number of maps generated: 13
Number of paths evaluated: 61
Number of paths evaluated: 61A: Interesting 30 (49%)B: Important but ordinary 22 (36%)C: Neither good or poor 8(13%) D: Obviously wrong 1(2%)
We can conclude that the tool could generate maps or paths sufficiently meaningful for experts.
85%
ONSD2013@ICEC2013
Experimental results (2) Quantitatively comparison of the anticipated maps
with the maps generated by the subjects
2013/09/03 49
(N) Nodes and links included in the
paths of anticipated maps
(M) Nodes and links included in the paths of generated and selected by the experts
50 15050
N∩M About 75% of paths in the generated maps are new paths which is not anticipated from the typical scenarios .
It is meaningful enough to claim a positive support for the developed tool. This suggests that the tool has a sufficient possibility of presenting unexpected contents and stimulating conception by the user.
About half (50%) of the paths included in the anticipated maps were included in the maps generated by the experts.
ONSD2013@ICEC2013
Summery: Use ontology to bridge datasets across domains
Basic technology Terms (classes/instances) defined in ontology are used as
common vocabulary for search data. If the ontology has mapping to Multiple DBs, the user can
search across them.
Motivation and Issue Combinations of multiple datasets could be valuable for
Big Data Analysis. However, to get all combinations across multiple Big Data is
not realistic for their size. Requests by the users are very different according to their
interests.
Ontology Engineering for Big Data to Solve the issue Ontology Exploration contribute to obtain
meaningful combinations (= viewpoints) according to the users’ interests.
2013/09/03 ONSD2013@ICEC2013 50
Case studies Use ontology to bridge datasets
across domains Understanding an Ontology through Divergent
Exploration Presented at ESWC2011
Use ontology to combine deep domain knowledge and raw data Japanese Medical Ontology project
Disease ontology and Ontology of Abnormal State
presented at ICBO (International Conference on Biomedical Ontology) 2011, 2012 and 2013
2013/09/03 52ONSD2013@ICEC2013
Medical ontology project in Japan Developed ontologies
Disease ontology : Definitions of diseases as causal chains of abnormal state. 6000+ diseases
Anatomy ontology : Connections between blood vessel, nerves, bones :
10,000+
It based on ontological frameworks (upper level ontology) which can apply to other domains
Models for causal chains Abnormal state ontology for data integration General framework to define complicated structures
2013/09/03 53ONSD2013@ICEC2013
Disease Ontology
Definition of the disease ontology
How to connect the disease ontology to medical database
2013/09/03 54ONSD2013@ICEC2013
An example of causal chain constituted diabetes.
2013/09/03 55
Disorder (nodes)
Causal Relationship
Core causal chain of a disease(each color represents a disease)
Legends
loss of sight
Elevated level of glucose in the blood
Type I diabetesDiabetes-related Blindness
Steroid diabetes
Diabetes…
…
……
…
…
…
… … …
…
possible causes and effects
Destruction of pancreatic beta cells
Lack of insulin I in the blood
Long-term steroid treatment
Deficiency of insulin
Is-a relation between diseases using chain-inclusion relationship between causal chains
ONSD2013@ICEC2013
Structural abnormality
Sizeabnormalit
y
Formational
abnormality
Conformational abnormality
Small in size
Small in line
Small in area
Small in volume
Narrowing tube
Vascular stenosis Gastrointestinal tract stenosis
Arterial stenosis … Intestinal stenosis
Layer 1 :Generic Abnormal States (Object-independent)
Layer 3:Specific context-dependent Abnormal States
Coronary stenosis in
Angina pectoris
Coronary stenosisin
Arteriosclerosis
Intestinal stenosisin
Ileus
Esophageal stenosis in
Esophagitis
Esophageal stenosis
is-a
Materialabnormality
Largein size
diseasedependent
Blood vesseldependent
Topological abnormalit
y ……
…
Is-a hierarchy of Abnormality Ontology
56
Tube-dependent…
Narrowing of valve
Layer2 :Object-dependentAbnormal States
……
…
Coronary stenosis
2013/09/03
ONSD2013@ICEC2013
Medical Department No. ofAbnormal states
No. of Diseases
Allergy and Rheumatoid 1,195 87
Cardiovascular Medicine 3,052 546
Diabetes and Metabolic Diseases 1,989 445
Orthopedic Surgery 1,883 198Nephrology and Endocrinology 1,706 198
Neurology 2,960 396Digestive Medicine 1,125 233
Respiratory Medicine 1,739 788
Ophthalmology 1,306 561
Hematology and Oncology
354 415
Dermatology 908 1,086
Pediatrics 2,334 879
Otorhinolaryngology 1,118 470
Total 21,669 6,302
Disease chains Graphical Tool
Hozo-Ontology EditorClinicians from 13 medical departments describe causal chains of diseases :• 6,302 diseases• 21,669 abnormal
states 572013/09/03
ONSD2013@ICEC2013
Medical Department No. ofAbnormal state
No. of Disease
Allergy and Rheumatoid 1,195 87
Cardiovascular Medicine 3,052 546
Diabetes and Metabolic Diseases 1,989 445
Orthopedic Surgery 1,883 198Nephrology and Endocrinology 1,706 198
Neurology 2,960 396Digestive Medicine 1,125 233
Respiratory Medicine 1,739 788
Ophthalmology 1,306 561
Hematology and Oncology
354 415
Dermatology 908 1,086
Pediatrics 2,334 879
Otorhinolaryngology 1,118 470
Total 21,669 6,302
Each Clinician defines diseases in terms ofcausal chains at his/her division
Causal Relationship
Abnormal States
Myocardial Infarction (disease)
582013/09/03
Each Clinician defines diseases in terms ofcausal chains at his/her division
Causal Relationship
Abnormal States
Myocardial Infarction (disease)
• Using three layer-model of abnormality ontology
• Combining causal chains including the same or related abnormal states by consulting is-a hierarchy
⇒Generic causal chains can be generated.
59
Layer 3
Layer 2
Layer 1
Causal Relationship
Abnormal States
Myocardial Infarction (disease)
Layer 3
Layer 2
Layer 1
Each Clinician describes the definition of disease (causal chains of disease ) at particular department
60
From 13medical divisions All 21,000 abnormal states can be visualized with possible causal relationships
• Using three layer-model of abnormality ontology
• Combining causal chains including the same or related abnormal states by consulting is-a hierarchy
⇒Generic causal chains can be generated.
Knowledge provided by the Disease Ontology
Definition of disease It can answer the following questions;
What abnormal state could be a cause of which diseases?
What condition may be occur on a patient of the disease?
That is it can provide base knowledge to analysis big data related to disease.
2013/09/03 ONSD2013@ICEC2013 61
DEMO: Visualization of abnormal state
ontology with possible causal relationships Java client application Developed by HOZO API.
Disease Chain LOD Linked Open Data converted from the disease
ontology. SPARQL endpoint (web API for query) and
Visualization Tool of Disease Chains by HTML5. http://lodc.med-ontology.jp/
2013/09/03 62ONSD2013@ICEC2013
SPARQL Endpoint
(c)The user can also browse connected triples by clicking rectangles that represent the objects.
(a)The user can make simple SPARQL queries by selecting a property and an object from lists.
(b) When the user selects a resource shown as a query result, triples connected the resource are visualized.
2013/09/03 63ONSD2013@ICEC2013
2013/09/03 64ONSD2013@ICEC2013
Abnormal state
Is-a hierarchy
Clinical DB
knowledge
data attribute⇔property interoperability
65
Anomaly representati
on
Abnormal statesLayers
Generic Chains
Disease chains
2013/09/03
Summary(2):Disease Ontology Disease Ontology
Provides domain knowledge described by medical experts.
Medical DB (Big Data) Provides evidential data from medial information
system such as electronic medical records.
It could be a good example to combine Ontology and Big Data.
2013/09/03 66
Existing Knowledge Evidence /New Knowledge
ONSD2013@ICEC2013
Concluding Remarks Ontology Engineering for Big Data
Combination of them are good! Basic technology: how to combine ontology to
big data Mapping ontology to database Add metadata on data using vocabulary defined in ontology Convert database (e.g. RDB) to ontology-based (RDF)
database How to use Combinations of Ontology and Big
Data: Two possible approaches Use ontology to bridge datasets across domains
Ontology exploration method to obtain meaningful combinations (= viewpoints)
Use ontology to combine deep domain knowledge and raw data
Future Plan Generalizing our approaches and feedback them as new
function of Hozo
2013/09/03 67ONSD2013@ICEC2013
Acknowledgements
A part of this work was supported by JSPS KAKENHI Grant Numbers 24120002 and 22240011.
A part of research on medical ontology is supported by the Ministry of Health, Labor and Welfare, Japan, through its “Research and development of medical knowledge base databases for medical information systems” and by the Japan Society for the Promotion of Science (JSPS) through its “Funding Program for World-Leading Innovative R&D on Science and Technology (FIRST Program)”.
I’m also grateful to all collaborator of each study.
2013/09/03 ONSD2013@ICEC2013 68
Acknowledgement
2013/09/03
Thank you for your attention!
Hozo Support Site : http://www.hozo.jp/Contact: [email protected]
69ONSD2013@ICEC2013