68
Knowledge Patterns: Design and Extraction Aldo Gangemi 1,2 , joint work with Andrea Giovanni Nuzzolese 2 , Valentina Presutti 2 , Diego Reforgiato Recupero 2,3 1 LIPN, Paris Nord University, CNRS UMR7030, France 2 Semantic Technology Lab, ISTC-CNR, Rome, Italy 3 Department of Informatics, University of Cagliari, Italy [email protected], {andrea.nuzzolese,valentina.presutti,diego.reforgiato}@istc.cnr.it

Knowledge Patterns SSSW2016

Embed Size (px)

Citation preview

Knowledge Patterns: Design and Extraction

Aldo Gangemi1,2, joint work with Andrea Giovanni Nuzzolese2,

Valentina Presutti2, Diego Reforgiato Recupero2,3 1LIPN, Paris Nord University, CNRS UMR7030, France

2Semantic Technology Lab, ISTC-CNR, Rome, Italy 3Department of Informatics, University of Cagliari, Italy

[email protected], {andrea.nuzzolese,valentina.presutti,diego.reforgiato}@istc.cnr.it

Invariances• “The important things in the world appear as invariants […] of

[…] transformations” (P. Dirac, The principles of quantum mechanics, 1947)

• “A property or relationship is objective when it is invariant under the appropriate transformations” (R. Nozick, Invariances, 2001)

• Multiple presentations (≈ under mapping)

• Multiple stages (≈ under change)

• Multiple contexts / perceivers / interpreters (≈ under different reference frameworks)

Patterns in general• “Invariances across observed data or objects”

• They exist in natural, social, cognitive, or abstract worlds • Mathematical pattern science is about symbols, i.e. non-

interpreted information objects • Objects of knowledge engineering are interpreted (cognitively,

and, by derivation, formally) • Mutual support/dependencies

• Gibson (1966), Shepard (1987,1992): invariances in stimulus-energy pair permanent (“projectable”) properties in the environment (affordances)

• E.g. colors, shapes, features of entities can constitute value-added references for behaviour

Knowledge as memory of (value-laden) observable (ir)regularities?

CureHealer

Medication

Patient

1985

At the origins of modern ontologies: Pat Hayes’ naïve physics manifesto

A Translation Approach to Portable Ontology Specifications. T. R. Gruber, Knowledge Acquisition, 5(2):199-220, 1993.

15459 citations!!!

CLib Attach component

(Attach has (superclasses (Action)) (required-slot (object base)) (primary-slot (agent)) )

(every Attach has (object ((exactly 1 Tangible-Entity) (a Tangible-Entity))) (base ((exactly 1 Tangible-Entity) (a Tangible-Entity)))

(every Attach has (preparatory-event ((:default (a Make-Contact with

(object ((the object of Self))) (base ((the base of Self))))

(a Detach with (object ((the object of Self))) (base ((the base of Self)))) ))))

RCC-8 Spatial Ontology

RCC: a calculus for region based qualitative spatial reasoning AG Cohn, B Bennett, JM Gooday, N Gotts - GeoInformatica, 1997

A library of generic concepts for composing knowledge bases K Barker, B Porter, P Clark, 2001

DOLCE (S5) foundational ontology patterns

Sowa’s Peirce-inspired top-level ontology

Sweetening ontologies with DOLCE A Gangemi, N Guarino, C Masolo, A Oltramari, …, 2002 - Springer

Knowledge Representation: Logical, Philosophical, and Computationa Foundations J SOWA - Brooks/Cole, 2000

Evidence of knowledge patterns• In linguistic resources

– Sentence forms

– Sub-categorization frames

– Lexico-syntactic patterns

– Conceptual frames

– Question patterns

– (Bounded sets of) selectional preferences

• In data

– Data patterns

– Data models (xsd, rdb)

– Query types and views

– Microformats

– Infoboxes

• In interaction

– Interaction patterns

– Lenses

– HTML templates

• In semantic resources

– Competency questions

– n-ary relations

– OWL/RDFS classes with (locally complete?) sets of restrictions or properties

– KM Component Library

– Content ontology design patterns (CPs)

– Knowledge patterns discovered from datasets

ConceptNet MIT OpenMind common sense project

AtLocation(dog, kennel)[]CapableOf(dog, bark)[]

CapableOf(dog, guard house)[]CapableOf(dog, pet)[]CapableOf(dog, run)[]Desires(dog, bone)[]

Desires(dog, chew bone)[]Desires(dog, pet)[]Desires(dog, play)[]HasA(dog, flea)[]

HasA(dog, four leg)[]HasA(dog, fur)[]

HasProperty(dog, loyal)[]IsA(dog, canine)[]

IsA(dog, domesticate animal)[]IsA(dog, loyal friend)[]IsA(dog, mammal)[]

IsA(dog, man best friend)[]IsA(dog, pet)[]

UsedFor(dog, companionship)[]

ConceptNet—a practical commonsense reasoning tool-kit.

Liu, Hugo, and Push Singh, BT technology journal, 2004

http://framenet.icsi.berkeley.edu/

Cure

Healer

Medication

Patient

FrameNet Cure frame

The Berkeley Framenet project. Baker, Fillmore, Lowe, Association for Computational Linguistics, 1998.

VerbNet Motion verb class

VerbNet: A broad-coverage, comprehensive verb lexicon. Schuler, KK, 2005

Knowledge patterns as expertise units• Evidence that units of expertise are larger than what we have from average linked data triples, or

ontology learning

• Cf. cognitive scientist Dedre Gentner: “uniform relational representation is a hallmark of expertise”

• We need to create expertise-oriented boundaries unifying multiple triples

– “Competency questions” are used to link ontology design patterns to requirements:

•Which objects take part in a certain event?

•Which tasks should be executed in order to achieve a certain goal?

• What’s the function of that artifact?

•What norms are applicable to a certain case?

•What inflammation is active in what body part with what morphology?

– Sometimes exception conditions should be added

– Task-based ontology evaluation can be performed with unit tests against ontologies trying to satisfy competency questions

Relational language and the development of relational mapping. Loewenstein, J, Gentner, D. Cognitive psychology, 2005

Ontology Design PatternsAn ontology design pattern is a reusable successful solution to a recurrent modeling

problem

Visit www.ontologydesignpatterns.org

Maximal ontology design requirement: What are we talking about, and why?

Generic Competency Questions Specific Modelling Use CaseWho does what, when and where? Production reports, schedules

Which objects take part in a certain event? Resource allocation, biochemical pathwaysWhat are the parts of something? Component schemas, warehouse management

What’s an object made of? Drug and food composition, e.g. for safety (comp.)What’s the place of something? Geographic systems, resource allocation

What’s the time frame of something? Dynamic knowledge basesWhat technique, method, practice is being used? Instructions, enterprise know-how database

Which tasks should be executed in order to achieve a certain goal? Planning, workflow management

Does this behaviour conform to a certain rule? Control systems, legal reasoning services

What’s the function of that artifact? System descriptionHow is that object built? Control systems, quality check

What’s the design of that artifact? Project assistants, cataloguesHow did that phenomenon happen? Diagnostic systems, physical modelsWhat’s your role in that transaction? Activity diagrams, planning, organizational models

What that information is about? How is it realized? Information and content modelling, computational models, subject directories

What argumentation model are you adopting for negotiating an agreement? Cooperation systems

What’s the degree of confidence that you give to this axiom? Ontology engineering tools

Layered pattern morphismsAn ontology design pattern describes a formal expression that can be exemplified, morphed, instantiated, and expressed in order to solve a domain modelling problem

• owl:Class:_:x rdfs:subClassOf owl:Restriction:_:y• Inflammation rdfs:subClassOf (localizedIn some BodyPart)

• Colitis rdfs:subClassOf (localizedIn some Colon)

• John’s_colitis isLocalizedIn John’s_colon

• “John’s colon is inflammated”, “John has got colitis”, “Colitis is the inflammation of colon”

LogicalPattern(MBox)

GenericContentPattern (TBox)

SpecificContentPattern (TBox)

DataPattern (ABox)

exemplifiedAs morphedAs instantiatedAs LinguisticPattern

expressedAs

Logic Meaning Reference Expression

expressedAs

Abstraction

Aldo Gangemi, Valentina Presutti: Ontology Design Patterns. Handbook on Ontologies 2nd ed. (2009)

Problem example: Temporal n-ary patterns

• Temporal indexing pattern – (R(a,b))+t sentence indexing

• quads, external time stamps

– R(a,b)+t relation indexing• reified n-ary relations (3D frames)

– R(a+t,b+t) individual indexing• fluents, 4D, tropes, “context slices” (4D frames)

– tR name nesting• ad hoc naming of binary relations

• More indexes for additional arguments

A Multi-dimensional Comparison of Ontology Design Patterns for Representing n-ary Relations. A Gangemi, V Presutti. SOFSEM 2013: 86-105

An Empirical Perspective on Representing Time. A Scheuermann, E Motta, P Mulholland, A Gangemi and V Presutti. K-CAP 2013

Formal Unifying Standards for the Representation of Spatiotemporal

Knowledge. P. Hayes, Advanced Decision Architectures Alliance, 2004

Procedural patterns• Precise

– Classification – Subsumption – Inheritance – Materialization – Rule firing – Constructive query

• Approximate – Fuzzy classification – Information extraction (NER, RE) – Similarity induction (e.g. alignment) – Taxonomy induction – Relevance detection – Latent semantic indexing

• Thesaurus to SKOS • Relational DB to RDF

• WordNet RDB to OWL • XML to RDF

• FrameNet XML to RDF • Microformat to RDF • NER entities to ABox • NLP to RDF

Reasoning patterns

Alignment patterns

Reengineering patterns

Anti-patterns (1/2)• Partonomies or subject classifications as subsumption hierarchies

• *City subClassOf Country

• City subClassOf (partOf some Country)

• *City subClassOf Geography

• City broader Geography (e.g. in SKOS)

• Linguistic disjunction as class disjointness

• Dead or alive

• *Dead or Alive

• Dead disjointWith Alive

• Linguistic conjunction as class disjunction

• Pen and paper

• *Pen and Paper

• Pen or Paper | Collection subClassOf (hasMember some Paper ; some Pen)

A catalogue of OWL ontology antipatterns.

Roussey, Corcho, Vilches-Blázquez, ACM, 2009.

A user oriented owl development environment designed to implement

common patterns and minimise common errors. Horridge, Rector, Drummond,

Springer, 2004.

Anti-patterns (2/2)• Causality as entailment

• Kaupthing bank behavior caused Iceland crisis

• *KaupthingBankBehavior subClassOf IcelandCrisis

• KaupthingBankBehavior isCauseOf IcelandCrisis

• Expressions as instances of the class representing their meaning

• *dog(word) rdf:type Dog

• dog(word) expresses Dog (with punning)

• Multiple domains or ranges of properties as intersection

• *hasInflammation rdfs:domain Epithelium ; Endothelium

• hasInflammation rdfs:domain (Epithelium or Endothelium)

Putting the pieces together

“Mine & Design”pattern induction from data

cleaning data by using (foundational) patterns pattern-based knowledge extraction from text

axiom induction from knowledge graphs …

Pattern induction from data: centrality discovery in datasets

mo:Track

mo:MusicArtist

mo:Playlist

mo:Torrent

mo:ED2K

tags:Tag

mo:Record

foaf:maker

rdfs:Literal

dc:title

dc:datemo:image

dc:description

mo:track

tags:taggedWithTag

mo:available_as

mo:available_as

mo:available_as

Extracting Core Knowledge from Linked Data.Presutti, Aroyo et al., COLD2011.

Serving DBpedia with DOLCE–More than Just Adding a Cherry on Top. Paulheim, Gangemi, ISWC, 2015.

Information Extraction and the SW• Historically SW mainly worked on ontology learning

– unconvincing results: sparseness, core knowledge difficult to catch, etc. (cf. analyses in Coppola et al. 2009, Blomqvist, 2009)

– natural language understanding known to be an AI-complete problem

• The paradigm of Open Information Extraction (Etzioni, 2006) fits the lightweight and/or data-driven trend of current SW

• Semantic technologies need hybridization

31

Google Knowledge Graph

IBM Watson

QA, NL querying on LD,full text search jointly with queries, ...

Apple Siri, Google Now, SRI startup Desti

Facebook Social Graph Microsoft Cortana

OIE, NELL, BabelNet, ...

Stochasticity does it well?

• Purely stochastic approaches to NLU attempt to learn models that solve one specific problem, but how to compose the different models? How to hybridise those models with logic/knowledge-based approaches?

• Cf. NCM chatbot, MetaMind neural QA

Google’s “Neural Conversational Model”one year ago on arXiv

mixed magic and massive stupidity

in this model deeply learnt from open movie scripts

similar vs. opposite semantics, but algorithm gives same

semantic similarity

We need intelligent hybridisation!

E.g. how to do deep semantic parsing?

• The Black Hand might not have decided to barbarously assassinate Franz Ferdinand after he arrived in Sarajevo on June 28th, 1914

eventnegation

modality

participants

more participants

quality

coreference

deep semantic parsing: not just annotation, but

formal knowledge extraction

eventrelation

Open Information Extractionpc5: NLPapps mac$ java -Xmx512m -jar reverb-latest.jar <<<"The Black Hand might not have decided to barbarously assassinate Franz Ferdinand after he arrived in Sarajevo on June 28th, 1914."

Initializing ReVerb extractor...Done.

Initializing confidence function...Done.

Initializing NLP tools...Done.

Starting extraction.

stdin 1 he arrived in Sarajevo 13 14 14 16 1610.2200632195721161 The Black Hand might not have decided to barbarously assassinate Franz Ferdinand after he arrived in Sarajevo on June 28th , 1914 .

DT NNP NNP MD RB VB VBN TO RB VB NNP NNP IN PRP VBD IN NNP IN NNP JJ , CD .B-NP I-NP I-NP B-VP I-VP I-VP I-VP I-VP I-VP I-VP B-NP I-NP B-SBAR B-NP B-VP

B-PP B-NP B-PP B-NP I-NP I-NP I-NP O he arrive in sarajevo

Done with extraction.

Summary: 1 extractions, 1 sentences, 0 files, 1 seconds

http://ai.cs.washington.edu/projects/open-information-extraction

Open Knowledge Extraction• Open Knowledge Extraction (OKE) is a hybrid approach to

knowledge graph production that exploits some of the assumptions of Open Information Extraction (open-domain, unsupervised), together with formal semantic reengineering of NLP output, Semantic Web and Linked Data patterns, entity linking, word-sense disambiguation linked to Linguistic Linked Data

• The result of OKE is a two-layered OWL-RDF knowledge graph that (1) lifts the content of a text into entities grounded into public web identities, with formal axioms, and (2) deeply annotates the text

• OKE can be used as a semantic middleware between content and knowledge management: querying, annotating, classifying, detecting, …

LOD and ODP design

Aligned to WordNet, VerbNet, FrameNet, DOLCE+DnS,

DBpedia, schema.org, BabelNet

RESTful or motif-based Python query interface

Earmark

RDF, OWL

Apache Stanbol

Neo-Davidsonian, DRT- and Frame-basedHigh EE and RE accuracy

FRED integrates NER, SenseTagging, WSD, Taxonomy

Induction, Relation/Event/Role Extraction

NIF

ALCO(D) DL language

http://wit.istc.cnr.it/stlab-tools/fred

OKE w. FRED“The Black Hand might not have decided to barbarously assassinate Franz Ferdinand after he arrived in Sarajevo on June 28th, 1914”

type induction

negation

modality

taxonomy induction

semantic roles

entity linking

+ configurable namespaces,Earmark text spans with semiotic relations to graph entities (denotes, hasInterpretant),

NIF annotations and text segmentation

events

qualities

tense representation

second order relationsrole propagation

predicate-argument structurescoreference resolution

Sample translation tableParsing pattern Logical pattern Example

Named entity <owl:NamedIndividuali> :BarackObama

Implicit discourse referent <owl:NamedIndividuali> rdf:type <owl:Classj> :doctor_1 rdf:type :Doctor

Entity resolution (NE) <owl:NamedIndividuali> owl:sameAs <owl:NamedIndividualj> :BarackObama owl:sameAs dbr:Barack_Obama

Entity coreference <owl:NamedIndividuali> owl:sameAs <owl:NamedIndividualj> :John owl:sameAs :doctor_1

Term <owl:Classi> || <owl:ObjectPropertyj> || <owl:DatatypePropertyk> :Doctor

Sense tag <owl:NamedIndividuali> rdf:type <owl:Classj> :BarackObama rdf:type dbo:Person

Sense disambiguation <owl:Classi> owl:equivalentClass <owl:Classj> :Doctor owl:equivalentClass n30:synset-doctor-noun-1

Compositional semantics<owl:Classi> owl:subClassOf <owl:Classj> && <owl:NamedIndividuali> dul:associatedWith <owl:NamedIndividualk>

:BrassInstrument rdfs:subClassOf :Instrument . :brass_1 dul:associatedWith :instrument_1

Extracted (binary) relationship <owl:NamedIndividuali> <owl:ObjectProperty> || <owl:DatatypeProperty> <owl:NamedIndividualj>

:Cabeza :survivorOf :expedition_1

Semantic role <semrolei> rdf:type (owl:ObjectProperty || owl:DatatypeProperty) vn.role:Agent rdf:type owl:ObjectProperty

Event <dul:Eventi> <semrolej> <Entityj> . <dul:Eventi> rdf:type <Event.type> . :visit_1 vn.role:Agent :doctor_1 . :visit_1 rdf:type :Visit

Frame <Event.typei> owl:subClassOf dul:Event || <ff:Framej>:Visit rdfs:subClassOf vn.data:Visit_36030100 || ff:Visiting

Neo-Davidsonian situation <boxing:Situationi> boxing:involves <owl:NamedIndividualj>:situation_1 boxing:involves :John , :Dog ; boxing:hasTruthValue :False

Quantified expression <owl:NamedIndividuali> quant:hasQuantifier <quant:Quantifierj> :doctor_1 quant:hasQuantifier quant:some

Negation <dul:Eventi> boxing:hasTruthValue :False :visit_1 boxing:hasTruthValue :False

Modality <dul:Eventi> boxing:hasModality <boxing:Modalityj> :visit_1 boxing:hasModality :Possible

Regular adjectival semantics <owl:NamedIndividuali> dul:hasQuality <dul:Qualityj> :John dul:hasQuality :Smart

Alternative adjectival semantics

<owl:Classi> dul:hasQuality <dul:Quality> . <owl:Classi> dul:associatedWith <owl:Classj>

:AllegedDoctor dul:hasQuality :Alleged ; dul:associatedWith :Doctor

Disjunction of individuals <owl:NamedIndividuali> boxing:union <owl:NamedIndividualj>

Factual entailment <Eventi> boxing:entails <Eventj>

FRED’s OKE pipelineSemantic Web Machine Reading with FRED. Gangemi, Presutti, Reforgiato Recupero, Nuzzolese, et al. Semantic Web Journal, 2016, http://semantic-web-journal.org/system/files/swj1379.pdf

Semiotic driftin' with OKE: an example

John Coltrane played with Miles Davis in Kind of Blue

(ROOT (S (NP (NNP John) (NNP Coltrane)) (VP (VBD played) (PP (IN with) (NP (NNP Miles) (NNP Davis))) (PP (IN in) (NP (NP (NNP Kind)) (PP (IN of) (NP (NNP Blue))))))))

root(ROOT-0, played-3) nn(Coltrane-2, John-1) nsubj(played-3, Coltrane-2) nn(Davis-6, Miles-5) prep_with(played-3, Davis-6) prep_in(played-3, Kind-8) prep_of(Kind-8, Blue-10)

___________________________ _____________ |x0 x1 x2 x3 | |e4 | |...........................| |.............| (|named(x0,john_coltrane,per)|A|play(e4) |) |named(x1,kind,loc) | |Actor1(e4,x0)| |named(x2,blue,loc) | |of(x1,x2) | |named(x3,miles_davis,loc) | |in(e4,x1) | |___________________________| |Actor2(e4,x3)| |_____________|

DRTDependencies

LLD (VerbNet)

Entity Linking

<http://dbpedia.org/resource/Kind_of_Blue> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/MusicAlbum> .<http://dbpedia.org/resource/Kind_of_Blue> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/CreativeWork> .<http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Play> <http://www.w3.org/2002/07/owl#equivalentClass> <http://www.ontologydesignpatterns.org/ont/vn/data/Play_36030100> .<http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Play> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#Event> .<http://dbpedia.org/resource/John_Coltrane> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/MusicGroup> .<http://dbpedia.org/resource/John_Coltrane> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .<http://www.ontologydesignpatterns.org/ont/fred/domain.owl#play_1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Play> .<http://www.ontologydesignpatterns.org/ont/fred/domain.owl#play_1> <http://www.ontologydesignpatterns.org/ont/vn/abox/role/Actor1> <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#John_coltrane> .<http://www.ontologydesignpatterns.org/ont/fred/domain.owl#play_1> <http://www.ontologydesignpatterns.org/ont/vn/abox/role/Actor2> <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Miles_davis> .<http://www.ontologydesignpatterns.org/ont/fred/domain.owl#play_1> <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#locatedIn> <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Kind> .<http://dbpedia.org/resource/Miles_Davis> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .<http://dbpedia.org/resource/Miles_Davis> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/MusicGroup> .<http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Kind> <http://www.w3.org/2002/07/owl#sameAs> <http://dbpedia.org/resource/Kind_of_Blue> .<http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Kind> <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#of> <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Blue> .<http://www.ontologydesignpatterns.org/ont/fred/domain.owl#Miles_davis> <http://www.w3.org/2002/07/owl#sameAs> <http://dbpedia.org/resource/Miles_Davis> .<http://www.ontologydesignpatterns.org/ont/fred/domain.owl#John_coltrane> <http://www.w3.org/2002/07/owl#sameAs> <http://dbpedia.org/resource/John_Coltrane> .

select ?p where {dbpedia:Kind_of_Blue ?p dbpedia:Miles_Davis}

p http://dbpedia.org/ontology/artist http://dbpedia.org/property/writer

Semantic subgraph

Query

<http://dbpedia.org/resource/Kind_of_Blue> <http://www.w3.org/2000/01/rdf-schema#comment> "Kind of Blue is a studio album by American jazz musician Miles Davis, released on August 17, 1959, by Columbia Records. Recording sessions for the album took place at Columbia's 30th Street Studio in New York City on March 2 and April 22, 1959. The sessions featured Davis's ensemble sextet, with pianist Bill Evans, drummer Jimmy Cobb, bassist Paul Chambers, and saxophonists John Coltrane and Julian Cannonball Adderley."@en .

describe dbpedia:Kind_of_Blue

<http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/ontology/producer> <http://dbpedia.org/resource/Irving_Townsend> . <http://dbpedia.org/resource/Kind_of_Blue> <http://purl.org/dc/terms/subject> <http://dbpedia.org/resource/Category:Albums_certified_gold_by_the_British_Phonographic_Industry> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/ontology/recordDate> "1958-05-25+02:00"^^<http://www.w3.org/2001/XMLSchema#date> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/ontology/recordedIn> <http://dbpedia.org/resource/CBS_30th_Street_Studio> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/ontology/recordedIn> <http://dbpedia.org/resource/New_York_City> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/ontology/recordLabel> <http://dbpedia.org/resource/Legacy_Recordings> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/ontology/genre> <http://dbpedia.org/resource/Modal_jazz> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/property/title> <http://dbpedia.org/resource/Blue_in_Green> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/property/producer> <http://dbpedia.org/resource/Irving_Townsend> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/property/title> <http://dbpedia.org/resource/All_Blues> . <http://dbpedia.org/resource/Kind_of_Blue> <http://dbpedia.org/property/title> <http://dbpedia.org/resource/Freddie_Freeloader> .

Entity Linking again

OKE again

“CBS 30th Street Studio was an American recording studio operated by Columbia Records, and located at 207 East 30th Street, between Second and Third Avenues in Manhattan, New York City.”

<http://dbpedia.org/resource/Columbia_Records> <http://www.ontologydesignpatterns.org/ont/fred/domain.owl#operateBetweenAvenueLocatedIn> <http://dbpedia.org/resource/New_York_City> .

etc. …

Legalo synthetic path finder

Complexity of FRED’s models and computational time

• typically, a 100-word sentence takes less than 1 second to be processed, also considering the lag due to the Web API, and the load of a diagram when using the Web application

• the expressivity of an OKE knowledge graph dataset produced by FRED calculated from a composite text corpus from four different textual types (approx. 19,000 axioms) is equivalent to an ALCO(D) (Attributive Language with Complements, Object value restrictions and Data properties) DL language

• ALCO(D) includes atomic negation, class intersection, universal and existential restrictions, and nominals (closed world classes)

• computational complexity of ALCO(D) is PSpace-Complete, and enjoys both finite model and tree model properties (cf. http://www.cs.man.ac.uk/~ezolin/dl/ for a complexity navigator)

Challenges for OKE

Four classes of semantic problems

1. partial accuracy of specific NLP components lead to global errors

2. “in praesentia” semantics besides literal interpretation: various kinds of coercion, met* phenomena

3. “in absentia” semantics: implicatures, presuppositions, tacit knowledge, reference to the physical context

4. higher-order phenomena: emergent frames, social (and legal) norms, attitudes, argumentation, cultural frames and narratives, discourse marks, text types

1. Partial accuracy of specific NLP components lead to global errors

• N/V POS tagging (specially for English) David Moyes shares Manchester United fans' frustration

• Complex multiword extraction Myeloid hepatosplenomegaly is an enlargement of liver and kidney due to myelofibrosis.

• Coordinations are difficult Uncaria est une liane des jungles tropicales de l'Amérique du Sud et Centrale.

Aristotle was a Greek philosopher, a student of Plato and teacher of Alexander the Great. • Citations and titles need to be treated differently

Anna Karenina is also mentioned in R. L. Stine's Goosebumps series Don't Go To Sleep. • Plural coreferences are hard

When Carol helps Bob and Bob helps Carol, they can accomplish any task.

• The path descended abruptly

• The road runs along the coast for two hours

• The fence zigzags from the plateau to the valley

• The highway crawls through the city

• The road leads us to Bordeaux

• Need for “type coercion” to satisfy hidden frame

• highway is actually a path that “can be crawled”, therefore the crawling frame here is descriptive of a state, not of an action

• fence is actually an object whose shape “can be followed by zigzaging”

• road is actually an object that “can be followed as an indication” to our destination

• sometimes an inversion of roles: the path descends because it can be descended

2. “in praesentia” semantics besides literal interpretation

E.g. fictive motion and coercion (Talmy, Welty, Retoré)

adjective semantics• Carmelo is a Sicilian surgeon

• Carmelo is an arsonist

• ⊨ Carmelo is a Sicilian arsonist

• Carmelo is a skilful surgeon

• ⊭ Carmelo is a skilful arsonist

• Carmelo is the alleged surgeon

• ⊭ Carmelo is the alleged arsonist

• ?⊨ Carmelo is a surgeon

• Carmelo is a fake surgeon

• ⊭ Carmelo is a fake arsonist

• ⊭ Carmelo is a surgeon

• How many frames? (FrameNet, VerbNet, etc. have a small coverage), roles are often partly covered, mapping between frame resources and linguistic constructions is seriously incomplete

• Interaction between in praesentia (traditional machine reading) and in absentia (SW-machine reading) knowledge is complicated: how about relevance, novelty, situatedness, etc.?

• Mario, please pass me the glass over there

• Mario, I feel sick … how about the meal we had yesterday?

• Mario, we are in last year’s situation

3. “in absentia” semantics: implicatures, presuppositions, tacit knowledge, reference to

the physical world

• I saw the Coliseum in my tourist guide and wanted to go there

• artifact vs. place

• Actually relevant?

• Power of ambiguity (“systematic polysemy”)

• Minimal effort seems to count in human evolution of lexical knowledge, but only if we can easily reconstruct the context (or frame, relation, ...)

• “The communicative function of ambiguity in language” (Piantadosi, Tilyb, Gibson, @PLoS): ambiguity allows for greater ease of processing by permitting efficient linguistic units to be re-used. “All efficient communication systems will be ambiguous, assuming that context is informative about meaning”

• Also in science: inflammation has several interrelated meanings

Dot objects, co-predication (Pustejovsky, Asher)

4. higher-order phenomena: emergent frames, social (and legal) norms, attitudes, argumentation, cultural frames and narratives, discourse marks, text types

• Weak results for automatic extraction of discourse marks and their related semantics

• Attitude/argumentation lenses over basic semantics

• Bhatkal's father: I'm glad he has been arrested

• I disagree with the comments of reviewer 1, but reviewer 2 should provide a stronger basis to his low rating

• Text types

• A cat is on the mat / A cat is a mammal

• Norms

• I should/feel obliged/want/obey/fear … it’s required/acceptable/convenient/proper/suggested …

• Complex frames/narratives

• We need tax relief vs. Taxes are investments

Example: Extracting opinion graphs

Filtering FRED’s graphs with opinions

People hope that the President will be condemned by the judges

Triggering event

Main topic

Subtopics

Holder

Sentilo opinion ontology

http://wit.istc.cnr.it/sentilo-release/sentilo

Result in triples

People hope that the President will be condemned by the judges

Aldo Gangemi, Valentina Presutti, Diego Reforgiato Recupero. Frame-based detection of opinion holders and topics: a model

and a tool. IEEE Computational Intelligence Magazine, 9(1), 2014

Example: Extracting adjectival qualities

Approximating adjective semantics

• New adjective ontology derived by reasoning on top of an integrated resource including (Onto)WordNet and FrameNet-RDF

• Four ontology design patterns for the four main semantics identified

Adjective Semantics in Open Knowledge Extraction. A. Gangemi, A.G. Nuzzolese, V. Presutti and D. Reforgiato, Formal Ontology in Information Systems Conference (FOIS2106), IOS Press, 2016.

Approximating adjective semantics: patterns

• Base: --> create taxonomy and intensional quality

:SkilfulSurgeon rdfs:subClassOf Surgeon . :surgeon_1 a :SkilfulSurgeon . :SkilfulSurgeon dul:hasQuality :Skilful .

• Extensional (Base + individual Quality): --> create taxonomy and individual+intensional quality

:CanadianSurgeon rdfs:subClassOf :Surgeon . :surgeon_1 a :CanadianSurgeon . :surgeon_1 dul:hasQuality :Canadian . :CanadianSurgeon dul:hasQuality :Canadian .

• Modal: --> create association and intensional modality

:surgeon_1 a :AllegedSurgeon . :AllegedSurgeon dul:associatedWith :Surgeon . :AllegedSurgeon boxing:hasModality :Alleged .

• Privative: --> create association and intensional quality

:surgeon_1 a :Fake_surgeon . :Fake_surgeon dul:associatedWith :Surgeon . :Fake_surgeon dul:hasQuality :Fake .

Approximating adjective semantics: example

The alleged doctor failed to transplant the fake organ into the nice patient that borrowed a Canadian car

Passing the baton• We have seen knowledge engineering on the SW as kind of

pattern science

• Reusable patterns

• Procedural practices

• Discoverable patterns

• Pattern-based formal knowledge extraction

• How logical and statistical techniques can be formally hybridised, so leveraging the legacy of Pat Hayes and David Mumford?

Other useful links• FRED web application

• http://wit.istc.cnr.it/stlab-tools/fred/demo

• FRED API documentation

• http://wit.istc.cnr.it/stlab-tools/fred/api

• A FRED benchmark in N-Quads

• complete with annotations

• https://www.dropbox.com/s/q6b47dxmwyyseij/goldfrombenchmark.nq?dl=0

• only the semantic subgraph

• https://www.dropbox.com/s/p7w8nojb2g2yf8k/goldfrombenchmark_semtriples.nq?dl=0