19
Copyright 2007 Cyc-Gate workflow Blaz Fortuna, Luka Bradesko Cycorp Europe, Slovenia

LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

  • Upload
    larkc

  • View
    729

  • Download
    0

Embed Size (px)

DESCRIPTION

The aim of the EU FP 7 Large-Scale Integrating Project LarKC is to develop the Large Knowledge Collider (LarKC, for short, pronounced “lark”), a platform for massive distributed incomplete reasoning that will remove the scalability barriers of currently existing reasoning systems for the Semantic Web. The LarKC platform is available at larkc.sourceforge.net. This is the first of two hand-ons that introduce participants to working with directly LarKC code.

Citation preview

Page 1: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Copyright 2007

Cyc-Gateworkflow

Blaz Fortuna, Luka BradeskoCycorp Europe, Slovenia

Page 2: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Goal

• Demonstrate reasoning over non-structured input data

• Learn how to correctly annotate a new plug-in

• Learn how to add a new plug-in to the platform

Page 3: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

External tools useds

• GATE– Information Extraction framework– Used here for extraction of named entities

from articles

• ResearchCyc– Common-sense knowledge base

• ~300,000 concepts, 1.3M assertions

– Reasoning engine

Page 4: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Pipeline diagram

Query Identify

Transform

Select

ReasonResult

ResearchCyc

GATE

Internet

Page 5: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Example

Page 6: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Query

PREFIX cyc: <http://www.cycfoundation.org/concepts/>

SELECT ?company WHERE

{ ?company cyc:mentionedInArticle " http://shodan.ijs.si:8080/GateServer/news.txt " .

?company cyc:isa cyc:PubliclyHeldCorporation }

Page 7: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Identify

• Find links to html documents and retrieve them using ArticleIdentifier plugin.– Returns a text document:

http://shodan.ijs.si:8080/GateServer/news.txt

Page 8: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Transform

• Use GATE to extract organizations– Retruns SetOfStatements of style:

article-0 urn:hasUrl “http://shodan.ijs.si:8080/GateServer/news.txt "

company-0 urn:nameString “Microsoft”

company-0 urn:mentionedInArticle article-0

company-1 urn:nameString “Ford”

company-1 urn:mentionedInArticle article-0

Query:

?company cyc:mentionedInArticle "http://shodan.ijs.si:8080/GateServer/news.txt"

Page 9: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Select

• Select only the companies with corresponding concept in ResearchCyc KBcompany-0 → #$MicrosoftInccompany-1 → #$FordMotors

• Replace URIs with Cyc conceptscyc:mentionedInArticle → #$mentionedInArticle

• Output:

#$MicrosoftInc  #$mentionedInArticle #$article-0

#$FordMotors #$mentionedInArticle #$article-0

Page 10: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Reason

• Reason– Load the triples with

Cyc concept names in ReasearchCyc KB

– Transform SPARQL query to Cyc query

– Execute and retrieve results

Page 11: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Run the workflow on your computer!

Main class: eu.larkc.core.LarkcVM arguments: -Xmx512m

Page 12: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Run SPARQL client

• In windows:Double-click SPARQLClient.jar

• In Linux:java –jar SPARQLClient.jar

Page 13: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Run example query

• Execute query in SPARQL Client

• Walk-through the output of the program

• Go through the plug-ins’ .java files

Page 14: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Other interesting queries

PREFIX cyc: <http://www.cycfoundation.org/concepts/>SELECT ?company WHERE{ ?company cyc:mentionedInArticle "http://shodan.ijs.si:8080/GateServer/news.txt" .?company cyc:isa cyc:PubliclyHeldCorporation }

PREFIX cyc: <http://www.cycfoundation.org/concepts/>SELECT ?company WHERE{ ?company cyc:mentionedInArticle "http://shodan.ijs.si:8080/GateServer/news.txt" .?company cyc:isa cyc:SoftwareVendor }

PREFIX cyc: <http://www.cycfoundation.org/concepts/>SELECT ?company WHERE{ ?company cyc:mentionedInArticle "http://shodan.ijs.si:8080/GateServer/news2.txt" .?company cyc:isa cyc:SoftwareVendor }

PREFIX cyc: <http://www.cycfoundation.org/concepts/>SELECT ?company WHERE{ ?company cyc:mentionedInArticle "http://shodan.ijs.si:8080/GateServer/news.txt" .?company cyc:mentionedInArticle "http://shodan.ijs.si:8080/GateServer/news2.txt" .?company cyc:isa cyc:Business }

Page 15: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Other interesting queries

PREFIX cyc: <http://www.cycfoundation.org/concepts/>

SELECT ?company WHERE

{ ?company cyc:mentionedInArticle "http://shodan.ijs.si:8080/GateServer/news2.txt" .

?company cyc:makesProductType cyc:CellularTelephone }

PREFIX cyc: <http://www.cycfoundation.org/concepts/>

SELECT ?company WHERE

{ ?company cyc:mentionedInArticle "http://shodan.ijs.si:8080/GateServer/news2.txt" .

?company cyc:makesProductType cyc:CellularTelephone .

?company cyc:stockTickerSymbol ?ticker }

PREFIX cyc: <http://www.cycfoundation.org/concepts/>

SELECT ?company WHERE

{ ?company cyc:mentionedInArticle "http://shodan.ijs.si:8080/GateServer/news2.txt" .

?program cyc:programAuthor ?company }

PREFIX cyc: <http://www.cycfoundation.org/concepts/>

SELECT ?company WHERE

{ ?company cyc:mentionedInArticle "http://shodan.ijs.si:8080/GateServer/news2.txt" .

?competitor cyc:competitors ?company .

?competitor cyc:makesProductType cyc:CellularTelephone }

Page 16: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Plug-in SAWSDL description

<wsdl:description>

<!-- COMMON TO ALL SELECTERS -->

<wsdl:interface name="identifier"

sawsdl:modelReference="http://larkc.eu/plugin#Identifier">

</wsdl:interface>

<wsdl:binding name="larkcbinding" type="http://larkc.eu/wsdl-binding" />

<!-- SPECIFIC TO THIS IDENTIFIER -->

<wsdl:service

name="urn:eu.larkc.plugin.identify.article.ArticleIdentifier"

interface="identifier”

sawsdl:modelReference="http://larkc.eu/plugin#ArticleIdentifier" >

<wsdl:endpoint

location="java:eu.larkc.plugin.identify.article.ArticleIdentifier" />

</wsdl:service>

</wsdl:description>

Page 17: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Plug-in ontology

@prefix larkc: <http://larkc.eu/plugin#> .

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

larkc:ArticleIdentifier

rdf:type rdfs:Class ;

rdfs:subClassOf larkc:Identifier ;

larkc:hasInputType larkc:SPARQLQuery ;

larkc:hasOutputType larkc:NaturalLanguageDocument .

Page 18: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Scripted decider

Pipeline pipeline = new Pipeline();

pipeline.addPlugIn(new URIImpl("urn:eu.larkc.plugin.identify.article.ArticleIdentifier"));

pipeline.addPlugIn(new URIImpl("urn:eu.larkc.plugin.transform.gate.GateTransformer"));

pipeline.addPlugIn(new URIImpl("urn:eu.larkc.plugin.select.cycselecter.CycSelecter"));

pipeline.addPlugIn(new URIImpl("urn:eu.larkc.plugin.reason.cycreasoner.CycReasoner"));

try {

pipeline.start(theQuery);

} catch (Exception e) {

// error

}

return (VariableBinding)pipeline.take();

Page 19: LarKC Tutorial at ISWC 2009 - Second Hands-on Scenario

Write a new plug-in

• Create new project– New Folder– Link bin directory– Make source directory– Add libraries

• Prepare code:– Copy-paste GateTransformer.Java– Rename it to SimpleNamedEntitiyExtractor– Insert code available in SimpleNamedEntitiyExtractor.txt

• Prepare/update meta-data files– SimpleNamedEntitiyExtractor.wsdl– SimpleNamedEntitiyExtractor.rdf

• Update CycGateDecider• Clean, Build and Run!