15
1 The company, product and service names used in this web site are for identification purposes only. © Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners. Ask Data Anything How to make the most of your data using Semantic Technologies

Ask Data Anything

Embed Size (px)

Citation preview

1The company, product and service names used in this web site are for identification purposes only.

© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.

Ask Data AnythingHow to make the most of your data using

Semantic Technologies

2The company, product and service names used in this web site are for identification purposes only.

© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.

Overview

Why semantics matter?

Ontologies and Ask Data Anything dimensions

Data exploration through Ask Data Anything

Data mining for attributes (if time permits)

3The company, product and service names used in this web site are for identification purposes only.

© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.

Why semantics matter?

Flat data is void The types of queries you can perform over flat data is restricted

to the vocabulary contained in this specific data.

Item City Date Inv. stat Vocabulary

4The company, product and service names used in this web site are for identification purposes only.

© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.

Item Country

1

2

Why semantics matter?(cont.)

What about relational data? e.g., how to make complex relational queries involving simple relations as the country to which a city belongs to?

Item City Date Inv. stat

5The company, product and service names used in this web site are for identification purposes only.

© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.

Why semantics matter (Cont)?

The key feature offered by semantics – and in particular Ask Data Anything - is adding additional layers on top of data (which are not explicitly in the data itself) e.g., ask for results over cities, countries, continents when the

data only contains information about cities.

A way to achieve this is by defining a structure of knowledge for any sort of domain (taxonomies), with: nouns representing classes of objects

verbs representing relations between objects

6The company, product and service names used in this web site are for identification purposes only.

© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.

Ontologies & Ada dimensions (Taxonomy embodiment)ada

dimensions

localization

city

country

continent

temporal

day

month

year

hierarchical

goods

clothing

shoe

sport-shoes

high-heel-shoes

pants

jeans

Shorts

Poland is a country. Germany is a country.Krakow is a city.Warsaw is a city.Krakow is-located-in Poland.Warsaw is-located-in Poland.Hamburg is a city.Berlin is a city.Hamburg is-located-in Germany.Berlin is-located-in Germany.

Every column in the input data is marked With one of the data dimensions and with a Concept (e.g., localization and city)

7The company, product and service names used in this web site are for identification purposes only.

© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.

Taxonomy modeling in Fluent Editor

Every location is an ada-dimension. Every country is a location. Every city is a location. Every continent is a location.

Every hierarchical is an ada-dimension. Every brand is hierarchical. Every status is a hierarchical. Every vendor is hierarchical. Every good is hierarchical.

Every temporal is an ada-dimension. Every second is temporal. Every month is temporal. Every year is temporal.

OWL 2.0 full compliance.

Uses Ontorion controlled natural language (OCNL) | OWL2 --> OWL2 + SWRL

8The company, product and service names used in this web site are for identification purposes only.

© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.

Ask Data Anything

Technically, Ask Data Anything is capable of performing projection, sub-setting and aggregation operations, providing answers for queries involving the following information:

What quantitative field to use?

How the output is to be displayed?

Where (Optional) to restrict the results? Seizes containment relation

By what to aggregate? (Optional) – needs an aggregating operation to be provided.

Aggregating operation (Optional) – tied to the type (e.g. double, string) of What

When? (Optional)

9The company, product and service names used in this web site are for identification purposes only.

© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.

Ask Data Anything (Architecture)

Data and Models are tightly coupled, as models provide an interface to query the data so everything aimed to be queried needs to be modeled in the underlying ontology.

10The company, product and service names used in this web site are for identification purposes only.

© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.

Ask Data Anything

Invoice Item Vendor Status Quantity City

TID-0001Cafepress Men's Bass Guitar Guy Graphic Tee Vendor-A Rejected 133 Berlin

Warsaw

TID-0002 Shirt-1 Vendor-C Accepted 114 Berlin

TID-0003 Cafepress Men's 2 Daughters Graphic Tee Vendor-A Pending 34 Munchen

TID-0004 Faded Glory Women's Knit Polo 2-Pack Vendor-D Accepted 110 Munchen

TID-0005 Cafepress Men's Biceps Graphic Tee Vendor-B Pending 112 Berlin

TID-0006Faded Glory Men's Long Sleeve 2 Pocket Flannel Shirt Vendor-B Rejected 37 Krakow

TID-0007 Aqua Blues Women's Twist Back Top Vendor-A Paid 63 Berlin

Let us consider the following data source (extract)

Vendor-A is a vendor.Vendor-B is a vendor.Vendor-C is a vendor.Vendor-D is a vendor.

Rejected is a status.Accepted is a status.Pending is a status.Paid is a status.

Every status is a hierarchical. Every vendor is hierarchical.

11The company, product and service names used in this web site are for identification purposes only.

© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.

Example queries in ADA

Sum quantity in Poland by city

12The company, product and service names used in this web site are for identification purposes only.

© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.

Example queries in ADA (Cont.)Sum quantity by country on map

13The company, product and service names used in this web site are for identification purposes only.

© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.

Example queries in ADA (Cont.)Summarize status by vendor on piechart

14The company, product and service names used in this web site are for identification purposes only.

© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.

Material sub-setting (Data mining for attributes)

In the case of csv data, Ontorion Text Mining AddIn for excel allows to extract meaningful information from raw data, helping in the taxonomy creation process

Allows attribute-based sub-setting

15The company, product and service names used in this web site are for identification purposes only.

© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.

Thank you!Jesus NuñezJunior Data Engineer at Cognitum

[email protected]

Cognitum Sp. z o.o.Wal Miedzeszynski 630, Warsaw, Poland

http://www.cognitum.eu/, [email protected]

Windows Azure Circle Partner

MEMBER