TAIPAN: Automatic Property Mapping for Tabular Data

TAIPAN: Automatic Property Mapping for

Tabular Data by Ivan Ermilov and Axel-Cyrille Ngonga Ngomo

November 22nd, 2016

Web Scale Data Mining from Web Tables

Web Data CommonsDresden Table Dataset

Other tables

The Web

TAIPAN

● Structured● Schemaless● Not using standards*

● SPARQL● RDFS● OWL

TAIPAN Approach Overview

Identify Subject Column

Atomize a Table

Identify Property for Each Table

Step 1 Step 2 Step 3 Step 4

Return Mappings

TAIPAN Approach Overview (example)1

The Core of TAIPAN

Subject Column Identification

● Unsupervised ML● Structural features● Semantic features

○ Support of a column○ Connectivity

● Retrieve seed entities● Rank entities● Return top entity

Property Mapping

Experimental setup

For T2K: 128GB, 4 Cores, Ubuntu 14.04

For TAIPAN: 16GB, 4 Cores Ubuntu 14.04

Dataset 1: curated T2D gold standard (T2D)

Dataset 2: DBpedia table dataset (DBD)

Subject Column Identification Experiments

Rule-based approach achieves only 51.72% accuracy

Using support and connectivity increase precision

Observations

Can be further improved using ML techniques

Property Mapping Experiments

TAIPAN achieves better recall, but lower precision than T2D

On the DBD dataset T2K could match only 1 property

Observations

Overall TAIPAN performs better than the state of the art

Conclusions & Future Work

Curated T2D & DBD datasets

Novel TAIPAN approach

Open Table Extraction

Table Extraction Benchmark (HOBBIT)

Integration of TAIPAN into GEISER project9

Thank you! Follow us on twitter :)

Ivan Ermilov <iermilov@informatik.uni-leipzig.de>

@hobbit_project

TAIPAN: Automatic Property Mapping for Tabular Data

Engineering

2. Taipan 111 (John)

MATCHED-TESTING MATTERS - Taipan

Presentación Serpiente Taipan

Simplex Tabular

35.7% Sb - HIGHEST ANTIMONY RESULT TO DATE AT EASTERN ... · 11/12/2013 · New Dugite Zone grades overshadow Taipan Zone ... 7 Laboratory analysis of historical Taipan Resources

The Important Stuff - Taipan

Neoflex Taipan Series. - thorperformanceproducts.com€¦ · Neoflex Taipan Series. Neoflex™ Taipan Series is a 100% EPDM rubber floor covering made of polymerically bound synthetic

Colless Kingston 180718 - Arthur B. McDonald Institute...UKST + TAIPAN system § The Taipan survey will employ the new TAIPAN multi-fibrespectrograph on a rejuvenated UKST… oThe

Taipan, Taiwan — January 21, 2019 — CoolBitX Partners With ... · Taipan, Taiwan — January 21, 2019 — CoolBitX Partners With Leading Japanese Exchange to Pilot Industry-First

Bangun Bisnis Riil Bersama Taipan

Modelo Tabular

Taipan-FluxCaL: TAIPAN Flux Calibration Statusweb.science.mq.edu.au/~mcowley/presentations/... · MICHAEL COWLEY TAIPAN TEAM MEETING, AUGUST 31 2016 Taipan-FluxCaL: TAIPAN Flux Calibration

Diario Tabular

STRUCTURE, MOTIONS AND COSMOLOGY FROM THE …cosmo/CosFlo16/DOCUMENTS/...UKST-TAIPAN instrument system q The Taipan survey will employ the new TAIPAN multi-fibre spectrograph on a

Tabular Minimization

Taipan Resources

Taipan Club Presentation

XXX10S Taipan Motorsport

Taipan ECU and Spark software Tuner Manual Revision 1

Kuasa Taipan Kelapa Sawit di Indonesia