ML Schema: Machine Learning Schema
Agnieszka Lawrynowicz
Poznan University of Technology, PolandOpenML2016
March 17, 2016
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 1 / 18
W3C Machine Learning Schema Community Group
https://www.w3.org/community/ml-schema/
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 2 / 18
Goals
To define a simple shared schema of data mining/ machine learning(DM/ML) algorithms, datasets, and experiments that may be used inmany di↵erent formats: XML, RDF, OWL, spreadsheet tables.
Collect use cases from the academic community and industry
Use this schema as a basis to align existing DM/ML ontologies anddevelop more specific ontologies with specific purposes/applications
Prevent a proliferation of incompatible DM/ML ontologies
Turn machine learning algorithms and results into linked open data
Promote the use of this schema, including involving stakeholders likeML tool developers
Apply for funding (e.g. EU COST, UK Research Councils,Horizon2020 Coordination and Support Actions) to organizeworkshops, and for dissemination
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 3 / 18
Goals
Use this schema as a basis to align existing DM/ML ontologiesand develop more specific ontologies with specificpurposes/applications
Prevent a proliferation of incompatible DM/ML ontologies
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 4 / 18
ML ontologies and vocabularies
OntoDM
DMOP
Expose
MEX vocabulary
others: KDDONTO, KD, DMWF, ...
mostly having several hundreds of classes, some highly axiomatized
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 5 / 18
OntoDM
Pance Panov, Larisa N. Soldatova, Saso Dzeroski: Ontology of core data miningentities. Data Min. Knowl. Discov. 28(5-6): 1222-1265 (2014)
built in compliance to upper level ontologies BFO, OBI, IAO, modularized
incorporates structured data mining
Use case: generic, middle level ontology for ML; representing QSAR entities fordrug design, used by Eve Robot Scientist
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 6 / 18
DMOP: Data Mining Optimization Ontology
C. Maria Keet, Agnieszka Lawrynowicz, Claudia d’Amato, Alexandros Kalousis, PhongNguyen, Raul Palma, Robert Stevens, Melanie Hilario: The Data Mining OPtimizationOntology. J. Web Sem. 32: 43-53 (2015)
development started in e-LICO EU FP7 project (2009-2012)
detailed algorithm internal characteristics (’qualities’)
Use case: meta-learning (’whitebox’), meta-mining, used to produce IntelligentDiscovery Assistant for RapidMiner
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 7 / 18
Expose
Joaquin Vanschoren, Hendrik Blockeel, Bernhard Pfahringer, Geo↵rey Holmes:Experiment databases - A new way to share, organize and learn from experiments.Machine Learning 87(2): 127-158 (2012)
re-uses OntoDM (at top-level) and DMOP (at bottom level)
superseded by OpenML DB schema
Use case: experiment databases, ExpML markup
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 8 / 18
MEX vocabulary
Diego Esteves, Diego Moussallem, Ciro Baron Neto, Tommaso Soru, Ricardo Usbeck,Markus Ackermann, Jens Lehmann: MEX vocabulary: a lightweight interchange formatfor machine learning experiments. SEMANTICS 2015: 169-176
lightweight interchange format
maps to PROV
Use case: annotating ML experiments and interchanging ML metadata
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 9 / 18
Previous step towards aligning DM/ML ontologies
DMO Ontology Jamboree, Josef Stefan Institute, Slovenia, 2010
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 10 / 18
Goals
To define a simple shared schema of data mining/ machinelearning (DM/ML) algorithms, datasets, and experiments thatmay be used in many di↵erent formats: XML, RDF, OWL,spreadsheet tables.
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 11 / 18
The current draft of ML Schema
OpenML2016, Lorentz Center, Netherlands, 2016(our work may be found at https://github.com/ML-Schema/core)
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 12 / 18
Goals
Turn machine learning algorithms and results into linked opendata
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 13 / 18
OpenML2016 plan for integrating OpenML with MLSchema and Linked Data
Assign URIs to OpenML classes and properties
Align OpenML vocabulary to ML-Schema
Complete an initial specification of ML-Schema v1.0
Develop a tool to provide each OpenML entity with RDF data(JSON-LD)
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 14 / 18
Goals
Collect use cases from the academic community and industry
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 15 / 18
Use cases
Experiment/model sharing
Workflow design/planning
Meta learning
Text mining
Experiment reproducibility in publications
Comparison of ML algorithms
Education
Call for use cases!
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 16 / 18
Goals
Promote the use of this schema, including involvingstakeholders like ML tool developers
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 17 / 18
You are invited to join the W3C ML Schema group!
Agnieszka Lawrynowicz ML Schema: Machine Learning Schema 18 / 18