Download pdf - Datasets by David Semeria

Transcript
Page 1: Datasets by David Semeria

LM Datasets

Promote data and code sharing

on the web

David [email protected]@hymanroth

Ruby Social Club Milano16th December 2010

Page 2: Datasets by David Semeria

LM Datasets 2

Objects

Properties (data)

Methods (code)

Interface

Page 3: Datasets by David Semeria

LM Datasets 3

Objects

Properties (data)

Methods (code) Functional abstraction (GOOD)

Interface

Page 4: Datasets by David Semeria

LM Datasets 4

Objects

Properties (data)

Methods (code)

Data abstraction (BAD)

Functional abstraction (GOOD)

Interface

Page 5: Datasets by David Semeria

LM Datasets 5

Objects

Properties (data)

Methods (code)

Data abstraction (BAD)

Functional abstraction (GOOD)

Interface

Context: web services

Interoperability is key

Page 6: Datasets by David Semeria

LM Datasets 6

Interoperability Browser

Twitter Facebook

Flickr Bit.ly

Page 7: Datasets by David Semeria

LM Datasets 7

Interoperability Browser

Twitter Facebook

Flickr Bit.ly

Page 8: Datasets by David Semeria

LM Datasets 8

How Much Glue Code?

Twitter Facebook

Twitter Flickr

Twitter Bit.ly

Facebook Twitter

Facebook Flickr

Facebook Bit.ly

Flickr Twitter

Flickr Facebook

Flickr Bit.ly

Bit.ly Twitter

Bit.ly Facebook

Bitl.ly Flickr

12 sets of code

N2 - N

Page 9: Datasets by David Semeria

LM Datasets 9

The General Case

Browser

Choose from N options

Service A

Choose from N options

Service B

Page 10: Datasets by David Semeria

LM Datasets 10

The General Case

Browser

Choose from N options

Service A

Choose from N options

Service B

For N = 100 N2 – N = 99,900

Page 11: Datasets by David Semeria

LM Datasets 11

The Problem

APIs are better than nothing, but they

remain a major impediment to a fully

writable Web.

(The same applies to corporate intranets)

Page 12: Datasets by David Semeria

LM Datasets 12

Datasets

A generic

representation for

hierarchical data

Global data definitions

Permissions

LIBRARY ( Front and back end )

Key word: GENERIC

Page 13: Datasets by David Semeria

LM Datasets 13

Hierarchical Structuresroot

node

node node

node nodeleaf leaf

leaf leaf leaf leaf

Page 14: Datasets by David Semeria

LM Datasets 14

A 'people' treeroot

people

sport music

soccer formula1Id: bowiename: “David Bowie”

Id: claptonname: “Eric Clapton”

Id: maldininame: “Paolo Maldini”

Id: gerrardname: “Steven Gerrard”

Id: alonsoname: “Fernando Alonso”

Id: hamiltonname: “Lewis Hamilton”

Page 15: Datasets by David Semeria

LM Datasets 15

Generic Representation

root

leaf 1

S node 1

node 2

node 1

leaf 2

R node 1 record

node 2 record

leaf 1 record

leaf 2 record

node 2

Page 16: Datasets by David Semeria

LM Datasets 16

JSON Example

ds: { s: { root: { people: 1 }, people: { music: 1, sport: 1 }, sport: { soccer: 1, forumla1: 1 },

music: { bowie: 1, clapton: 1 }, soccer: { maldini: 1, gerrard: 1 }, formula1: { alonso: 1, hamilton: 1 } },

r: { people: { name: “People”, color: “green” }, music: { name: “Music” color: “black” }, sport: { name: “Sport” color: “white” }, soccer: { name: “Soccer”, color “red” }, formula1: { name: “Formula One”, color: “yellow” },

bowie: { name: “David Bowie”, color: “black” }, clapton: { name: “Eric Clapton”, color: “black” }, Maldini: { name: “Paolo Maldini”, color: “red” }, Gerrard: { name: “Steven Gerrard”, color: “red” }, Alonso: { name: “Fernando Alonso”, color: “red” }, Hamilton: { name: “Lewis Hamilton”, color: “silver” } } };

Page 17: Datasets by David Semeria

LM Datasets 17

Some Code Examples

➔ Leverage structure

➔ No need for recursive tree walking

➔ Leverage native operations

➔ Object property look-up much faster than array iteration.

Page 18: Datasets by David Semeria

LM Datasets 18

ID Exists ?

function IdExists (id){

return ds.r[id] != null;

}

Page 19: Datasets by David Semeria

LM Datasets 19

Node or Leaf ?

function nodeOrLeaf (id){

return (ds.s[id]) ?'node' :'leaf';

}

// assumes id exists

Page 20: Datasets by David Semeria

LM Datasets 20

Node contains id ?

function contains (nodeId, id){

if (ds.s[nodeId][id]){

return true;

}

return false

}

// assumes nodeId exists

Page 21: Datasets by David Semeria

LM Datasets 21

Parent Node

function parentNode (id){

for ( var k in ds.s ){

if (ds.s[k][id]){

return k;

}

}

//error

}

Page 22: Datasets by David Semeria

LM Datasets 22

Move Item

function move ( toNodeId, id ){

delete( ds.s[parenNode(id)][id] );

ds.s[toNodeId][id] = 1;

}

// assumes all ids exist

Page 23: Datasets by David Semeria

LM Datasets 23

Templates

DATASET

TEMPLATES

+ HTMLFLOW

Page 24: Datasets by David Semeria

LM Datasets 24

Flowing TemplatesNODE TEMPLATE:

<DIV style = “border: 2px solid {color}; padding: 10px”></DIV>

LEAF TEMPLATE:

<P><SPAN style = “color:{color}”>{name}</SPAN></P>

Page 25: Datasets by David Semeria

LM Datasets 25

Flowing TemplatesNODE TEMPLATE:

<DIV style = “border: 2px solid {color}; padding: 10px”></DIV>

LEAF TEMPLATE:

<P><SPAN style = “color:{color}”>{name}</SPAN></P>

OUTPUT:

Eric Clapton

David Bowie

Paolo MaldiniSteven Gerrard

Fernando AlonsoLewis Hamilton

Page 26: Datasets by David Semeria

LM Datasets 26

Demo 1

Page 27: Datasets by David Semeria

LM Datasets 27

Data Definitions

Agetype integer

minVal 0

maxVal 150

EXAMPLE DEFINITION

Nametype string

minLen 1

maxLen 50

canBeNumeric false

regex (\w| )*

function checkName

Page 28: Datasets by David Semeria

LM Datasets 28

Inheritance

PEOPLE PLACES THINGS ......

BASIC INFO

DETAILED INFO EMAIL INFO

DETAILED & EMAIL INFO

Page 29: Datasets by David Semeria

LM Datasets 29

Inheritance Across Root Types

PEOPLE SERVICE

BASIC INFO

DETAILED INFO

TWITTER

TWITTER USER

TWITTER USER is a sub-type of both:

SERVICE / TWITTER / TWITTER INFO

PEOPLE / BASIC INFO

TWITTER INFO

TWITTER

Page 30: Datasets by David Semeria

LM Datasets 30

Inheritance

Demo 2

Page 31: Datasets by David Semeria

LM Datasets 31

Normalization

Just like in the relational model, Dataset

normalization means we don't store the

same information twice....

Page 32: Datasets by David Semeria

LM Datasets 32

Viewsets and Recordsets

VIEWSET A

RECORD SET 1 RECORD SET 2

VIEWSET B

sparse

refs

SERVER

Page 33: Datasets by David Semeria

LM Datasets 33

Demo 3

VS - LIVERPOOL

RECORD SET FOOTBALLERS

VS - MILAN

SERVER

VS – DREAM TEAM

LIVERPOOL MILAN #1 MILAN #2 DREAM TEAM

windows

view sets

Page 34: Datasets by David Semeria

LM Datasets 34

Demo 3

VS - LIVERPOOL

RECORD SET FOOTBALLERS

VS - MILAN

SERVER

VS – DREAM TEAM

LIVERPOOL MILAN #1 MILAN #2 DREAM TEAM

windows

view sets

Page 35: Datasets by David Semeria

LM Datasets 35

Demo 3

VS - LIVERPOOL

RECORD SET FOOTBALLERS

VS - MILAN

SERVER

VS – DREAM TEAM

LIVERPOOL MILAN #1 MILAN #2 DREAM TEAM

windows

view sets

Page 36: Datasets by David Semeria

LM Datasets 36

Demo 3

VS - LIVERPOOL

RECORD SET FOOTBALLERS

VS - MILAN

SERVER

VS – DREAM TEAM

LIVERPOOL MILAN #1 MILAN #2 DREAM TEAM

windows

view sets

Page 37: Datasets by David Semeria

LM Datasets 37

Demo 3

VS - LIVERPOOL

RECORD SET FOOTBALLERS

VS - MILAN

SERVER

VS – DREAM TEAM

LIVERPOOL MILAN #1 MILAN #2 DREAM TEAM

windows

view sets

Page 38: Datasets by David Semeria

LM Datasets 38

Demo 3

VS - LIVERPOOL

RECORD SET FOOTBALLERS

VS - MILAN

SERVER

VS – DREAM TEAM

LIVERPOOL MILAN #1 MILAN #2 DREAM TEAM

windows

view sets

Page 39: Datasets by David Semeria

LM Datasets 39

Summary

➔ Don't hide your data in objects

Page 40: Datasets by David Semeria

LM Datasets 40

Summary

➔ Don't hide your data in objects

➔ APIs can be an obstacle (representation)

Page 41: Datasets by David Semeria

LM Datasets 41

Summary

➔ Don't hide your data in objects

➔ APIs can be an obstacle (representation)

➔ Above all, KEEP IT GENERIC !!

Page 42: Datasets by David Semeria

LM Datasets 42

Summary

Questions are welcome:

[email protected]

@hymanroth

➔ Don't hide your data in objects

➔ APIs can be an obstacle (representation)

➔ Above all, KEEP IT GENERIC !!


Recommended