Clustered Serialization with Fuel

Preview:

DESCRIPTION

Presentation of the research paper entitled "Clustered Serialization with Fuel" in ESUG International Workshop on Smalltalk Technologies (IWST 2011), Edinburgh, Scotland.

Citation preview

RMod

1

Mariano Martinez Peckmarianopeck@gmail.com

http://marianopeck.wordpress.com/

Monday, August 22, 2011

THANKS a lot!!!

2

SummerTalk 2011

Student: Martin DiasMentor: Mariano Martínez Peck

Monday, August 22, 2011

Object references

3

Monday, August 22, 2011

Object graph

4

Monday, August 22, 2011

5

An objet graph serializer.

Monday, August 22, 2011

Serialize

6

Input: an object graph

Output: stream of bytes

Monday, August 22, 2011

Materialize(deserialize)

7

Output: an object graph

Input: stream of bytes

Monday, August 22, 2011

Once serialized...

8

Database File Memory Socket

Stream of bytes

Monday, August 22, 2011

Fuel’s main goals

Provide fast object serialization and materialization.

Be flexible and easy to customize.

Have a good OO design, well tested and benchmarked.

No need of special support from the VM.

Be a general purpose serializer.

Allow tools to be built on top of Fuel.9

Monday, August 22, 2011

Key features Fast serialization and materialization.

Class reshape support.

Serialization of any kind of object.

Cycles support.

Global objects references.

Buffered writing.

Support for some “hook methods”.10

Monday, August 22, 2011

Key Characteristics

Pickle format.

Objects grouped in clusters.

Analysis phase before writing.

Stack over recursion.

Two phases for writing instances and references.

Iterative graph recreation.11

Monday, August 22, 2011

Pickle format

12

Invest more time in serialization so that objects can then be materialized much faster.

Monday, August 22, 2011

Grouping objects in clusters

13

“Similar” objects (they share writing/loading information) are grouped together in clusters. The most common case, yet not the

only one, takes place when a class is a cluster for its instances.

Monday, August 22, 2011

14

Monday, August 22, 2011

14

Each jar has a specific type of element

Monday, August 22, 2011

14

Each jar has a specific type of element

Jars are in order

Monday, August 22, 2011

14

Label: - What’s inside? - How much?Each jar has

a specific type of element

Jars are in order

Monday, August 22, 2011

14

Label: - What’s inside? - How much?

Different sizes and different amounts of elements

Each jar has a specific type of element

Jars are in order

Monday, August 22, 2011

14

Label: - What’s inside? - How much?

Different sizes and different amounts of elements

Each jar has a specific type of element

Jars are in order

s/jar/cluster

Monday, August 22, 2011

14

Label: - What’s inside? - How much?

Different sizes and different amounts of elements

Each jar has a specific type of element

Jars are in order

s/jar/cluster

Monday, August 22, 2011

14

Label: - What’s inside? - How much?

Different sizes and different amounts of elements

Jars are in order

s/jar/cluster

Each cluster has a specific type of object

Monday, August 22, 2011

14

Label: - What’s inside? - How much?

Different sizes and different amounts of elements

Jars are in order

s/jar/cluster

Each cluster has a specific type of object

Monday, August 22, 2011

14

Label: - What’s inside? - How much?

Different sizes and different amounts of elements

s/jar/cluster

Each cluster has a specific type of object

Clusters are in order

Monday, August 22, 2011

14

Label: - What’s inside? - How much?

Different sizes and different amounts of elements

s/jar/cluster

Each cluster has a specific type of object

Clusters are in order

Monday, August 22, 2011

14

Different sizes and different amounts of elements

s/jar/cluster

Each cluster has a specific type of object

Clusters are in order

Label: - Cluster ID - Amount of

objects

Monday, August 22, 2011

14

Different sizes and different amounts of elements

s/jar/cluster

Each cluster has a specific type of object

Clusters are in order

Label: - Cluster ID - Amount of

objects

Monday, August 22, 2011

14

s/jar/cluster

Each cluster has a specific type of object

Clusters are in order

Label: - Cluster ID - Amount of

objects

Different sizes and different amounts of

objects

Monday, August 22, 2011

Pickle format basic

15

Stream

Monday, August 22, 2011

Pickle format basic

15

Stream

Monday, August 22, 2011

Pickle format basic

15

Stream

Monday, August 22, 2011

Pickle format basic

15

Stream

Monday, August 22, 2011

16

Why the pickle format is so fast in materialization?

Standard serializers Fuel pickle format

Monday, August 22, 2011

16

Why the pickle format is so fast in materialization?

Standard serializers Fuel pickle format

Recursive

materialization Iterative

materialization

Monday, August 22, 2011

Pickle advantages

Batch/Bulk/Iterative materialization.

Efficient since types are stored and fetch only once.

Fast because at materialization we know the size of everything.

The generated stream is smaller.

More next....

17

Monday, August 22, 2011

There is no silver bullet...

18

Fast serialization

(without pickle)

Slow serialization(with pickle)

Monday, August 22, 2011

Fuel requires

Traversing the object graph.

Mapping each object to a specific cluster.

19

This is done in a phase before serialization called “Analysis”.

Monday, August 22, 2011

Analysis phase

20

cluster1 aPointcluster2 x y

......

... ...

key (a cluster) value (a set)

1) Traverse (#fuelAccept: aVisitor)2) Fill dictionary(#fuelSerializer)

Monday, August 22, 2011

Serialization

21

key (a cluster) value (a set)

serialize: anObject on: aStreammaterializeFrom: aStream

IDCluster

A cluster defines how its objects are serialized and materialized.

cluster1 aPointcluster2 x y

......

... ...

Monday, August 22, 2011

Stack over recursion

22

To traverse the object graph, Fuel uses a custom stack implementation rather than a recursion.

Monday, August 22, 2011

Basic stepsSerialization

1. Analyze.2. Serialize header.3. Serialize instances.4. Serialize references.5. Serialize root.

Materialization1. Materialize header.2. Materialize instances.3. Materialize references.4. Materialize root.

23

Monday, August 22, 2011

Fuel for software(so far)

Moose export utility.

SandstoneDB persistence.

Pier kernel persistence.

Newspeak language.

Marea (my own research project!).

24

Monday, August 22, 2011

Future Work

Continue efforts on performance optimization.

Create a tool for loading class and trait packages.

Support user-defined Singletons.

Fast statistics/brief info extraction of a stored graph.

Partial loading of a stored graph.25

Monday, August 22, 2011

Future work 2

Enable to deploy serialization and materialization behavior independently.

Support object replacement for serialization and materialization.

Allow cycle detections to be disabled.

Partial loading.

26

Monday, August 22, 2011

Serialization of primitive objects

27

Memory based stream

Monday, August 22, 2011

Serialization of primitive objects

27

Memory based stream

StOMPSRPFuel

Monday, August 22, 2011

Serialization of primitive objects

27

Memory based stream File based stream

StOMPSRPFuel

Monday, August 22, 2011

Serialization of primitive objects

27

Memory based stream File based stream

StOMPSRPFuel

FuelSRPStOMP

Monday, August 22, 2011

28

Materialization of primitive objects

Monday, August 22, 2011

28

Fuel

ImageSegment

StOMP

Materialization of primitive objects

Monday, August 22, 2011

Non primitive objects

29

Monday, August 22, 2011

Non primitive objects

29

StOMPSRPFuel

Monday, August 22, 2011

Non primitive objects

29

StOMPSRPFuel

Monday, August 22, 2011

Non primitive objects

29

FuelImageSegmentStOMP

StOMPSRPFuel

Monday, August 22, 2011

Excellent performance without special support from VM and good OO design.

Conclusionfor us

31

Monday, August 22, 2011

Fuel is a vehicle. It is infrastructure. You can build cool stuff on top of it.

Conclusionfor YOU

32

Monday, August 22, 2011

Thanks!

RMod

Mariano Martinez Peckmarianopeck@gmail.com

http://marianopeck.wordpress.com/

Monday, August 22, 2011

34

Concrete example

Monday, August 22, 2011

35

Monday, August 22, 2011

Analysis phase

3620

cluster1 aRectanglecluster2 anOrigin aCornercluster3 10

key (a cluster) value (a IdentitySet)

20 30 40

Analysis phase

Monday, August 22, 2011

Serializationinstances

phase

cluster1 aRectangle

cluster2 anOrigin aCorner

cluster3 10 20 30 40

Monday, August 22, 2011

Serializationinstances

phase

cluster1 aRectangle

cluster2 anOrigin aCorner

cluster3 10 20 30 40

Monday, August 22, 2011

Serializationinstances

phase

cluster1 aRectangle

cluster2 anOrigin aCorner

cluster3 10 20 30 40

1 Rectangle 2 origi corne 1

instance variable names

instance count

instance variable count

class name

cluster ID

Monday, August 22, 2011

Serializationinstances

phase

cluster1 aRectangle

cluster2 anOrigin aCorner

cluster3 10 20 30 40

1 Rectangle 2 origi corne 1

instance variable names

instance count

instance variable count

class name

cluster ID

Monday, August 22, 2011

Serializationinstances

phase

cluster1 aRectangle

cluster2 anOrigin aCorner

cluster3 10 20 30 40

1 Rectangle 2 origi corne 1

1 Point 2 x y 2

instance variable names

instance count

instance variable count

class name

cluster ID

Monday, August 22, 2011

Serializationinstances

phase

cluster1 aRectangle

cluster2 anOrigin aCorner

cluster3 10 20 30 40

1 Rectangle 2 origi corne 1

1 Point 2 x y 2

54 10 20 30 40

instance variable names

instance count

instance variable count

class name

cluster ID

Monday, August 22, 2011

cluster1 aRectangle

cluster2 anOrigin aCorner

cluster3 10 20 30 40

aRectangle anOrigiaCorner20 40 10 30instancesIndex

(IdentityDictionary)

1 5 7 3 2 4 6

54 10 20 30 40

2

Serializationreferences

phase

instances

references

Monday, August 22, 2011

cluster1 aRectangle

cluster2 anOrigin aCorner

cluster3 10 20 30 40

aRectangle anOrigiaCorner20 40 10 30instancesIndex

(IdentityDictionary)

1 5 7 3 2 4 6

54 10 20 30 40

2

2

3

Serializationreferences

phase

instances

references

Monday, August 22, 2011

cluster1 aRectangle

cluster2 anOrigin aCorner

cluster3 10 20 30 40

aRectangle anOrigiaCorner20 40 10 30instancesIndex

(IdentityDictionary)

1 5 7 3 2 4 6

54 10 20 30 40

2

2

3

4 5 6 7

Serializationreferences

phase

instances

references

Monday, August 22, 2011

cluster1 aRectangle

cluster2 anOrigin aCorner

cluster3 10 20 30 40

aRectangle anOrigiaCorner20 40 10 30instancesIndex

(IdentityDictionary)

1 5 7 3 2 4 6

54 10 20 30 40

2

2

3

4 5 6 7

Serializationreferences

phase

instances

references

Monday, August 22, 2011

2

2

3

4 5 6 7

final stream

1 Rectangle 2 origi corne 1

1 Point 2 x y 2

54 10 20 30 40

{instances

{references

Monday, August 22, 2011

2

2

3

4 5 6 7

final stream

1 Rectangle 2 origi corne 1

1 Point 2 x y 2

54 10 20 30 40

{instances

{references

1{trailer

7 3{header clustersCount

objectsCounts

rootMonday, August 22, 2011

2

Materialization

instances {2 3

4 5 6 7

1 Rectangle 2 origin corner 1

1 Point 2 x y 2

5 4 1 2 3 4

{references

1{trailer

7 3{header clustersCount

objectsCounts

root

Monday, August 22, 2011

41

Monday, August 22, 2011