34
History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

  • View
    216

  • Download
    1

Embed Size (px)

Citation preview

Page 1: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

History, Architecture, and Implementation of the CLR Serialization and Formatter Classes

Peter de JongApril 24, 2003

Page 2: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

History

J++ DCOM 1997 J++ SOAP 1998 CLR .Net Remoting 1999 Spring CLR Serialization Classes 1999 Spring CLR SoapFormatter 1999 Spring CLR BinaryFormatter 1999 December CLR V1 2002 January

Page 3: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

J++ Soap Original Soap Spec (Bob

Atkinson) 1997 Protocol

HTTP Bi-Directional Give me a call - Server

callback using response from a hanging http request.

XML No namespaces, no xsd

RPC Soap Header root for Soap

Headers and parameter graph

No Envelope J++ Proxy/Stub for

serialization/deserialization of Interface parameters

HttpServer

Client

Soap Root

Parameters Soap Headers

Page 4: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

CLR Soap Soap .9 spec

Section 5 specifies how to map objects Namespaces, no xsd Soap Envelope

Rpc - rooted Headers and Parameters Serialization – root of object graph

Most annoying part Headers are really an array of objects For XML beauty specified as xml field elements. Lead to specification of root attribute

Page 5: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Soap Moving Target Original Soap Soap .9 Soap as a cottage industry

Easy to produce a subset of soap Microsoft had 5 or so implementations Individuals and companies set up Soap Web sites

Soap Interop Meeting (IBM 2000-2001) Soap Application Bench marks Led to Web sites which implemented the Applications

~15 sites to test interoperability Soap 1.0

Standards effort which included many of the Soap producers. Envelope, body - no header or parameter root Moved Section 5 to an appendix

Soap 1.1 Nest top level object

Page 6: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Serialization Classes

Page 7: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Architecture

Serializer-----------Parser

Serializer-----------Parser

Object Reader----------------------Object Writer

Binary Stream Soap XML Stream

BinaryFormatter SoapFormatter

Object Reader----------------------Object Writer

Object Reader----------------------Object Writer

Serialization Classes

Page 8: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Serialization Classes

Designed to make it easy to produce Formatters. True for a subset of CLR False for the complete CLR object

model SoapFormatter and BinaryFormatter are

the only Serialization/Deserialization engines which support the complete CLR model.

Page 9: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Serialization Classes Services System controlled serialization

(Serializable, NotSerialized) User controlled serialization

(ISerializable) Type substitution

(ISerializationSurrogate, ISurrogateSelector)

Object Substitution (IObjectReference) Object Sharing Fixups

Page 10: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

System Controlled Serialization

Serialization Serialization Custom Attribute NotSerialized Customer Attribute public, internal, private fields serialized

Deserialization Creates Uninitialized object Populates the fields Constructor is not called

Page 11: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

User Controlled Serialization Inherits from ISerializable Serialization – GetObjectData give

name/value pairs to serializer Deserialization – Constructor used to

retrieve name/value pairs and populate object. Constructor is not in Interface, so compiler

can’t check whether it present Constructor isn’t inherited, so each subclass

needs its own constructor Earlier version used SetObjectData instead

of constructor

Page 12: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Surrogates

Type substitution Objects of specified type replaced by

a new object of a different type.

MarshalByRefObject ObjRef Proxy

Page 13: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Object Substitution

IObjectReference GetRealObject method returns

deserialized object When object is returned, it and its

descendents are completely deserialized

Used extensively for returning singleton system objects Types, Delegates

Page 14: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Object Fixup Reference before

object Serialization

swizzles objref to integer

Page 15: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Object Fixup Complications

Value classes must be fixed up before boxed

ISerializable directly referenced object graphs must be deserialized one level

IObjectReference object graph must be completely deserialized

Page 16: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

IDeserializationCallBack

Used to signal that deserialization is complete E.g. Hashtable can’t create hashes

until all the objects are deserialized.

Page 17: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Formatter Classes

Page 18: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

IFormatter Object Graph Serialize(Stream s, Object graph) Object Deserialize(Stream s) Properties

ISurrogateSelector SerializationBinder (Type substitution when deserializing) StreamingContext

CrossProcess CrossMachine File Persistence Remoting Other Clone CrossAppDomain All

Page 19: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

IRemotingFormatter - RPC Serialize(Stream s, Object graph, Header[]

headers) Two Serializations

Graph (parameter array) Headers (Header array)

Object Deserialize(Stream s, HeaderHandler handler)

Delegate Object HeaderHandler(Headers[] headers) Headers handed to delegate, delegate returns object

into which parameters are deserialized.

Page 20: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Formatter Property Enums FormatterTypeStyle

TypesWhenNeeded – types outputted for Arrays of Objects Object fields, inheritable fields ISerializable

TypesAlways version compatibility MemberInfo -> ISerializable

FormatterAssemblyStyle Simple – No version information Full – Full assembly nameDefaults

Remoting – Serialization Full, Deserialization Simple Non-Remoting – Serialization Full, Deserialization Full

Page 21: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

SoapFormatter additional Properties ISoapMessage – Alternate way of

specifying Parameter/Header serialization. ParamNames ParamValues ParamTypes MethodName XmlNameSpace Header[] headers

Page 22: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

BinaryFormatter Binary Stream Format Design

Primitive types are written directly Array of primitives - bytes are copied directly from the

CLR (100x faster then using reflection) All other types are written as records

Basic record types SerializedStreamHeader, Object, ObjectWithMap,

ObjectWithMapAssemId, ObjectWithMapTyped, ObjectWithMapTypedAssemId, ObjectString, Array, MemberPrimitiveTyped, MemberReference, ObjectNull, MessageEnd, Assembly

Record types added later for performance ObjectNullMultiple256, ObjectNullMultiple,

ArraySinglePrimitive, ArraySingleObject, ArraySingleString, CrossAppDomainMap, CrossAppDomainString, CrossAppDomainAssembly, MethodCall, MethodReturn

Page 23: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Serialization

1 2

3

4

5

6

7

8

9

10

2

3

4

5

6

7

8

9

10

1

Page 24: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Serialization Complications MethodCall/MethodReturn CrossAppDomain Determine when Type information is needed Value classes are nested/Non-Value classes

are top level Arrays – mix of jagged and multi-dimensional

[][,,][] Array of primitives copied to stream as a

collection of bytes Surrogates ISerializable

Page 25: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Deserialization

1 2

3

4

5

6

7

8

9

10

Fixups

Process 1, fixups 2, 3, 4Process 2, fixups 5,6 Process 3, fixups 7 Process 4, fixups 8,9

Page 26: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Deserialization Binary

Parsing Record Headers specify what is

coming next in stream Primitives do not have headers so

need to use previously encountered record headers as map for reading primitive

Page 27: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Deserialization Complications

Remoting MethodCall/MethodReturn

optimization CrossAppDomain Value Type ISerializable Surrogate

Page 28: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

Retrospective

Page 29: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

What Went Wrong -1 Beta1 gave GC a workout

Object oriented style is dangerous for plumbing. Lots of objects created.

Solution Use object singletons (or fixed number) Object pools Start with larger storage for growing objects such

as ArrayLists Special cases – Primitive parameters -

serialization classes aren’t used so aren’t initialized.

Page 30: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

What Went Wrong - 2 Performance is never good enough Reflection is slow

Boxes value types Interpretive

Serialization classes are slow Boxes value types Keeps lots of state around in resizable

arrays

Page 31: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

What Went Wrong - 3 Formatters are slow

Object type and field information inflates size of stream (reflection and versioning requirement)

Lots of irregular cases Clr – value types, singletons, transformations Serialization – ISerializable, Resolving graph rules

Code more general then it has to be now we know, but during development underlying system kept

changing Clr object model (variants, reflection, security, BCL, etc) Serialization model (ISerializable underwent many changes) Soap spec kept changing Binary Format changed for perf reasons

Fixups used too much – strings and value classes are put in stream when encountered, object references are put in stream, with object coming later

Soap 1.2 nests reference objects BinaryFormatter should be changed to nest objects

Page 32: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

What Went Wrong -4 Why didn’t we use Reflection.Emit

1200 serialization to make up cost Couldn’t serialize private and internal fields

BinaryFormatter Primitive Arrays uses array copy rather then reflection

100x faster when switch was made Cross Appdomain smuggling

Primitive and strings bypasses the BinaryFormatter results in faster times then COM cross process

BinaryFormatter prototyped option to omit type information in stream

4 byte point class serialized in 10 bytes instead of 125 bytes. Future version of the Formatters will be much faster

Improvements to Reflection.Emit Cross Appdomain Serialization Prototype implemented in the EE.

Page 33: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

What Went Wrong - 5 Web Services

The BinaryFormatter and SoapFormatter existed before Web Service classes

Serialization, Formatter, and Remoting classes are based on object oriented programming, RPC and COM models

Web Services started to gain importance late in the development of the .Net Frameworks

Future releases will combine the two models, use same custom attributes and underlying messaging model

SoapFormatter Specify shape of stream to some extent Object WSDL, added additional schema information to

WSDL to allow generation of the CLR object model in client proxies

Object WSDL is only way in .Net Frameworks V1 to copy clr metadata without copying dll which includes code

Page 34: History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

The Formatters are Great (at least useful)

Only way to make a deep copy of an object graph with complete fidelity

Integrated with .Net Remoting Combines the CLR Object Model with

the Web Services Model Version resilient (at least the attempt is

made) Secure Perf isn’t all that bad