View
216
Download
1
Tags:
Embed Size (px)
Citation preview
History, Architecture, and Implementation of the CLR Serialization and Formatter Classes
Peter de JongApril 24, 2003
History
J++ DCOM 1997 J++ SOAP 1998 CLR .Net Remoting 1999 Spring CLR Serialization Classes 1999 Spring CLR SoapFormatter 1999 Spring CLR BinaryFormatter 1999 December CLR V1 2002 January
J++ Soap Original Soap Spec (Bob
Atkinson) 1997 Protocol
HTTP Bi-Directional Give me a call - Server
callback using response from a hanging http request.
XML No namespaces, no xsd
RPC Soap Header root for Soap
Headers and parameter graph
No Envelope J++ Proxy/Stub for
serialization/deserialization of Interface parameters
HttpServer
Client
Soap Root
Parameters Soap Headers
CLR Soap Soap .9 spec
Section 5 specifies how to map objects Namespaces, no xsd Soap Envelope
Rpc - rooted Headers and Parameters Serialization – root of object graph
Most annoying part Headers are really an array of objects For XML beauty specified as xml field elements. Lead to specification of root attribute
Soap Moving Target Original Soap Soap .9 Soap as a cottage industry
Easy to produce a subset of soap Microsoft had 5 or so implementations Individuals and companies set up Soap Web sites
Soap Interop Meeting (IBM 2000-2001) Soap Application Bench marks Led to Web sites which implemented the Applications
~15 sites to test interoperability Soap 1.0
Standards effort which included many of the Soap producers. Envelope, body - no header or parameter root Moved Section 5 to an appendix
Soap 1.1 Nest top level object
Serialization Classes
Architecture
Serializer-----------Parser
Serializer-----------Parser
Object Reader----------------------Object Writer
Binary Stream Soap XML Stream
BinaryFormatter SoapFormatter
Object Reader----------------------Object Writer
Object Reader----------------------Object Writer
Serialization Classes
Serialization Classes
Designed to make it easy to produce Formatters. True for a subset of CLR False for the complete CLR object
model SoapFormatter and BinaryFormatter are
the only Serialization/Deserialization engines which support the complete CLR model.
Serialization Classes Services System controlled serialization
(Serializable, NotSerialized) User controlled serialization
(ISerializable) Type substitution
(ISerializationSurrogate, ISurrogateSelector)
Object Substitution (IObjectReference) Object Sharing Fixups
System Controlled Serialization
Serialization Serialization Custom Attribute NotSerialized Customer Attribute public, internal, private fields serialized
Deserialization Creates Uninitialized object Populates the fields Constructor is not called
User Controlled Serialization Inherits from ISerializable Serialization – GetObjectData give
name/value pairs to serializer Deserialization – Constructor used to
retrieve name/value pairs and populate object. Constructor is not in Interface, so compiler
can’t check whether it present Constructor isn’t inherited, so each subclass
needs its own constructor Earlier version used SetObjectData instead
of constructor
Surrogates
Type substitution Objects of specified type replaced by
a new object of a different type.
MarshalByRefObject ObjRef Proxy
Object Substitution
IObjectReference GetRealObject method returns
deserialized object When object is returned, it and its
descendents are completely deserialized
Used extensively for returning singleton system objects Types, Delegates
Object Fixup Reference before
object Serialization
swizzles objref to integer
Object Fixup Complications
Value classes must be fixed up before boxed
ISerializable directly referenced object graphs must be deserialized one level
IObjectReference object graph must be completely deserialized
IDeserializationCallBack
Used to signal that deserialization is complete E.g. Hashtable can’t create hashes
until all the objects are deserialized.
Formatter Classes
IFormatter Object Graph Serialize(Stream s, Object graph) Object Deserialize(Stream s) Properties
ISurrogateSelector SerializationBinder (Type substitution when deserializing) StreamingContext
CrossProcess CrossMachine File Persistence Remoting Other Clone CrossAppDomain All
IRemotingFormatter - RPC Serialize(Stream s, Object graph, Header[]
headers) Two Serializations
Graph (parameter array) Headers (Header array)
Object Deserialize(Stream s, HeaderHandler handler)
Delegate Object HeaderHandler(Headers[] headers) Headers handed to delegate, delegate returns object
into which parameters are deserialized.
Formatter Property Enums FormatterTypeStyle
TypesWhenNeeded – types outputted for Arrays of Objects Object fields, inheritable fields ISerializable
TypesAlways version compatibility MemberInfo -> ISerializable
FormatterAssemblyStyle Simple – No version information Full – Full assembly nameDefaults
Remoting – Serialization Full, Deserialization Simple Non-Remoting – Serialization Full, Deserialization Full
SoapFormatter additional Properties ISoapMessage – Alternate way of
specifying Parameter/Header serialization. ParamNames ParamValues ParamTypes MethodName XmlNameSpace Header[] headers
BinaryFormatter Binary Stream Format Design
Primitive types are written directly Array of primitives - bytes are copied directly from the
CLR (100x faster then using reflection) All other types are written as records
Basic record types SerializedStreamHeader, Object, ObjectWithMap,
ObjectWithMapAssemId, ObjectWithMapTyped, ObjectWithMapTypedAssemId, ObjectString, Array, MemberPrimitiveTyped, MemberReference, ObjectNull, MessageEnd, Assembly
Record types added later for performance ObjectNullMultiple256, ObjectNullMultiple,
ArraySinglePrimitive, ArraySingleObject, ArraySingleString, CrossAppDomainMap, CrossAppDomainString, CrossAppDomainAssembly, MethodCall, MethodReturn
Serialization
1 2
3
4
5
6
7
8
9
10
2
3
4
5
6
7
8
9
10
1
Serialization Complications MethodCall/MethodReturn CrossAppDomain Determine when Type information is needed Value classes are nested/Non-Value classes
are top level Arrays – mix of jagged and multi-dimensional
[][,,][] Array of primitives copied to stream as a
collection of bytes Surrogates ISerializable
Deserialization
1 2
3
4
5
6
7
8
9
10
Fixups
Process 1, fixups 2, 3, 4Process 2, fixups 5,6 Process 3, fixups 7 Process 4, fixups 8,9
Deserialization Binary
Parsing Record Headers specify what is
coming next in stream Primitives do not have headers so
need to use previously encountered record headers as map for reading primitive
Deserialization Complications
Remoting MethodCall/MethodReturn
optimization CrossAppDomain Value Type ISerializable Surrogate
Retrospective
What Went Wrong -1 Beta1 gave GC a workout
Object oriented style is dangerous for plumbing. Lots of objects created.
Solution Use object singletons (or fixed number) Object pools Start with larger storage for growing objects such
as ArrayLists Special cases – Primitive parameters -
serialization classes aren’t used so aren’t initialized.
What Went Wrong - 2 Performance is never good enough Reflection is slow
Boxes value types Interpretive
Serialization classes are slow Boxes value types Keeps lots of state around in resizable
arrays
What Went Wrong - 3 Formatters are slow
Object type and field information inflates size of stream (reflection and versioning requirement)
Lots of irregular cases Clr – value types, singletons, transformations Serialization – ISerializable, Resolving graph rules
Code more general then it has to be now we know, but during development underlying system kept
changing Clr object model (variants, reflection, security, BCL, etc) Serialization model (ISerializable underwent many changes) Soap spec kept changing Binary Format changed for perf reasons
Fixups used too much – strings and value classes are put in stream when encountered, object references are put in stream, with object coming later
Soap 1.2 nests reference objects BinaryFormatter should be changed to nest objects
What Went Wrong -4 Why didn’t we use Reflection.Emit
1200 serialization to make up cost Couldn’t serialize private and internal fields
BinaryFormatter Primitive Arrays uses array copy rather then reflection
100x faster when switch was made Cross Appdomain smuggling
Primitive and strings bypasses the BinaryFormatter results in faster times then COM cross process
BinaryFormatter prototyped option to omit type information in stream
4 byte point class serialized in 10 bytes instead of 125 bytes. Future version of the Formatters will be much faster
Improvements to Reflection.Emit Cross Appdomain Serialization Prototype implemented in the EE.
What Went Wrong - 5 Web Services
The BinaryFormatter and SoapFormatter existed before Web Service classes
Serialization, Formatter, and Remoting classes are based on object oriented programming, RPC and COM models
Web Services started to gain importance late in the development of the .Net Frameworks
Future releases will combine the two models, use same custom attributes and underlying messaging model
SoapFormatter Specify shape of stream to some extent Object WSDL, added additional schema information to
WSDL to allow generation of the CLR object model in client proxies
Object WSDL is only way in .Net Frameworks V1 to copy clr metadata without copying dll which includes code
The Formatters are Great (at least useful)
Only way to make a deep copy of an object graph with complete fidelity
Integrated with .Net Remoting Combines the CLR Object Model with
the Web Services Model Version resilient (at least the attempt is
made) Secure Perf isn’t all that bad