Upload
ignacio-ogburn
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
COM: A Brief Introduction
Dan Berger [email protected]
Outline A Brief History of COM Objects vs. Components COM as a (C++)++ The What and How of COM COM as CORBA-light The Definitive References
A Brief History of COM The Brain Child of Anthony Williams –
outlined in two (internal) MS papers: Object Architecture: Dealing with the
Unknown or Type Safety in a Dynamically Extensible Class (1988)
On Inheritance: What It Means and How To Use it (1990)
History (Cont.) Origins of COM were OLE (Object Linking
and Embedding) 1 that shipped with Windows 3.1 (1992)
The first public version of COM shipped in OLE 2.0 (1993).
DCOM (Distributed COM) was released in 1996 in answer to CORBA. (We’ll ignore it.)
COM+ was released along with Windows 2000 and was primarily concerned with MTS. The DCOM moniker was dropped.
Objects vs. Components “Object Oriented Programming =
Polymorphism + (Some) Late Binding + (Some) Encapsulation + Inheritance
Component Oriented Programming = Polymorphism + (Really) Late Binding + (Real, Enforced) Encapsulation + Interface Inheritance + Binary Reuse”
Charlie Kindel “COM Guy” Microsoft Corp. 9/97
COM as (C++)++ (Adapted From [2])
If you “get” this, it’s all down hill from here.
In C++, in particular, the linkage model makes binary distribution and reuse difficult.
Consider: You’ve written a class that’s part of a C++ class library.
Challenges of Distribution Imagine we distribute the source for the
class (as is common in C++ class libraries). If each application that uses it statically links it,
it gets duplicated (waste) and is impossible to update/fix in the field without redistributing a new application.
If it’s packaged as a shared library/object, the lack of binary standardization moves the problem to one of interoperation.
The DLL model (but not the .so model) can actually deal with this lack of standardization, but it’s not pretty.
Challenges of Encapsulation Assume we side-step the compiler/linker
trouble. The coast still isn’t clear. The C++ standard also lacks any standard definition for binary encapsulation.
So changes to the internals of an object that don’t change it’s interface can (and do) break code that uses the object. Consider adding private member variables.
Versioned Libraries A quick look in /usr/lib or %WINDIR% will
likely reveal a number of “identical” libraries with different versions. libFoo.so.1 libFoo.so.2
With enough diligence the library developer can insulate applications from change buy explicitly versioning the library. I think we all agree this is sub-optimal
solution.
Interface v. Implementation C++ supports separation of interface
and implementation at the syntax level – not at the binary level. So changes of the implementation are
“seen” by clients. We could hide the actual implementing
class behind an opaque pointer in the interface exposed to the client and delegate interface calls through this pointer to the “real” object. Easy for simple, cumbersome for complex
interfaces
Abstract Classes as Interfaces With three assumptions, we can
use abstract classes to solve these problems:
1. C-style structs are represented identically across (C++) compilers.
2. All compilers can be forced to use common call conventions.
3. All compilers on a platform use equivalent virtual call implementations.
vtbls and vptrs Assumption 3 is critical, and turns
out to be not unfounded, as nearly all C++ compilers use vptrs and vtbls.
For each class the compiler generates a (static) array of func pointers to it’s members (it’s vtbl).
Each instance of each class has a (hidden) member that points to the vtbl (it’s vprt).
IsearchableString
Exampleclass ISearchableString { public: virtual int Length(void) const = 0; virtual int Find(const char *s) = 0;};
vptr
vtbl
Length (null)
Find (null)
Example (cont.)class SString : public
ISearchableString { public: SearchableString(const char *s); ~SearchableString(void);
int Length(void) const; int Find(const char *s);};
SString
vptr
vtbl
SString::Length
SString::Find
Instantiating an Abstract Class Clearly the client can’t instantiate
an ISearchableString – it’s pure abstract, nor do we want them instantiating a SString – that breaks (binary) encapsulation.
So we need a factory method – and we can force it (using extern “C”) to be accessible to all clients.
Virtual Destructors Unfortunately, there’s a problem – our
class lacks a virtual d’tor – so calls to delete will use the (default) d’tor on the ISearchableString class.
We can’t add a virtual d’tor to the abstract class because different compilers put dtor’s in different places in the vtbl. (blech)
So we add a virtual “Delete” method to the abstract class.
What is COM A “substrate” for building re-usable
components. Language neutral
it’s easier to use in C++, but can be used from any language that can generate/grok vtbl’s and vptrs.
Interfaces are defined in COM IDL (IDL+COM extensions for inheritance and polymorphism)
OS Neutral commercial Unix implementations, and MS
supports COM on Mac System (OS X?) Using only on the COM spec, we (OMKT) rolled our
own.
Interfaces Interfaces are uniquely identified by
UUID’s (often called GUID’s – the terms are equivalent) called their IID (interface ID).
Implementers of an interface are uniquely identified by a UUID called their CLSID (class ID).
All COM objects implement the IUnknown interface.
IUnknown Provides three methods:
HRESULT QueryInterface(IID iid, void **ppv) ULONG AddRef(void); ULONG Release(void);
AddRef and Release are for resource management (reference counting). We’ll mostly ignore them.
QueryInterface QueryInterface is essentially a run-time
cast – it allows you to ask a component if it implements a specific interface. If it does, it returns a pointer to that
interface pointer in ppv. Think of it as a compiler/language neutral
dynamic_cast operation.
HRESULT This is language neutral – so no
exceptions. HRESULTS are a packed bit field return value used all over COM. Honestly it’s one of the ugliest parts of COM.
The most used return value is defined as S_OK (success, ok), the other is E_FAIL (error, failure) but there are others.
There are macros SUCCEEDED() and FAILED() that take an HRESULT and report success or failure.
Instantiating Objects So as developers, we have interfaces
(defined in IDL) for the components available in a library/on the system.
How do we actually obtain an instance of an object we want to use? In COM this is termed Activation – there
are three basic types, and each involves the SCM (service control manager).
Activation and the SCM The SCM manages the mapping between
IIDs, CLSIDs, and implementations. You can ask the SCM for a particular
CLSID and it will instantiate an instance and return it’s interface pointer. CoGetClassObject()
There’s an additional layer of indirection through ProgIDs – strings of the form libraryname.classname.version that map to CLSIDs.
Activation (cont.) Sometimes you want “an
implementation of the following interface that meets some set of constraints” enter category IDs (CATIDs)
You can define a set of categories, and each COM class can advertise the categories it implements.
COM as CORBA-light COM provides a very efficient in-process
component model. Once past the initial COCreateInstance and
QueryInterface calls, each method call is simply a call-by-func-pointer call, essentially free.
Instantiating a component doesn’t require any out of process, or shared memory operations – it’s all DLL (or Shared Object) magic.
There’s Much, Much More COM is specific about many topics
that C++ (and other languages) are not. It specifies: Execution environment options (so-
called Apartments) Inter-process Marshalling Remote Object Activation mechanism
and protocols Threading models
The Definitive References [1] The Component Object Model
Specification Microsoft and Digital Equipment Corp, 1992-
1995 www.microsoft.com/com/resources/comdocs
.asp
[2] Essential COM Don Box, Addison Wesley ISBN 0-201-63446-5