COM: A Brief Introduction Dan Berger [email protected]@cs.ucr.edu

COM: A Brief Introduction

Dan Berger [email protected]

Outline A Brief History of COM Objects vs. Components COM as a (C++)++ The What and How of COM COM as CORBA-light The Definitive References

A Brief History of COM The Brain Child of Anthony Williams –

outlined in two (internal) MS papers: Object Architecture: Dealing with the

Unknown or Type Safety in a Dynamically Extensible Class (1988)

On Inheritance: What It Means and How To Use it (1990)

History (Cont.) Origins of COM were OLE (Object Linking

and Embedding) 1 that shipped with Windows 3.1 (1992)

The first public version of COM shipped in OLE 2.0 (1993).

DCOM (Distributed COM) was released in 1996 in answer to CORBA. (We’ll ignore it.)

COM+ was released along with Windows 2000 and was primarily concerned with MTS. The DCOM moniker was dropped.

Objects vs. Components “Object Oriented Programming =

Polymorphism + (Some) Late Binding + (Some) Encapsulation + Inheritance

Component Oriented Programming = Polymorphism + (Really) Late Binding + (Real, Enforced) Encapsulation + Interface Inheritance + Binary Reuse”

Charlie Kindel “COM Guy” Microsoft Corp. 9/97

COM as (C++)++ (Adapted From [2])

If you “get” this, it’s all down hill from here.

In C++, in particular, the linkage model makes binary distribution and reuse difficult.

Consider: You’ve written a class that’s part of a C++ class library.

Challenges of Distribution Imagine we distribute the source for the

class (as is common in C++ class libraries). If each application that uses it statically links it,

it gets duplicated (waste) and is impossible to update/fix in the field without redistributing a new application.

If it’s packaged as a shared library/object, the lack of binary standardization moves the problem to one of interoperation.

The DLL model (but not the .so model) can actually deal with this lack of standardization, but it’s not pretty.

Challenges of Encapsulation Assume we side-step the compiler/linker

trouble. The coast still isn’t clear. The C++ standard also lacks any standard definition for binary encapsulation.

So changes to the internals of an object that don’t change it’s interface can (and do) break code that uses the object. Consider adding private member variables.

Versioned Libraries A quick look in /usr/lib or %WINDIR% will

likely reveal a number of “identical” libraries with different versions. libFoo.so.1 libFoo.so.2

With enough diligence the library developer can insulate applications from change buy explicitly versioning the library. I think we all agree this is sub-optimal

solution.

Interface v. Implementation C++ supports separation of interface

and implementation at the syntax level – not at the binary level. So changes of the implementation are

“seen” by clients. We could hide the actual implementing

class behind an opaque pointer in the interface exposed to the client and delegate interface calls through this pointer to the “real” object. Easy for simple, cumbersome for complex

interfaces

Abstract Classes as Interfaces With three assumptions, we can

use abstract classes to solve these problems:

1. C-style structs are represented identically across (C++) compilers.

2. All compilers can be forced to use common call conventions.

3. All compilers on a platform use equivalent virtual call implementations.

vtbls and vptrs Assumption 3 is critical, and turns

out to be not unfounded, as nearly all C++ compilers use vptrs and vtbls.

For each class the compiler generates a (static) array of func pointers to it’s members (it’s vtbl).

Each instance of each class has a (hidden) member that points to the vtbl (it’s vprt).

IsearchableString

Exampleclass ISearchableString { public: virtual int Length(void) const = 0; virtual int Find(const char *s) = 0;};

vptr

vtbl

Length (null)

Find (null)

Example (cont.)class SString : public

ISearchableString { public: SearchableString(const char *s); ~SearchableString(void);

int Length(void) const; int Find(const char *s);};

SString

vptr

vtbl

SString::Length

SString::Find

Instantiating an Abstract Class Clearly the client can’t instantiate

an ISearchableString – it’s pure abstract, nor do we want them instantiating a SString – that breaks (binary) encapsulation.

So we need a factory method – and we can force it (using extern “C”) to be accessible to all clients.

Virtual Destructors Unfortunately, there’s a problem – our

class lacks a virtual d’tor – so calls to delete will use the (default) d’tor on the ISearchableString class.

We can’t add a virtual d’tor to the abstract class because different compilers put dtor’s in different places in the vtbl. (blech)

So we add a virtual “Delete” method to the abstract class.

What is COM A “substrate” for building re-usable

components. Language neutral

it’s easier to use in C++, but can be used from any language that can generate/grok vtbl’s and vptrs.

Interfaces are defined in COM IDL (IDL+COM extensions for inheritance and polymorphism)

OS Neutral commercial Unix implementations, and MS

supports COM on Mac System (OS X?) Using only on the COM spec, we (OMKT) rolled our

own.

Interfaces Interfaces are uniquely identified by

UUID’s (often called GUID’s – the terms are equivalent) called their IID (interface ID).

Implementers of an interface are uniquely identified by a UUID called their CLSID (class ID).

All COM objects implement the IUnknown interface.

IUnknown Provides three methods:

HRESULT QueryInterface(IID iid, void **ppv) ULONG AddRef(void); ULONG Release(void);

AddRef and Release are for resource management (reference counting). We’ll mostly ignore them.

QueryInterface QueryInterface is essentially a run-time

cast – it allows you to ask a component if it implements a specific interface. If it does, it returns a pointer to that

interface pointer in ppv. Think of it as a compiler/language neutral

dynamic_cast operation.

HRESULT This is language neutral – so no

exceptions. HRESULTS are a packed bit field return value used all over COM. Honestly it’s one of the ugliest parts of COM.

The most used return value is defined as S_OK (success, ok), the other is E_FAIL (error, failure) but there are others.

There are macros SUCCEEDED() and FAILED() that take an HRESULT and report success or failure.

Instantiating Objects So as developers, we have interfaces

(defined in IDL) for the components available in a library/on the system.

How do we actually obtain an instance of an object we want to use? In COM this is termed Activation – there

are three basic types, and each involves the SCM (service control manager).

Activation and the SCM The SCM manages the mapping between

IIDs, CLSIDs, and implementations. You can ask the SCM for a particular

CLSID and it will instantiate an instance and return it’s interface pointer. CoGetClassObject()

There’s an additional layer of indirection through ProgIDs – strings of the form libraryname.classname.version that map to CLSIDs.

Activation (cont.) Sometimes you want “an

implementation of the following interface that meets some set of constraints” enter category IDs (CATIDs)

You can define a set of categories, and each COM class can advertise the categories it implements.

COM as CORBA-light COM provides a very efficient in-process

component model. Once past the initial COCreateInstance and

QueryInterface calls, each method call is simply a call-by-func-pointer call, essentially free.

Instantiating a component doesn’t require any out of process, or shared memory operations – it’s all DLL (or Shared Object) magic.

There’s Much, Much More COM is specific about many topics

that C++ (and other languages) are not. It specifies: Execution environment options (so-

called Apartments) Inter-process Marshalling Remote Object Activation mechanism

and protocols Threading models

The Definitive References [1] The Component Object Model

Specification Microsoft and Digital Equipment Corp, 1992-

1995 www.microsoft.com/com/resources/comdocs

.asp

[2] Essential COM Don Box, Addison Wesley ISBN 0-201-63446-5

Documents

COM: A Brief Introduction Dan Berger [email protected]@cs.ucr.edu