8

Click here to load reader

BCP Lehman Brothers

Embed Size (px)

Citation preview

Page 1: BCP Lehman Brothers

Patrick Alesi is Senior Vice President and Co-Manager of Business Continuity Management atLehman Brothers. He has held this positionsince March 2002, but had previously workedfor Lehman on its business continuity plansas Assistant Vice President from 1997 until2000. Patrick’s current responsibilities includeincident response management and strategicplanning as well as regulatory compliance forbusiness continuity. Between his stints at Leh-man, Patrick worked at the New York MercantileExchange as its Director of Systems, Opera-tions and Database Administration. He has abroad range of technology experience in sys-tems analysis, and voice and data communica-tions. Mr Alesi is currently Chairman of theSecurities Industry and Financial Markets As-sociation Business Continuity Committee and amember of the Futures Industry AssociationBCP Committee.

ABSTRACT

This paper follows the development of thebusiness continuity planning (BCP) pro-gramme at Lehman Brothers following theevents of September 11th. Previous attempts toimplement a ‘traditional’ form of BCP hadbeen ineffective, but following the events, thefirm began to look at BCP in a new light.This paper deals with three main themes:creating a culture of resiliency, leveraging

technology, and building flexible plans. Dis-tributing accountability for BCP to busi-ness line managers, integrating BCP changemanagement into the normal course of business,and providing every employee with personalisedBCP information breeds a culture of resiliencywhere people are empowered to react to eventswithout burdensome, hierarchical response andrecovery procedures. Building a strong relation-ship with one’s application development com-munity can result in novel, customised BCPsolutions; existing systems and data structurescan be used to enhance an existing BCP. Eventhe best plans are often challenged by events;understanding that flexibility is essential toeffective incident response is a critical element inthe development of a proper business continuityplan.

Keywords: resiliency, business con-tinuity, BCP, incident response

INTRODUCTION

‘Change alone is unchanging.’ —Heraclitus1

Business continuity planning (BCP), likeall human endeavours, is in a constantstate of change and development. Some-times these changes are slow and almost

Building enterprise-wide resilience byintegrating business continuity capabilityinto day-to-day business culture andtechnology

Patrick AlesiReceived (in revised form): 29th January, 2008Lehman Brothers, Inc., 745 7th Avenue, 12th Floor, New York, NY 10019, USATel: �1 212 526 1734; E-mail: [email protected]

Journal of Business Continuity & Emergency Planning Volume 2 Number 3

Page 214

Journal of Business Continuity &Emergency PlanningVol. 2 No. 3, pp. 214–220� Henry Stewart Publications,1749-9216

Page 2: BCP Lehman Brothers

without lengthy approval processes, andwithout any micromanagement.

That blank sheet of paper certainly servedthe firm well, and provided some lessons onhow to do a better job of BCP.

In early 2002, senior managementbegan to contemplate how the BCPfunction could be reinvented. Althoughthe firm had made the proverbial silkpurse from a sow’s ear, it might not be solucky next time. As such, it was agreed toget the BCP right this time.

Three themes emerged in the process ofrebuilding the business continuity group:creating a culture of resiliency, leveraginginternal technology assets and creatingflexible plans.

CREATING A CULTURE OFRESILIENCYThe empowerment of Lehman’s peoplethat enabled a successful recovery fromSeptember 11th needed to be reinforcedand incorporated into the corporateculture. The traditional model of acentralised BCP group that creates andmaintains plans for a select few ‘criticalrecovery staff’ would not succeed —everyone needed to have a stake in theprocess. Furthermore, to the greatestextent possible, Lehman needed to makeBCP part of day-to-day operations. Astools for response and recovery werecreated, it was important for them to beexercised regularly. Obsolete content andprocedures that were only dusted offoccasionally for testing would be of littleuse during a recovery effort. A number ofprogrammes were created to foster thisculture of resiliency.

A ‘federated’ organisational BCPmodel places responsibility on thebusiness ownersA model that concentrates BCPknowledge in a centralised group is

imperceptible, but sometimes change israpid and far-reaching in its effect. Whenchange occurs suddenly, it is often accom-panied by an unforeseen, external event.Such a sudden change occurred in BCPat Lehman Brothers, and the externalevent was September 11th.

This paper will examine the significantchanges in BCP at Lehman Brothers, asspurred on by the events of September11th. Five years onward, the effects ofthat day are still being felt — the ‘newnormalcy’ as the think-tanks and massmedia outlets like to refer to it. Part ofthis new normal is a new way ofthinking about incidents, response andrecovery that has changed BCP from atedious planning exercise, to a ‘forwardleaning’ activity that emphasises flexi-bility, portability and technological in-tegration.

A BLANK SHEET OF PAPERWhen asked what Lehman Brothers’ busi-ness continuity plan was on September11th, the CIO answered by holding up ablank sheet of paper. While the firm had thetechnological foundation built — redun-dant data centres and connectivity — therewere no useful plans for mustering anddirecting people during a disaster.

What Lehman Brothers did have, how-ever, was a group of intelligent, motivatedand empowered people, and whenever agroup like that is brought together to solvea problem, great things can happen. With-out any playbook to work from, Leh-man Brothers built two trading floors in aweek, and was able to participate in all ofthe major financial markets when theyopened.

What Lehman Brothers staff did on thedays following September 11th provides apowerful lesson for any business continuityplanner — they simply did whatever theyneeded to do to get their jobs done —

Alesi

Page 215

Page 3: BCP Lehman Brothers

inefficient from a planning perspectiveand could be dangerously ineffectiveduring an actual disaster.

Many organisations ‘seed’ their businesslines with dedicated continuity plannersthat have a matrix reporting relation-ship with the business manager and theBCP manager. Depending upon how thismodel is actually implemented, the solu-tion may be adequate. Nonetheless, thereare inherent risks:

• line managers in the business now havean excuse to ignore their responsibilityfor BCP because there is someone whoserves as the BCP expert in the busi-ness;

• there may be a tendency for the busi-ness line BCP professionals to hoardinformation;

• business line BCP staff are accountablefor creating plans, but do not necessarilyhave the authority to make decisionsduring an incident.

It is this last point that is perhaps themost important. Both the accountabilityfor planning and the authority to ex-ecute plans (or perhaps more importantly,deviate from plans) during an incidentmust reside with business line managers.

As of the writing of this paper, thereare 15 dedicated business continuityemployees globally, for a firm of roughly28,300. That works out to one dedi-cated BCP employee for every 1,887employees.

Across the firm’s business lines, how-ever, the planning software has well over200 regular users, and the ‘owners’ of theplans are the chief administrative officers(CAOs) for each business. The CAO isresponsible for resource planning and al-location, and works with senior busi-ness management on a daily basis. Thisclose working relationship can be easilyand effectively leveraged in the creation,

maintenance and activation of business-specific continuity plans.

Continuity plans must be revised aspart of the normal course of businessCreation and maintenance of businesscontinuity plans is not done twice yearlyor every quarter — it takes place prettymuch all the time. If plan owners andmembers look at and update their plansonly occasionally, they will soon forgetwhat they entail once they return to their‘day job’.

One of the more notable ways thatLehman Brothers has increased the main-tenance cycle and incorporated it intodaily operations is the mechanism bywhich it maintains accurate contact infor-mation for employees. Every 75 days,when an employee opens a web browser,they are automatically redirected to a pagewith their contact information and askedto review it. Users cannot navigate to asite until they respond to the request.Maintaining reliable contact informationcannot be underestimated when respond-ing to a disaster.

In addition, the continuity planningtool uses a system of ‘listeners’ to findmissing or changed data, and alert planowners via e-mail that something in theirplan needs to be updated. For instance,this happens when an employee who hasa dedicated recovery seat leaves the firm.

Every employee must be part of thebusiness continuity planIn some traditional business continuityplans, only ‘critical’ employees are actuallydocumented as part of the recovery plan.All of Lehman’s employees are consideredcritical, and they all need to be communi-cated with during an incident. As a result,the firm has created a business rule in itsinternally-developed continuity planningsoftware that ensures that all employeesare assigned to a plan. If an employee

Building enterprise-wide resilience

Page 216

Page 4: BCP Lehman Brothers

Lehman’s business continuity manage-ment (BCM) group has partnered withcolleagues in the corporate diversitygroup, which is responsible for promot-ing, among other things, flexible workarrangements and the virtual workplacefor employees who may wish to workfrom home all or part of the time.The two groups conducted a virtualworkplace awareness fair, demonstratingvirtual workplace tools in the cafeterias ofmajor buildings in New York and NewJersey (similar efforts were later repeatedin other offices globally). The results ofthe fair were evident in some of thestatistics generated afterwards: hits on thevirtual workplace intranet page increased13-fold during the fair and requests forremote voice capability increased by thesame proportion.

Lehman Brothers had generated a greatdeal of interest in these technologies,but in order to make them a viablepart of BCP, the firm had to be surethat people knew how to use them,and used them regularly. To do this,the BCM group implemented a bian-nual ‘tickler’ that puts a reminder onevery employee’s desktop to test theirremote access capability. Employees arethen asked to answer a single question onwhether the test has been successful. Thereminder has a due date, and will not goaway until the task is completed.

With so many employees workingremotely and testing this capabilityregularly, the firm can depend on aremote workforce to significantly aug-ment dedicated workspace recoveryseats.

LEVERAGING INTERNALTECHNOLOGYA sophisticated development team hasbuilt web-based applications for the firm’snumerous divisions. This extensive intra-

moves from one department, and there-fore BCP, to another, the system auto-matically moves them, so they are neverwithout a plan.

Remote access tools can enhanceresiliencyWhile many firms have a mechanism forremotely accessing their data and haveincorporated this into their BCPs, Lehmanplaces great emphasis on this capability, andhas implemented a full suite of ‘virtualworkplace’ technologies, including:

• Citrix server-based access to critical ap-plications;

• remote desktop capability to allow fullaccess to a user’s work PC;

• remote voice capability that allows callsto be received and placed from anyphone, while appearing as though theyare made from the user’s desk.

Lehman has even piloted remote tradingturret (dealer board) applications that al-low traders to access their private linesremotely.

A critical element of this virtualworkplace is that it does not rely on avirtual private network (VPN) connec-tion. VPN connections typically requirespecialised software to be pre-loaded on aPC (usually company-supplied). Lehman’ssolution only requires an authenticationtoken. Therefore, any internet-connectedPC or Mac can be used for remote access,substantially increasing resiliency.

A virtual workplace environment is awonderful tool in the continuity planner’sarsenal, but people must be comfortablewith using it during an incident. In keep-ing with the firm’s desire to incorporategood business continuity practices into itsday-to-day culture, there are a number ofadditional efforts that have been under-taken to ensure that remote access is aviable part of the plan.

Alesi

Page 217

Page 5: BCP Lehman Brothers

net architecture has been leveraged tocreate customised incident response andplanning tools that connect in real-time toauthoritative, up-to-date sources of data,using the same look and feel familiar tousers. To ensure that technology is atthe core of the new BCP model, itwas decided that the BCM group wouldreport to the chief information officer.This structure has allowed the group toimplement some novel BCP solutions.

Internally develop the continuityplanning toolAlthough there are many capable third-party BCP software packages on themarket, most require significant cus-tomisation. The decision to build its owntool allowed Lehman to create plans thatfollowed its business model.

Lehman Brothers’ primary planningtool, BCPlans, is web-based, and adheresto all of the firm’s internal standards forhow web applications should look andbehave. This significantly reduces thelearning curve. As plans are maintained bymanagers and staff in the business linerather than dedicated continuity planners,this is a major consideration.

In addition, the tool connects toproduction databases for people, systemsand assets, making all of the planinformation as accurate as possible. Whendevelopment began five years ago, theability to update plan data in real timeusing application programming interfaceswas uncommon.

Give everyone their own businesscontinuity planAs it is the firm’s philosophy that everyindividual should be part of a recoveryeffort and a member of a continuity plan,BCPlans was extended to create cus-tomised BCPs for each employee by pars-ing out key elements of the business line’srecovery plan and formatting it as a simple

web page that is accessible from any-where, even wirelessly.

The page provides key contact in-formation and phone numbers, tasks,recovery seat assignments and othercritical data that are unique to that person.The information is brief — typically oneor two pages long, and is ideal for gettingeveryone through the first 24–48 hours ofan incident. After such a period, it isreasoned that the recovery strategy willchange drastically depending on theevent.

In keeping with goals to decentraliseaccountability and integrate BCP intocorporate culture, Lehman again leveragestechnology to ensure that employeesunderstand their individual plan. On aregular basis, a reminder is placed onevery employee’s default browser pageasking them to read their plan. Aconfirmation button is placed on the planpage so that people can ‘attest’ to havingread and understood it.

CREATING FLEXIBLE PLANS

‘No plan of operations reaches withany certainty beyond the first en-counter with the enemy’s main force.’— Helmuth Karl Bernhard Graf vonMoltke2

This quote has morphed over time to themore familiar ‘No plan survives first con-tact with the enemy’, and is a favourite ofmine when discussing BCP strategy. Leh-man’s response to the events on Septem-ber 11th showed that a smart, empoweredand flexible organisation can be moreimportant than a thoroughly detailed butunworkable plan. This is not to say thatone should not plan — on the con-trary — von Moltke, being a careermilitary officer, understood the impor-tance of planning and preparation. But

Building enterprise-wide resilience

Page 218

Page 6: BCP Lehman Brothers

itself, counterparts in London would takeresponsibility.

There is no complicated flowchart ofwhen and how to notify BCM of anincident; there are no green, blue, yellowor red alarms. There is just a simplenotification. It is then up to BCM toescalate and trigger the incident responseprocess in whatever way is deemed ap-propriate; again, there is no rigid structureor colour-coded diagram, just measuredresponse. There is of course an inci-dent response document (35 pages at lastcount), but it is used as a reference ratherthan being used to dictate how to react.

If escalation to the business is required,teleconference bridges would be used toquickly communicate and decide ona response strategy. Each business isthen responsible for communicating thisstrategy to its employees, who canthen employ their personalised plans andremote access tools for a rapid andeffective response.

CONCLUSIONTo summarise the main points:

• Seek to create a culture of resiliency,where accountability is co-located withauthority and BCP components areintegrated into day-to-day operations.Make every employee part of a plan,and make the plan accessible to them.

• Leverage the technology infrastructure.Bolster remote access solutions throughconstant awareness and training. Lookfor internal technology competenciesthat can integrate the BCP model withcorporate data.

• Be prepared to improvise. Creating aculture of resiliency where employeesare able to respond quickly to incidentsusing familiar tools containing real-timedata creates a model that lends itself tothe required flexibility.

what he also knew, and what all goodBCP professionals must know, is thatevery incident is different and one mustbe prepared to change one’s response toevents as they occur.

The importance of flexibility beginswith response to the incident itself. Likemost firms, Lehman has a number ofteams that are ‘activated’ to respond to anincident, including:

• incident management team: BCM, security,facilities and IT — this team is the firstto respond to incidents;

• business management team: seniormanagement of the firm, including alladministrative officers and divisionalBCP contacts;

• crisis communications team: corporatecommunications, employee relationsand investor relations;

• technology response team: self-ex-planatory.

This is all standard stuff until one con-siders how this structure is employed. Theresponse process is designed from the startto have the BCM group as the primaryincident manager for all incidents.3 BCM isclosely aligned with corporate security,and has a matrix reporting relationship tothe global head of corporate security. Infact, the New York BCM office is co-located with the corporate security com-mand centre. Security’s role is of course toprotect the human and physical assetsof the firm; this is done by the 24/7monitoring of video, sensors and externalinformation sources. BCM’s co-locationwith this group means that, during thebusiness day, it is informed immediatelyof incidents that may affect the firm.After hours, the employees who staff thesecurity console are trained to contact theBCM group directly and have access to allof their contact information. If the inci-dent were to affect the security console

Alesi

Page 219

vivek
Highlight
vivek
Highlight
Page 7: BCP Lehman Brothers

Lehman Brothers is not the sole owner ofthis new perspective on business con-tinuity. Indeed, many financial servicesindustry workers are adopting some of thesame new strategies and tactics. But many,especially those who have not recentlyfaced a catastrophic event, are stilldiligently creating the ‘big red BCPbinder’, and putting it on the shelf tocollect dust. By describing the sea changethat occurred at Lehman, it is hoped thatreaders will look at business continuityfrom a fresh perspective and glean someideas for improving their own pro-grammes; and maybe start throwing outtheir red binders.

REFERENCES

(1) Heraclitus (c. 535–475 BC), inDavenport, G. (trans.) (1976)‘Herakleitos and Diogenes’, pt. 1,fragment 23, Grey Fox Press, Eugene,OR.

(2) Shafritz, J. M. (1990) ‘Words on War:Military Quotations from Ancient Timesto the Present’, Prentice Hall, NewYork, NY.

(3) In this paper, the term ‘incident’ is usedto refer to an event that does or mayhave the ability to significantly affectbusiness operations. It is not used todescribe typical operational incidentsinvolving system failures or routineoutages.

Building enterprise-wide resilience

Page 220

Page 8: BCP Lehman Brothers