17
© 2015 IBM Corporation Resiliency Services: Always there, in an always-on world Resiliency Services: Always there, in an always-on world Exercising your DR solution, why it's still a critical no matter your solution. Valerie S. Egginton, Associate Partner, Resiliency Services Consulting

Exercising Your Disaster Recovery Solution

  • Upload
    ibm

  • View
    178

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on worldIBM Resiliency Services: Always there, in an always-on world

Exercising your DR solution, why it's still a critical no matter your solution.

Valerie S. Egginton, Associate Partner, Resiliency Services Consulting

Page 2: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on worldIBM Resiliency Services: Always there, in an always-on worldIBM Resiliency Services: Always there, in an always-on world

What was old, needs to be new again

In the evolving world of technology and business resiliency we cannot allow complacency to take hold. Advances and innovations in technology give greater flexibility and independence in the ability to recover, but organizations still need to take responsibility in the area of exercise and testing.

On demand technology services of all kinds, including DRaaS, Virtualization and Cloud in general, still need to have the disciplines of focused recovery exercises to ensure that the solutions in place are indeed going to meet the needs of the organization both from an operational perspective as well as from a regulated data integrity position.

Let’s discuss leading practices in exercise and testing, why it’s still critical to success and some of the ways testing objectives can evolve along with your recovery options.

2

Page 3: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on world

3

DR Test/Exercise ProgramDR Test/Exercise ProgramMission Statement, GoalsMission Statement, GoalsAnd Guiding Principles And Guiding Principles

Page 4: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on worldIBM Resiliency Services: Always there, in an always-on worldIBM Resiliency Services: Always there, in an always-on world

A DR test program mission statement should be established supported by seven goals.

The test strategy should enable the organization to accomplish the following seven goals:

1. Demonstrate the effectiveness, efficiency and readiness of the organization's DR capabilities through a common testing framework;

2. Provide standard, formalized test measurements and reporting mechanisms;

3. Ensure that weaknesses are uncovered during DR testing and that required changes are addressed appropriately;

4. Ensure that test exercises are planned, prepared and executed using a standard set of comprehensive testing methods to validate accuracy of the DR plans, build team awareness and provide educational opportunities;

5. Enable the organization to increase the scope and complexity of test exercises;

6. Provide the organization with a structured and measurable approach to DR testing at all levels (application, infrastructure and data centre);

7. Provide an ability to validate alignment between business requirements and technology capabilities.

“The DR Test/Exercise Program mission is to provide the organization with the ability to measure recovery capabilities, identify recovery weaknesses, and enable operational viability through validation, team awareness, skills development and education.”

Example:

Page 5: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on world

5

Guiding principles are established with the key principle that DR testing will not increase the risk to any production application or environment.

1.Remote access to the DR site from the organization network will be required to support the testing required.

2.A series of tests will be required to validate that each server/service/application can be recovered.

3.DR Testing should be realistic enough to give us confidence that it will work in a real disaster situation.

An organization should have at least three guiding principles in place relating to DR testing

Page 6: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on world

6

Additional DR testing-specific best practice guiding principles provide further clarity.

1. Distributed applications that have a mainframe interface will share the same test environment where possible.

2. The DR Process will define the details of each application’s recovery capability, requirements and dependencies.

3. The DR test strategy will not increase the risk to any production application or production environment.

4. The DR test strategy will be developed based on the concept of application streaming1.

5. The DR test strategy must account for all production applications and take into account the dynamic nature of all recovery requirements. This supports the notion that all applications become critical over time.

6. A DR test program maturity model will be developed and become an integral component of the organization’s DR Test Strategy and the associated implementation roadmap.

7. The DR test strategy will address disaster recovery testing practices, policies, and capabilities defined as follows:

• Practice - The current method of carrying out a set of activities. “What is currently in use today.”

• Capability -The full ability of a product, process or organization. “What can be performed, whether it is or not.”

• Policy - A plan, course of action, set of rules, guidelines which is intended to influence and determine decisions, actions, and responsibilities.

1Application Streaming is defined as a collection of applications, services and infrastructure required to support a business function.

Additional Sample DR Test Program Guiding Principles

Page 7: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on world

7

DR Testing Best Practices DR Testing Best Practices

Page 8: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on worldIBM Resiliency Services: Always there, in an always-on worldIBM Resiliency Services: Always there, in an always-on world

DR Test Program Strategy Key Components

Typical best practice strategy components1. Long and short term goals have been set for the organization in the Disaster Recovery Overall High-Level Plan for

specified years – actually put this into your strategy document.

2. Identification of DR Test Program stakeholders and ensuring they are continually involved in the strategy implementation (vital critical success factor).

3. Established DR test-build approach incorporated into the strategy to support a progressive increase in the scope of DR tests.

4. Defined application selection process incorporated into the strategy to define the selection and prioritization criteria.

5. Recognition that the DR Test Program Strategy employs many testing techniques as required at the various stages of testing maturity.

6. Implementation of a balanced set of measurements.

7. Architectural integration of DR at the enterprise at the enterprise level.

8. Identification and implementation of organizational requirements (structure, processes, resources, skills and governance) to ensure the successful implementation and sustainment of the DR Test Program in the long term.

Page 9: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on world

9

The organization's DR test practices and capabilities must fully support data centre or enterprise level disaster recovery testing to verify its ability to recover the entire production environment.

Typical best practice capability requirements1. Environments must be designed to enable testing.

2. Defined roles of application owners must include DR testing acceptance.

3. Critical applications should be tested on an annual basis.

4. The rating or ranking of applications (i.e., Criticality) must be agreed to by both Business Units and IT including formally documented approvals.

5. There must be a comprehensive understanding of the requirements and resources necessary to test or recover the entire production environment.

6. There must be a process in place to identify, record and track required changes to DR infrastructure or procedures that are identified during DR testing.

7. Organizational awareness or consistency must exist across business lines regarding DR test policies, practices and procedures.

Page 10: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on world

10

Industry frameworks such as IBM's Enterprise Business Continuity Framework, COBIT and ITIL support the strategies used by organizations in their DR testing programs.

Typical industry framework considerations 1. DR plans should be tested based on business needs.

2. Direct linkage between DR test strategies and how organization manages Operational Risk.

3. DR test programs are integrated within the overall IT Service Continuity Management Program (ITSCM).

4. DR plans should be tested at every major change to the business environment.

5. DR test results should be recorded and gaps or weaknesses formulated into action plans and tracked through to resolution.

6. DR test results should be used to verify and enhance training.

7. DR tests should consider infrastructure testing, single application testing, integrated testing, end-to-end testing and integrated vendor testing.

8. A Business Continuity Steering Committee should be implemented to support the ITSCM, DR and BCP organization programs.

9. Personnel involved in the development of the DR plans and day to day operation of the test target should not be involved in the DR test.

10. A measurement program must measure and report on multiple aspects of the DR test program.

Page 11: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on world

11

High Level Regulatory Requirements1. Organizations need to ensure that business continuity plans are effective and that necessary

modifications are identified through periodic testing.

2. Organizations should test their business continuity plans, evaluate their effectiveness and update their business continuity management, as appropriate.

3. Testing the ability to recovery "critical operations" is essential.

4. Testing, which can take many forms, should be conducted periodically, with the nature, scope and frequency determined by the criticality of the applications/business functions. These tests should identify the need to modify the recovery plan due to changes in its business, responsibilities, systems, software, hardware, personnel, or facilities or the external environment.

Research into regulatory requirements further supports the strategies in place at leading organizations and supports recommendations contained within industry frameworks.

Regulatory requirements tend to be specified at such a level as to allow organizational compliance within a range that

enables a balance between the cost of mitigation controls and operational risk tolerance.

Page 12: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on world

12

DR Testing Best PracticesDR Testing Best PracticesLearning from theLearning from the

Experience of Others Experience of Others

Page 13: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on world

13

DR Test Programs in leading organizations exhibit strong and highly visible governance structure, dedicated DR testing resources and a technology strategy that minimizes risk to the production environment.

Typical DR Test Program elements in leading organizations 1. A strong and highly visible DR testing governance structure.

2. Linkage between the DR test program and other processes within the IT Service Management

3. Program to ensure that the DR test program remains a viable part of the services provided by IT.

4. Long term commitment to increase the DR testing capability over time until goals are reached.

5. DR test goals reflect the organization’s tolerance for risk, regulatory pressures and deployed technology.

6. DR test programs receive significant investment and organizational support including: clear ownership, roles and responsibilities and a dedicated team to execute and improve the program.

7. Testing priorities are based upon business requirements.

8. Common DR test methodology, framework and processes.

9. Test strategies make every effort to minimize risk to production.

10. Day to day production operations staff are not involved in DR testing.

11. Test existing DR processes and procedures (i.e., not customized for specific tests nor executed "on-the-fly").

12. Fully test DR plans to fullest extent including third party links, switching production network links to DR test sites; moving operations to alternate sites, etc.

Page 14: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on world

14

Learning from Others: Effective DR Test Program Strategies leverage those used by leading organizations, practices identified in industry recognized frameworks.

Identify DR Test Strategies used by other organizations

Identify Industry Framework DR Testing Recommendations

Identify Industry DR Testing - Best Practices

Identify DR Testing Life Cycle – Best Practice

Identify DR Testing Preparation Timelines – Best Practice

Identify DR Test Plan Components – Best Practice

Identify Full Life Cycle Testing Practices

Identify DR Test Program Governance Scope and Requirements

Identify Solution-specific Test Strategies

Identify Test Standards for HA Environments

Identify Test Standards for Non HA Environments

DR Test Program Strategy

Page 15: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on world

15

Learning from Others: Pre and Post Test Activities tend to be very dynamic based on requirements and require significant commitment to planning and preparation.

Typical Pre and Post Test Activities in leading organizations 1. Select test type (tabletop vs live)

2. Pre-Test Cycle - weekly meetings to prepare for exercise

3. Discuss/agree scope, objectives, expected outcomes, issues, constraints, go/no-go decision points, open items from last test, roles, responsibilities, alignment of expectations with available test window, sequence of activities, estimated time frames.

4. Conduct test

5. Review results and define action plan

6. Report results

7. Track recommendations and corrective actions

Page 16: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on world

16

Learning from Others: Effective reporting - Detailed exercise overviews and statistical performance summaries facilitate quantifiable measurements of success.

Copies of outputs from various systems provide evidence to support results.

1. Objectives

Tape Management Network Connectivity Desktop Systems Access Mainframe Data Mirroring Job Scheduling and Output Management Infrastructure Validation Enterprise Tools Recovery Validation Systems Security i Series Recovery Open Systems Recovery (Unix, Storage, Wintel)

2. Exercise Assumptions

3. Planned Process Deviations

4. Problem Management

5. Communication Plan

6. Key Telephone Numbers

7. Staff and Shift Schedule

8. Directions and Map to DR site

9. Command Centre Roles & Responsibilities

10. Time Line and Milestones Exercise Mainframe LPARS i series LPARS Distributed Environment

11. Appendices DR Exercise Problem Log Form Triage Process DNS Names and associated IP Addresses of DR servers Access ACLs DR Configuration Setup Network Diagrams

Typical Exercise Overview Table of Contents

Page 17: Exercising Your Disaster Recovery Solution

© 2015 IBM Corporation

IBM Resiliency Services: Always there, in an always-on world

For more information

Find out how IBM resiliency consultants can help you enhance your disaster recovery testing. Contact your IBM representative or IBM Business Partner, and download the no-charge whitepaper on resiliency program exercising and testing from our website:

ibm.com/services/continuity

17