Upload
giles-mclaughlin
View
216
Download
1
Embed Size (px)
Citation preview
1
Availability Policy
(slides from Clement Chen and Craig Lewis)
2
Definition
bull The degree to which data or systems are accessible and in functioning condition
bull Looking at it another way the degree to which the system is fulfilling the intended function
3
Availability and Reliability
Availability and Reliability are not the same thingbull Availability means that the system is ready for
usebull Reliability means that a device or system can
perform its job when called upon to do so
There is overlap but they are not the same thing
4
Major Causes of Disruption
bull Human Interferencendash Operator errorndash Virus and hacker attackndash Theft or sabotage
bull Communication Failurebull Hardware or system failurebull Natural Disastersbull Power Failurebull Water Damagebull Fire
5
Aspects of Availability
bull Data Availability
bull Network Availability
bull Communication Availability
bull System Availability
bull Power Availability
bull People Availability
bull Other Resources Availability
6
Data Availability
bull 1048708 Rule 1 Backup
bull 1048708 Rule 2 Backup
bull 1048708 Rule 3 Backup
7
Backup Methods
Full Backupndash Backup every filendash Takes a lot of storage space
Incremental Backupndash backs up files that have been created or modified only since the
last backupndash backup operator needing several tapes to do a complete restoration
Differential Backupndash backs up files that have been created or modified only since the
last full backupndash backup operator need only the full backup and the one differential
backup to restore thesystem
8
Data RetentionSarbanes Oxleybull All electronic company information must be retained for at
least five yearsbull Accounting firms that audit publicly traded companies
must retain all related documents for 7 years after auditHIPPAbull 1048708 Members of health care industry must retain patient
information for 6 yearsSEC 17a-3 and 17a-4bull Brokersdealers must retain records for 3-6 years and more
9
Data Vaulting
bull Copy of data is saved at a remote site periodically or continuously via network
bull Remote site may be own site or at a vendor location
bull Minimal or no data maybe lost in a disaster
bull There is typically some delay before data can actually be used
10
Network Availability
bull Prioritize the systems needing network access
bull Measure the amount of bandwidth needed to fulfill purpose of each component
bull Calculate overhead of protective measures
bull Decide what (if anything) can drop
11
Service Level Agreement
bull Can the ISP deliver
bull Can your equipment handle it
bull Higher bandwidth ndash for whatndash More businessndash Faster customer accessndash Faster music downloadsndash More scanning
12
People and Availability
bull People are a source of informationbull Staff with knowledge of how to fix a problem not
being there to fix it negatively impacts availabilityndash Positional redundancy ndash ldquoWorker X can do that but
shersquos not here until tomorrowrdquondash Shared knowledge ndash ldquoWhat if I get hit by a busrdquondash Limitations on physical access ndash ldquoItrsquos a 30 second fix
but it will take me 10 minutes to get thererdquondash Limitations placed by policy ndash ldquoI know how to fix it
but Irsquom not allowed to go in the server roomrdquo
13
Infrastructure Availability
bull Availability of the infrastructure can have a direct impact on availability of informationndash Voice communicationsndash Powerndash HVACndash Physical access
14
Infrastructure Solutions
Voicebull Cellular Phones bull WiFi Phonesbull Walkie-talkiesPowerbull Uninterruptible Power Supply (UPS)bull GeneratorsHVACbull Portable coolersbull FansBlowersPhysical Accessbull Security guardsbull Transportation shuttlesbull Backupalternative to electronic access controls
15
Measuring Availability
What does it mean to be available and how can it be measured
Availability means that systems or data are accessible but does not guaranteendash Performancendash Typical ways of doing things can still be usedndash Full system capacity
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
2
Definition
bull The degree to which data or systems are accessible and in functioning condition
bull Looking at it another way the degree to which the system is fulfilling the intended function
3
Availability and Reliability
Availability and Reliability are not the same thingbull Availability means that the system is ready for
usebull Reliability means that a device or system can
perform its job when called upon to do so
There is overlap but they are not the same thing
4
Major Causes of Disruption
bull Human Interferencendash Operator errorndash Virus and hacker attackndash Theft or sabotage
bull Communication Failurebull Hardware or system failurebull Natural Disastersbull Power Failurebull Water Damagebull Fire
5
Aspects of Availability
bull Data Availability
bull Network Availability
bull Communication Availability
bull System Availability
bull Power Availability
bull People Availability
bull Other Resources Availability
6
Data Availability
bull 1048708 Rule 1 Backup
bull 1048708 Rule 2 Backup
bull 1048708 Rule 3 Backup
7
Backup Methods
Full Backupndash Backup every filendash Takes a lot of storage space
Incremental Backupndash backs up files that have been created or modified only since the
last backupndash backup operator needing several tapes to do a complete restoration
Differential Backupndash backs up files that have been created or modified only since the
last full backupndash backup operator need only the full backup and the one differential
backup to restore thesystem
8
Data RetentionSarbanes Oxleybull All electronic company information must be retained for at
least five yearsbull Accounting firms that audit publicly traded companies
must retain all related documents for 7 years after auditHIPPAbull 1048708 Members of health care industry must retain patient
information for 6 yearsSEC 17a-3 and 17a-4bull Brokersdealers must retain records for 3-6 years and more
9
Data Vaulting
bull Copy of data is saved at a remote site periodically or continuously via network
bull Remote site may be own site or at a vendor location
bull Minimal or no data maybe lost in a disaster
bull There is typically some delay before data can actually be used
10
Network Availability
bull Prioritize the systems needing network access
bull Measure the amount of bandwidth needed to fulfill purpose of each component
bull Calculate overhead of protective measures
bull Decide what (if anything) can drop
11
Service Level Agreement
bull Can the ISP deliver
bull Can your equipment handle it
bull Higher bandwidth ndash for whatndash More businessndash Faster customer accessndash Faster music downloadsndash More scanning
12
People and Availability
bull People are a source of informationbull Staff with knowledge of how to fix a problem not
being there to fix it negatively impacts availabilityndash Positional redundancy ndash ldquoWorker X can do that but
shersquos not here until tomorrowrdquondash Shared knowledge ndash ldquoWhat if I get hit by a busrdquondash Limitations on physical access ndash ldquoItrsquos a 30 second fix
but it will take me 10 minutes to get thererdquondash Limitations placed by policy ndash ldquoI know how to fix it
but Irsquom not allowed to go in the server roomrdquo
13
Infrastructure Availability
bull Availability of the infrastructure can have a direct impact on availability of informationndash Voice communicationsndash Powerndash HVACndash Physical access
14
Infrastructure Solutions
Voicebull Cellular Phones bull WiFi Phonesbull Walkie-talkiesPowerbull Uninterruptible Power Supply (UPS)bull GeneratorsHVACbull Portable coolersbull FansBlowersPhysical Accessbull Security guardsbull Transportation shuttlesbull Backupalternative to electronic access controls
15
Measuring Availability
What does it mean to be available and how can it be measured
Availability means that systems or data are accessible but does not guaranteendash Performancendash Typical ways of doing things can still be usedndash Full system capacity
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
3
Availability and Reliability
Availability and Reliability are not the same thingbull Availability means that the system is ready for
usebull Reliability means that a device or system can
perform its job when called upon to do so
There is overlap but they are not the same thing
4
Major Causes of Disruption
bull Human Interferencendash Operator errorndash Virus and hacker attackndash Theft or sabotage
bull Communication Failurebull Hardware or system failurebull Natural Disastersbull Power Failurebull Water Damagebull Fire
5
Aspects of Availability
bull Data Availability
bull Network Availability
bull Communication Availability
bull System Availability
bull Power Availability
bull People Availability
bull Other Resources Availability
6
Data Availability
bull 1048708 Rule 1 Backup
bull 1048708 Rule 2 Backup
bull 1048708 Rule 3 Backup
7
Backup Methods
Full Backupndash Backup every filendash Takes a lot of storage space
Incremental Backupndash backs up files that have been created or modified only since the
last backupndash backup operator needing several tapes to do a complete restoration
Differential Backupndash backs up files that have been created or modified only since the
last full backupndash backup operator need only the full backup and the one differential
backup to restore thesystem
8
Data RetentionSarbanes Oxleybull All electronic company information must be retained for at
least five yearsbull Accounting firms that audit publicly traded companies
must retain all related documents for 7 years after auditHIPPAbull 1048708 Members of health care industry must retain patient
information for 6 yearsSEC 17a-3 and 17a-4bull Brokersdealers must retain records for 3-6 years and more
9
Data Vaulting
bull Copy of data is saved at a remote site periodically or continuously via network
bull Remote site may be own site or at a vendor location
bull Minimal or no data maybe lost in a disaster
bull There is typically some delay before data can actually be used
10
Network Availability
bull Prioritize the systems needing network access
bull Measure the amount of bandwidth needed to fulfill purpose of each component
bull Calculate overhead of protective measures
bull Decide what (if anything) can drop
11
Service Level Agreement
bull Can the ISP deliver
bull Can your equipment handle it
bull Higher bandwidth ndash for whatndash More businessndash Faster customer accessndash Faster music downloadsndash More scanning
12
People and Availability
bull People are a source of informationbull Staff with knowledge of how to fix a problem not
being there to fix it negatively impacts availabilityndash Positional redundancy ndash ldquoWorker X can do that but
shersquos not here until tomorrowrdquondash Shared knowledge ndash ldquoWhat if I get hit by a busrdquondash Limitations on physical access ndash ldquoItrsquos a 30 second fix
but it will take me 10 minutes to get thererdquondash Limitations placed by policy ndash ldquoI know how to fix it
but Irsquom not allowed to go in the server roomrdquo
13
Infrastructure Availability
bull Availability of the infrastructure can have a direct impact on availability of informationndash Voice communicationsndash Powerndash HVACndash Physical access
14
Infrastructure Solutions
Voicebull Cellular Phones bull WiFi Phonesbull Walkie-talkiesPowerbull Uninterruptible Power Supply (UPS)bull GeneratorsHVACbull Portable coolersbull FansBlowersPhysical Accessbull Security guardsbull Transportation shuttlesbull Backupalternative to electronic access controls
15
Measuring Availability
What does it mean to be available and how can it be measured
Availability means that systems or data are accessible but does not guaranteendash Performancendash Typical ways of doing things can still be usedndash Full system capacity
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
4
Major Causes of Disruption
bull Human Interferencendash Operator errorndash Virus and hacker attackndash Theft or sabotage
bull Communication Failurebull Hardware or system failurebull Natural Disastersbull Power Failurebull Water Damagebull Fire
5
Aspects of Availability
bull Data Availability
bull Network Availability
bull Communication Availability
bull System Availability
bull Power Availability
bull People Availability
bull Other Resources Availability
6
Data Availability
bull 1048708 Rule 1 Backup
bull 1048708 Rule 2 Backup
bull 1048708 Rule 3 Backup
7
Backup Methods
Full Backupndash Backup every filendash Takes a lot of storage space
Incremental Backupndash backs up files that have been created or modified only since the
last backupndash backup operator needing several tapes to do a complete restoration
Differential Backupndash backs up files that have been created or modified only since the
last full backupndash backup operator need only the full backup and the one differential
backup to restore thesystem
8
Data RetentionSarbanes Oxleybull All electronic company information must be retained for at
least five yearsbull Accounting firms that audit publicly traded companies
must retain all related documents for 7 years after auditHIPPAbull 1048708 Members of health care industry must retain patient
information for 6 yearsSEC 17a-3 and 17a-4bull Brokersdealers must retain records for 3-6 years and more
9
Data Vaulting
bull Copy of data is saved at a remote site periodically or continuously via network
bull Remote site may be own site or at a vendor location
bull Minimal or no data maybe lost in a disaster
bull There is typically some delay before data can actually be used
10
Network Availability
bull Prioritize the systems needing network access
bull Measure the amount of bandwidth needed to fulfill purpose of each component
bull Calculate overhead of protective measures
bull Decide what (if anything) can drop
11
Service Level Agreement
bull Can the ISP deliver
bull Can your equipment handle it
bull Higher bandwidth ndash for whatndash More businessndash Faster customer accessndash Faster music downloadsndash More scanning
12
People and Availability
bull People are a source of informationbull Staff with knowledge of how to fix a problem not
being there to fix it negatively impacts availabilityndash Positional redundancy ndash ldquoWorker X can do that but
shersquos not here until tomorrowrdquondash Shared knowledge ndash ldquoWhat if I get hit by a busrdquondash Limitations on physical access ndash ldquoItrsquos a 30 second fix
but it will take me 10 minutes to get thererdquondash Limitations placed by policy ndash ldquoI know how to fix it
but Irsquom not allowed to go in the server roomrdquo
13
Infrastructure Availability
bull Availability of the infrastructure can have a direct impact on availability of informationndash Voice communicationsndash Powerndash HVACndash Physical access
14
Infrastructure Solutions
Voicebull Cellular Phones bull WiFi Phonesbull Walkie-talkiesPowerbull Uninterruptible Power Supply (UPS)bull GeneratorsHVACbull Portable coolersbull FansBlowersPhysical Accessbull Security guardsbull Transportation shuttlesbull Backupalternative to electronic access controls
15
Measuring Availability
What does it mean to be available and how can it be measured
Availability means that systems or data are accessible but does not guaranteendash Performancendash Typical ways of doing things can still be usedndash Full system capacity
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
5
Aspects of Availability
bull Data Availability
bull Network Availability
bull Communication Availability
bull System Availability
bull Power Availability
bull People Availability
bull Other Resources Availability
6
Data Availability
bull 1048708 Rule 1 Backup
bull 1048708 Rule 2 Backup
bull 1048708 Rule 3 Backup
7
Backup Methods
Full Backupndash Backup every filendash Takes a lot of storage space
Incremental Backupndash backs up files that have been created or modified only since the
last backupndash backup operator needing several tapes to do a complete restoration
Differential Backupndash backs up files that have been created or modified only since the
last full backupndash backup operator need only the full backup and the one differential
backup to restore thesystem
8
Data RetentionSarbanes Oxleybull All electronic company information must be retained for at
least five yearsbull Accounting firms that audit publicly traded companies
must retain all related documents for 7 years after auditHIPPAbull 1048708 Members of health care industry must retain patient
information for 6 yearsSEC 17a-3 and 17a-4bull Brokersdealers must retain records for 3-6 years and more
9
Data Vaulting
bull Copy of data is saved at a remote site periodically or continuously via network
bull Remote site may be own site or at a vendor location
bull Minimal or no data maybe lost in a disaster
bull There is typically some delay before data can actually be used
10
Network Availability
bull Prioritize the systems needing network access
bull Measure the amount of bandwidth needed to fulfill purpose of each component
bull Calculate overhead of protective measures
bull Decide what (if anything) can drop
11
Service Level Agreement
bull Can the ISP deliver
bull Can your equipment handle it
bull Higher bandwidth ndash for whatndash More businessndash Faster customer accessndash Faster music downloadsndash More scanning
12
People and Availability
bull People are a source of informationbull Staff with knowledge of how to fix a problem not
being there to fix it negatively impacts availabilityndash Positional redundancy ndash ldquoWorker X can do that but
shersquos not here until tomorrowrdquondash Shared knowledge ndash ldquoWhat if I get hit by a busrdquondash Limitations on physical access ndash ldquoItrsquos a 30 second fix
but it will take me 10 minutes to get thererdquondash Limitations placed by policy ndash ldquoI know how to fix it
but Irsquom not allowed to go in the server roomrdquo
13
Infrastructure Availability
bull Availability of the infrastructure can have a direct impact on availability of informationndash Voice communicationsndash Powerndash HVACndash Physical access
14
Infrastructure Solutions
Voicebull Cellular Phones bull WiFi Phonesbull Walkie-talkiesPowerbull Uninterruptible Power Supply (UPS)bull GeneratorsHVACbull Portable coolersbull FansBlowersPhysical Accessbull Security guardsbull Transportation shuttlesbull Backupalternative to electronic access controls
15
Measuring Availability
What does it mean to be available and how can it be measured
Availability means that systems or data are accessible but does not guaranteendash Performancendash Typical ways of doing things can still be usedndash Full system capacity
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
6
Data Availability
bull 1048708 Rule 1 Backup
bull 1048708 Rule 2 Backup
bull 1048708 Rule 3 Backup
7
Backup Methods
Full Backupndash Backup every filendash Takes a lot of storage space
Incremental Backupndash backs up files that have been created or modified only since the
last backupndash backup operator needing several tapes to do a complete restoration
Differential Backupndash backs up files that have been created or modified only since the
last full backupndash backup operator need only the full backup and the one differential
backup to restore thesystem
8
Data RetentionSarbanes Oxleybull All electronic company information must be retained for at
least five yearsbull Accounting firms that audit publicly traded companies
must retain all related documents for 7 years after auditHIPPAbull 1048708 Members of health care industry must retain patient
information for 6 yearsSEC 17a-3 and 17a-4bull Brokersdealers must retain records for 3-6 years and more
9
Data Vaulting
bull Copy of data is saved at a remote site periodically or continuously via network
bull Remote site may be own site or at a vendor location
bull Minimal or no data maybe lost in a disaster
bull There is typically some delay before data can actually be used
10
Network Availability
bull Prioritize the systems needing network access
bull Measure the amount of bandwidth needed to fulfill purpose of each component
bull Calculate overhead of protective measures
bull Decide what (if anything) can drop
11
Service Level Agreement
bull Can the ISP deliver
bull Can your equipment handle it
bull Higher bandwidth ndash for whatndash More businessndash Faster customer accessndash Faster music downloadsndash More scanning
12
People and Availability
bull People are a source of informationbull Staff with knowledge of how to fix a problem not
being there to fix it negatively impacts availabilityndash Positional redundancy ndash ldquoWorker X can do that but
shersquos not here until tomorrowrdquondash Shared knowledge ndash ldquoWhat if I get hit by a busrdquondash Limitations on physical access ndash ldquoItrsquos a 30 second fix
but it will take me 10 minutes to get thererdquondash Limitations placed by policy ndash ldquoI know how to fix it
but Irsquom not allowed to go in the server roomrdquo
13
Infrastructure Availability
bull Availability of the infrastructure can have a direct impact on availability of informationndash Voice communicationsndash Powerndash HVACndash Physical access
14
Infrastructure Solutions
Voicebull Cellular Phones bull WiFi Phonesbull Walkie-talkiesPowerbull Uninterruptible Power Supply (UPS)bull GeneratorsHVACbull Portable coolersbull FansBlowersPhysical Accessbull Security guardsbull Transportation shuttlesbull Backupalternative to electronic access controls
15
Measuring Availability
What does it mean to be available and how can it be measured
Availability means that systems or data are accessible but does not guaranteendash Performancendash Typical ways of doing things can still be usedndash Full system capacity
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
7
Backup Methods
Full Backupndash Backup every filendash Takes a lot of storage space
Incremental Backupndash backs up files that have been created or modified only since the
last backupndash backup operator needing several tapes to do a complete restoration
Differential Backupndash backs up files that have been created or modified only since the
last full backupndash backup operator need only the full backup and the one differential
backup to restore thesystem
8
Data RetentionSarbanes Oxleybull All electronic company information must be retained for at
least five yearsbull Accounting firms that audit publicly traded companies
must retain all related documents for 7 years after auditHIPPAbull 1048708 Members of health care industry must retain patient
information for 6 yearsSEC 17a-3 and 17a-4bull Brokersdealers must retain records for 3-6 years and more
9
Data Vaulting
bull Copy of data is saved at a remote site periodically or continuously via network
bull Remote site may be own site or at a vendor location
bull Minimal or no data maybe lost in a disaster
bull There is typically some delay before data can actually be used
10
Network Availability
bull Prioritize the systems needing network access
bull Measure the amount of bandwidth needed to fulfill purpose of each component
bull Calculate overhead of protective measures
bull Decide what (if anything) can drop
11
Service Level Agreement
bull Can the ISP deliver
bull Can your equipment handle it
bull Higher bandwidth ndash for whatndash More businessndash Faster customer accessndash Faster music downloadsndash More scanning
12
People and Availability
bull People are a source of informationbull Staff with knowledge of how to fix a problem not
being there to fix it negatively impacts availabilityndash Positional redundancy ndash ldquoWorker X can do that but
shersquos not here until tomorrowrdquondash Shared knowledge ndash ldquoWhat if I get hit by a busrdquondash Limitations on physical access ndash ldquoItrsquos a 30 second fix
but it will take me 10 minutes to get thererdquondash Limitations placed by policy ndash ldquoI know how to fix it
but Irsquom not allowed to go in the server roomrdquo
13
Infrastructure Availability
bull Availability of the infrastructure can have a direct impact on availability of informationndash Voice communicationsndash Powerndash HVACndash Physical access
14
Infrastructure Solutions
Voicebull Cellular Phones bull WiFi Phonesbull Walkie-talkiesPowerbull Uninterruptible Power Supply (UPS)bull GeneratorsHVACbull Portable coolersbull FansBlowersPhysical Accessbull Security guardsbull Transportation shuttlesbull Backupalternative to electronic access controls
15
Measuring Availability
What does it mean to be available and how can it be measured
Availability means that systems or data are accessible but does not guaranteendash Performancendash Typical ways of doing things can still be usedndash Full system capacity
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
8
Data RetentionSarbanes Oxleybull All electronic company information must be retained for at
least five yearsbull Accounting firms that audit publicly traded companies
must retain all related documents for 7 years after auditHIPPAbull 1048708 Members of health care industry must retain patient
information for 6 yearsSEC 17a-3 and 17a-4bull Brokersdealers must retain records for 3-6 years and more
9
Data Vaulting
bull Copy of data is saved at a remote site periodically or continuously via network
bull Remote site may be own site or at a vendor location
bull Minimal or no data maybe lost in a disaster
bull There is typically some delay before data can actually be used
10
Network Availability
bull Prioritize the systems needing network access
bull Measure the amount of bandwidth needed to fulfill purpose of each component
bull Calculate overhead of protective measures
bull Decide what (if anything) can drop
11
Service Level Agreement
bull Can the ISP deliver
bull Can your equipment handle it
bull Higher bandwidth ndash for whatndash More businessndash Faster customer accessndash Faster music downloadsndash More scanning
12
People and Availability
bull People are a source of informationbull Staff with knowledge of how to fix a problem not
being there to fix it negatively impacts availabilityndash Positional redundancy ndash ldquoWorker X can do that but
shersquos not here until tomorrowrdquondash Shared knowledge ndash ldquoWhat if I get hit by a busrdquondash Limitations on physical access ndash ldquoItrsquos a 30 second fix
but it will take me 10 minutes to get thererdquondash Limitations placed by policy ndash ldquoI know how to fix it
but Irsquom not allowed to go in the server roomrdquo
13
Infrastructure Availability
bull Availability of the infrastructure can have a direct impact on availability of informationndash Voice communicationsndash Powerndash HVACndash Physical access
14
Infrastructure Solutions
Voicebull Cellular Phones bull WiFi Phonesbull Walkie-talkiesPowerbull Uninterruptible Power Supply (UPS)bull GeneratorsHVACbull Portable coolersbull FansBlowersPhysical Accessbull Security guardsbull Transportation shuttlesbull Backupalternative to electronic access controls
15
Measuring Availability
What does it mean to be available and how can it be measured
Availability means that systems or data are accessible but does not guaranteendash Performancendash Typical ways of doing things can still be usedndash Full system capacity
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
9
Data Vaulting
bull Copy of data is saved at a remote site periodically or continuously via network
bull Remote site may be own site or at a vendor location
bull Minimal or no data maybe lost in a disaster
bull There is typically some delay before data can actually be used
10
Network Availability
bull Prioritize the systems needing network access
bull Measure the amount of bandwidth needed to fulfill purpose of each component
bull Calculate overhead of protective measures
bull Decide what (if anything) can drop
11
Service Level Agreement
bull Can the ISP deliver
bull Can your equipment handle it
bull Higher bandwidth ndash for whatndash More businessndash Faster customer accessndash Faster music downloadsndash More scanning
12
People and Availability
bull People are a source of informationbull Staff with knowledge of how to fix a problem not
being there to fix it negatively impacts availabilityndash Positional redundancy ndash ldquoWorker X can do that but
shersquos not here until tomorrowrdquondash Shared knowledge ndash ldquoWhat if I get hit by a busrdquondash Limitations on physical access ndash ldquoItrsquos a 30 second fix
but it will take me 10 minutes to get thererdquondash Limitations placed by policy ndash ldquoI know how to fix it
but Irsquom not allowed to go in the server roomrdquo
13
Infrastructure Availability
bull Availability of the infrastructure can have a direct impact on availability of informationndash Voice communicationsndash Powerndash HVACndash Physical access
14
Infrastructure Solutions
Voicebull Cellular Phones bull WiFi Phonesbull Walkie-talkiesPowerbull Uninterruptible Power Supply (UPS)bull GeneratorsHVACbull Portable coolersbull FansBlowersPhysical Accessbull Security guardsbull Transportation shuttlesbull Backupalternative to electronic access controls
15
Measuring Availability
What does it mean to be available and how can it be measured
Availability means that systems or data are accessible but does not guaranteendash Performancendash Typical ways of doing things can still be usedndash Full system capacity
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
10
Network Availability
bull Prioritize the systems needing network access
bull Measure the amount of bandwidth needed to fulfill purpose of each component
bull Calculate overhead of protective measures
bull Decide what (if anything) can drop
11
Service Level Agreement
bull Can the ISP deliver
bull Can your equipment handle it
bull Higher bandwidth ndash for whatndash More businessndash Faster customer accessndash Faster music downloadsndash More scanning
12
People and Availability
bull People are a source of informationbull Staff with knowledge of how to fix a problem not
being there to fix it negatively impacts availabilityndash Positional redundancy ndash ldquoWorker X can do that but
shersquos not here until tomorrowrdquondash Shared knowledge ndash ldquoWhat if I get hit by a busrdquondash Limitations on physical access ndash ldquoItrsquos a 30 second fix
but it will take me 10 minutes to get thererdquondash Limitations placed by policy ndash ldquoI know how to fix it
but Irsquom not allowed to go in the server roomrdquo
13
Infrastructure Availability
bull Availability of the infrastructure can have a direct impact on availability of informationndash Voice communicationsndash Powerndash HVACndash Physical access
14
Infrastructure Solutions
Voicebull Cellular Phones bull WiFi Phonesbull Walkie-talkiesPowerbull Uninterruptible Power Supply (UPS)bull GeneratorsHVACbull Portable coolersbull FansBlowersPhysical Accessbull Security guardsbull Transportation shuttlesbull Backupalternative to electronic access controls
15
Measuring Availability
What does it mean to be available and how can it be measured
Availability means that systems or data are accessible but does not guaranteendash Performancendash Typical ways of doing things can still be usedndash Full system capacity
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
11
Service Level Agreement
bull Can the ISP deliver
bull Can your equipment handle it
bull Higher bandwidth ndash for whatndash More businessndash Faster customer accessndash Faster music downloadsndash More scanning
12
People and Availability
bull People are a source of informationbull Staff with knowledge of how to fix a problem not
being there to fix it negatively impacts availabilityndash Positional redundancy ndash ldquoWorker X can do that but
shersquos not here until tomorrowrdquondash Shared knowledge ndash ldquoWhat if I get hit by a busrdquondash Limitations on physical access ndash ldquoItrsquos a 30 second fix
but it will take me 10 minutes to get thererdquondash Limitations placed by policy ndash ldquoI know how to fix it
but Irsquom not allowed to go in the server roomrdquo
13
Infrastructure Availability
bull Availability of the infrastructure can have a direct impact on availability of informationndash Voice communicationsndash Powerndash HVACndash Physical access
14
Infrastructure Solutions
Voicebull Cellular Phones bull WiFi Phonesbull Walkie-talkiesPowerbull Uninterruptible Power Supply (UPS)bull GeneratorsHVACbull Portable coolersbull FansBlowersPhysical Accessbull Security guardsbull Transportation shuttlesbull Backupalternative to electronic access controls
15
Measuring Availability
What does it mean to be available and how can it be measured
Availability means that systems or data are accessible but does not guaranteendash Performancendash Typical ways of doing things can still be usedndash Full system capacity
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
12
People and Availability
bull People are a source of informationbull Staff with knowledge of how to fix a problem not
being there to fix it negatively impacts availabilityndash Positional redundancy ndash ldquoWorker X can do that but
shersquos not here until tomorrowrdquondash Shared knowledge ndash ldquoWhat if I get hit by a busrdquondash Limitations on physical access ndash ldquoItrsquos a 30 second fix
but it will take me 10 minutes to get thererdquondash Limitations placed by policy ndash ldquoI know how to fix it
but Irsquom not allowed to go in the server roomrdquo
13
Infrastructure Availability
bull Availability of the infrastructure can have a direct impact on availability of informationndash Voice communicationsndash Powerndash HVACndash Physical access
14
Infrastructure Solutions
Voicebull Cellular Phones bull WiFi Phonesbull Walkie-talkiesPowerbull Uninterruptible Power Supply (UPS)bull GeneratorsHVACbull Portable coolersbull FansBlowersPhysical Accessbull Security guardsbull Transportation shuttlesbull Backupalternative to electronic access controls
15
Measuring Availability
What does it mean to be available and how can it be measured
Availability means that systems or data are accessible but does not guaranteendash Performancendash Typical ways of doing things can still be usedndash Full system capacity
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
13
Infrastructure Availability
bull Availability of the infrastructure can have a direct impact on availability of informationndash Voice communicationsndash Powerndash HVACndash Physical access
14
Infrastructure Solutions
Voicebull Cellular Phones bull WiFi Phonesbull Walkie-talkiesPowerbull Uninterruptible Power Supply (UPS)bull GeneratorsHVACbull Portable coolersbull FansBlowersPhysical Accessbull Security guardsbull Transportation shuttlesbull Backupalternative to electronic access controls
15
Measuring Availability
What does it mean to be available and how can it be measured
Availability means that systems or data are accessible but does not guaranteendash Performancendash Typical ways of doing things can still be usedndash Full system capacity
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
14
Infrastructure Solutions
Voicebull Cellular Phones bull WiFi Phonesbull Walkie-talkiesPowerbull Uninterruptible Power Supply (UPS)bull GeneratorsHVACbull Portable coolersbull FansBlowersPhysical Accessbull Security guardsbull Transportation shuttlesbull Backupalternative to electronic access controls
15
Measuring Availability
What does it mean to be available and how can it be measured
Availability means that systems or data are accessible but does not guaranteendash Performancendash Typical ways of doing things can still be usedndash Full system capacity
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
15
Measuring Availability
What does it mean to be available and how can it be measured
Availability means that systems or data are accessible but does not guaranteendash Performancendash Typical ways of doing things can still be usedndash Full system capacity
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
16
MTBF amp MTTR
Definitionsbull Mean Time Between Failure (MTBF) is the amount of
time between failures where failure is defined as a departure from acceptable service for a system This is a measure of reliability
bull Mean Time to Recover (MTTR) measures the amount of time required to repair or recovery for a failed system
bull Availability is the ratio of the time a system is actually available to the time it should have been available
Availability = MTBF (MTBF + MTTR)
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
17
Availability Values
bull 1 weekThreshold Downtime
99 11 hr
999 63 min
9999 378 sec
99999 38 sec
999999 038 sec
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
18
Business Continuity Planning
bull Big deal since 911bull Every Business Continuity strategy includes three
fundamental componentsndash Business Impact Analysisndash Recovery Strategyndash Design and Develop the disaster recovery process
bull BCP should consider every type of interruption from a brief power outage up to the worst possible natural disaster or terrorist attack
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
19
Requirements of a BCP
1 Provide procedures and listing of resources to assist in the recovery process
2 Provide an immediate accurate and measured response to emergency situations
3 Identify vendors that may be needed in the recovery process and put agreements in place with selected vendors
4 Avoid confusion experienced during a crisis by documenting testing an training plan procedures
5 Clear guidance for declaring a disaster6 Provide the necessary directions to ensure the timely resumption of
critical services7 Document recovery processes so they can be executed by
knowledgeable people
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
20
BCDR Resources
Survive The Business Continuity Groupndash httpwwwsurvivecom
Emergency Information Infrastructure Partnershipndash httpwwweiiporg
Disaster Recovery Journalndash httpwwwdrjcom
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap
21
Summary
bull Lots of parts of availability
bull Tradeoffs are essential
bull Complexity complexity complexity
bull Need policy for a roadmap