48
Faculty of Computing Blekinge Institute of Technology SE-371 79 Karlskrona Sweden Effectiveness of Backup and Disaster Recovery in Cloud Sindhura Yarrapothu A Comparative study on Tape and Cloud based Backup and Disaster Recovery

Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

Faculty of Computing Blekinge Institute of Technology SE-371 79 Karlskrona Sweden

Effectiveness of Backup and Disaster Recovery in Cloud

A Comparative study on Disk and Cloud based Backup and Disaster Recovery

Sindhura Yarrapothu

A Comparative study on Tape and Cloud based Backup and Disaster Recovery

Page 2: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

ii

This thesis is submitted to the Faculty of Computing at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering. The thesis is equivalent to 20 weeks of full time studies.

Contact Information: Author(s): Sindhura Yarrapothu E-mail: [email protected]

External advisor: Sujoy Chatterjee Email: [email protected]

University advisor: Adrian Popescu Department of Communication Systems

Faculty of Computing Blekinge Institute of Technology SE-371 79 Karlskrona, Sweden

Internet : www.bth.se Phone : +46 455 38 50 00 Fax : +46 455 38 50 57

Page 3: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

ABSTRACT

Context: Backup and Disaster Recovery, DR play a vital role in day-to-day IT operations. They define extensive aspects of business continuity plan in an enterprise. There is a continuous need to improve backup and recovery performance concerning attributes such as backup window size, high availability, security, etc. Definitive information is what enterprises strive for and rely upon to deviate from traditional methods towards advancing technologies, which are an intrinsic segment of business mundane actions. Objectives: In this study, we investigate Backup and DR plans on an enterprise level. They are compared in terms of performance metrics such as Recovery Time Objective, Recovery Point Objective, Time taken to backup, Time taken to recover and Total cost of ownership. Also, how CPU and memory utilization conduct differ in both tape-based, cloud-based Backup and DR. Methods: Literature study was the first step to formulate research questions by understanding present technologies in Backup and DR. This led us to conduct a survey for further understanding of challenges faced in industries gaining a more practical exposure. A case study was conducted in an enterprise to capture accurate values. An experiment had been deployed to compare performance of both scenarios and analyze which methodology elevates Backup and DR performance by overcoming challenges. Results: The results attained through this thesis encompass performance related metrics and also the load in terms of CPU and memory utilizations. Survey results were observed to gain better understanding of current technologies and challenges with Backup and DR in enterprises. The cloud-based backup has proved to be better in considered enterprise environment during experimentation in terms of RPO, RTO, CPU, memory utilizations and Total Cost of ownership. Conclusions. There have been numerous research works conducted on how backup and DR plans can be made better. But, they lack accurate information on how their performances vary, what all parameters can be improved by shifting towards advanced and contemporary methodologies with addressing features such as scalability, flexibility and adaptability, which is provided in this study.

Keywords: Backup, Disaster Recovery, Performance, and Storage

Page 4: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

2

ACKNOWLEDGEMENTS

My deep gratitude goes to my thesis supervisor Prof. Adrian Popescu for his timely guidance and valuable inputs. He enthusiastically believed in and supported our work from the moment he learned of it and had exceedingly helped me in completion of this thesis. I sincerely extend my thanks to Prof. Kurt Tutschku for his immense support and valuable efforts during this course. I would like to thank Siva Kumar Surampudi, Hardeep Garewal, Prasad Natu for inspiring us, providing required permissions to conduct this thesis and for their constant support at all intervals of time during our work. I am particularly indebted to Sujoy Chatterjee for his searing feedback, which has motivated us at every step of our thesis. He continually boosted our faith in this work and in improving it. Also I thank him for letting me acquire noteworthy knowledge through this industrial experience. I would like to thank my colleague, Akash Kaveti for his constant encouragement in learning and understanding concepts better during this work. Also, he had bared a huge responsibility in bringing this project to a successful end. It is also my duty to record thankfulness to Gan Sharma, Bhujay Kumar Bhatta, Karthik Sethuraman Krishna, Mohan Kumar, Emmanuel Anthony, Amit Kumar Pandey, Pushpalatha Kuruba, Jagadish Kb, Latheesh Devairakkam, Panna Lal Shaw, Anoop Kodakara and Sanket Srivastav for providing me with the required resources. These critical readers had invested hours in reading our drafts and helping us improve them. I am indebted to my family members and friends for being my deepest, enduring support and standing by my side at all times. I would like to thank all the 45 participants in our survey who had taken keen interest in answering our queries and have provided their best wishes.

Page 5: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

3

CONTENTS ABSTRACT ............................................................................................................................................ I  

ACKNOWLEDGEMENTS .................................................................................................................. 2  

CONTENTS ........................................................................................................................................... 3  

LIST OF FIGURES ............................................................................................................................... 5  

LIST OF TABLES ................................................................................................................................. 6  

ABBREVIATIONS ................................................................................................................................ 7  

1   INTRODUCTION ........................................................................................................................... 8  1.1   AIM AND OBJECTIVES ................................................................................................................ 9  1.2   RESEARCH QUESTIONS ............................................................................................................... 9  1.3   THESIS OUTLINE ......................................................................................................................... 9  

2   BACKGROUND ........................................................................................................................... 11  2.1   BACKUP METHODS: .................................................................................................................. 11  

2.1.1   Full Backup ...................................................................................................................... 11  2.1.2   Incremental Backup .......................................................................................................... 12  2.1.3   Differential Backup .......................................................................................................... 13  2.1.4   Full Backup + Incremental Backup ................................................................................. 13  2.1.5   Full Backup + Differential Backup .................................................................................. 14  2.1.6   Incremental forever Backup ............................................................................................. 14  2.1.7   Synthetic Backup .............................................................................................................. 15  

2.2   STORAGE SYSTEM TECHNOLOGIES ........................................................................................... 15  2.2.1   Block Level Storage .......................................................................................................... 16  2.2.2   File Level Storage ............................................................................................................ 16  

2.3   PERFORMANCE METRICS .......................................................................................................... 16  2.3.1   Recovery Point Objective and Recovery Time Objective ................................................. 16  2.3.2   Age of Backup ................................................................................................................... 16  2.3.3   Time taken to backup ........................................................................................................ 17  2.3.4   Time taken to recover ....................................................................................................... 17  2.3.5   Total Cost of Ownership .................................................................................................. 17  

2.4   BACKUP OPERATION ................................................................................................................ 17  2.5   RECOVERY OPERATION ............................................................................................................ 17  2.6   THE NEED FOR CHANGE IN BACKUP AND DR STRATEGIES: ...................................................... 18  

3   EXPERIMENT METHODOLOGY ........................................................................................... 19  3.1   EXPERIMENTAL ENVIRONMENT ................................................................................................ 19  3.2   EXPERIMENTAL SETUP FOR TAPE-BASED BACKUP AND DR ..................................................... 19  3.3   EXPERIMENTAL SETUP FOR CLOUD-BASED BACKUP AND DR .................................................. 20  

4   RESULTS ...................................................................................................................................... 23  4.1   SURVEY .................................................................................................................................... 23  4.2   BACKUP WINDOW SIZE ............................................................................................................. 23  4.3   AGE OF BACKUP ....................................................................................................................... 23  4.4   BACKUP OPERATION ................................................................................................................ 24  

4.4.1   Full Backup Operation ..................................................................................................... 24  4.4.2   Incremental Backup .......................................................................................................... 25  

4.5   RECOVERY OPERATION ............................................................................................................ 26  4.5.1   VM Level Recovery ........................................................................................................... 26  4.5.2   File Level Recovery .......................................................................................................... 26  

4.6   CPU UTILIZATIONS .................................................................................................................. 27  4.7   MEMORY UTILIZATIONS ........................................................................................................... 28  

Page 6: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

4

4.8   RPO AND RTO ......................................................................................................................... 28  4.9   TOTAL COST OF OWNERSHIP .................................................................................................... 29  

5   ANALYSIS .................................................................................................................................... 31  5.1   RESEARCH QUESTIONS ............................................................................................................. 31  5.2   LIMITATIONS ............................................................................................................................ 32  

6   CONCLUSION AND FUTURE WORK .................................................................................... 33  6.1   FUTURE WORK .......................................................................................................................... 34  

REFERENCES ..................................................................................................................................... 35  

APPENDIX A ....................................................................................................................................... 38  

APPENDIX B ....................................................................................................................................... 45  

Page 7: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

5

LIST OF FIGURES

Figure 1, Full back up [16] --------------------------------------------------------------------------- 12  Figure 2, Full + Incremental Backup [16] ---------------------------------------------------------- 13  Figure 3, Full + Differential Backup[16] ----------------------------------------------------------- 14  Figure 4, Incremental forever Backup[16] --------------------------------------------------------- 15  Figure 5, Synthetic Backup[16] ---------------------------------------------------------------------- 15  Figure 6, Backup Operation[24] --------------------------------------------------------------------- 17  Figure 7, Recovery Operation[24] ------------------------------------------------------------------- 18  Figure 8, Block diagram for LAN-based Backup and DR[24] ---------------------------------- 20  Figure 9, Backup and DR with agents installed[30] ---------------------------------------------- 21  Figure 10, Asigra with agentless Backup and DR[30] -------------------------------------------- 22  Figure 11, Full back up comparison of tape-based and cloud-based backup and DR -------- 25  Figure 12, Incremental backup comparison of tape-based and cloud-based backup and DR 25  Figure 13, Recovery on VMs on tape-based and cloud-based backup and DR --------------- 26  Figure 14, Recovery on files in tape-based and cloud-based backup and DR ----------------- 27  Figure 15, Graphical representation of CPU utilization ------------------------------------------ 27  Figure 16, Graphical representation of memory utilization -------------------------------------- 28  Figure 17, RPO and RTO when disaster occurs[32] ---------------------------------------------- 29  

Page 8: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

6

LIST OF TABLES

Table 1, System configuration for experiment 1 -------------------------------------------------- 22 Table 2, Challenges faced with multiple devices -------------------------------------------------- 23 Table 3, Backup window of tape-based Backup and DR ----------------------------------------- 23 Table 4, Backup cycle of tape-based Backup and DR -------------------------------------------- 24 Table 5, RPO and RTO ------------------------------------------------------------------------------- 28 Table 6, Tape-based Backup and DR TCO -------------------------------------------------------- 29

Page 9: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

7

ABBREVIATIONS AOB Age Of Backup CPU Central Processing Unit DR Disaster Recovery DS Data System IT Information Technology LAN Local Area Network LTO Linear Tape Open NAS Network-attached Storage OS Operating System RPO Recovery Point Objective RTO Recovery Time Objective SaaS Software as a Service SAN Storage Area Network TCO Total Cost of Ownership VM Virtual Machine WAN Wide Area Network

Page 10: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

8

1 INTRODUCTION

Enterprises of all sizes require safeguarding of their data on a regular basis, which makes Backup and DR procedures, inescapable factors. The backup system acts as a fundamental system in case of a mishap where data is inadvertently lost or corrupted from original system. It protects data by maintaining a copy of gross data to comply with the regulatory laws[1]. The repetition of backing up data in an enterprise may come in handy in long run because when a calamity occurs, data is not permanently lost and is made readily available after disaster proving to be advantageous. And disaster recovery involves a set of policies and procedures to preserve continuation of business in case of a disaster. A disaster may refer to a natural hazard or due to manual error or machine failure[2].

With changing technologies, there is an increase in need for robust methods with low cost and limited burden for enterprises during restoration of lost data from backup copies. With advancements in applications and presence of multiple devices, there is an urge for development in Backup and DR plans with better data integrity, capability of handling innumerable devices simultaneously and capability to recover data efficiently[3]. Absence of a backup operation scheduled to take place in an enterprise may have a great financial impact on the enterprise due to cost of recreating lost data, dilatory, legal actions that may be faced, lower productivity which may slowly lead to the collapse of business.[4]

The Backup and DR of data follow various methods according to the demand in

enterprise-level such as: • Full back up where data is backed up on a full-scale and recovered back from the

same. • Incremental backup where only changed or newly added data is backed up

subsequently after the last full or incremental backup. Recovery is made with help of last full backup and all incremental backups performed everyday from the date of last full backup[5].

• Differential backup where only changed or newly added data is backed up subsequently after last full or differential backup but changes made in the previous differential backup are updated in next differential backup. This makes recovery process easier, which only requires data from last full backup and last differential backup copies.

The usage of cloud storage for backing up data by enterprises can enhance its features such as being economical and cost effective when compared to traditional methods like disks and tapes in which handling and transportation of media prove to be arduous tasks[6]. But transformation to cloud storage introduces challenges to factors accompanying for the success of business such as availability [7]and security[8]

In case of tapes and disks, data is backed up on a removable media that can be mounted

and removed manually to perform backup or recovery. There are various guidelines in different parts of world for erasure of data stored on this media[9]. This media is usually stored in an offsite storage destination, which is away from main site helping the company protect its data and retrieve it later without any struggle[10]. The offsite storage is chosen at a remote site to protect data in case of a disaster.

In the olden days, companies stored data in gigabytes range, which has increased to terabytes and petabytes of copious data sizes in the recent times[11]. This humongous amount of data is required to have high availability rate and needs to be handled at faster rates. The enterprise data is to be protected and avoiding data loss while storing it is of

Page 11: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

9

immense significance. To cope up with this burgeoning data within an enterprise, a potent approach for Backup and DR is essential using media, which is prone to a lesser degree of harm.

Data Backup and DR plans rely on the company’s requirement. These plans may vary

based on scale of operation and amount of data that is to be backed up[5]. We have conducted a study on how existing methods in an enterprise can be improved based on performance metrics such as:

• Recovery Time Objective, RTO and Recovery Point Objective, RPO, which define the threshold limits of a system in terms of time needed to restore an application and allowable limit of data loss.

• Age of Backup, AOB • Time taken to backup and Time taken to recover, which are speeds at which data is

backed up and restored. • Total Cost of Ownership is measured in terms of cost for infrastructure, operations

and maintenance.

1.1 Aim and Objectives The main aim of this study is to evaluate performance of traditional Backup and DR

system in an enterprise. The data collected from a Mid-Tier IT Company using tape-based Backup and DR is compared with the deployed cloud-based Backup and DR in same environment respectively. The objectives of this study are:

• Identify and fathom the effects of various challenges in Backup and DR systems in an enterprise by conducting a literature study and a survey.

• Learn methods like tape-based, cloud-based Backup and DR procedures used to backup data regularly and recover it after a disaster.

• With the help of experimentation conducted, performance values will be recorded for files of different size on file level and VM level backups.

• Analyze performance based on metrics such as CPU utilization, memory utilization RPO, RTO, AOB, Time taken to backup, Time taken to recover and Total cost of ownership.

• Comparing evaluated metrics of both methodologies.

1.2 Research questions The research questions for this thesis are as follows: RQ1. What are the challenges faced by enterprises with multiple devices coming in IT estates and backing them up? RQ2. What is the CPU utilization and memory utilization for tape-based and cloud-based backup methods? RQ3. How can one evaluate performance of tape-based and cloud-based Backup and DR in terms of RPO, RTO, AOB, Time taken to backup, Time taken to recover and Total Cost of Ownership?

1.3 Thesis outline Chapter 1 gives an overview of this thesis document where a brief introduction is

provided regarding data Backup and DR in enterprises. The aim of this thesis, its objectives and research questions are mentioned.

Page 12: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

10

Chapter 2 deals with background of this thesis, which describes previous related work. Many research papers, journals, whitepapers published by companies have been referred and personal interviews were conducted with experts in this field to understand current course of Backup and DR.

Chapter 3 provides detailed mention of methodology used to fulfill this thesis.

Firstly, a literature study has been conducted and then an online survey was conducted to understand challenges faced in industries. Next, a case study was conducted on a tape-based backup and DR system in an enterprise with supervision of highly experienced employees from the IT Company. Then a cloud-based backup and DR plan was deployed to measure its performance. Finally, the values attained from the case study and experiments were compared.

Chapter 4 provides detailed results of experimentation conducted on both tape-

based and cloud-based Backup and DR plans. Comparing CPU, memory utilization values, RTO, RPO, Total cost of ownership, Time taken to backup and Time taken to recover accomplishes this task.

Chapter 5 provides an analysis of results obtained and how it differs from prior

research works conducted. Also, it gives an idea of why this method has been chosen to fulfill this thesis.

Chapter 6 deals with conclusions and scope for future works. It explicitly unveils

how the research questions formulated in the beginning are answered through several research methodologies and results obtained by executing experiments on them.

Page 13: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

11

2 BACKGROUND

In the olden days, backing up of data was performed with no expectations of immediate availability and recovery in case of a disaster occurrence. The only method available for backing up data was on tapes and these were locked with minimum account to security. In hospitals, banks, backup was carried out on paper prior to using tapes and bundles of papers were locked away which are currently being digitalized and stored online to avail it for future reference.[12]

But, changing trends imply change in requirements of enterprises. With development of

remote offices and expansion of organizations globally, data is required in various locations at same time with minimal or no downtime. It is crucial to schedule backup considering the timeframe when organizations are involved globally supporting different operating systems in different time zones. The backup of data is stored at different locations in offsite storage to avoid any regional disaster from affecting the stored data.

Innumerable research works have been conducted in the field of Backup and DR to help

enterprises understand significance of upgrading to advanced technologies. The major influences on backup are explosion in data growth[13] compliance with

regulatory laws, undesirable maintenance, downtime windows, demand for high availability and quick data processing rate during backup of data.[14]

Susan Snedaker et al [5] pointed out a checklist for recovery operation in case of a

disaster occurrence for an enterprise which begins with assessing the problem, escalating it, declaration of disaster, activating and implementing the plan in DR phase, Business Continuity phase implementation, bringing business back to up and running state. It mentions the eminence of learning from an event by reviewing it and revising plans if needed. It must be of high priority to understand the stature of every phase as a minute mistake can lead to huge loss.

Ann L. Chervenak et al through their study on backup techniques from different perspectives have discussed desirable properties inspected by enterprises. The results have proven the prominence of incremental-only scheme while handling enormous amounts of data.[34]

Globalization of industries has been prevailing for a few years now, and abiding by the

laws in all parts of world plays a major role and impacts enterprises from restricting them to shift towards cloud services due to security reasons.[35]

The Cloud-based Backup and DR has its perks of diminished total cost of ownership but

enterprises bear queries regarding security provided by them, which are being acknowledged [36][7][8].

2.1 Backup Methods: The different backup methods, their advantages and disadvantages are described in this

section.

2.1.1 Full Backup Full backup process, all files and folders selected are backed up as shown in figure 1.

Page 14: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

12

This is generally used as an initial backup in every enterprise and subsequently incremental or differential backups are conducted respectively. The full backup is repeated after a set of incremental or differential backups to update the original full backup copy with updates made from previous full backup. Few organizations perform full backups, when backup data only requires minimum storage space.

Ø Advantages

• Easy to manage as entire list of files and folders are in one backup set. • Easy to maintain

Ø Disadvantages • Backups can take very long time as each file is backed up repeatedly every

time a full backup is run. • Consumes massive storage space. • The exact same files are being stored repeatedly resulting in inefficient use of

storage.

Figure 1, Full back up [16]

2.1.2 Incremental Backup In incremental backup, all changes or newly added data that take place since last full

backup or last incremental backup are backed up. This is generally used after a full backup to save time by backing up modifications made on each day from full backup.[17]

Ø Advantages

• Much faster backups • Efficient use of storage space, as files are not duplicated

Ø Disadvantages • Restores are slower • Restores are complicated. The full backup and all consecutive incremental

backups are needed to perform a restore.

Page 15: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

13

2.1.3 Differential Backup In differential backup, all changes or newly added data that took place since last full or

differential backup are backed up and each differential backup is merged into next day’s differential backup copy making restoration process faster.[18]

Ø Advantages

• Much faster backups • Faster restores

Ø Disadvantages • Not an efficient use of storage space. All files added or edited after a

differential backup will be duplicated again with each following differential backup

• Restores are slightly complicated. The full backup and last differential backup copies are needed to perform a restore.

2.1.4 Full Backup + Incremental Backup In combinational backup process, all files and folders selected are backed up during the

initial full backup and updates are backed up in the subsequent incremental backups as shown in figure 2. Only changes are backed up while performing incremental backups. The full backup is repeated after a set of incremental backups. Last full backup and all incremental backups from the time of last full backup copies are required to restore any data [19].

As shown in figure 2, backup operation on day 1 is functioning typically and fresh

supplementary data is available on day 2 which is backed up through an incremental backup. This procedure persists till day 5 on which a disaster has occurred. Day 1 full backup, day 2, day 3 and day 4 incremental backup sets of tapes are required to restore data lost on day 5.

Figure 2, Full + Incremental Backup [16]

Page 16: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

14

2.1.5 Full Backup + Differential Backup In combinational backup process, all files and folders selected are backed up during

initial full backup and updates are backed up in the subsequent differential backups as shown in figure 3. The changes backed up in first differential backup are updated in subsequent differential backup. The full backup is repeated after a set of differential backups to update data. The last full backup and last differential backup sets of tapes are required to restore data.

As shown in figure 3, backup operation on day 1 is functioning typically and fresh

supplementary data is available on day 2 which is backed up through a differential backup. This procedure persists till day 5 on which a disaster has occurred. To restore data lost on day 5, sets of tapes from day 1 consisting of full backup copy and day 4 consisting of differential backup copy are exclusively required.

Figure 3, Full + Differential Backup[16]

2.1.6 Incremental forever Backup In incremental forever-backup process, all changes that take place since first full backup

copy are backed up as shown in figure 4. This is generally used after an initial full backup and sequential incremental backups are conducted forever[20].

Ø Advantages

• Reduces length of backup window • No need for scheduling the full backup • Reduces load on network • Availability of data • Transparency in restoration procedure, automated restoration

Ø Disadvantages • Restores require full backup and all incremental backup copies to perform a

restore making it complex

Page 17: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

15

Figure 4, Incremental forever Backup[16]

2.1.7 Synthetic Backup In synthetic backup process, an initial full backup is conducted and then incremental

backups are performed consecutively as shown in figure 5. As the name suggests this backup is not created from original data. The full backup will eventually be merged into the following incremental backup to form a new synthetic full backup. The incremental backups consist of updates only.[21]

Ø Advantages • Consumes less time to perform backup • Costs are reduced • Lower restore time

Ø Disadvantages • This backup cannot be first backup performed on a system • It cannot be following backup of a full backup

Figure 5, Synthetic Backup[16]

2.2 Storage system technologies The two most important storage system technologies are file level storage and block

level storage.

Page 18: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

16

2.2.1 Block Level Storage Block level storage is the process of creating raw blocks of storage, which can be

handled individually. This system is commonly used in SAN. It is presented as a hard disk to the server. The blocks are controlled by the Server based OS [15].

Ø Advantages of Block level storage systems • Offers better speed • Popular with SAN users • Reliable • Transport system is efficient

2.2.2 File Level Storage File level storage is the process of backing up entire file system if there is a slight

modification in any of the files. This is the most common storage system that we find with our hard-drives, NAS systems, which is used to share files with users. They are configured with common file level protocols such as NTFS (windows) and NFS (Linux). This method is cost effective when compared to block level storage. A hybrid model using both levels of storage can also be used based on the requirement of the enterprise [15].

Ø Advantages of File Level Storage System • Simple usage • Comparatively cheaper • Transparency in storing and accessing files • Popular with NAS users • Bulk access to files

2.3 Performance Metrics

2.3.1 Recovery Point Objective and Recovery Time Objective When a disaster occurs, time period prior to occurrence of a disaster during which data is

lost and is of allowable, limit can be defined as recovery point objective. Whereas time period between the disaster occurrence and time where application, server or business activities are up and running again in their established order can be defined as the recovery time objective.[15][22]

A backup window is the time allocated by an enterprise during which the backup can be performed without any heavy load on the network. Let us assume a backup window size to be 12 hours starting at 22:30 on day 1 till 10:30 on day 2. The disaster has occurred at 16:00 hours on day 2 on a regular working day in the enterprise. According to enterprise DR plan, time required for business to be back online is 6 hours. Hence, recovery time objective is 6 hours. The data that has been changed, added or deleted from 10:00 to 16:00 on day 2 will be lost on occurrence of disaster. The acceptable amount of time, an enterprise can handle data loss without huge damage to the users is 12 hours. Hence, the recovery point objective depends on how fresh the data is and the acceptable limit of data loss defined by an enterprise in their DR plan.

2.3.2 Age of Backup The age of backup is measured in days or hours. Age of backup refers to the time period

a backup copy will be retained [17].

Page 19: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

17

2.3.3 Time taken to backup Time taken for a backup to be successfully executed and stored.

2.3.4 Time taken to recover Time taken by the server for the recovery of lost data. This depends on factors such as

network speed, size of data, type of network, speed of the media, etc.[23]

2.3.5 Total Cost of Ownership The total cost of ownership is the financial estimate of Backup and DR plan in an

enterprise. It includes both capital and operational expenses. Capital costs refer to initial costs in setting up Backup and DR infrastructure where operational costs refer to maintaining equipment and procurements.

2.4 Backup Operation In figure 6, the step-by-step procedure of performing a backup can be defined in the

numbered order.

Figure 6, Backup Operation[24]

2.5 Recovery Operation In figure 6, the step-by-step procedure of performing a recovery can be defined in the

numbered order.

Page 20: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

18

Figure 7, Recovery Operation[24]

2.6 The need for change in Backup and DR strategies: With any backup, it is important to understand the requirements of an enterprise and

design a methodology that can provide better output accordingly. Such requirements may include very important information contributing towards a veracious and accurate choice of Backup and DR plan such as the following.[25]

• What recovery time is being promised and to what limit can the enterprise handle data loss?

• What backup types are necessary and if they can support the applications used by the enterprise?

• Based on what policy are backup media stored at offsite storage. This may introduce trouble if incremental backups are performed on a daily basis, as it requires all media to be shipped in case of a recovery procedure, if the offsite is far away.

• How expensive is the Backup and DR plan? Does it fulfill the needs of an enterprise?

With changing technologies, changing trends occur. Need for a better system is always presumed. A compromise may be made concerning one of the characteristics if it benefits another characteristic to a great extent. It is difficult to give an opinion on which Backup and DR plan well suits an enterprise without knowing complete requirements and vitality of them.[26]

As scholars try to improve one or more features everyday, they may find new needs

emerging as requirements from enterprises. So, to understand what are the current trends and how they can improve Backup and DR technologies is always important.

Page 21: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

19

3 EXPERIMENT METHODOLOGY

In this chapter, a detailed description of experimentation is given. This thesis can help enterprises understand the challenges faced while performing backup and recovery operations. And allow them to decide on how changes can be implemented to improve their current systems by enhancing some of the key performance indicators such as RPO, RTO and TCO. The experiments have been conducted in a mid-tier IT company to draw absolute values.

The elementary prominence is to fathom several existing Backup and DR operations in

enterprises, a literature study was conducted as part of which several research papers were referred and personal interviews were conducted with experts in Backup and DR operations with media as tapes, disks and cloud [27].

In the next phase, we have prepared a questionnaire keeping in mind major factors

affecting enterprises while performing backup and recovery procedures. This questionnaire has been sent out as an online survey to proficient members in the field of Backup and DR globally. Telephonic discussions were conducted in several organizations such as banks, hospitals, clothing retailers, marketing fields to comprehend challenges faced by them while backing up their data on a daily basis. A case study where performance metrics were noted and an experiment conducted in the same environment help in understanding which factors are better in both Backup and DR methodologies.[28][29]

3.1 Experimental environment The mid-tier IT Company is employed with 6000-7000 employees approximately. The

data center is located on premises where current Backup and DR system makes use of tape media to backup and store the data at offsite, which is around 100kms from main site of operation. The experimental setup constitutes of 4 backup servers with RAM of 2 GB with 2 cores, 8 GB with 4 cores, and 6 GB with 4 cores and 16 GB with 4 cores respectively. The experiment is conducted on a Windows 2008 R2 Standard connected to one of the backup server with Intel (R) Xeon (R) processor, 6 GB RAM and two dual core processors. The CPU and memory utilization values are noted for 4 hours 30 minutes while backup is carried out. The time taken to execute backup and recovery operations is noted. The RPO, RTO are all noted for comparing with the Backup and DR plan in Cloud. In figure 8, block diagram of LAN based Backup and DR system using tape media is shown. The individual backup clients are installed on all machines, which are required to be backed up. All these machines are connected to the media servers consisting enterprise data. These are in turn connected to the backup server and backup media to perform backup and recovery. Symantec Backup Exec 2010 [43] and Asigra backup softwares are installed on the backup server.

3.2 Experimental setup for Tape-based Backup and DR The Backup and DR on tape media is carried out at daily, weekly, monthly and yearly

intervals. In this LAN based Backup and DR, there are four backup servers and four tape libraries with each tape library having two tape drives. One of these backup servers and a tape library has been chosen to conduct the experiment. Tape library is a storage device containing one or more tape drives, slots to hold tape cartridges.

To fulfill this operation a management console is used where, jobs can be created and

monitored for both backup and recovery operations. When a job is unsuccessful in creating a backup copy or restoring data, this job must be manually run again to accomplish it

Page 22: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

20

successfully. These errors in completing jobs may be due to several reasons such as corruption of data or exceeding data limits in a file, etc.

The backup carried out is a full backup on first day, continued by incremental backups

and the full backup is repeated on day 7. This process is repeated every week. The full backup copy of fourth (last) week in a month is considered as the monthly backup copy. The full backup copy of twelfth month is considered as the yearly backup copy. The tapes that are currently under use are LTO4 and LTO6. Linear Tape Open, LTO is a magnetic tape data storage technology. The LTO-4 tape format was released in 2007 and can hold 1.6 TB (800gigabytes uncompressed). LTO4 was believed to provide better performance than its predecessors. LTO-6, which succeeded LTO-5, was released in December 2012 with a capacity of 6.25 TB (3 TB uncompressed) with data transfer rates up to 400MBps.

The enterprise requires dedicated IT staff to manage their backup and recovery

operations without any failures. An engineer may be required to make sure that in case of a backup operation failure, selected jobs have to be run manually to successfully perform them. Another engineer may be required to make sure that all complaints for recoveries are answered and data is recovered successfully when a recovery request is received from any of the departments. There is a need to manually insert tapes every day in the morning into the tape libraries for a backup to take place. There is constant manpower and effort involved in this scenario. There is also need for monitoring temperature and current consumption used by datacenter where tape libraries are located.

Tapes are transported on a weekly basis to offsite storage requiring an engineer to travel

with the tapes to offsite storage, which is 100Kms from main site of operation. The tapes are hardware encrypted. They require security checks at both main site and offsite storage to ensure that they have successfully been stored and can be retrieved in case of a disaster occurrence. The tapes are barcoded and stored for easier identification in case of retrieval for recovery. This transportation requires many hours invested, as the location of offsite storage is traffic tied up at all times. The enterprise conducts recovery drills once every month as a precautionary measure.

Figure 8, Block diagram for LAN-based Backup and DR[24]

3.3 Experimental setup for Cloud-based Backup and DR A cloud-based Backup and DR is set up to compare values obtained from experiment on

tape solution. The Backup and DR on cloud solution chosen is Asigra. It is a cloud-based

Page 23: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

21

Backup and DR solution, which can be integrated with all cloud storages and little implementation required on the existing infrastructure.

It is a software platform constituting of two components, namely DS-Client and DS-

System. DS-Client is installed on customer LAN and used to back up all data on the LAN. It can be run on a dedicated or existing machine and can be either physical or virtual. DS-System is the location of data storage provided by the Cloud provider. It can provide single-tenancy and multi-tenancy options to one or more DS-Clients. It is installed in a secure offsite hosting facility. The Asigra backup solution is not a service provider but rather teams up with them.

It can handle private, public or even hybrid cloud environments. The deployment of this

software on an existing backup solution is easier as it only requires DS-Client to be installed on one of the machine to backup all the data at the customer end. It does not require an agent to be installed on all the devices[30]. The experiment was conducted on one of the backup server in the enterprise. With the help of a console on desktop, backup or recovery options can be chosen and destination of data to be stored or retrieved from is mentioned allowing user to access data.

This solution makes use of a combination of full and incremental-forever backups while

performing backup operation providing bandwidth savings. The incremental-forever backup reduces backup window size and only blocks of updating data are backed up in incremental copies making minimal amount of data pass across the network.

A regular Backup and DR methodology in an enterprise with agents installed on all

devices pertaining to enterprise can be seen as shown in figure 9. It is a typical scenario where all laptops, desktops and VMs are situated. Applications such as SharePoint, MS Exchange and MS SQL used by employees regularly are connected to backup server. Henceforth, backup operation takes the course of plan as shown in figure 6.

Figure 9, Backup and DR with agents installed[30]

Whereas Asigra solution does not require machines to be rebooted every time a software

is updated unlike in the case of other Backup and DR solutions. It does not require major deployments on an existing architecture and it saves time. The enterprise network may be

Page 24: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

22

represented as shown in figure 10. It is a typical scenario using Asigra Backup and DR where all laptops, desktops and VMs are situated. Applications such as SharePoint, MS Exchange and MS SQL used by employees are connected to backup server. These are connected to locally installed DS-Client, which may be installed on a physical or a virtual machine without requiring any agents installed on each of them. Besides local storage, DS-Client is connected to DS-System located at Server Provider datacenter through WAN where storage of data takes place.

Figure 10, Asigra with agentless Backup and DR[30]

In table 1, system configuration of desktop on which experiment was conducted is shown.

CPU Intel Xeon X5570 Speed 2.93 GHz, Turbo boost up to 3.3 GHz Operating system Windows Server R2 2008 RAM 6GB

Table 3, System configuration for experiment 1

In Asigra solution, laptops, mobiles and other handhelds require a mobile client to be installed on them, which is not charged for, saving a fortune. But these systems need to be connected within the same network while performing backup or restore operations. Company private cloud was considered with a bandwidth of 1GB pipe.

Page 25: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

23

4 RESULTS

This chapter gives detailed explanation of results obtained during experimentation phase to answer the research questions. The survey results can be referred in the appendix section. The experimentation results are illustrated graphically for easier understanding.

4.1 Survey In table 2, the dominant challenges faced in enterprises were described. It can be seen

that the backup of remote/branch offices was given highest priority according to 30 respondents followed by the growth in data handling difficulties by 23 respondents and time-consuming constraint by 21 respondents. The other survey responses may be referred to the Appendix-A section.

Table 4, Challenges faced with multiple devices

4.2 Backup window size In table 3, maximum backup window size for experimental setup in IT Company is

described. A full backup is performed every week, during the weekend and incremental backups are taken for rest of the week. This process is repetitive every week.

Table 3, Backup window of tape-based Backup and DR

4.3 Age of Backup The backup cycle and retention policy are defined as shown in table 4 which can be used

Page 26: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

24

to understand age of backup in the IT Company. Retention policies define time period of retaining a media of backup copy[31]. Backup operation performed on initial day is a full backup, which is then continued, with subsequent daily incremental backups until next full backup is performed. Full backups are repeatedly performed on a weekly basis. The full backup copy of fourth week is considered to be a monthly full backup copy. And simultaneously, full backup copy of last week in the last working month of year in enterprise is maintained as the yearly full backup copy. The cloud storage avails features such as scalability and flexibility allowing enterprises to store bulk volumes of data as and when needed. It only requires login credentials for the user to access data.

Currently, enterprise requires 6 tapes for an incremental backup on a daily basis. The

same enterprise requires a minimum of 25 tapes for a full backup with a data size of approximately 27.1 TB. According to advances in expanse of data, number of tapes required may increase. This increase may be fulfilled by procurement of tapes every year.

The total number of tapes required by an enterprise in a year in initial stages is

approximately 375-400 tapes. For later stages of backup, a procurement of 80-150 tapes is required every year as per the average of past five years data in the enterprise, which include to the operational expenses of the enterprise.

Table 4, Backup cycle of tape-based Backup and DR

4.4 Backup Operation

4.4.1 Full Backup Operation Full backup was carried out three times and average value of these three noted values

was taken. This experiment was performed for several weeks. In initial value, tape-based full backup was quick when compared to cloud-based full backup. In second and third values, time taken to perform full backup was equable. In fourth value, tape-based backup was comparatively time taking procedure when compared to cloud-based method. And in fifth value, time taken for tape-based backup was slightly higher than that of cloud-based method due to difference in the network load as shown in figure 11. Both VM level and file level backups were conducted, where the fourth value represents file level backup. The tape-based backup time was added with an extra 180 minutes to compensate the time required to manually insert tapes, store them after a backup in the local storage with precaution.

Page 27: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

25

Figure 11, Full back up comparison of tape-based and cloud-based

backup and DR

4.4.2 Incremental Backup Incremental backup was carried out three times and mean value of these three noted

values was taken. On all the days of incremental backup, cloud-based method was exceedingly time taking when compared to tape-based Backup and DR. Different data with varying size from 1205 MB to 2515 MB were considered for this experiment, where time taken in cloud-based backup was between 50-100 minutes range and tape-based backup took time in a range of 180-200 minutes. The addition of 180 minutes required for inserting tapes and storing them after backup at local storage were considered here.

In tape-based Backup and DR, number of hours required for transportation is to be examined as they play a vital role in total number of hours required for backing up and storing them at offsite storage site. The values plotted for both tape-based and cloud-based Backup and DR plans can be shown in figure 12. Typically, tape-based Backup and DR team receives recovery requests at the rate of 50 per year. The rate of change in data during this experiment was approximately between 2- 5%.

Figure 12, Incremental backup comparison of tape-based and cloud-based backup

and DR

Page 28: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

26

4.5 Recovery Operation Data is usually salvaged from storage media for recovery. It is the elementary function

of having a backup plan in an enterprise. An experiment was conducted on VMs, which are deployed with both tape-based and cloud-based Backup, and DR plans in LAN. An additional estimated value of 480 minutes was added to the time required in retrieving tapes from offsite on identification of them, bringing them onsite, then prepping tapes and accessing the data required. This procedure may be avoided in case of cloud-based Backup and DR saving time.

4.5.1 VM Level Recovery The time taken for carrying out recovery operation in tape-based Backup and DR with

Symantec Backup Exec 2010 was unquestionably higher when compared to time taken for recovery of the same sized data in cloud-based Backup and DR plan with Asigra. The graph is a visual representation of observed values attained through recovery operation and three iterations were performed to attain accuracy. The average value was noted and used to plot graphs as shown in figure 13. Tape-based backup in VM level recovery has taken 300 minutes more than that cloud-based backup.

Figure 13, Recovery on VMs on tape-based and cloud-based backup and DR

4.5.2 File Level Recovery Recovery operation was carried out on file level basis on two different files with

different sizes each having three iterations each and average values have been noted and plotted in graph as shown in figure 14. The restoration procedure in cloud-based backup and DR requires a console to access data and mark its destination to begin procedures of restoration as required. Whereas, in case of tape-based Backup and DR, tapes are to be identified and retrieved from offsite storage by searching for required tapes using bar coded system. Once, they are brought to main site, they are inserted into tape libraries with help of an encryption key as they are hardware encrypted. Upon which restoration procedures begin making it a complex and time-consuming procedure.

Page 29: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

27

Figure 14, Recovery on files in tape-based and cloud-based backup and DR

4.6 CPU Utilizations In figure 15, behavior of CPU utilization of the backup server values are plotted in tape-

based and cloud-based Backup and DR. The data points have been plotted at intervals of three minutes and the sudden increment and decrement in the plotted graph illustrate beginning and end of backup operation. Both CPU and memory utilization have been impacted upon the change in network load. This experiment has been performed while taking a full backup during a weekend which factors to the smaller loads. From figure 15, we can clearly state that cloud-based Backup and DR had less CPU utilization than tape media using Backup and DR though both ambit between 26-30 % of CPU utilization values at most instants of time. The CPU and memory utilization values were demonstrated by using PerfMon monitoring tool from the counter values [42].

Figure 15, Graphical representation of CPU utilization

Page 30: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

28

4.7 Memory Utilizations In figure 16, behavior of memory utilization of the backup server values are plotted in

tape-based and cloud-based Backup and DR. The data points have been plotted at intervals of three minutes. From figure 16, we can clearly state that cloud-based Backup and DR had less memory utilization than tape-based Backup and DR. In cloud-based, values range between 15-20% whereas in tape-based, values are 20% and above in the considered memory utilization values at most instants of time. The read and write operations of tapes while conducting backup and recovery procedures may have impact on the compute resource utilization values. Using PerfMon monitoring tool, we retrieved these values.

Figure 16, Graphical representation of memory utilization

4.8 RPO and RTO In table 5, RPO and RTO for both tape-based and cloud-based Backup and DR are

defined. These values define threshold limits of enterprise unto which backup and recovery operations can be carried out without any fuss. The RTO value of tape-based model lies between 5 hours and 24 hours depending on age of data required for restoration. If the data to be restored is at main site of operation, it requires less time but if it is located at the offsite storage, it requires more time for media retrieval. The RPO value in cloud-based Backup and DR is considerably small due to the scheduled backup taking place every hour to backup data onto cloud. To backup data every hour in tape-based Backup and DR, large number of tapes will be required which is not preferred. Hence, the RPO value is affected.

Type of backup RTO RPO Tape  based   5  to  24hrs   12hrs  Cloud  based   04hrs   1hr  

Table 5, RPO and RTO

In figure 17, a clearer view of RPO and RTO in an enterprise during an occurrence of disaster can be seen. From the occurrence of last full backup taken to the point where a disaster has occurred can be defined as recovery point objective and the point of disaster occurrence till the time of business being normal again can be defined as recover time objective. In every enterprise, these values play a crucial role while selecting a Backup and DR system for the enterprise.

Page 31: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

29

Figure 17, RPO and RTO when disaster occurs[32]

4.9 Total Cost of Ownership Total Cost of Ownership deals with involved necessitates concerning costs to deploy a

new technology in an enterprise. Tape-based Backup and DR requires a lot of initial capital expenses for setup of datacenter, buying infrastructure such as backup servers, tape libraries, tapes. In cloud-based Backup and DR, capital expenses are minimum due to the absence of high-end infrastructure requirements.

For procurement of tapes, maintenance and outages every year operational costs are increased. For buying storage every year, operational costs are increased in cloud-based Backup and DR. The Asigra cloud-based Backup and DR costs approximately 81 INR (1.2$) per GB as there is a cloud provider involved making the costs minimal and due to very little installation required. Though the exact values could not be provided for monitoring, transportation of tapes and IT staff involved in table 6. These affect TCO highly and are to be considered based on the enterprise and can be avoided in case of cloud-based Backup and DR except staff, which is minimal in cloud solution comparatively.

Table 6, Tape-based Backup and DR TCO

Summary

From these results, it is noticeable that cloud-based Backup and DR proves to be better

in the experimental environment. The resource utilization was comparatively high in tape-based Backup and DR plan due to processing rates of reading and writing the data on to the tapes.

In experiment relating to backup operation, the backup operations were scheduled. In

Page 32: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

30

tape-based Backup and DR, manual power was a continuous requirement as tapes had to be inserted into tape library every morning. Whereas, files are chosen along with destination set to perform backup in cloud-based Backup and DR. In tape-based Backup and DR, an addition of 180 minutes can be seen due to the operations involving manual insertion of tapes everyday, storing them in airtight container and sending them to the local site for storage of 7 days and then sending them to the offsite storage after 7 days.

In experiment relating to recovery operation, 480 minutes were added to time taken to

recover in tape-based Backup and DR due to several procedures involved such as transporting, identifying, accessing and inserting tapes manually in tape libraries. This time was a minimum estimate of time required to carry out all operations. Whereas, in cloud-based Backup and DR, files can be accessed and selected along with destination set to perform restoration with the help of a management console.

In Asigra solution, RPO value is 1 hour due to Continuous Data Protection, CDP. It

continuously monitors and backs up changes in files every hour to the cloud with no extra charges applied[33]. The RTO in tape-based Backup and DR may vary from 5 hours to 24 hours based on prioritization and criticality of requested information for restoration in enterprise. The infrastructure costs required by tapes-based were exorbitant when compared to infrastructure costs required by cloud-based Backup and DR.

Page 33: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

31

5 ANALYSIS

Dependability always plays a crucial role in enterprises when backup or recovery operations are under question. Data loss is a compelling factor for any enterprise as it has the capability to even bring closure to the enterprise. A good business continuity plan must always consist an appropriate Backup and DR solution suiting the enterprise architecture.[19]

During our literature study, we came across many participants who were keen on shifting

to the new, improved technologies but cannot shift immediately. Many enterprises pool up an internal or external panel to conduct a periodic review of Backup and DR plan used currently and decision is made precisely based on factors such as budget, need for improvement in their technologies and other factors.

This led us to choose selective performance parameters to obtain their absolute values

and compare them. Results drawn from above experiments are of immense use to the enterprise in reviewing their existing Backup and DR plan. Also, the parameters under question have not been focused upon enough in previous research works. While different traditional Backup and DR plans have been researched upon, performance related parameters haven’t been given much acclaim, which is gained through this thesis study.

5.1 Research questions RQ1. What are the challenges faced by enterprises with multiple devices coming in IT estates and backing them up?

The results obtained from survey prove that primary challenges faced by enterprises are its exponential growth in data and backing up their remote/branch office data according to 30% of the survey participants. Nearly 52% of the participants reported their Backup and DR systems to be an expensive scheme further making enterprise prone to prolonged time frames.

The installation of agent on each device prevailing in the enterprise is required making it

a tedious procedure every time there is an update of software requiring to reboot all devices. Unlike this Backup and DR, Asigra does not require installation of agents on all devices and is a centralized agentless Backup and DR system making it easier for deployment, needs less effort during software updates and lower downtime for the enterprise as shown in figure 9 and figure 10.

From the survey, the following analysis can be done. With multiple systems such as

laptops, handhelds, desktops introduced in an enterprise, this may be due to the immersion of branch/remote offices. These machines need to be backed up increasing the data size and eventually the resources utilized may increase. This may have a direct impact on the time consumption and increase in storage space requirement. More devices require licensing and agents installed on each of them making it expensive. RQ2. What is the CPU utilization and memory utilization for tape-based and cloud-based backup methods?

In figure 15 and figure 16, graphs are plotted from conducting an experiment where CPU utilization, memory utilization values are captured on a tape-based, cloud-based Backup and DR while a backup operation is taking place using PerfMon monitoring tool.

Page 34: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

32

In both scenarios, cloud-based backup and DR had proven to be a better solution in

terms of CPU, cloud-based was on the lower end of 26- 30% and tape-based was on higher end of 26- 30%. Cloud-based had 15 -20 % of memory utilization whereas tape-based had 20% and above memory utilization values. The processing of tapes may impact on high computational resource utilization.

RQ3. How can one evaluate performance of tape-based and cloud-based Backup and DR in terms of RPO, RTO, AOB, Time taken to backup, Time taken to recover and Total Cost of Ownership?

In table 5 and figure 17, RPO and RTO have been defined for both tape-based and cloud-based Backup and DR. The differences in RTO may be defined on the basis of age of data required for restoration. And RPO values differ due to the innumerable tape requirement and continuous data backed up onto cloud every hour.

The bar graphs in figure 11 and figure 12 show the differentiation in tape-based and

cloud-based Backup and DR plans while conducting full back and incremental backup operations. The bar graphs in figure 13 and figure 14 show the differentiation in recovery procedure at VM level and file level of backup. The tape solution had extra time involving media transportation, which could be avoided in cloud solution.

The total cost of ownership has been described as shown in table 6. We have conducted

further study to understand what factors may impact this parameter in Backup and DR plans respectively. These results can be seen in Appendix B

5.2 Limitations • The values have been taken at different times leading different network loads at

different time intervals. • This study was expensive and access was limited to minimal features of the

software. • Availability of employees to monitor our work was crucial and led to a slight delay

in retrieving the required values. • Various permissions from higher authorities were required to enter the data center

premises and understand procedures directly. • Due to security purposes, name of company and screenshots could not be availed in

this study.

Page 35: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

33

6 CONCLUSION AND FUTURE WORK

In 2009, Microsoft subsidiary lost information of thousands of user data following a service disruption, which could never be retrieved[37]. In 2010, Salesforce.com had disruptive services for almost one hour and thirty minutes making its offerings unavailable to its customer users, which has been recovered soon. Backup and recovery are time taking tasks but play a vital role. While traditional methods are being followed for their advantages but they pose challenges, which are leading the companies to shift towards the tapeless backup [38].

Backing up multiple systems (server, desktop, etc.) is a struggle, large-scale

organizations face today[39]. This thesis deals with understanding such challenges and reconstructing existing Backup and DR systems making enterprises grow into beneficiaries.

We have conducted a survey to understand these challenges by attaining valuable

information from participants of survey who have showed enthusiasm and have provided insights on how technology in their enterprise has evolved over the years.

A case study was conducted keeping in mind the performance framework where study

was conducted on an enterprise network with tape-based solution being used to give absolute values. Thereby giving us an opportunity to learn terms on an industrialized view.

Then on same setup, cloud-based solution was implemented to compare these attained

values obtained from case study where parameters considered were RTO, RPO, age of backup, time taken to backup, time taken to recover, TCO, CPU utilization and memory utilizations of backup server.

It is clearly evident from the results that cloud-based Backup and DR has proven to be

better with regard to TCO, RTO and RPO. It provides advantages such as: Low total cost of ownership: This system requires minimal capital costs. Easier deployment: Traditional backup solutions are more complex as they need more pieces of software or agents. Cloud-based solutions generally have less complexity and have less pieces of software to manage. Its deployment is very easy, as it does not require an agent to be installed on every existing device in enterprise. Flexible: With login credentials, a user can access the data from any location. Easier upgrades: Software updates do not require rebooting of all devices. Lower downtime with limitations as shown in[40].

It is very difficult to schedule recovery drills with tape-based solutions because the complexity of the logistics will make it very difficult. It is easier to perform recovery drills with cloud-based solutions.

Cloud-based solutions offer built-in features like multi-tenancy, encryption and

compression, which are not offered by tape-based solution.

But there are several disadvantages, which are: Security: As backup media do not leave the data center in traditional backup, it is

less vulnerable to security threats, but loss of tapes, inability to find the right tapes

Page 36: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

34

during restores, tapes falling off a truck are quiet common. Cloud backup solutions are sometimes susceptible to security worries, because data travels over networks, subject to hacking. Depending on cloud deployment, the information may also be available to Cloud Provider making security of enterprise data questionable. Vijaykumar Javaraiah[41] has provided with services that can be on hardware platform to avoid dependency on CSP but with a limitation of capacity.

Storage space: Enterprise need to keep on increasing the size of online storage

according to growth in data. Bandwidth Utilization: One of the main factors in cloud-based Backup and DR.

It is required for both data transfer and storage operations. It has impact on recovery timeframe.

Many cloud solutions are not comprehensive; they will backup only certain types of data

but not all. Traditional backup solutions can usually backup any kind of data (except SaaS applications).

6.1 Future work The advancing technologies with the need for improvement in aspects proving to be a

requirement for enterprises keep on growing by the day. Such features include improvement in timeliness, better RTO and RPO, faster restoration, high availability, etc.

An effective Backup and DR plan implementation in enterprises is of prime standing.

Not many research works have been conducted that can be readily available for enterprises to rely upon. They are hiring research teams to understand how they can improve the existing technologies without a high impact on budget.

CPU and memory utilization values may be evaluated while conducting a recovery

operation. Different cloud providers may be chosen to achieve deeper understanding. Enterprises are always concerned with security while using cloud-based Backup and DR.

Research can be held in this course making cloud technologies reliable for industries. A hybrid model with combinational media may have an effective disaster recovery approach, which may be further investigated.

Page 37: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

35

REFERENCES

[1] “Survey Reveals Lack of Understanding by Business Executives of the Value of Disaster Recovery and Business Continuity to Organizational Success.” [Online]. Available: http://www.prnewswire.com/news-releases/survey-reveals-lack-of-understanding-by-business-executives-of-the-value-of-disaster-recovery-and-business-continuity-to-organizational-success-62118217.html. [Accessed: 06-Sep-2015]. [2] C. Brooks, M. Bedernjak, I. Juran, and J. Merryman, Disaster recovery strategies with Tivoli storage management. IBM Corp., 2002. [3] O. P. Rotaru, “Beyond Traditional Disaster Recovery Goals–Augmenting the Recovery Consistency Characteristics,” in Proceedings of the International Conference on Software Engineering Research and Practice (SERP), 2012, p. 1. [4] “Disaster Recovery: Best Practices  [High Availability],” Cisco. [Online]. Available: http://www.cisco.com/en/US/technologies/collateral/tk869/tk769/white_paper_c11-453495.html. [Accessed: 03-Sep-2015]. [5] A. Chervenak, V. Vellanki, and Z. Kurmas, “Protecting file systems: A survey of backup techniques,” in Joint NASA and IEEE Mass Storage Conference, 1998. [6] T. Wood, E. Cecchet, K. K. Ramakrishnan, P. Shenoy, J. Van Der Merwe, and A. Venkataramani, “Disaster recovery as a cloud service: Economic benefits & deployment challenges,” in 2nd USENIX workshop on hot topics in cloud computing, 2010, pp. 1–7. [7] D. Zissis and D. Lekkas, “Addressing cloud computing security issues,” Future Gener. Comput. Syst., vol. 28, no. 3, pp. 583–592, 2012. [8] N. J. King and V. T. Raja, “Protecting the privacy and security of sensitive customer data in the cloud,” Comput. Law Secur. Rev., vol. 28, no. 3, pp. 308–319, 2012. [9] G. F. Hughes, T. Coughlin, and D. M. Commins, “Disposal of disk and tape data by secure sanitization,” Secur. Priv. IEEE, vol. 7, no. 4, pp. 29–34, 2009. [10] Y. ZHANG, Z. LI, and D. HE, “A Survey on Disaster Backup and Recovery Techniques [J],” Comput. Eng. Sci., vol. 2, p. 037, 2005. [11] “Gartner Survey Shows Data Growth as the Largest Data Center Infrastructure Challenge.” [Online]. Available: http://www.gartner.com/newsroom/id/1460213. [Accessed: 06-Sep-2015]. [12] R. R. Schulman, “Disaster recovery issues and solutions,” Hitachi Data Syst. White Pap., 2004. [13] R. Xia, X. Yin, J. Alonso Lopez, F. Machida, and K. S. Trivedi, “Performance and Availability Modeling of ITSystems with Data Backup and Restore,” Dependable Secure Comput. IEEE Trans. On, vol. 11, no. 4, pp. 375–389, 2014. [14] C. G. Rudolph, “Business continuation planning/disaster recovery: a marketing perspective,” Commun. Mag. IEEE, vol. 28, no. 6, pp. 25–28, 1990. [15] O. H. Alhazmi and Y. K. Malaiya, “Assessing disaster recovery alternatives: On-site, colocation or cloud,” in Software Reliability Engineering Workshops (ISSREW), 2012 IEEE 23rd International Symposium on, 2012, pp. 19–20. [16] “Unitrends KB  :: When are synthetic backups created?” [Online]. Available: http://support.unitrends.com/ikm/questions.php?questionid=980. [Accessed: 08-Sep-2015]. [17] A. Lenk and S. Tai, “Cloud standby: disaster recovery of distributed systems in the cloud,” in Service-Oriented and Cloud Computing, Springer, 2014, pp. 32–46. [18] “What is differential backup? - Definition from WhatIs.com,” SearchDataBackup. [Online]. Available: http://searchdatabackup.techtarget.com/definition/differential-backup. [Accessed: 03-Sep-2015]. [19] A. Zrnec and D. Lavbič, “Comparison of Cloud vs. Tape Backup Performance and Costs with Oracle Database,” J. Inf. Organ. Sci., vol. 35, no. 1, pp. 135–142, 2011.

Page 38: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

36

[20] A. the A. J. H. J. H. is an A. I. S. in G. S. D. J. joined I. D. in 2005 as a S. A. with focus on T. S. M. F. some years he was responsible for the customers in the shipping industry M. recently and he has been working as T. L. in B. S. I. the delivery centers in Denmark, “Incremental forever backup versus traditional backup,” IT Service Management 360. . [21] “What are synthetic backups?,” Techworld. [Online]. Available: http://www.techworld.com/storage/what-are-synthetic-backups-1149/. [Accessed: 03-Sep-2015]. [22] X. Shi, K. Guo, Y. Lu, and X. Chen, “Survey on Data Recovery for Cloud Storage,” in Trustworthy Computing and Services, Springer, 2014, pp. 176–184. [23] M. Wiboonratr and K. Kosavisutte, “Optimal strategic decision for disaster recovery,” Int. J. Manag. Sci. Eng. Manag., vol. 4, no. 4, pp. 260–269, 2009. [24] “Enterprise Backup, Recovery, and Archive Products and Solutions | EMC.” [Online]. Available: http://www.emc.com/data-protection/index.htm. [Accessed: 08-Sep-2015]. [25] O. H. Alhazmi and Y. K. Malaiya, “Evaluating disaster recovery plans using the cloud,” in Reliability and Maintainability Symposium (RAMS), 2013 Proceedings-Annual, 2013, pp. 1–6. [26] T. Hammond, “Research: 68% report cost is biggest data storage pain point,” TechProResearch. [Online]. Available: http://www.techproresearch.com/article/research-68-report-cost-is-biggest-data-storage-pain-point/. [Accessed: 03-Sep-2015]. [27] C. R. Kothari, Research methodology: Methods and techniques. New Age International, 2004. [28] M. N. Saunders, M. Saunders, P. Lewis, and A. Thornhill, Research methods for business students, 5/e. Pearson Education India, 2011. [29] R. K. Yin, Case study research: Design and methods. Sage publications, 2013. [30] “Cloud Backup Software | Asigra.” [Online]. Available: http://www.asigra.com/cloud-backup-software. [Accessed: 06-Sep-2015]. [31] “Data Backup and Recovery: Backup Retention.” . [32] “Application RTO and RPO Explained. Part 1: Definitions - SIS.” [Online]. Available: http://thinksis.com/blog/infrastructure/storage/application-rto-and-rpo-explained-part-1-definitions. [Accessed: 08-Sep-2015]. [33] “Continuous Data Protection | Asigra.” [Online]. Available: http://www.asigra.com/CDP-for-service-providers. [Accessed: 08-Sep-2015]. [34] T. Ries, V. Fusenig, C. Vilbois, and T. Engel, “Verification of data location in cloud networking,” in Utility and Cloud Computing (UCC), 2011 Fourth IEEE International Conference on, 2011, pp. 439–444. [35] M. A. Khoshkholghi, A. Abdullah, R. Latip, S. Subramaniam, and M. Othman, “Disaster Recovery in Cloud Computing: A Survey,” Comput. Inf. Sci., vol. 7, no. 4, Sep. 2014. [36] S. K. Sood, “A combined approach to ensure data security in cloud computing,” J. Netw. Comput. Appl., vol. 35, no. 6, pp. 1831–1838, Nov. 2012. [37] J. Kincaid, “T-Mobile Sidekick Disaster: Danger’s Servers Crashed, And They Don’t Have A Backup,” TechCrunch. . [38] “Salesforce.com Rings in New Year With Massive Service Disruption,” AllThingsD. . [39] “Using Multiple Backup Systems Costs Companies Millions in Downtime and Data Loss.” [Online]. Available: http://www.r1soft.com/blog/using-multiple-backup-systems-costs-companies-millions-in-downtime-and-data-loss. [Accessed: 06-Sep-2015]. [40] “Microsoft Azure had more downtime in 2014 than main cloud rivals,” ComputerWeekly. [Online]. Available: http://www.computerweekly.com/news/2240238379/Microsoft-Azure-had-more-downtime-than-main-cloud-rivals. [Accessed: 06-Sep-2015].

Page 39: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

37

[41] V. Javaraiah, “Backup for cloud and disaster recovery for consumers and SMBs,” in 2011 IEEE 5th International Conference on Advanced Networks and Telecommunication Systems (ANTS), 2011, pp. 1–3. [42] AnandK@TWC, “Perfmon or Performance Monitor in Windows 10 / 8 / 7,” The Windows Club. [43] Symantec, ”Symantec NetBackup™ Better Backup for a Virtual World,” [Online]. Available: http://www.symantec.com/content/en/us/enterprise/fact_sheets/bnetbackup-ds-21324986.pdf. [Använd 18 Mars 2015].

Page 40: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

38

APPENDIX A The survey conducted had the following responses as shown below.

Page 41: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

39

Page 42: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

40

Page 43: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

41

Page 44: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

42

Page 45: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

43

Page 46: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

44

Page 47: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

45

APPENDIX B These factors contribute to Total Cost of Ownership in an enterprise. Depending on requirements of an enterprise, following factors may be included or excluded from their Backup and DR plan. Total Cost of Ownership-Constituents of different costs for Tape/ Disk Backup

• Capital cost

o License cost + base/server cost § Hardware purchase § Software purchase § Local and remote data circuits § Storage area networking § Security and Encryption § Compression § DR site costs § Client side cost § Cost of growth

o Integration § Backup infrastructure § Backup media § CIFS (Common Internet File System) or NFS (Network File

System) o Migration

§ Migration (lifecycle costs of the storage system), remastering (data lifecycle costs)

• Operational cost o Training

§ Backup and disaster recovery labor (DR planning and testing) o Insurance o IT staff

§ Storage management labor (upgrades, troubleshooting, load balancing, tuning)

§ Monitoring costs o Management time

§ Cost of disaster risk, business resumption § Recovery time objective and recovery point objective (RTO and

RPO) o Electricity

§ Power consumption § Cooling

o Floor space § Data center floor space

o Outage costs § Cost of scheduled outage § Cost of unscheduled outage (machine related) § Cost of unscheduled outage (people and process related)

o Backup and recovery cost § Hardware maintenance § Software maintenance § Cost of performance

o Cost of procurement

Page 48: Effectiveness of Backup and Disaster Recovery in Cloud861846/FULLTEXT01.pdf · A Comparative study on Tape and Cloud based Backup and Disaster Recovery . i i This thesis is submitted

46

o Transportation costs • Risk cost

o Cost of waste o Cost of duplicate data o Data loss - Loss of reputation (which is immeasurable) o Litigation, e-discovery risk (lawsuits) o Reduction of hazardous waste o Cost of risk with backup windows o Noncompliance risk (archive, data retention) - negative publicity

• Opportunity cost (opportunity cost is the value of the opportunity lost).