39
DATA BACKUP/RECOVERY DATA BACKUP/RECOVERY TRAINING TRAINING Dehner M. De Leon Dehner M. De Leon Officer – Database Administration Officer – Database Administration IRRI Social Sciences Division IRRI Social Sciences Division

Data Backup Recovery Training

Embed Size (px)

DESCRIPTION

Dehner M. De LeonOfficer – Database AdministrationIRRI Social Sciences Division

Citation preview

DATA BACKUP/RECOVERY DATA BACKUP/RECOVERY TRAININGTRAINING

Dehner M. De LeonDehner M. De Leon

Officer – Database AdministrationOfficer – Database Administration

IRRI Social Sciences DivisionIRRI Social Sciences Division

WHY THERE IS A NEED?WHY THERE IS A NEED?

The unexpected computer The unexpected computer glitches always happen glitches always happen

whether it's the hardware whether it's the hardware failure, failure,

files become corrupted, files become corrupted,

viruses attacksviruses attacks

WHY THERE IS A NEED?WHY THERE IS A NEED?

power failures and spikes, or power failures and spikes, or

you accidentally delete an essential you accidentally delete an essential file, orfile, or

something else makes it impossible to something else makes it impossible to open and read a file. open and read a file.

WHY THERE IS A NEED?WHY THERE IS A NEED?As per IRRI’s BCM-RMQA As per IRRI’s BCM-RMQA

Yearly risk assessment (2008-2010)Yearly risk assessment (2008-2010) Possible loss of research dataPossible loss of research data

o Risk sources – technological failureRisk sources – technological failureo Physically hazardous environmentsPhysically hazardous environmentso Lack of training/ awarenessLack of training/ awareness

WHY THERE IS A NEED?WHY THERE IS A NEED?

As per IRRI’s BCM-RMQAAs per IRRI’s BCM-RMQA

oDisasters/ CalamitiesDisasters/ Calamities

oRelocation of IT Equipments (2011)Relocation of IT Equipments (2011)

SSD Data Backup/ Retrieval ProcedureSSD Data Backup/ Retrieval Procedure

Objective:Objective:

The purpose of this procedure is to The purpose of this procedure is to safeguardsafeguard all the all the important/working files and important/working files and databasesdatabases at IRRI SSD/GIS Workstations and at IRRI SSD/GIS Workstations and laptops. laptops. in compliancein compliance with IRRI’s Business with IRRI’s Business Continuity Management (BCM). To the event Continuity Management (BCM). To the event that a data was loss a quick recovery plan is in that a data was loss a quick recovery plan is in place. place.

SSD Data Backup/ Retrieval ProcedureSSD Data Backup/ Retrieval ProcedureData to be backed Data to be backed

upup

Data commonly stored on the secondary Data commonly stored on the secondary partitionpartition

WHAT IS A PARTITION?WHAT IS A PARTITION?Disk partitioningDisk partitioning is the act of dividing a  is the act of dividing a hard disk drivehard disk drive into multiple logical  into multiple logical storage units referred to as storage units referred to as partitionspartitions, to , to treat one physical disk drive as if it were treat one physical disk drive as if it were multiple disksmultiple disks

500GB

250GB

250GB

1 Drive 1 Drive/ 2 Partitions

Primary Partition

Secondary Partition

•System Drive C:•Program Files•Temp Files

•Data Drive E:•My Docs•pst etc.•Customized Programs

OS?Data?Etc?

SSD Data Backup/ Retrieval ProcedureSSD Data Backup/ Retrieval ProcedureData to be backed Data to be backed

upup

Commonly stored on the secondary Commonly stored on the secondary partitionpartition Research Data Files: Research Data Files: All of your files including All of your files including

Microsoft Office documents, spreadsheets, Microsoft Office documents, spreadsheets, presentations, databases, graphics files, audio & presentations, databases, graphics files, audio & video files, software, etc. video files, software, etc.

Outlook – pst files Outlook – pst files

SSD Data Backup/ Retrieval ProcedureSSD Data Backup/ Retrieval ProcedureData that are not necessary to Data that are not necessary to

backup backup

Files on Files on USB Flash “thumb” drives USB Flash “thumb” drives Partition Drive C: (Operating System)Partition Drive C: (Operating System) NAS or \\Netwin\SSDNAS or \\Netwin\SSD Stored data on the CloudStored data on the Cloud Non-official dataNon-official data If the source data capacity is larger than the backup If the source data capacity is larger than the backup

storage capacity (520GB > 500GB)storage capacity (520GB > 500GB)

What is a Backup?What is a Backup?

In In Information TechnologyInformation Technology, a , a backupbackup or the process  or the process of of backing upbacking up refers to making copies of  refers to making copies of datadata so  so that these additional copies may be used that these additional copies may be used to to restorerestore the original after a  the original after a data lossdata loss event . event .

Synchronizing/ Mirroring -  in computing is the process of Synchronizing/ Mirroring -  in computing is the process of making sure that files in two or more locations are updated making sure that files in two or more locations are updated through certain rules through certain rules

Imaging - A Imaging - A disk imagedisk image is a single  is a single filefile or storage device  or storage device containing the complete contents and structure representing a containing the complete contents and structure representing a data storage medium or device data storage medium or device

Modes of BackupModes of Backup

One-way BackupOne-way Backup

Source Target

Two-way BackupTwo-way Backup

One-Way SynchronizationOne-Way Synchronization (a.k.a. file mirroring / file (a.k.a. file mirroring / file replication / file backup): replication / file backup):

Files are expected to change in one location only. To Files are expected to change in one location only. To reconcile the changes, the synchronization process reconcile the changes, the synchronization process copies files only in one direction. The two locations copies files only in one direction. The two locations are not considered equivalent. One location is are not considered equivalent. One location is considered the Source and the other is considered considered the Source and the other is considered the Target. Files are pushed from Source to Target the Target. Files are pushed from Source to Target (or files are pulled from Source to Target, but always (or files are pulled from Source to Target, but always in one direction only). Source is said to be mirrored in one direction only). Source is said to be mirrored to Target. to Target.

Source Target Two-Way SynchronizationTwo-Way Synchronization (a.k.a. bi-directional (a.k.a. bi-directional synchronization or both-ways synchronization): synchronization or both-ways synchronization):

This synchronization process copies files in both directions This synchronization process copies files in both directions to reconcile changes as needed. Files are expected to reconcile changes as needed. Files are expected to change in both locations. The two locations are to change in both locations. The two locations are considered equivalent.considered equivalent.

Modes of BackupModes of Backup

One-way BackupOne-way Backup

Source Target

Two-way BackupTwo-way Backup

ProsPros File ReplicationFile Replication File BackupFile Backup Can keep Can keep

previous dataprevious data

Source Target

ConsCons Consumes more Consumes more

space if unattendedspace if unattended New file on target New file on target

will not reflect to will not reflect to SourceSource

File File deleted/renamed to deleted/renamed to Source will remain Source will remain in Targetin Target

ProsPros Source and Source and

Target are up to Target are up to datedate

ConsCons Once deleted at the Once deleted at the

Source same goes Source same goes to the Target to the Target

What are the available Backup Systems out What are the available Backup Systems out there??there??

Proprietary Proprietary

Bundles with your External Storage Drive:Bundles with your External Storage Drive:

Seagate, Maxtor, Western Digital, etc.Seagate, Maxtor, Western Digital, etc.

Free Windows Backup SoftwareFree Windows Backup Software

Cobian,Todo, Delta copy, Ace, Sync Toy, Windows 7 new featureCobian,Todo, Delta copy, Ace, Sync Toy, Windows 7 new feature

What are the available Backup Systems out What are the available Backup Systems out there?there?

One-way BackupOne-way Backup

Source Target

Two-way BackupTwo-way Backup

ProsPros File ReplicationFile Replication File BackupFile Backup Can keep Can keep

previous dataprevious data

Source Target

ConsCons Consumes more Consumes more

space if unattendedspace if unattended New file on target New file on target

will not reflect to will not reflect to SourceSource

File File deleted/renamed to deleted/renamed to Source will remain Source will remain in Targetin Target

ProsPros Source and Source and

Target are up to Target are up to datedate

ConsCons Once deleted at the Once deleted at the

Source same goes Source same goes to the Target to the Target

2 Main assets of SSD2 Main assets of SSD

1. Data1. Data

2 Main risks2 Main risks1. Loss / Quality of Data 1. Loss / Quality of Data

2. People2. People

2. Loss of Life 2. Loss of Life (endanger)(endanger)

Let’s focus on DataLet’s focus on DataThe SSD DatabaseThe SSD Database

A comprehensive digital databaseA comprehensive digital database It is an integrated data center for the socio-It is an integrated data center for the socio-

economic data on rice production system at economic data on rice production system at the farm level, rice demand, supply and related the farm level, rice demand, supply and related data on the national, and regional leveldata on the national, and regional level

It is composed of primary and secondary data It is composed of primary and secondary data collected through the research projects and collected through the research projects and activities of SSD activities of SSD

It is organized and accessible in a user friendly It is organized and accessible in a user friendly fashionfashion

Household Survey DatabaseHousehold Survey Database It is a rich collection of actual farm It is a rich collection of actual farm and household level data on rice and household level data on rice production collected throughproduction collected through

personal farmer personal farmer interviews, interviews, farm record keeping, and farm record keeping, and periodic monitoringperiodic monitoring

of farm activities from various sites of farm activities from various sites in different rice growing countries of in different rice growing countries of Asia.Asia.

Core activities of SSDCore activities of SSD

Data Collection

Data Management

Data Analysis

SSD Work Process FlowSSD Work Process Flow

ConceptualizeConceptualize

Pre-testing of the Pre-testing of the survey survey questionnairequestionnaire

SSD Work Process FlowSSD Work Process Flow

Field SurveyField Survey

Training of Training of enumeratorsenumerators

SSD Work Process FlowSSD Work Process FlowCoding and do quality Coding and do quality control of the datacontrol of the data..

ConfirmationConfirmation

SSD Work Process FlowSSD Work Process Flow

Data Entry/ CleaningData Entry/ Cleaning

Data AnalysisData Analysis

SSD Work Process FlowSSD Work Process FlowIncorporate resultsIncorporate resultsPublicationsPublications

PresentationsPresentations

Data entry

Data cleaning (key variables)

Upload to public domainShare data among members

Data merge

Data cleaning

Variable construction

Analysis

CSPro

Excel/STATA

STATA (program)

Current status of the household survey Current status of the household survey

data setsdata sets

The SSD survey data sets are all over the place, The SSD survey data sets are all over the place, each researcher kept his own project data set. each researcher kept his own project data set.

It is kept in a format known only to the It is kept in a format known only to the researcherresearcher

No focal person to ask who keep such and such No focal person to ask who keep such and such data setdata set

Lack of standard protocol for the repository of Lack of standard protocol for the repository of data collected by SSD and NARES collaboratorsdata collected by SSD and NARES collaborators

Lack of standard system of Lack of standard system of collecting/organizing the data setscollecting/organizing the data sets

Where’s the Data?? Where’s the Data??

STRASA

Africa Rice CSISA

PRSSP

GSR

VDS

Bohol Project

CGISA

SSD

What steps are involved to build What steps are involved to build this database?this database?

1.1. Develop a standard format that defines Develop a standard format that defines how all data sets by projects must be how all data sets by projects must be formatted—how do we do this?formatted—how do we do this?

involve majority if not all researchers involve majority if not all researchers in SSD who collect, summarize, in SSD who collect, summarize, analyze and manage the data sets of analyze and manage the data sets of all completed and on going projects in all completed and on going projects in SSD. SSD.

a series of meetings to develop a a series of meetings to develop a common template for the data base. common template for the data base.

a workshop was held to implement a workshop was held to implement this template to at least one data set this template to at least one data set for participant and to further refine for participant and to further refine the template in terms of applicability the template in terms of applicability to majority of the data sets in SSDto majority of the data sets in SSD

22. Merge and clean all the datasets to adhere . Merge and clean all the datasets to adhere to a common format, codes etc. to establish to a common format, codes etc. to establish the master file databasethe master file database

3. With further consultation, agreed on which 3. With further consultation, agreed on which components of the data sets will be freely components of the data sets will be freely open to the public and which one will be of open to the public and which one will be of limited access.limited access.

4. Do an additional processing of selected 4. Do an additional processing of selected variables to create the data set that will be variables to create the data set that will be accessible to the general publicaccessible to the general public

Steps involved…Steps involved…

CGISA

Unified DataUnified Data

STRASA

Africa Rice CSISA

PRSSP

GSR

VDS

Bohol Project

Database

Initial accomplishmentsInitial accomplishments

Number of data sets Number of data sets 1010

Inclusive period Inclusive period 1993-20081993-2008

SitesSites 9 countries9 countries

Total number of recordsTotal number of records 6,6226,622

Hundreds of variables:Hundreds of variables: inputs and inputs and outputs of outputs of rice rice production, production, demographics, income, demographics, income,

land profile, water use, land profile, water use, variety planted etc.variety planted etc.

SSD DatabaseSSD Database

Unified Database

IRRINAS

External HDD

HDD RaidSSD WorkstationsSSD Workstations

INTERNETINTERNET

SSD / GIS Servers managed in coordination with SSD / GIS Servers managed in coordination with ITSITS

SSD NAS (1TB) SSD NAS (1TB) Working files only!Working files only!

SignificanceSignificance Once completed it will be the first Once completed it will be the first

comprehensive digital socioeconomic comprehensive digital socioeconomic database on farm level rice production database on farm level rice production in the rice growing areas of Asia in the rice growing areas of Asia (Africa) . (Africa) . for use by researchers, govt and academic for use by researchers, govt and academic

institutions, donors and other interested institutions, donors and other interested members of society members of society

a a gold minegold mine of information on what is of information on what is actually happening at the farmer’s actually happening at the farmer’s fieldfield

To attest SSD’s RMQA complianceTo attest SSD’s RMQA compliance Visit Visit www.irri.orgwww.irri.org under Our Sciences\Social under Our Sciences\Social

Science & EconomicsScience & Economics Farm household survery (Farm household survery (

http://geo.irri.org:8180/householdshttp://geo.irri.org:8180/households)) World rice statistics (World rice statistics (http:/geo.irri.org:8180/wrshttp:/geo.irri.org:8180/wrs))

Procedures are in place (Netwin, Back-up Procedures are in place (Netwin, Back-up monitoring)monitoring)

SSD RMQA poster SSD RMQA poster Awareness (OU heap approval, notifications Awareness (OU heap approval, notifications

via email and mousepad imprint) via email and mousepad imprint)

People that work hard to make this happenPeople that work hard to make this happen

SSD Database TeamSSD Database Team

SSD Household DatabaseSSD Household Database

SDD RMQATeamSDD RMQATeam

THANK YOU!!THANK YOU!!

What will it contain?What will it contain? 1.) List of all completed 1.) List of all completed

research projects of SSD research projects of SSD with some basic information with some basic information about it such as:about it such as:

project titleproject title project sitesproject sites principal researcher principal researcher duration of the projectduration of the project major variables collectedmajor variables collected main objective of the researchmain objective of the research project output (reports and paper project output (reports and paper

published)published)

2. Selected summary tables on basic 2. Selected summary tables on basic information about rice production at the information about rice production at the farm levelfarm level

3. Detailed data by individual household 3. Detailed data by individual household observations on the following ;observations on the following ; land use/profileland use/profile household informationhousehold information inputs on rice production e.g. fertilizer, inputs on rice production e.g. fertilizer,

seeds, pesticides, seeds, pesticides, and labor useand labor use rice production practices – method of crop rice production practices – method of crop

establishment, establishment, variety planted, level of variety planted, level of mechanization etcmechanization etc

costs and returns of rice production by costs and returns of rice production by individual individual sample householdsample household

What will it contain?What will it contain?