Upload
melissa-peterson
View
222
Download
1
Tags:
Embed Size (px)
Citation preview
1
U.S. Department of the Interior
U.S. Geological Survey
Mission Support TeamStorage ArchitecturesMission Support TeamStorage Architectures
Presented ByKen Gacke, SAIC*
U.S. Geological Survey, EROS Data CenterSioux Falls, SD
July, 2004* Work performed under U.S. Geological Survey contract 03CRCN0001
2
Storage Architecture AgendaStorage Architecture Agenda
Online/Nearline Storage Architecture System Backup Architecture
Onsite Short-term system recovery Offsite Disaster Recovery
Archive Storage Long term data preservation
3
Storage Architecture AgendaStorage Architecture Agenda
Online/Nearline Storage Architecture System Backup Architecture
Onsite Short-term system recovery Offsite Disaster Recovery
Archive Storage Long term data preservation
4
Storage ArchitectureStorage Architecture
Online Direct Attached Storage (DAS)
Just a Bunch Of Disk (JBOD): Intermediate processing.
Redundant Array Independent Disk (RAID): Database, Web/ftp, and product generation.
Network Attached Storage (NAS): Office automation.
Storage Area Network (SAN): Clustered File System for High Performance Processing
Nearline: Online disk cache with high performance tape backend
5
Storage ArchitectureStorage Architecture
EDC’s historical nearline experience: EPOCH, AMASS – (1987 – 1993)
Optical AMASS – (1992 – 2000)
Quantum DLT 2000 UniTree – (1992 – 2001)
StorageTek 3480/3490/D-3/9840 DMF, AMASS, LAM (2000 – Present)
StorageTek 9840/9940B
6
Storage ArchitectureStorage Architecture
Multi Tiered Storage Vision Online
Supported Configurations DAS – Local processing such as image processing NAS – Data sharing such as office automation SAN – Production processing
Data accessed frequently Nearline
Integrated within SAN Scalable for large datasets & infrequently accessed data Multiple Copies and/or Offsite Storage
7
Storage Architecture DecisionsStorage Architecture Decisions
Optimized by individual program and program manager, not the enterprise.
Requirements Factors Reliability – Data Preservation Performance – Data Access Cost – $/GB, Engineering Support, O&M Scalability – Data Growth, Multi-mission, etc. Compatibility with current Architecture
Evaluated and recommended through engineering white papers and weighted decision matrices.
8
High Performance RAID Weighted Matrix High Performance RAID Weighted Matrix
Selecton Criteria RW #EMC
CX300EMC
CX500STK D240
STK D220
Ciprico FibreSt
Adaptec SANbloc
NexSan Ataboy
Initial Cost 9 8 5 6 7 4 10 10Support Cost 9 4 4 6 6 5 10 10
Vendor Support 8 9 9 9 9 8 7 5EDC Experience 6 7 7 8 8 7 5 6
Performance 8 8 8 9 8 6 7 6Reliability 9 9 9 9 9 6 7 6
Manageability 8 7 7 9 9 6 7 7Scalability 6 7 8 8 7 7 7 5
SAN Ready 4 8 8 8 8 6 8 8Upgradeable 4 9 9 8 8 7 7 5
Weighted ScoreEMC
CX300EMC
CX500STK D240
STK D220
Ciprico FibreSt
Adaptec SANbloc
NexSan Ataboy
Initial Cost 72 45 54 63 36 90 90Support Cost 36 36 54 54 45 90 90
Vendor Support 72 72 72 72 64 56 40EDC Experience 42 42 48 48 42 30 36
Performance 64 64 72 64 48 56 48Reliability 81 81 81 81 54 63 54
Manageability 56 56 72 72 48 56 56Scalability 42 48 48 42 42 42 30
SAN Ready 32 32 32 32 24 32 32Upgradeable 36 36 32 32 28 28 20
Total Weighted Score 533 512 565 560 431 543 496
9
Bulk RAID Weighted MatrixBulk RAID Weighted Matrix
Selecton Criteria RW #Nexsan Ataboy2
CESATA
EMC SATA
STK B220
STK D240
Initial Cost 10 10 10 7 9 6Support Cost 10 10 9 5 8 6
Vendor Support 2 5 3 9 9 9EDC Experience 0 6 5 6 7 8
Performance 5 6 6 7 7 9Reliability 1 6 3 7 7 9
Manageability 5 7 4 7 9 9Scalability 1 5 5 7 7 8
SAN Ready 1 8 0 8 8 8Upgradeable 1 5 3 9 9 8
Weighted ScoreNexsan Ataboy2
CESATA
EMC SATA
STK B220
STK D240
Initial Cost 100 100 70 90 60Support Cost 100 90 50 80 60
Vendor Support 10 6 18 18 18EDC Experience 0 0 0 0 0
Performance 30 30 35 35 45Reliability 6 3 7 7 9
Manageability 35 20 35 45 45Scalability 5 5 7 7 8
SAN Ready 8 0 8 8 8Upgradeable 5 3 9 9 8
Total Weighted Score 299 257 239 299 261
10
CR1 Storage in Terabytes – May 2004CR1 Storage in Terabytes – May 2004
8.4
53.7
62.181
Nearline JBOD RAID
11
CR1 SAN/Nearline ArchitectureCR1 SAN/Nearline Architecture
DMF Server
Product Distribution
Tape Drives 8x9840 2x9940B
1Gb Fibre
2Gb Fibre
Disk Cache /dmf/edc 68GB/dmf/doqq 547GB/dmf/guo 50GB/dmf/pds 223GB/dmf/pdsc 1100GB
Ethernet
12
Future Seamless/Silo ArchitectureFuture Seamless/Silo Architecture
Ethernet
DMF
PDS
Tape Library 8x9840 3x9940B
FTP (lxs37)
Web/ExtractTP9300S3TB
TP9400
CIFSMount
Data Servers
13
Storage Architecture AgendaStorage Architecture Agenda
Online/Nearline Storage Architecture System Backup Architecture
Onsite Short-term system recovery Offsite Disaster Recovery
Archive Storage Long term data preservation
14
System Backup Architecture
ITS is responsible for generating system backups to maintain system integrity.
Promote centralized data backup solution to the Projects Legato is used for automated system backups for the Unix (SGI,
SUN, Linux) platforms. ArcServe is used for automated system backups for the Windows
based platform. Fully automated backup solution
Tapes located within tape library Retention period is three months
15
System Backup Architecture
Unix Server Weekly Full backups with daily incremental:
System partitions Local and third party software packages Databases
DORRAN, Earth Explorer, Inventory, Seamless Legato Oracle Module for Very Large Databases
Quarterly Full backups with daily incremental RAID Datasets (DRG, Browse, Anonymous FTP)
Backups with exclusion of image files and large files User data file systems
16
System Backup ArchitectureSystem Backup Architecture
Windows Servers Typically full backups with daily incremental (no exclusions)
Workstations and PCs Generally, no system backups Production workstations within CR1 are backed up
(International, WBV)
17
System Backup ResourcesSystem Backup Resources
Sun E450 (4CPU)Legato -- Unix
StorageTek L700,Six SDLT 220 Drives
Dell 2550 (2 CPU)ArcServe -- Windows
Overland Storage NEO 4100,Three LTO-2 Drives
18
Legato Monthly Data BackupsLegato Monthly Data Backups
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
2000010
/99
04/0
0
10/0
0
04/0
1
10/0
1
04/0
2
10/0
2
04/0
3
10/0
3
04/0
4
GB
Sto
red
Per
Mon
th
Quarterly IT Landsat WebMap Total
19
Offsite Backup Architecture
ITS is responsible for generating offsite backups for disaster recovery
Mission essential data written to media and stored offsite LTO-2 tape generated once per week Data written in an open format (tar) Retention period is three months
Projects currently using offsite storage DORRAN Inventory EarthExplorer Digital Archive LAS Source Code Web Servers
20
Storage Architecture AgendaStorage Architecture Agenda
Online/Nearline Storage Architecture System Backup Architecture
Onsite Short-term system recovery Offsite Disaster Recovery
Archive Storage Long term data preservation
21
Archive StorageArchive Storage
Digital Archive Media Trade Study To analyze offline digital archive technologies and
recommend the next EDC archive media of choice. Criteria in decreasing order of importance:
Reliability: A second copy will reduce risk somewhat, but a reliable technology is mandatory. Reliability is proven over time.
Performance: High capacity saves significant space and high transfer rates speed up transcription.
Cost: The actual drive cost is fairly insignificant, but the media cost is quite important.
22
Archive Media Weighted MatrixArchive Media Weighted Matrix
Selecton Criteria WtSTK
9940BHP
LTO2IBM
LTO2SDLT 600
Sony SAIT
IBM 3592
STK 9940B
HP LTO2
IBM LTO2
SDLT 600
Sony SAIT
IBM 3592
Design criteria 50 10.0 7.1 7.1 6.5 6.5 7.9 500.0 355.0 355.0 325.0 325.0 395.0
Capacity 10 4.1 4.1 4.1 6.8 10.0 6.0 41.0 41.0 41.0 68.0 100.0 60.0
Media cost/TB 25 10.0 9.8 9.8 9.7 9.3 7.9 250.0 245.0 245.0 242.5 232.5 197.5
Compatibility 15 1.7 8.3 8.3 10.0 10.0 6.7 25.5 124.5 124.5 150.0 150.0 100.5
Transfer rate 8.2 8.4 8.9 10.0 9.0 6.9 0.0 0.0 0.0 0.0 0.0 0.0
Drive cost 10 2.4 10.0 6.6 7.7 3.8 1.2 24.0 100.0 66.0 77.0 38.0 12.0
Vendor analyses 4.1 5.8 7.3 9.6 7.6 10.0 0.0 0.0 0.0 0.0 0.0 0.0
Scenario cost 20 4.7 10.0 9.2 9.4 7.6 4.1 94.0 200.0 184.0 188.0 152.0 82.0
Total Weighted Score 934.5 1065.5 1015.5 1050.5 997.5 847.0
FY04 RevisionFY04 Revision
23
System OverviewSystem Overview
Quantity of data to be copied: 2 Copies
Data Set Scenes Data Volume DCTs / HDTs 9940BMSS-P 65,128 3.2 terabytes 118 DCTs 36MSS-A 262,088 9.5 terabytes 277 DCTs 100TM-A 13,733 3.6 terabytes 108 DCTs 40TM-R 386,934 102.2 terabytes 2,357 DCTs 1,040TM-R ~150,000 ~40.4 terabytes ~7,500 HDTs 420
Total: 877,883 158.9 terabytes 10,758 Tapes 1,636
Number of HDTs currently transcribed on TMACS to DCT: 30,500Quantity of HDTs that can be land filled after conversion: 38,000+
24
Big ChangesBig Changes
Your Order is in What Box?!
25
(1 copy) (2 copies) 13 Semis ½ cargo space in
SUV38,000 HDTs < 1800 STK 9940B
Impact
26
(1 copy) (2 copies) 13 Semis 1/3 space of STK
Silo38,000 HDTs < 1800 STK 9940B
Impact