View
44
Download
0
Category
Tags:
Preview:
DESCRIPTION
SAN Disk Metrics. Measured on Sun Ultra & HP PA-RISC Servers, StorageWorks MAs & EVAs, using iozone V3.152. Current Situation. UNIX External Storage has migrated to SAN Oracle Data File Sizes: 1 to 36 GB (R&D) Oracle Servers are predominantly Sun “Entry Level” - PowerPoint PPT Presentation
Citation preview
SAN Disk Metrics
Measured on Sun Ultra & HP PA-RISC Servers, StorageWorks MAs & EVAs, using iozone V3.152
Current Situation
UNIX External Storage has migrated to SANOracle Data File Sizes: 1 to 36 GB (R&D)Oracle Servers are predominantly Sun “Entry Level”HPQ StorageWorks: 24 MAs, 2 EVAs2Q03 SAN LUN restructuring using RAID 5 onlyOracle DBAs continue to request RAID 1+0Roadmap for future - needed
Purpose Of Filesystem Benchmarks
Find Best Performance Storage, Server, HW options, OS, and
Filesystem
Find Best Price/Performance Restrain Costs
Replace “Opinions” with Factual AnalysisContinue Abbott UNIX Benchmarks Filesystems, Disks, and SAN
Benchmarking began in 1999
Goals
Measure Current CapabilitiesFind BottlenecksFind Best Price/PerformanceSet Cost Expectations For Customers Provide a Menu of Configurations
Find Simplest ConfigurationSatisfy Oracle DBA Expectations Harmonize Abbott Oracle Filesystem
Configuration
Create a Road Map for Data Storage
Preconceptions
UNIX SysAdmins RAID 1+0 does not vastly outperform RAID
5 Distribute Busy Filesystems among LUNs At least 3+ LUNs should be used for Oracle
Oracle DBAs RAID 1+0 is Required for Production I Paid For It, So I Should Get It Filesystem Expansion On Demand
CPU
Memory
I/O
Web serving:Small, integrated system
Database/CRM/ERP:Storage
Oracle Server Resource Needs in 3D
Sun Servers for Oracle Databases
Sun UltraSPARC UPA Bus Entry Level Servers Ultra 2, 2x300 MHz Ultra SPARC-II, Sbus, 2 GB 220R, 2x450 MHz Ultra SPARC-II, PCI, 2 GB 420R, 4x450 MHz Ultra SPARC-II, PCI, 4 GB
Enterprise Class Sun UPA Bus Servers E3500, 4x400 MHz Ultra SPARC-II, UPA, Sbus, 8 GB
Sun UltraSPARC Fireplane (Safari) Entry Level Servers
280R, 2x750 MHz Ultra SPARC-III, Fireplane, PCI, 8 GB 480R, 4x900 MHz Ultra SPARC-III, Fireplane, PCI, 32 GB V880, 8x900 MHz Ultra SPARC-III, Fireplane, PCI, 64 GB
Other UNIX HP L1000, 2x450 PA-RISC, Astro, PCI, 1024 MB
Oracle UNIX Filesystems
Cooperative Standard between UNIX and R&D DBAs8 Filesystems in 3 LUNs
/exp/array.1/oracle/<instance> binaries & config /exp/array.2-6/oradb/<instance> data, index, temp,
etc… /exp/array.7/oraarch/<instance> archive logs /exp/array.8/oraback/<instance> export, backup (RMAN)
Basic LUN Usage Lun1: array.1-3 Lun2: array.4-6 Lun3: array.7-8 (Initially on “far” Storage Node)
StorageWorks SAN Storage Nodes
StorageWorks: DEC -> Compaq -> HPQ A traditional DEC Shop
Initial SAN equipment vendor Brocade Switches resold under StorageWorks label
Only vendor with complete UNIX coverage (2000) Sun, HP, SGI, Tru64 UNIX, Linux EMC, Hitachi, etc… could not match UNIX coverage
Enterprise Modular Array (MA) – “Stone Soup” SAN Buy the controller, then 2 to 6 disk shelves, then disks 2-3 disk shelf configs have led to problem RAIDsets which
have finally been reconfigured in 2Q2003
Enterprise Virtual Array (EVA) – Next Generation
MA 8000
EVA
2Q03 LUN Restructuring – 2nd Gen SAN
“Far” LUNs pulled back to “near” Data Center6 disk, 6 shelf MA RAID 5 RAIDsetsLUNs are partitioned from RAIDsetsLUNs are sized as multiples of disk sizeMultiple LUNs from different RAIDsetsBusy filesystems are distributed among LUNsServer and Storage Node SAN Fabric Connections mated to common switch
Results – GeneralizationsRead Performance - Server Performance Baseline
Basic Measure of System Bus, Memory/Cache, & HBA Good evaluation of dissimilar server I/O potential
Random Write - Largest Variations in Performance
Filesystem & Storage Node Selection Dominant Variables
Memory & Cache – Important Processor Cache, System I/O Buffers, Virtual Memory
All boost different data stream size performance
More Hardware, OS, & Fsys selections To be evaluated
IOZONE Benchmark Utility
File Operations Sequential Write & Re-write Sequential Read & Re-read Random Read & Random Write Others are available:
record rewrite, read backwards, read strided, fread/fwrite, pread/pwrite, aio_read/aio_write
File & Record Sizes Ranges or individual sizes may be specified
IOZONE – Output: UFS Seq ReadReader Report
4 8 16 32 64 128 256 512 10241024 406496 520835 540941 607379 620229 637632 627066 641585 6509692048 406348 520844 578532 606278 616880 631900 614101 626302 6456504096 394035 499083 512131 611343 579181 624295 629465 635531 6335738192 359598 518713 551389 592636 594050 618494 629180 632393 632004
16384 367206 501806 535775 571031 565921 611207 587746 586439 54551432768 360016 497118 535215 518998 584620 597759 577501 567647 58104465536 348275 483674 502195 529404 544065 548551 569081 545047 574136
131072 354139 473798 509211 530035 555874 550961 555821 549711 564418262144 345184 464995 512975 539870 540698 551561 550247 554913 551340524288 338531 463357 502684 521700 535715 547722 539675 550952 547278
1048576 330355 446950 488376 503234 522847 532205 535885 539378 5313232097152 162903 188737 200783 209500 213811 216011 217075 217480 2174994194304 171974 199961 212548 221275 226669 229542 230549 230721 2305338388608 173820 201640 214809 223530 228429 230563 231865 232803 232673
16777216 94754 138745 145705 147569 151129 151455 150974 151356 152321
IOZONE – UFS Sequential Read
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216 4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Sequencial Read, UFS, EVA RAID 5, SunFire 480R, 16384 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
IOZONE – UFS Random Read
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216 4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Random Read, UFS, EVA RAID 5, SunFire 480R, 16384 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
IOZONE – UFS Sequential Write
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216 4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Sequencial Write, UFS, EVA RAID 5, SunFire 480R, 16384 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
IOZONE – UFS Random Write
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216 4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Random Write, UFS, EVA RAID 5, SunFire 480R, 16384 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
Results – Server Memory
Cache Influences small data stream performance
Memory - I/O buffers and virtual memory Influences larger data stream performance
Large Data Streams need Large Memory Past this limit => Synchronous performance
Results – Server I/O Potential
System Bus Sun: UPA replaced by SunFire
Peripheral Bus: PCI vs. SBus Sbus (Older Sun only)
Peak Bandwidth (25 MHz/64-bit) ~200 MB/sec Actual Thruput ~50-60 MB/sec (~25+%)
PCI (Peripheral Component Interconnect) Peak Bandwidth (66 MHz/64-bit) ~530 MB/sec Actual Thruput ~440 MB/sec (~80+%)
Server – Sun, UPA, SBus
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304 4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Sequential Read, UFS+logging, EMA RAID 5, Sun Ultra 2, 2048 MB, SBus QLA2200FS
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
Server – Sun Enterprise, Gigaplane/UPA, SBus
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304 4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Sequencial Read, UFS+logging, EMA RAID 5, Sun E3500, 6400 MB, Sbus QLA2200FS
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
Server – Sun, UPA, PCI
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608 4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Sequencial Read, UFS+logging, EMA RAID 5
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
Server – HP, Astro Chipset, PCI
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608 4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Sequencial Read, HP VxFS, EMA RAID 5, HP L1000, 1024 MB, QLA2200F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
Server – Sun, Fireplane, PCI
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216 4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Sequencial Read, UFS, EVA RAID 5
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
Results – MA vs. EVA
MA RAID 1+0 & RAID 5 vs. EVA RAID 5 Sequential Write
EVA RAID 5 is 30-40% faster than MA RAID 1+0 EVA RAID 5 is up to 2x faster than MA RAID 5
Random Write EVA RAID 5 is 10-20% slower than MA RAID 1+0 EVA RAID 5 is up to 4x faster than MA RAID 5
Servers were SunFire 480Rs, using UFS+logging.
EVA: 12 72 GB FCAL Disk RAID 5 partitioned LUN
MA: 6 36 GB SCSI Disk RAIDset
RAID 0RAID 1
RAID 3RAID 5
RAID 1+0 RAID 0+1
Results – MA RAIDsets
Best: 3 mirror, 6 shelf RAID 1+0 3 mirror RAID 1+0 on 2 shelves only yield 80% of 6 shelf version2 disk mirror (2 shelves) yields 50%
Results – MA RAIDsets
Best: 3 mirror, 6 shelf RAID 1+0 6 disk, 6 shelf RAID 5: Sequential Write: 75-80% Random Write: 25-50% (2 to 4 times slower)
3 disk, 3 shelf RAID 5: Sequential Write: 40-60% Random Write: 25-60% Can outperform 6 disk RAID 5 on random
write
Results – LUNs from Partitions
3 Simultaneous Writers Partitions of same RAIDset
Write performance (S or R) Less than 50% of no-contention
performance
No control test performed: 3 servers write to 3 different RAIDsets of
same Storage Node
Where is the Bottleneck? RAIDset, SCSI channels, or Controllers?
Results – Fabric Locality
In production, “far” LUNs underperform Monitoring “sar” disk data, “far” LUN
filesystems are 4 to 10 times slower. Fabric-based service disruptions are drawn
into the server when any LUNs are not local.
This round of testing did not show wide variations in performance whether the server was connected to it’s Storage Node’s SAN Switch, or 3 / 4 hops away.
Results – UFS Options
Logging The journaling UFS Filesystem
Advised on large filesystems to avoid long running “fsck”.
Under Solaris 8, logging introduces a 10% write performance penalty.
Solaris 9 advertises its logging algorithm is much more efficient.
Forcedirectio No useful testing without an Oracle workload
Results – UFS Tuning
Bufhwm: Default 2% of memory, Max 20% of memory Extends I/O Buffer effect
improves write performance on moderately large files
Ufs:ufs_LW & ufs:ufs_HW Solaris 7 & 8: 256K & 384K bytes Solaris 9: 8M & 16M bytes More data is held in system buffer before being
flushed. Fsflush() effect on “sar” data: large service
times
Results – VERITAS VxFS
Outstanding Write Performance VxFS only on MA 6-disk RAID 5
UFS on MA 6-disk RAID 5 Sequential Write VxFS is 15 times faster Random WriteVxFS is 40 times faster
UFS on MA 6-disk RAID 1+0 Sequential Write VxFS is 10 times faster Random WriteVxFS is 10 times faster
UFS on EVA 12-disk RAID 5 Sequential Write VxFS is 7 times faster Random WriteVxFS is 12 times faster
Results –Random WriteHardware-only Storage Node Performance MA 1+0 = EVA RAID 5 EVA RAID 5 pro-rata cost similar to MA RAID 5
RAID 1+0 is Not Cost Effective Improved Filesystem is Your Choice
Order-of-Magnitude Better Performance Less expensive
Server Memory Memory Still Is Important for Large Data
Streams
Random Write: UFS, MA, RAID 5
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288 4
16
64
256
1024
4096
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/sec
File Size (KB)
Record Size (KB)
UFS Random Write, EMA RAID 5, SunFire 480R, 8192 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
Random Write: UFS, MA, RAID 1+0
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288 4
16
64
256
1024
4096
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/sec
File Size (KB)
Record Size (KB)
UFS Random Write, EMA RAID 0+1, SunFire 480R, 8192 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
Random Write: UFS, EVA, RAID 5
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288 4
16
64
256
1024
4096
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/sec
File Size (KB)
Record Size (KB)
UFS Random Write, EVA RAID 5, SunFire 480R, 16384 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
Random Write: VxFS, MA, RAID 5
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288 4
16
64
256
1024
4096
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/sec
File Size (KB)
Record Size (KB)
VxFS Random Write, EMA RAID 5, SunFire 480R, 8192 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
Closer Look: VxFS vs. UFS
Graphical Comparison: Sun Servers provided with RAID 5 LUNs
UFS EMA UFS EVA VxFS EMA VxFS EVA
File Operations Sequential Read Random Read Sequential Write Random Write
Sequential Read
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Sequencial Read, VxFS, EVA RAID 5, SunFire 480R, 8192 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
1024
2048
4096
8192
1638
4
3276
8
6553
6
1310
72
2621
44
5242
88
4
16
64
256
1024
4096
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/sec
File Size (KB)
Record Size (KB)
VxFS Sequential Read, EMA RAID 5, SunFire 480R, 8192 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Sequencial Read, UFS, EVA RAID 5, SunFire 480R, 16384 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
1024
2048
4096
8192
1638
4
3276
8
6553
6
1310
72
2621
44
5242
88
1048
576
2097
152
4194
304
8388
608
4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Sequencial Read, UFS+logging, EMA RAID 5, Sun 220R, 2048 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
Random Read
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216 4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Random Read, VxFS, EVA RAID 5, SunFire 480R, 8192 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
1024
2048
4096
8192
1638
4
3276
8
6553
6
1310
72
2621
44
5242
88
4
16
64
256
1024
4096
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/sec
File Size (KB)
Record Size (KB)
VxFS Random Read, EMA RAID 5, SunFire 480R, 8192 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216 4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Random Read, UFS, EVA RAID 5, SunFire 480R, 16384 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
102
4
204
8
409
6
819
2
163
84
327
68
655
36
131
072
262
144
524
288
104
857
6
209
715
2
419
430
4
838
860
8
4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Random Read, UFS+logging, EMA RAID 5, Sun 220R, 2048 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
Sequential Write
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216 4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Sequencial Write, VxFS, EVA RAID 5, SunFire 480R, 8192 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
1024
2048
4096
8192
1638
4
3276
8
6553
6
1310
72
2621
44
5242
88
4
16
64
256
1024
4096
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/sec
File Size (KB)
Record Size (KB)
VxFS Sequential Write, EMA RAID 5, SunFire 480R, 8192 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216 4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Sequencial Write, UFS, EVA RAID 5, SunFire 480R, 16384 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
102
4
204
8
409
6
819
2
163
84
327
68
655
36
131
072
262
144
524
288
104
857
6
209
715
2
419
430
4
838
860
8
4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Sequencial Write, UFS+logging, EMA RAID 5, Sun 220R, 2048 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
Random Write
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216 4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Random Write, VxFS, EVA RAID 5, SunFire 480R, 8192 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
1024
2048
4096
8192
1638
4
3276
8
6553
6
1310
72
2621
44
5242
88
4
16
64
256
1024
4096
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/sec
File Size (KB)
Record Size (KB)
VxFS Random Write, EMA RAID 5, SunFire 480R, 8192 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216 4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Random Write, UFS, EVA RAID 5, SunFire 480R, 16384 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
102
4
204
8
409
6
819
2
163
84
327
68
655
36
131
072
262
144
524
288
104
857
6
209
715
2
419
430
4
838
860
8
4
16
64
256
1024
0
100000
200000
300000
400000
500000
600000
700000
800000
KB/Sec
File Size (KB)
Record Size (KB)
Random Write, UFS+logging, EMA RAID 5, Sun 220R, 2048 MB, QLA2310F
700000-800000
600000-700000
500000-600000
400000-500000
300000-400000
200000-300000
100000-200000
0-100000
Results – VERITAS VxFS
Biggest Performance gains Everything else is of secondary importance
Memory Overhead for VxFS Dominates Sequential Write of small files Needs further investigation
VxFS & EVA RAID 1+0 not measured Don’t mention what you don’t want to sell
Implications – VERITAS VxFS
Where is the Bottleneck? Changes at Storage Node
Modest Increases in Performance Changes within Server
Dramatically Increase Performance
The Bottleneck is in the Server, not the SAN The relative cost is just good fortune
Changing the filesystem is much less expensive
Results – Bottom LineBottleneck Identified It’s the Server, not Storage
VERITAS VxFS Use it on UNIX Servers
RAID 1+0 is Not Cost Effective VxFS is much cheaper – Tier 1 servers
Server Memory Memory is cheaper than Mirrored Disk
Operating System I/O Buffers Configure as large as possible
Price & Performance
Cost Of Computing Hardware Software One time costs Ongoing costs
How Much Does VxFS Cost?How Much Do RAID 1+0 / 5 Cost?
Abbott-like Price/Performance
10 Servers need 216 GB each (3 x 72 GB) 5 x 280R, 3 x 480R, 2 x V880 2160 GB Required
MA w/84 36 GB disks costs about $100K RAID 1+0: 1400 GB usable 2 MAs needed RAID 5: 2340 GB usable 1 MA needed
EVA w/168 72 GB disks costs about $500K RAID 5: 9360 GB usable 1 EVA needed
Abbott-like Price/Performance
Best Hardware-only MA cost RAID 1+0, 2 MAs required (5/3 rounded up to 2) $200K ($170K), or $20K ($17K) per server
Best Hardware-only EVA cost RAID 5, 1 EVA required $500K ($120K), or $50K ($12K) per server
Best Software/Hardware price/performance MA Raid 5, 1 needed at $100K ($10K per server) VERITAS Foundation Suite, total cost $29K + $8K Average Server cost $14K Write Performance 7-12x better than EVA hardware-
only
Abbott-like Best Possible Performance
EVA RAID 1+0 & VxVM/VxFS EVA w/168 36 GB 15K RPM disks ($450K?) RAID 1+0 yields 2400 GB ($41K for 216 GB) VERITAS Foundation Suite (480R: $3500 + $977) $32K for a 480R Not Justified – Use RAID 5 & 72+ GB drives This EVA holds only 12 requests for 216 GB Data Center, Administrative, Maintenance,
Infrastructure, and Backup costs are not included
Abbott-like Best Available Config
MA RAID 5 with VxVM/VxFS 24 existing MAs, but only 2 EVAs Read performance
No significant variance: storage node & filesystem Write performance
Choose the better filesystem More effective and less costly than attempting a
hardware “fix” thru RAID 1+0 or a new EVA VxFS is not required on all servers
Cost constrained projects live with reduced performance
Performance is still excellent 216 GB request costs $14,500 for a 480R
Simplest Configuration
All RAIDsets: Same RAIDset Configuration Performance, Predictability, and Stability
Architectural, Administrative, and Operational RAID 5 (soon to be no RAID 1+0?) Default Chunksize Disk count
MA: 6 disk, 6 shelves EVA: 12 disks - too many?
Allocation Units are common (N x disk size) I need 15 GB and I won’t pay more
peals of laughter? Later re-allocation is eased
Opinion vs. Fact
RAID 1+0 is a requirement Hardware-only “fixes”
inadequate, expensive, and perform relatively poorly Not all applications need premium performance
Or will willingly pay for it
RAID 5 is just as good as RAID 1+0 Only if supplemented with an improved filesystem Read is equal and Seq Write is 80% Not on Random Write: 25% to 50% of 1+0 Does the application justify filesystem upgrade
Will the client pay for it?
Yet To Be Tested: Wish List
Oracle WorkloadOther Solaris Servers
Larger Sun Servers: E4800, E10K, E15K Multiple/Max I/O Channels – Is Scaling Linear?
New Sun Entry Level J-Bus Servers: V240, V440 Fujitsu Servers: Much Faster System Bus
New Sun/Fujitsu Alliance
Other UNIX Servers: IBM, Alpha, Intel Linux, etc…Other HBAs (Emulex, JNI?)EVA RAID 1+0Raw FilesystemsiSCSI
Roadmap
RAID 5 configs on all SAN Storage NodesClient may supplement with VxFSUFS rMAins on system drives No mirrors for system drives
Contingency root filesystem on 2nd internal disk
Use 32K Oracle db_block_size (8K default)
Metrics Data
/da/adm/rcsupport/sys/admin/metrics bonnie
Y2000 & Y2001 data iozone
bin output
Date Stamped Directories scripts
References
Configuration and Capacity Planning for Solaris Servers Brian L Wong, Prentice Hall, 1997
Solaris System Performance Management SA-400, Sun Educational Services
The Sun Fireplane System Interconnect Alan Charlesworth
http://www.sc2001.org/papers/pap.pap150.pdf
References
Iozone Source & Documentation Author: William Norcott
(wnorcott@us.oracle.com) http://www.iozone.org/
Questions
Recommended