Upload
alexkannan
View
134
Download
6
Embed Size (px)
Citation preview
2/20/2005 Template Documentation 1
Turning The Knobs: Practical AIX Tuning
770134Susan Schreitmueller
pSeries, Advanced Technical Support
2/20/2005 Template Documentation 2
Disclaimer• The suggestions contained in this presentation are
general suggestions formulated by the author not as a recommendation from IBM.
• These recommendations should be carefully examined for your environment and tested rigorously prior to implementing in production.
• All environments differ and requirements vary given application and system nuances. Always use YOUR best judgment.
2/20/2005 Template Documentation 3
Unless you:
Put fewer cars on the road
Widen the road
Reroute the cars
You just move the bottleneck to a different location!
2/20/2005 Template Documentation 4
Unless resources are actually ADDED to the system, much of tuning is moving a bottleneck from one place to another and balancing the trade-off of what application or request gets the available resource. It is important to note however, that is important to identify the cause rather than the bottleneck in order to effectively manage a constraint.
We must concentrate on the entire picture. From point A to point C, there may be many, many factors. Understanding the throughput from A - C is crucial. Striping and disk layout and file access are defeated when in fact the real bottleneck cause is slow transfer speed due to a low-cost/low-through-put adapter.
LTG
LVM MAPPING
RAID Levels
Disk access
disk layout tuned
High thruput
VMM - tuned for optimal memory allocation
ASYNC I/O &
application access
bottleneck continues to be a slow adapter
point A point Cpoint B - slow adapter
disktechnology
disk layout
logicaltrack
grouping
tuningadapter
application tuning and
access
Examine throughput from Point A to Point C
2/20/2005 Template Documentation 5
Network
Virtua
l MemoryManagement
WorkloadManager
System Tuning
PSALLOC
schedo
Async I/O
Performance - pieces of the puzzle
no
nfso
LVM Tuning
vmo# of procs
ioo
CPU
filesystem layout
2/20/2005 Template Documentation 6
CPU
Number of processes Process-PrioritiesWLM managed
Memory
Real memoryPagingMemory leaksWLM managed
Disk I/O
Disk balancingTypes of disksLVM policiesWLM managed
Network
NFS used to load applicationsNetwork typeNetwork traffic
Critical Resources: Four Bottlenecks c AUS es
tprof pprofsar -P -u -q ps aux ps topas
svmonvmtunevmoipcsPSALLOC
netpmonnolsattrnfsoentstatnetstat
lvmstatiostatlvm mapwlmmonwlmstatioo
2/20/2005 Template Documentation 7
Check CPU
High
CPU %
Check memory
High
paging
possiblememory
constraint
possibleCPU constraint
noBalance disk
yes
possibledisk/SCSIconstraint
Check disk
Disk
balanced
no
yes
yes
no
topas vmstatsar -q | -u | -P
tprof pprof wlmstat
sar -dtopas iostat
filemon lvmstatwlmstat wlmmonsvmon topas
vmstat ps gvcvmtune
NETWORKINGnetstat -v -s -m
netpmonnetperf
lsattr -El ent1
disk layouttransfer
ratesaccess
(seq/par)filemonlvmstat
2/20/2005 Template Documentation 8
Networking
Monitor
iptrace, ipfilter, ipreport
netpmon
netstat, nfsstat (entstat)
Tune
no, isno
nfso
adapters - chent
MemoryMonitor
svmon, vmstat, sar
wlmstat, wlmmon
tprof
Tune
vmo
paging controls
Monitor
topas, vmstat, sar
wlmstat, wlmmon
xmperf / PTX
tprof, pprof,
nmon (download)
CURT, SPLAT
Tune
schedo, system parms
CPU/Kernel General
Monitor:
Profiling: tprof, pprof, Xprof
fdpr
CURT, SPLAT
Tune:
database calls
file calls
good programming
Applications
Monitor:
filemon, fileplace
lvm mapping, lvmstat, iostat
wlmstat, wlmmon
Tune:
ioo
AIO – max/min servers
adapter spread, file layout
Disk I/O
2/20/2005 Template Documentation 9
Network Options
Network options are set by executing the command:
Prior to AIX 5.2, these options should be placed where they will be re-executed on boot. e.g., an /etc/rc.tune file or /etc/rc.local
AIX 5.2 supports permanent and reboot values retention in - /etc/tunables or /etc/tunables/nextboot | /etc/tunables/lastboot
Interface Specific Network Options (ISNO)Allows some options to be configured differently for the following network interfaces.
10/100/1000 BaseT, 10/100 BaseTATMGigabit Ethernet
no -anfso -a
extendednetstats = 0 strturncnt = 20 directed_broadcast = 1
thewall = 1048576 pseintrstack = 15 ipignoreredirects = 0
sockthresh = 85 lowthresh = 12288 ipsrcroutesend = 1
sb_max = 1310720 medthresh = 90 ipsrcrouterecv = 1 somaxconn = 1024 psecache = 95 ipsrcrouteforward = 1
clean_partial_conns = 0 subnetsarelocal = 1 ip6srcrouteforward = 1
net_malloc_police = 0 maxttl = 1 ip6_defttl = 64 rto_low = 1 ipfragttl = 255 ndpt_keep = 120 rto_high = 64 ipsendredirects = 60 ndpt_reachable = 30 rto_limit = 7 ipforwarding = 1 ndpt_retrans = 1
rto_length = 13 udp_ttl = 1 ndpt_probe = 5 inet_stack_size = 16 tcp_ttl = 30 ndpt_down = 3
arptab_bsiz = 7 arpt_killc = 60 ndp_umaxtries = 3 arptab_nb = 25 tcp_sendspace = 655360 ndp_mmaxtries = 3
tcp_ndebug = 100 tcp_recvspace = 655360 ip6_prune = 2
ifsize = 8 udp_sendspace = 65536 ip6forwarding = 0 arpqsize = 1 udp_recvspace = 655360 multi_homed = 1 ndpqsize = 50 rfc1122addrchk = 65536 main_if6 = 0 route_expire = 1 nonlocsrcroute = 0 main_site6 = 0
send_file_duration = 300 tcp_keepintvl = 1 site6_index = 0 fasttimo = 200 tcp_keepidle = 150 maxnip6q = 20
routerevalidate = 0 bcastping = 14400 llsleep_timeout = 3 nbc_limit = 0 udpcksum = 0 tcp_timewait = 1
nbc_max_cache = 131072 tcp_mssdflt = 1448 tcp_ephemeral_low = 32768 nbc_min_cache = 1 icmpaddressmask = 1448 tcp_ephemeral_high = 65535
nbc_pseg = 0 tcp_keepinit = 0 udp_ephemeral_low = 32768
nbc_pseg_limit = 524288 ie5_old_multicast_mapping = 150 udp_ephemeral_high = 65535
strmsgsz = 0 rfc1323 = 1 delayack = 0
strctlsz = 1024 pmtu_default_age = 0 delayackports = {}
nstrpush = 8 pmtu_rediscover_interval = 10 sack = 0
strthresh = 85 udp_pmtu_discover = 30 use_isno = 1
psetimers = 20 tcp_pmtu_discover = 0 psebufcalls = 20 ipqmaxlen = 0
100
Network Tuning – Choose the ‘knobs’ to turn
2/20/2005 Template Documentation 11
portcheck = 0udpchecksum = 1
nfs_socketsize = 600000nfs_tcp_socketsize = 600000
nfs_setattr_error = 0nfs_gather_threshold = 4096
nfs_repeat_messages = 0nfs_udp_duplicate_cache_size = 5000nfs_tcp_duplicate_cache_size = 5000
nfs_server_base_priority = 0nfs_dynamic_retrans = 1
nfs_iopace_pages = 0nfs_max_connections = 0
nfs_max_threads = 3891nfs_use_reserved_ports = 0
nfs_device_specific_bufs = 1nfs_server_clread = 1
nfs_rfc1323 = 1nfs_max_write_size = 32768nfs_max_read_size = 32768
nfs_allow_all_signals = 0
NFS should be tuned also
2/20/2005 Template Documentation 12
Tuning the I/O Subsystem
Disk layout: check w/ LVM Mapping
Disk technology: check LTG selection & lsattr -El hdisk#
Check file(Seq. or Parallel) access, concurrent access patterns
Review Async I/O needs and tuning for the application
Utilize filemon, iostat, sar -d to monitor
Physical Layer
Application Access
LVM Layout
Async I/O | Parallel or Sequential
2/20/2005 Template Documentation 13
Virtual Memory Management
A (very) short tutorial!
2/20/2005 Template Documentation 14
Paging Space
Real Memory
Segment
256MB
Frame 4KB
File System (JFS)
CLIENT PAGES
Working Segment
Persistent Segment
NFS
JFS2
Client Segment
2/20/2005 Template Documentation 15
bbb2
bbb4
ccc1
aaa4
aaa2
Real Addr
Free List
+Pbbb2++Pbbb1
Waaa4+Waaa3
+Waaa2++Waaa1
++Cccc1Pbbb4
+Pbbb3
++ Cccc4+Cccc3
+Cccc2
Mod?
Ref?
SegType
Real Addr
+Pbbb1
Waaa3
+Waaa1
+Cccc1
Pbbb3
Mod?Ref?SegType
Real Addr
Page Frame Table (PFT)
PagingSpace
File System
New PFT
NFS/JFS2
2/20/2005 Template Documentation 16
vmo - A beginning look
• Let’s look at two examples of controlling memory usage:– maxfree/minfree – to control at what levels the
page replacement algorithm will begin or stop stealing pages
– maxperm/minperm – to control what types (file pages or computational pages) of pages are stolen first
2/20/2005 Template Documentation 17
minfree – when # of frames on the free list reaches this value –Page Replacement Algorithm wakes up and begins stealing pages
maxfree – VMM Page Stealing continues until the maxfree value is reached.
120
120 + 8
defaults
On a large memory system or SMP, the defaults of 120 and 128 are a very small amount of the real memory available. If memory demand continues after minfree value is reached, then processes may be suspended or killed. When the number of free pages is = or < than maxfree, algorithm no longer frees pages. There will be insufficient pages relative to the total system memory to satisfy demand.
Large memory system’s
remaining memory
2/20/2005 Template Documentation 18
maxperm
maxpermnumperm
Comp Pages
Comp Pages
File Pages
File Pages
minperm minperm
100%
0%
50%
Controlling Memory Selections
2/20/2005 Template Documentation 19
vmo & ioo vs. vmtune at AIX V5.2 •With the AIX V5.2 release, vmtune was replaced by the tuning parameters vmo and ioo. schedtune was replaced by the tuning parameter schedtune. All of these parameters (along with noo and nfso) support the ability to retain tuning parameters in the /etc/tunables files
•Although vmtune and schedtune can still be run, the appropriate vmo, ioo or schedtune command should be utilized. vmtune and schedtune are still available for backward compatibility but have limited functionality.
2/20/2005 Template Documentation 20
vmtune – maxfree & minfree
• Set minfree = 120 * # of CPU’s * # memory pools
• Set maxfree = minfree + maxpageahead * # of CPU’s
(some recommend starting value of (120+4)* (#of memorypools)
Number of memory pools can be determined through:
• vmtune –a (pre 5.2)
•vmstat –v for AIX 5.2
2/20/2005 Template Documentation 21
The nmon and nmon_analyzer are free tools from IBM that are useful for displaying and analyzing AIX performance.
The nmon tool is similar to "topas", which displays real-time AIX performance statistics. But unlike "topas", nmon presents more information and can capture data for analysis and presentation.
The nmon_analyzer tool analyzes the captured performance data. It can create a spreadsheet showing graphs of performance trends.
Using nmon & nmon_analyzer
2/20/2005 Template Documentation 22
NMON Analyzer performs analyses of the nmon data to produce the following:
Calculation of weighted averages for hot-spot analysis
Distribution of CPU utilization by processor over the collection interval - useful in identifying single-threaded processes
Additional sections for ESS vpaths showing device busy, read transfer size, and write transfer size by time of day
Total system data rate by time of day, adjusted to exclude double-counting of EMC hdiskpower devices - useful in identifying I/O subsystem and SAN bottlenecks
Separate sheets for EMC hdiskpower and FAStT dac devices
Analysis of memory utilization to show the split between computational and non-computational pages
Total data rates for each network adapter by time of day
Summary data for the TOP section showing average CPU and memory utilization for each command
2/20/2005 Template Documentation 23
Examining
numperm/minperm/maxpermwith nmon
2/20/2005 Template Documentation 24
Additional information in nmonanalyzer
2/20/2005 Template Documentation 25
Initial Tuning – vmtune or vmo
If the load on the system is relatively unknown, the values above could be considered a starting point.
• MINPERM = 15• MAXPERM = 60• numfsbufs = 186• MINFREE = (120 + 4) * # of memory pools (#of memory
pools is found by issuing the vmstat -a command)• MAXFREE = MINFREE + (MAXPGAHEAD (or
j2maxpgahead) * # of Memory Pools)• hd_pbuf_cnt = (# of Disks attached to the server (Physical
or Luns) + 4) times 120
2/20/2005 Template Documentation 26
I/O Tuning Parameters• numfsbufs (vmtune –b) specifies the number of file system buffer
structures. This value is critical asVMM will put a process on the wait list if there are insufficient free buffer structures.
• Run vmtune –a (pre 5.2) vmstat –v (5.2 & >) and monitor fsbufwaitcnt. This is incremented each time an I/O operation has to wait for file system buffer structures.
• A general technique is to double the numfsbufs value (up to a maximum of 512) until fsbufwaitcount no longer increases. This value, as it is dynamic, should be re-executed on boot prior to any mount all command.
2/20/2005 Template Documentation 27
I/O Tuning Parameters (cont)• hd_pbuf_cnt (vmtune –B) determines the number of pbufs
assigned to LVM. pbufs are pinned memory buffers used to hold pending I/O requests.
• Again, examine vmtune –a and review the psbufwaitcnt. If increasing, multiply the current hd_pbuf_cnt by 2 until psbufwaitcnt stops incrementing.
• Because the hd_pbuf_cnt can only be reduced via a reboot (this is pinned memory) – be frugal when increasing this value.
2/20/2005 Template Documentation 28
I/O Tuning• Over 35% I/O wait should be investigated• Oracle databases like async I/O, DB2 & Sybase do not care (a
good place to start would be AIO PARMS of • MINSERVERS = 80 MAXSERVERS = 200 MAXREQUESTS = 8192)
• Recent technology disks will support higher ltg numbers• lvmstat (must be enabled prior to usage) provides detailed
information for I/O contention• filemon is an excellent I/O tool (trace – ensure you turn it off) • numfsbufs and hd_pbuf_cnt adjusted to reduce wait counts in
vmtune or vmstat -v
2/20/2005 Template Documentation 29
VMSTAT AIX 5# vmstat - I -t 1 10kthr memory page faults cpu time----- ----------- ------------------------ ------------ ----------- --------r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se0 0 0 35169 98866 0 0 0 0 0 16 118 231 30 0 1 99 0 12:41:520 1 0 35169 98863 0 0 0 0 0 0 222 100 27 0 0 99 0 12:41:531 1 0 35169 98863 5 0 0 0 0 0 229 88 38 2 0 91 7 12:41:541 1 0 35169 98863 6 0 0 0 0 0 218 58 26 4 5 91 0 12:41:551 1 0 35169 98863 7 0 0 0 0 0 227 58 30 6 0 94 0 12:41:560 1 0 35169 98863 4 0 0 0 0 0 236 72 34 0 0 99 0 12:41:570 1 0 35169 98863 0 9 0 0 0 0 223 72 34 0 0 99 0 12:41:580 1 2 35169 98863 20 7 0 0 0 0 221 60 28 1 0 89 10 12:41:590 1 2 35169 98863 18 4 0 0 0 0 213 58 30 1 5 84 10 12:42:000 1 0 35169 98863 0 0 0 0 0 0 221 72 34 0 0 99 0 12:42:01
2/20/2005 Template Documentation 30
VMSTAT AIX 5/@test1 $ vmstat hdisk0 hdisk1 1 10 kthr memory page faults cpu disk xfer ----- ----------- ------------------------ ------------ ----------- ----------- r b avm fre re pi po fr sr cy in sy cs us sy id wa 1 2 3 4 1 1 51459 110720 0 0 0 0 0 0 208 2484 1177 26 10 64 0 0 0 3 0 51465 110714 0 0 0 0 0 0 303 5371 1609 26 11 64 0 0 0 1 0 51465 110714 0 0 0 0 0 0 300 5502 1725 27 8 65 0 0 0 3 0 51465 110714 0 0 0 0 0 0 305 5273 1613 27 9 64 0 0 0 1 0 51466 110713 0 0 0 0 0 0 310 5330 1654 21 15 65 0 0 0 1 1 51467 110712 0 0 0 0 0 0 308 5341 1643 28 7 65 0 10 0 1 1 51467 110712 0 0 0 0 0 0 313 5392 1665 28 10 62 0 5 4 1 1 51467 110712 0 0 0 0 0 0 308 5421 1677 23 13 64 0 0 8 2 0 51467 110712 0 0 0 0 0 0 308 5271 1635 27 8 66 0 0 0 1 0 51467 110712 0 0 0 0 0 0 302 5432 1697 29 9 62 0 0 0
The number of transfers per second to the specified physical volumes that occurred in the sample interval. One to four physical volume names can be specified. Transfer statistics are given for each specified drive in the order specified. This count represents requests to the physical device. It does not imply an amount of data that was read or written. Several logical requests can be combined into one physical request.
2/20/2005 Template Documentation 31
A look at the changes at AIX 5.2 and beyond
2/20/2005 Template Documentation 32
AIX 5.2 Performance ToolsWhat’s Changed? Template-based AIX performance tuning via a stanza based file:
/etc/tunablesSupports no, nfso, schedo (schedtune), and vmo (vmtune)Supports persistent values for no and nfso across rebootFile can be exported and imported to multiple servers
Consolidated access to performance tuning values in SMIT and Web-base System ManagerPerf toolbox and iostat support for ESS vpathsInclude Xprofiler (GUI-based profiling tool) in AIX basePerformance tools support for LPAR, large pages and memory affinity New thread analysis tools: CURT and SPLATtprof enhancements
Support for emulation and alignment interruptsImproved threads supportMultiple process profiling
kdb enhancements for crash, lldb functions
Performance management and debugging tools
2/20/2005 Template Documentation 33
vmtuneioo vmo
schedtune schedo•Command Consistency
•Options for display or change
•Ability to control changes now, next boot, all
•Ability to return to defaults, check consistency, save or propogate
•Commands supported from SMIT or WSM
no
nfso
2/20/2005 Template Documentation 34
no (new syntax)no –aNetwork Tuning
nfso (new syntax)
nfso –aNFS TuningioovmtuneI/O Tuning
schedoschedtuneScheduler Tuning
vmo & ioovmtuneVMM TuningAIX 5.2AIX 5.1Command
Old vs. New
2/20/2005 Template Documentation 35
tuncheck - checks ranges, dependencies, bosboot if required
tunsave - saves current values to a file (optionally nextboot)
tunrestore - restore from a file (now or at reboot)
tundefault - restores to default values
etc/tunables
AIX 5.2/5.3 Tuning: /etc/tunables
Promotes reusability
Flags are now consistant
Automatic saving of parameters
Called from SMIT/WSM
2/20/2005 Template Documentation 36
vmo ioo schedo no nfso Common Flags:
2/20/2005 Template Documentation 37
General Database Goodness
• Async I/O• Buffering vs. AIX Filesystems• Controlling Logs
– Size to avoid constant switches– Watch placement on volumes
• Use the correct tuning parameters• Set system parameters
– Number of processes per user
2/20/2005 Template Documentation 38
Metrics displayedCPU utilization (%user, %sys, %idle, %wait)percentage spent in hypervisor (%hypv) and number of hcalls (hcalls) [both optional]additional shared mode only metrics
Physical Processor Consumed (physc)Percentage of Entitlement Consumed (%entc)Logical CPU Utilization (%lbusy)Available Pool Processors (app)number of virtual context switches (vcsw)
virtual processor hardware preemptionsnumber of phantom interrupts (phint)
interrupts received for other partitionsExample
# lparstat 5 10
System configuration: type=Shared mode=Capped smt=On lcpu=2 mem=2048 psize=1.0 ent=0.50
%user %sys %wait %idle physc %entc lbusy app vcsw phint----- ---- ----- ----- ----- ----- ------ --- ---- ----- 4.8 1.2 0.0 94.0 0.04 7.0 1.7 0.9 1378 0 21.8 1.8 0.0 76.4 0.13 26.3 13.3 0.8 1580 0 31.2 2.2 0.0 66.5 0.19 37.0 16.1 0.8 1461 0 84.9 5.4 0.0 9.7 0.50 99.4 48.7 0.5 1472 0 85.1 5.4 0.0 9.5 0.50 99.5 48.1 0.5 1477 0 77.1 4.9 0.0 18.0 0.45 90.2 44.5 0.4 1546 0 2.9 6.2 0.0 90.9 0.06 11.4 2.2 0.9 1425 1 4.8 13.6 0.0 81.6 0.11 22.6 10.0 0.8 1810 0 4.4 12.3 0.0 83.3 0.10 20.5 10.5 0.9 1773 1
lparstat - monitoring modeMetrics displayed
CPU utilization (%user, %sys, %idle, %wait)percentage spent in hypervisor (%hypv) and number of hcalls (hcalls) [both optional]additional shared mode only metrics
Physical Processor Consumed (physc)Percentage of Entitlement Consumed (%entc)Logical CPU Utilization (%lbusy)Available Pool Processors (app)number of virtual context switches (vcsw)
virtual processor hardware preemptionsnumber of phantom interrupts (phint)
interrupts received for other partitionsExample
# lparstat 5 10
System configuration: type=Shared mode=Capped smt=On lcpu=2 mem=2048 psize=1.0 ent=0.50
%user %sys %wait %idle physc %entc lbusy app vcsw phint----- ---- ----- ----- ----- ----- ------ --- ---- ----- 4.8 1.2 0.0 94.0 0.04 7.0 1.7 0.9 1378 0 21.8 1.8 0.0 76.4 0.13 26.3 13.3 0.8 1580 0 31.2 2.2 0.0 66.5 0.19 37.0 16.1 0.8 1461 0 84.9 5.4 0.0 9.7 0.50 99.4 48.7 0.5 1472 0 85.1 5.4 0.0 9.5 0.50 99.5 48.1 0.5 1477 0 77.1 4.9 0.0 18.0 0.45 90.2 44.5 0.4 1546 0 2.9 6.2 0.0 90.9 0.06 11.4 2.2 0.9 1425 1 4.8 13.6 0.0 81.6 0.11 22.6 10.0 0.8 1810 0 4.4 12.3 0.0 83.3 0.10 20.5 10.5 0.9 1773 1
2/20/2005 Template Documentation 39
3 MODES: information (-i) shows static configuration informationdetailed hypervisor (-H) breakdown of hypervisor time by hcall typemonitoring mode (default)
topas - main screen update
New cpu section metrics on physical processing resources consumedPhysc: amount consumed in fractional number of processors%Entc: amount consumed in percentage of entitlement
Topas Monitor for host: specweb8 EVENTS/QUEUES FILE/TTY Sat Mar 13 09:47:18 2004 Interval: 2 Cswitch 50 Readch 0 Syscall 47 Writech 34 Kernel 0.0 | | Reads 0 Rawin 0 User 0.0 | | Writes 0 Ttyout 34 Wait 0.0 | | Forks 0 Igets 0 Idle 100.0 |############################| Execs 0 Namei 1 Physc = 0.01 %Entc= 1.2 Runqueue 0.0 Dirblk 0 Waitqueue 0.0 Network KBPS I-Pack O-Pack KB-In KB-Out en0 0.1 1.0 1.0 0.0 0.1 PAGING MEMORY lo0 0.0 0.0 0.0 0.0 0.0 Faults 0 Real,MB 8191 Steals 0 % Comp 5.4 Disk Busy% KBPS TPS KB-Read KB-Writ PgspIn 0 % Noncomp 1.6 hdisk0 0.0 0.0 0.0 0.0 0.0 PgspOut 0 % Client 1.6 hdisk2 0.0 0.0 0.0 0.0 0.0 PageIn 0 hdisk3 0.0 0.0 0.0 0.0 0.0 PageOut 0 PAGING SPACE hdisk1 0.0 0.0 0.0 0.0 0.0 Sios 0 Size,MB 512 % Used 0.6 Name PID CPU% PgSp Owner NFS (calls/sec) % Free 99.3 IBM.CSMAg 13180 0.0 1.6 root ServerV2 0 syncd 9366 0.0 0.5 root ClientV2 0 Press: prngd 22452 0.0 0.3 root ServerV3 0 "h" for help psgc 2322 0.0 0.0 root ClientV3 0 "q" to quit pilegc 2580 0.0 0.0 root
New metrics are added automatically when running in shared modeCPU utilization metrics are automatically calculated using new purr-based data and formula when running in SMT or shared mode
2/20/2005 Template Documentation 41
2/20/2005 Template Documentation 42
2/20/2005 Template Documentation 43
Filesystem buffers: Insufficient buffers will degrade I/O performance. The default AIX setting for these buffers is typically too low for database servers. JFS/JFS2 are tuned separately. JFS uses “vmtune –b”, while JFS2 uses “vmtune –Z”.Be careful not to set the value too high if running a 32 bit kernel with a large number of filesystems (50+). The buffer setting is per filesystem, and you can run out of kernel memory if this is set too high. (This does not apply to the 64 bit kernel, which supports larger kernel memory sizes.)Tune the filesystem buffers when the system is under peak load. Run the following command multiple times:
AIX 5.1: /usr/samples/kernel/vmtune –a | grep fsbufwaitcntAIX 5.2: vmstat –v | grep “filesystem I/Os blocked with no fsbuf”
2/20/2005 Template Documentation 44
Disk Layout: The most important I/O tuning step is to spread all of the data over all of the physical disk drives*. If you have a SAN, work closely with the SAN administrator to understand the logical to physical disk layout. In a SAN, two or more hdisks may reside on the same physical disk. (*-The one exception is when you back up to disk. Be sure the backup disks are on a separate storage system to avoid having a single point of failure.)
Queue depth for fibre channel adapter: This setting depends on the storage vendor. For IBM Shark storage, I set this around 100. If using non-IBM storage, check with your vendor for their queue depth recommendation. High queue depth settings have been known to cause data corruption on some non-IBM storage. If unsure, use the default value.
Asynch I/O: Improves write performance for JFS and JFS2 file systems. It does not apply to raw partitions. AIO is implemented differently in AIX 5.1 and 5.2. In 5.1, the min/max AIO settings are for the entire system. In 5.2, the AIO settings are per CPU.
In AIX 5.1, I set the “max server” to 1000. On a 5.2 system, divide 1000 by the number of CPUs. I tend to over configure AIO, as it requires a reboot. Over configuring “max server” doesn’t use any extra resources, as AIO servers are only created when needed. The “max server” just sets the maximum, not the actual number used. If you plan to use DLPAR and dynamically add CPU’s, contact Supportline to discuss the implications.
2/20/2005 Template Documentation 45
Change / Show Characteristics of Operating System
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
Maximum number of PROCESSES allowed per user [10000] ***
Maximum number of pages in block I/O BUFFER CACHE [20]
Maximum Kbytes of real memory allowed for MBUFS [0]
Automatically REBOOT system after a crash false
Continuously maintain DISK I/O history t rue ***
HIGH water mark for pending write I/Os per file [0]
LOW water mark for pending write I/Os per file [0]
Amount of usable physical memory in Kbytes 524288
State of system keylock at boot time normal
Enable full CORE dump false
Use pre-430 style CORE dump false
CPU Guard enable ***
ARG/ENV list size in 4K byte blocks [6]
Default max processes per user (128) too low
Continuously maintain disk I/O - sar and iostat to record disk
CPU Guard - CPU Deallocation
2/20/2005 Template Documentation 46
Change / Show Characteristics of an Ethernet Adapter
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
Ethernet Adapter ent0
Description IBM 10/100 Mbps Ethern
Status Available
Location 10-60
TRANSMIT queue size [8192]
HARDWARE RECEIVE queue size [256]
RECEIVE buffer pool size [384]
Media Speed 10/100,full-duplex****
Inter-Packet Gap [96]
Enable ALTERNATE ETHERNET address no
ALTERNATE ETHERNET address [0x000000000000]
Enable Link Polling no
Time interval for Link Polling [500]
Apply change to DATABASE only no
Avoid use of autonegotiate…
2/20/2005 Template Documentation 47
General Oracle Tuning Tips• Enable Async I/O (ORACLE: init.ora thru smit menu) • Read tips on number of AIO servers enabled (maxservers) • Setting minservers higher will ensure that these servers run at a favored priority and with round-
robin policy - otherwise, servers created after minservers (up to maxservers) run with whatever priority and scheduling policy of the process which issued the AIO (ex. Oracle).
• Set tcp.nodelay = true in the protocol.ora file - very important if using large MTUs (ex, SP systems)
• You may want to run fixed priority for all Oracle processes and increase the timeslice w/ schedtune
• dbwriter and logwriter may be run at more favored priorities • Size Oracle log files to avoid frequent log switches • Use raw LV's for your db files rather than filesystem files if the database size is much larger
than the real memory size. If you have to use filesystem files, then use a lot of files - you are asking for trouble if you put all of your tables into a single file.
• Separate your Oracle log files away from all other files (ie, do not mix Oracle sequential I/O with any random I/O)
• timed_statistics can be turned off for better performance• Check that any non-essential traces are off: eg, SQL trace, SQL-NET trace, or Oracle Otrace.• Increase Oracle spincount parameter on faster hardware
2/20/2005 Template Documentation 48
pSeries/AIX SAP hints• Configuration of aio0 device
– Symptom of problem is slow application I/O but fast I/O at AIX PV level– ROT for starters
• maxservers = 125% of datafile (container) count • minservers = ½ maxservers
• Filesystem buffers - numfsbufs– Symptom is consistent increases in:
• vmstat –v “filesystem I/Os blocked with no fsbuf” (AIX 5.2)• vmtune –a “fsbufwaitcnt” (AIX 4.3.3)
– Increase using AIX 5 vmo (or vmtune with 5.1)
• JFS i-node serialization with change intensive workloads– Symptom is slow I/O at application, fast at filemon PV level
• Perfpmr traces to confirm– JFS2 cio option removes i-node serialization– With JFS, limit size of datafiles (containers) on change-intensive tablespaces
to 2 GB.
2/20/2005 Template Documentation 49
General Backup Goodness
• Large tcpip sizes• Watch concurrent access (disk thrashing)• Remember tcpip parameters are inherited from
parent
2/20/2005 Template Documentation 50
Redbook References• Managing AIX Server Farms,
– SG24-6606-00 • AIX 5L Differences Guide,
– SG24-5765-02• AIX Version 4.3 to 5L Migration Guide,
– SG24-6924-00• AIX 5L Performance Tools Handbook
– SG24-6039-00• Database Performance Tuning on AIX
• SG24-5511
• Understanding IBM eServer pSeries Performance and Sizing• SG24-4810
2/20/2005 Template Documentation 51
Other References
– “Performance Management Guide”http://publib16.boulder.ibm.com/pseries/en_US/infocenter/base/aix52.htm
• SG24-6039-00– Direct I/O:http://www-106.ibm.com/developerworks/eserver/articles/DirectIO.html– Concurrent I/O:http://www-1.ibm.com/servers/aix/whitepapers/db_perf_aix.pdf
2/20/2005 Template Documentation 52
SUMMARY• Each environment will have different challenges• Rules of Thumb – are just that; suggestions• If you don’t know what your performance was BEFORE you
made the change, you won’t know what affect you had on performance.
• Carefully define ALL boundaries that you must operate under. Best-case throughput is always controlled by the slowest common denominator!