Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

Embed Size (px)

Citation preview

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    1/79

    Copyright IBM Corporation, 2013

    Optimizing SAP Sybase AdaptiveServer Enterprise (ASE) for IBM AIX

    An IBM and SAP Sybase white paper on optimizing SAP Sybase ASE 15for

    IBM AIX 5L , AIX 6.1, and AIX 7.1 on IBM POWER5, POWER6 andPOWER7 processor-based servers

    Peter H. Barnett (IBM), Stefan Karlsson (SAP Sybase),

    Mark Kusma (SAP Sybase), and Dave Putz (SAP Sybase)

    August 2013

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    2/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    Table of contents

    Abstract ..................................................................................................................................... 1

    Introduction .............................................................................................................................. 1

    Considering virtual-memory issues ........................................................................................ 4

    ASE memory usage ........................................................................................................................... 4

    ASE 15.7 update: kernel resource memory ........................................................................................ 5

    AIX virtual memory ............................................................................................................................. 6

    Memory-sizing guidelines ................................................................................................................... 6

    AIX memory-configuration guidelines ................................................................................................. 7

    Setting the file-system buffer-cache configuration ......................................................... 9

    Locking (pinning) ASE memory in physical memory .....................................................11

    The ipcs command ......................................................................................................13

    The vmstat command ..................................................................................................13

    Using large pages with ASE ..............................................................................................................14

    Reclaiming swap space.....................................................................................................................16

    Processor considerations ...................................................................................................... 16General sizing guidelines ..................................................................................................................18

    Idling engines (ASE 15.7 process mode, ASE 15.03, ASE 15.5) ..................................19

    ASE engine configuration for AIX and runnable process search count (ASE 15.7 processmode, ASE 15.03, ASE 15.5) ......................................................................................20

    Spinlock contention .....................................................................................................21

    ASE 15.7 and the threaded kernel .....................................................................................................21

    ASE 15.7 threaded kernel: New sp_sysmon metrics ..........................................................................24

    Engine utilization .........................................................................................................24

    Average runnable tasks ...............................................................................................26

    Processor yields by engine ..........................................................................................26

    Thread utilization at the OS level .................................................................................27

    Context switches at the OS level..................................................................................28

    Simultaneous multithreading (POWER5, POWER6, and POWER7 with AIX 5.3, AIX 6.1, and AIX 7.1) .........................................................................................................................................................29

    SMT and online engines ....................................................................................................................29

    Shared-processor LPARs ..................................................................................................................32

    Recommendations for ASE in a shared-processor LPAR environment .........................34

    Processor and memory affinity in POWER7 and POWER7+ processors ............................................37

    The performance penalty in over-allocating engines (applies to all releases) .....................................38

    The AIXTHREAD_SCOPE environment variable in AIX .....................................................................40

    Storage alternatives ............................................................................................................... 41General storage recommendations for AIX ........................................................................................41

    Asynchronous I/O methods ...............................................................................................................44

    Tuning asynchronous I/O in ASE .................................................................................44

    Tuning asynchronous I/O in AIX 5L only ......................................................................45

    Tuning asynchronous I/O in AIX 6.1 and AIX 7.1 ..........................................................46

    Raw devices .....................................................................................................................................46

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    3/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    File-system devices...........................................................................................................................47

    Buffered file system I/O and dsync ....................................................................................................47

    Quick I/O, direct I/O and concurrent I/O: File-system access without OS buffering .............................47

    Considerations for temporary databases ...........................................................................................49

    Special considerations for the 2 KB page-size ASE ...........................................................................50

    Virtual I/O in AIX 5L to AIX 7.1 with POWER5, POWER6, and POWER7 ..........................................50Examining network performance .......................................................................................... 54

    Virtual Ethernet I/O buffers ..........................................................................................56

    Ulimits ..................................................................................................................................... 57

    Summary ................................................................................................................................. 58

    Appendix A: Tuning and performance testing of SAP Sybase ASE 15 on AIX 6.1 and

    POWER7 .................................................................................................................................. 59

    Description of the workloads: BatchStressTest, MixedStressTest, StaggeredStressTest, and TPC-C 59

    Hardware, operating system, and ASE configuration .........................................................................60

    BatchStressTest................................................................................................................................62

    MixedStressTest ...............................................................................................................................64StaggeredStressTest ........................................................................................................................65

    TPC-C benchmark ............................................................................................................................65

    Conclusions ......................................................................................................................................65

    Appendix B: Comparing ASE 15.7 threaded mode with ASE 15.5 and ASE 15.7 process

    mode on AIX 7.1 and POWER7+ ............................................................................................ 67

    BatchStressTest................................................................................................................................68

    MixedStressTest ...............................................................................................................................69

    TPC-C ..............................................................................................................................................70

    General observations ........................................................................................................................70

    Appendix C: AIX 15.7 prerequisite

    I/O completion ports .................................................. 71Appendix D: Resources ......................................................................................................... 72

    Acknowledgements ................................................................................................................ 74

    Trademarks and special notices ........................................................................................... 75

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    4/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    1

    Abstract

    This is a paper from IBM and SAP Sybase to assist database administrators (DBAs) and UNIXsystem administrators working together to achieve the best possible performance from SAPSybase Adaptive Server Enterprise (ASE) on IBM Power Systems servers running in the IBM

    AIX operating environment.

    The focus of this white paper is on SAP Sybase ASE 15.0.2 through SAP Sybase ASE 15.7,running under AIX 5.3, AIX 6.1, and AIX 7.1 on the IBM POWER6 and IBM POWER7 and IBMPOWER7+ processor-based servers. The recommendations of the original version of this white

    paper, published in March 2006, applied to IBM AIX 5L logical partitions (LPARs) with dedicatedprocessors, memory, and devices. The 2009 revision of the paper also addressed all relevantaspects of IBM PowerVM including shared-processor LPARs (referred to as IBM Micro-Partitioning) and virtual I/O, features which are only available in the POWER 5, POWER6, andPOWER7 processor-based servers with AIX 5.3 or later releases. The 2012 revision of the white

    paper extended coverage to POWER7, enhancements to virtual I/O, and AIX 7.1 were relevantto ASE. The current revision extends coverage to ASE 15.7 Electronic Software Distribution(ESD) 4.2. Recent field and lab experience with POWER7and POWER7+ processor-basedservers is applied to new or revised best practice recommendations. The recommendationsapply to ASE Cluster Edition for AIX except where noted.

    Introduction

    This paper is particularly relevant to those businesses that are implementing SAP Sybase ASE instances

    in one of the following environments:

    Implementing new SAP Sybase ASE instances on IBM Power Systems with IBM AIX 5.3,

    AIX 6.1, and AIX 7.1

    Upgrading from earlier IBM Power Systems and AIX installations

    Migrating to an IBM Power System server running AIX 5.3, AIX 6.1, or AIX 7.1 from another

    version of the UNIX operating system

    Note: In this paper, the operating system is referred to as AIX unless a feature or issue is specific to aparticular version of AIX 5.3, AIX 6.1, or AIX 7.1. The intermediate release levels, called technology levels

    are abbreviated as TL where referenced. References to the IBM POWER7 architecture apply to the IBM

    POWER7+ processor-based server line unless otherwise noted. ASE 15.0.1 to ASE 15.7 are referred to

    as ASE unless a feature or issue is specific to a certain level. References to ASE 15 process mode appl

    to ASE 15.7 process mode as well as to ASE 15.0.3 and ASE 15.5. References to ASE 12.5 and AIX 5.2

    and AIX 5.3, which are out of support, are retained for the sake of clients who might have extended

    maintenance arrangements for special circumstances.

    In new implementations, upgrades, or migrations, DBAs and system administrators can encounter feature

    of the AIX operating system and the IBM Power Systems architecture including its management of

    memory and processor utilization, network I/O, and disk I/O operations that are unfamiliar. Administrators

    might also have questions about how best to implement SAP Sybase ASE in their environment. The high

    throughput and unique processor-dispatching mechanism of the IBM Power Systems architecture can also

    require revisions of older configurations.

    The scope of this white paper is the interface between SAP Sybase ASE and the AIX operating system.

    SAP Sybase ASE manuals address general ASE tuning, such as systems tuning, physical-database

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    5/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    2

    design, and query tuning. AIX and IBM Power Systems manuals and white papers address details about

    the server, the operating system, storage, and virtualization that are generic to database servers and are

    beyond the scope of this paper. To the extent that ASE system tuning is pertinent to the optimization of

    performance in an AIX or IBM Power environment, it will be addressed. Likewise, to the extent that

    virtual memory and I/O management, processor options, and virtualization features of Power Systems are

    relevant to optimizing ASE on AIX/Power, they will be addressed as well.

    The focus of this white paper is also on the aspect of database administration in which the DBA and UNIX

    system administrators interact most closely. Ideally, DBAs who read this paper will be competent in SAP

    Sybase ASE; UNIX system administrators will be competent in IBM Power Systems and AIX, and both wi

    be reasonably acquainted with the other s specialty.

    This white paper discusses some commands that require UNIX rootprivileges. Other commands require

    SAP Sybase ASE sa_roleauthority. Close communication and cooperation between the system

    administrator and the DBA is essential in every aspect of tuning, including the following tasks:

    Sizing and allocation of shared memory in the ASE engine

    Configuration of virtual memory in the operating system

    The relation of processor resources to online ASE engines

    Storage considerations, such as raw logical volumes for ASE devices and asynchronous I/O

    configuration

    This paper contains sections on virtual memory, processing, and storage. Following these three major

    sections, there is a short section on networking, a reminder on ulimits, and a summary. All sections are

    updated to include changes pertinent to ASE 15.7.

    The Appendix A: Tuning and performance testing of SAP Sybase ASE 15 on AIX 6.1 and POWER7

    sectionSAP Sybase, which describes a laboratory exercise in tuning and testing SAP Sybase ASE 15 on

    POWER7 processor-based servers, is retained as pertaining to ASE 15.7 process mode and earlier

    releases of ASE. The Appendix B: Comparing ASE 15.7 threaded mode with ASE 15.5 and ASE 15.7

    process mode on AIX 7.1 and POWER7+ section reports comparative performance of ASE 15.5, ASE

    15.7 threaded mode and ASE 15.7 process mode on the same workloads as were used in Appendix A.

    The Appendix C: AIX 15.7 prerequisite I/O completion ports section contains details on activating and

    patching the input/output completion port (IOCP) subsystem which is used by ASE 15.7. All references (in

    the Appendix D: Resources section) have been updated.

    The recommendations that are included in this white paper have been gathered and compiled from a

    number of resources, including:

    Experience in the field (pertaining to development, testing, and production systems) with ASE in

    the AIX environment

    Laboratory experiments conducted by specialists at SAP Sybase and IBM

    Close cooperation and discussions between AIX and ASE performance and development teams

    System and database administration is close to being an exact science, but performance tuning is an art.

    The recommendations in this white paper are a starting point. Administrators can apply these techniques

    in consultation with users, business units, and application developers. Every enterprise has its unique

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    6/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    3

    characteristics, its mixture of batch and online transaction processing (OLTP), its daily ebb and flow of

    demand, and its evolution and growth.

    Likewise, both SAP Sybase and AIX specialists have changed their performance recommendations over

    time. This might continue to be the case as SAP Sybase ASE, the AIX operating environment, and the

    IBM Power Systems platform evolve in parallel. A formal partnership exists between IBM and SAP

    Sybase. IBM and SAP Sybase have committed to continuing a dialogue to keep their recommendations

    current and relevant.

    The authors want to close the introduction with an observation and a strong recommendation. In many

    organizations, a significant disconnect exists between database and system administrators. This

    disconnect adversely affects system performance and availability. Therefore, it is important to establish

    close cooperation and direct communication between these units and roles to build stronger business

    benefits from the effective use of the system.

    LPARs and dynamic logical partitioning (DLPAR)

    Several IBM POWER4 processor-based UNIX servers and all IBM POWER5 , IBM POWER6, and

    POWER7 and POWER7+ processor-based UNIX servers are LPAR-capable. That is, they support LPARswhich are multiple OS images of varying sizes and capacities, that exist on a single physical server frame

    Even if there is only one image on an applicable hardware platform, it is still an LPAR. All of the tuning

    recommendations contained in this white paper apply to the AIX operating system images in the form of

    LPARs.

    LPAR-capable POWER4, POWER5, POWER6, and POWER7 processor-based systems support the

    dynamic reallocation of memory and processors, referred to as DLPAR. The system administrator can add

    or remove processors and memory in an LPAR without rebooting, within the minimum and maximum

    memory and processor values specified in the LPAR profile. The same applies to virtual resources that ar

    discussed in the following sections.

    A note on optimization level

    While the focus of this white paper is on ASE and AIX system tuning, the authors wish to call attention to

    the ASE configuration parameter,optimization level. This parameter is perhaps not as familiar as

    optimization goal or compatibility modebut needs to be taken into account when migrating from an

    earlier to a current version of ASE. Whereas, optimization goaloptimizes for an OLTP, mixed or batch

    type workload, and compatibility mode allows the use of an ASE 12.5 level of optimizer, optimization

    level allows the database administrator and application administrator to control the level of optimizer

    changes within the ASE 15 family of servers. The default is the ASE 15.0.3 ESD1 optimizer; other choices

    are ASE 15.0.3 ESD2 and 3, and ase_current, which is the optimizer supported by the current installation

    for example, ASE 15.7 ESD4. Thus, in migrating from ASE 15.0.3 to ASE 15.7, the administrator might

    choose to implement the default value until the optimizer enhancements of ASE 15.7 could be thoroughly

    tested with the applications. Full documentation of optimization level is available at:http://www.sybase.com/detail?id=1080354

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    7/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    4

    Considering virtual-memory issues

    Incorrect sizing or OS configuration can severely degrade performance. This section explains how ASE

    uses and AIX manages virtual memory and provides detailed configuration advice.

    All host applications consume memory because of the code, static variables, and more. Certain

    applications have greater memory requirements than others. Typically, an administrator configures a

    relational database management system (RDBMS) to use the majority of the host's memory for data

    caches, but other applications can use significant amounts, as well. For example, memory must be

    available for:

    Operating system

    File-system buffer cache

    A GUI-based environment

    Batch operations, such as loading or unloading data (bulk copy)

    Maintenance operations, such as database backup and restore

    Other SAP Sybase applications, such as SAP Sybase Open Client and SAP Sybase Open Server

    Third-party applications and utilities that run on the same LPAR against the data server

    If virtual memory approaches over-allocation (being overcommitted), then the operating system looks for

    memory pages to write to disk, free memory, and make it available for other purposes. This process,

    typically referred to as paging, is a concern only when some memory area approaches its limit or the

    search for victim pages requires significant resources. The main performance hit stems from the fact that

    the application expects to find some piece of data in memory, yet the data has been paged to disk.

    Retrieving paged data can result in throughput that is an order of magnitude slower. Although AIX and its

    applications continue to run in a degraded fashion while some paging space remains, paged-out code

    paths might result in application timeouts with unpredictable results.

    ASE memory usage

    SAP Sybase ASE uses memory for many purposes, including:

    Executable binaries

    Configuration (static and dynamic configurable parameters)

    Resources such as number of open objects

    User tasks (number of user connections)

    Data caches

    Procedure cache

    In-memory databases (ASE 15.5 and higher versions)

    Kernel resource memory (ASE 15.7)

    During startup, the ASE engine allocates shared memory to meet all these needs, up to the limit definedby themax memoryvalue. If max memoryis set to less than the total of configured resources, then the

    engine does not start. If it is set higher than what the specified items need, the ASE instance requests onl

    the required amount of memory.

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    8/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    5

    The classic ASE scheme for allocating shared memory is to request the total max memoryamount at

    startup. Users can revert to the classical scheme by setting the allocate max shared memoryparameter

    to 1. This does not limit the administrator s ability to reconfigure the ASE engine dynamically, unless the

    dynamic allocation on demandvalue has changed from its default setting. You can increase the amoun

    of memory that the ASE engine uses by increasing the max memoryvalue during run time. Optionally, th

    ASE engine also allows administrators to lock its memory into physical memory (refer to the Locking(pinning) ASE memory in physical memory section).

    but if ASE memory is pinned, memory added by increasing max memorywill not be pinned until the next

    restart.

    SAP Sybase ASE maintains two other measures of memory besides max memory, namely total logical

    memoryand total physical memory. You can observe them in the sp_configure memoryprocedure

    (refer to Listing 1: ).

    total logical memory 63488 3189754 1594877 1594803 memory pages(2 k)

    read-only

    total physical memory 0 3189756 0 1594878 memory pages(2 k)read-only

    Listing 1: SAP Sybase ASE maintains two additional measures of memory (besides max memory)

    The total logical memory valuerepresents the amount of memory required by all objects that are

    allocated in the configuration file or through the sp_configureprocedure; for example, users, caches,

    open databases, and open objects.

    The total physical memoryvaluerepresents the amount of memory that is actually in use at a given time

    This fluctuates according to the number of users online and the number of open objects, among other

    factors. This value corresponds closely to the value of active virtual memory in the AIX vmstatreport that

    is attributable to ASE (that is, minus the kernel and other running processes that use memory). The

    difference between the memory that is set aside for ASE by the OS and the amount actually allocated cancreate misunderstandings regarding the sizing of the system and the amount of free memory the system

    actually has. Moreover, AIX uses a lazy allocation policy such that even if allocate max shared memory

    is enabled, AIX does not attach the memory until ASE needs to access it. Refer also to the Locking

    (pinning) ASE memory in physical memory section.

    ASE 15.7 update: kernel resource memory

    As described in the ASE 15.7 System Administration Guide,kernel resource memory is required to bring

    online a given number of engines (both in process and thread mode) and is determined mainly by two

    factors, max online enginesand number of user connections. For configurations of eight engines or

    less, it is recommended to add one 2 KB page of kernel resource memory for every two user connections

    above 100, and for 9 engines or more, increase kernel resource memory by one page for each user

    connection. The default value is 4096. The symptom of insufficient kernel resource memory is not being

    able to online the number of engines specified by number of engines at startup (process mode) or in the

    syb_default _pool (threaded mode). Increasing kernel resource memory can allow the additional engines

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    9/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    6

    to be brought on line. ASE 15.7 threaded mode will be described in detail in the Processor

    considerations section.

    AIX virtual memory

    For a dedicated ASE host, the two largest consumers of memory are likely the ASE engine and the file-

    system buffer cache. When memory is close to being over-allocated (overcommitted), the AIX operating

    system pages memory to disk. This is also true for any other operating system in which ASE runs. File-

    buffer pages and pages that belong to running processes, such as ASE instances, can be paged to disk.

    AIX support refers to file-buffer paging as page replacementor page stealingas distinct from the paging

    of an application s working set.Paging in the latter sense causes far more serious performance issues.

    The size of the file-system buffer cache is set as a percentage of physical memory (RAM) using the

    minperm%and maxperm%parameters. The journaled file system (JFS) and the client file systems are

    managed within this cache. Common examples of client file systems are the enhanced journaled file

    system (JFS2) and the Network File System (NFS). The administrator can set the client file system buffer

    cache maximum size (through the maxclient%parameter) to a percentage of RAM. In practice,

    maxclient% is usually set to the same value as maxperm%.

    An intuitive approach to tuning the file-system buffer cache size (familiar from earlier versions of

    IBM AIX 5L ) is to limit the cache size by setting the maxperm%parameter to a low figure, such as the

    memory left over after allocating for ASE, applications and the kernel, and then enforcing that figure as a

    hard limit by setting the strict_maxpermparameter to 1. The maxperm%parameter defaults to a soft

    limit. Enforcing a hard limit has two drawbacks:

    When workloads change, the administrator must change the configuration.

    When memory is overcommitted, the system does not behave gracefully.

    The following sizing and configuration guidelines represent the current best practices for ensuring

    performance and reliability, without demanding regular reconfiguration. These guidelines reflect the design

    of AIX virtual memory, which is to fill almost all memory left over from applications (up to all but minfree)

    with file cache, and to replace this file cache on demand. They are implemented as defaults in AIX 6.1 and

    AIX 7.1 and must not be altered without the advice of AIX support. Clients still using AIX 5.3 are advised t

    adhere to the same values as the defaults for AIX 6.1 and AIX 7.1.

    Memory-sizing guidelines

    An administrator usually sizes a dedicated ASE host so that it is large enough to accommodate one or

    more ASE instances. The general guideline is to dedicate as much memory as possible to the ASE engine

    and also to save space for other applications. The latter must be emphasized, as proper sizing is

    imperative in the AIX environment. The recommendations in this section strive to provide a solution, as

    robust as possible. Here is a simplified example of the sizing process:

    The size of the AIX kernel is proportional to the total memory of the LPAR. The kernel grows between

    reboot operations owing to JFS2 buffer allocation and inode caching. This is also discussed in the

    Locking (pinning) ASE memory in physical memory section.

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    10/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    7

    If a host has 16 GB of RAM, the kernel will be around 1.5 GB in size. Then 14.5 GB or less should

    be considered available to other uses, including file-system buffer cache and ASE.

    Assuming that there are no other major applications and no windowing environment, the next item

    is the file-system buffer cache. In this example, it is possible to assume a total of 3 GB for the JFS

    and JFS2 file-system buffer caches, leaving 11.5 GB or lessof memory that is available to

    applications, including ASE.Note:As explained in the following section, this does not imply that you need to limit file-

    system buffer-cache size to a maximum of 3 GB. Instead, you can determine this maximum

    by examining empirical data from monitoring. Current AIX best practices, detailed next, allow

    almost all memory that is not occupied by program and kernel memory to be occupied by file

    cache and to replace file cache on demand.

    The next consideration involves batch jobs and maintenance tasks. In this case, batch jobs do not

    run concurrently with maintenance tasks. The greater of these might consume 1 GB of memory fo

    the duration of the operation, which leaves 10.5 GB available for ASE.

    Reserving 1 GB of memory (for headroom and future increases in batch jobs) suggests

    configuring ASE for 9.5 GB of memory.

    Note: In this example, if the 9.5 GB to the ASE engine is insufficient, you can add morephysical memory to the host.

    As a rule of thumb, start sizing ASE on an LPAR dedicated to ASE with no other significant

    applications by allocating at most 70% of physical memory on the LPAR to ASE max memory: if

    there is more than one instance on an LPAR, then the sum of all max memory. Local application

    consuming significant memory should be factored into the 70%. Over-allocating memory to ASE

    (or any large application) can result both in bad paging of computational pages and also in file-

    cache thrashingwhere normal page replacement is so rapid, or so many pages have to be

    scanned to find page-eligible ones, that it consumes a substantial amount of the processor and

    slows down overall processing.

    AIX memory-configuration guidelines

    A versatile operating system such as AIX, which runs applications of many different characteristics,

    requires that the administrator configures it specifically for its intended use. In the case of virtual

    memory, there are some areas of greater interest, where you need to ensure system performance and

    provide a robust configuration. Administrators can view and set parameters that are related to the AIX

    Virtual Memory Manager (VMM) through the AIX vmocommand, which requires root privileges.

    However, you can see several VMM parameters by using the vmstat -v command, which by default

    does not require root privileges.

    To view all parameters and their respective settings, use the vmo a command(vmo Fa in AIX 6.1

    and AIX 7.1 to see restricted parameters).

    To set themaxperm%parameter to 90%, use the vmo -p -o maxperm%=90 command. AIX versions 6.1 and 7.1 come with virtual memory defaults that usually make these changes unnecessary.

    Do not change AIX 6.1 and AIX 7.1 restricted tunables without the advice of AIX support.

    The following recommendations largely correspond to AIX 6.1 and AIX 7.1 defaults and are included to

    provide the reader with a basic understanding of AIX virtual memory management.

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    11/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    8

    maxperm%=maxclient%=

    [If possible, set these values so that maxclient%is always greater than numclient%. The

    current recommendation for AIX 5L and the default for AIX 6.1 and AIX 7.1 is 90%. To

    determine the current numclient%value, use the AIX vmstat vcommand.]

    minperm%=

    [Set this value so that minperm%is always less than numperm%. The current

    recommendation for AIX 5L and the default for AIX 6.1 and AIX 7.1 is 3%, but you can set it

    as low as 1%. To find the current numperm%value, use the vmstat -vcommand.]

    strict_maxperm = 0

    strict_maxclient= 1

    lru_file_repage= 0

    lru_poll_interval= 10

    page_steal_method = 1

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    12/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    9

    Notes:

    In contrast to many other UNIX operating systems, the AIX environment does not require you to

    configure limits for semaphores or shared-memory size (for example, compare this to the

    kernel.shmmaxtunable [configuration parameter] on the Linux operating system and

    /etc/system in Solaris). To find the amount of physical memory, use the AIX lparstat i command (/usr/bin/lparstat,

    illustrated in Listing 5 under the Processor considerations section). The vmstat -vcommand

    also displays physical memory, but in 4 KB pages, and not in megabytes (MB).

    File cache size is reported by vmstat v, which can be run by any user. The key values,

    numperm%and numclient%represent, respectively, total file cache and file cache consisting of

    clientpages, such as JFS2 (64-bit native AIX file system), NFS, Common Internet File System

    (CIFS) and Veritas File System. These values are usually the same in contemporary systems

    owing to the obsolescence of the legacy 32-bit JFS.

    The recommended values apply to all AIX operating environments and to all applications running

    on AIX. They are set the same whether ASE is using buffered file system, direct I/O, or raw

    devices.

    Setting the file-system buffer-cache configuration

    The minperm%and maxperm%parameters, which represent a percentage of physical memory,

    specify the size of the AIX file buffers. Within this cache, the AIX operating system also manages file

    pages for client file systems (for example, JFS2 and NFS). Administrators set the maximum size of

    this subset using maxclient%, again representing a percentage of physical memory.1

    Numperm%should be monitored continually, especially on a new system, until it reaches a steady

    state based on routine usage. If the numperm%parameter drops to less than theminperm% value,

    then the AIX operating system will swap out computational pages (for example, pages belonging to

    the ASE processes) instead of file cache. Therefore, you must set minperm% low (3% or less) to

    prevent this. Note that if numperm%is as low as 3% on a regular basis, then memory might be

    overallocated and file cache might have been effectively squeezed out of the system. Even if ASE

    uses raw data devices, its bulk copies, backups, and other maintenance operations need file cache,

    and if they have difficulty obtaining it, they will slow down and consume extra processor2.

    By setting the maxperm%parameter to a high value, the file-system buffer cache can grow, but not at

    the expense of computational pages, because lru_file_repageis set to 0and page_steal_methodis

    set to 1.

    The AIX operating system can page file-system buffer pages and computational pages to disk. When

    memory requirements exceed the amount that is currently available, a certain amount of paging

    1 JFS2 is the 64-bit AIX journaled file system, and is almost universally used instead of JFS, the 32-bit implementation.

    However, AIX virtual memory management still classifies JFS2 file systems as clientfile systems.2 One can reduce the burden on file cache by mounting file systems used for dumps, backups and load with the release

    behindoptions: release behind readfor file systems typically read sequentially, release behind writefor file systems typicallywritten to sequentially, and release behind read writefor file systems both read and written to sequentially. An ASE dump areawhich is swept by a tape backup utility would be a good use case for release behind read write.

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    13/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    10

    occurs. At this time, the least recently used (LRU) algorithm comes into play (you are probably alread

    familiar with LRU because of its use in ASE data caches). A good example of using the LRU algorithm

    involves burst loads where a large amount of file buffers are currently in use. Then, the AIX page-

    stealing daemon (lrud) looks at the memory pages to determine if a particular page has been paged

    recently; if not, the lrudprocess sends the page to disk. This happens irrespective of whether the

    page belongs to a file or a process. The recommended lru_file_repage = 0andpage_steal_method = 1settings prevents the lrudprocess from paging computational pages except

    as a last resort. Setting lru_poll_intervalto 10 can improve the responsiveness of lrudwhile it is

    running. The value of 10 is the default in AIX 5.3, 6.1, and 7.1.

    Note:The lru_file_repageparameter is tunable in AIX 5L 5.2.05 and later. For AIX 5L 5.2, this

    tunable parameter is available as a fix through the authorized program analysis report (APAR)

    IY62224. The value of 0 causes lrudto ignore the age flag on the page when considering it to be

    paged out. This helps preserve computational pages. In AIX 7.1,lru_file_repage=0 is hard wiredand

    no longer visible to be modified. The page_steal_methodtunable, introduced in AIX 5.3, TL6 also

    helps avoid paging out computational pages by maintaining separate lists of computational and file

    cache pages and favoring the latter for paging,

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    14/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    11

    Here is a summary of the recommended virtual-memory configuration values for AIX 5.3, AIX 6.1, and

    AIX 7.1:

    Parameter Recommended Valuefor AIX 5.3, AIX 6.1and AIX 7.1

    AIX 6.1and AIX7.1 default

    AIX 6.1 andAIX 7.1restricted

    AIX 5.3default

    minperm% 3 3 No 20

    maxperm% 90 90 Yes 80

    maxclient% 90 90 Yes 80

    strict_maxclient 1 1 Yes 1

    strict_maxperm 0 0 Yes 0

    lru_file_repage 0 0/na* Yes 1

    lru_poll interval 10 10 Yes 10

    Minfree 960 960 No 960

    Maxfree 1088 1088 No 1088

    page_steal_

    method

    1 1 Yes 0

    Memory affinity 1 1 Yes 1

    v_pinshm 0 0 No 0

    lgpg_regions 0 0 No 0

    lgpg_size 0 0 No 0

    maxpin% 80 80/90* No 80

    esid_allocator 1 (AIX 6.1 and 7.1) 0/1 No n/a

    Table 1: Virtual memory configuration for AIX

    Large memory segment aliasing

    Large segment aliasing allows each memory segment lookaside buffer to address up to 1 TB of

    memory, reducing segment lookaside buffer faults and improving memory access. This is enabled by

    default on AIX 7.1, and is enabled using vmo p o esid_allocator=1 in AIX 6.1.

    Locking (pinning) ASE memory in physical memory

    Administrators can configure the ASE engine to lock (pin) its memory into physical memory. This

    requires both: operating system configuration and ASE configuration. Pinning ASE shared memory

    * In AIX 7.1, maxpin% (maximum allowable pinned memory) default is 90%. In AIX 7.1, the lru_file_repage parameter isimplemented by default and is neither listed nor modifiable.

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    15/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    12

    used to be a common practice to avoid paging, but AIX performance support generally recommends

    against pinning (if memory is overcommitted, pinning makes matters worse). If paging is caused by

    incorrect virtual memory configuration, then that can be corrected without pinning memory. SAP

    Sybase and IBM only recommend pinning in order to use 16 MB large pages with ASE 15.0.3 and

    higher versions. If you decide to employ pinning, you must configure the ASE instance to allocate all

    configured and shared memory during startup. The following steps demonstrate the configuration ofAIX and ASE for locking ASE memory:

    1. To enable pinning from the OS level, use vmo p o v_pinshm=1command.

    2. To configure ASE to lock shared memory, use the ASE sp_configure 'lock shared memory', 1

    command andsp_configure 'allocate max shared memory, 1command, or modify the values

    in the ASE .cfg file. It is not necessary to restart AIX but ASE must be restarted.

    Notes:

    By default, AIX allows for the pinning of 80% of physical memory in AIX 6.1 and 90% in AIX 7.1.

    To change this, use vmo-p o maxpin%=. In general, this value must not be increased,

    or else, it might result in starvation of processes using file-system cache.

    Locking shared memory can cause issues for other applications if you have not taken into accoun

    the memory requirements of the kernel, other applications, and the file cache.

    You can dynamically configure the ASE memory to be larger or smaller by using the

    sp_configure max memory, command, and then use this memory for data caches

    or other purposes. The newly allocated memory is not locked into physical memory. Thus, to add

    memory to ASE and have it pinned, ASE needs to be shut down and restarted.

    The IBM AIX performance-development team generally recommends against pinning shared

    memory because the operating system can do a better job by allocating memory dynamically. AIX

    attaches mapped memory to ASE when ASE needs it and not before. AIX will also not attach the

    difference between ASE physical memoryand max memory. In both cases, this memory is

    available to other processes and to file cache, whereas, if it was pinned, it might be unused by

    ASE and yet unavailable to other processes. As already noted, the only case in which AIX

    support recommends pinning memory is when using the large (16 MB) memory page for

    performance enhancement, which must be pinned. Refer to the Using large pages with ASE

    section for details.

    In AIX, kernel-pinned memory grows between OS restarts principally because it caches inodes

    and file system lookahead buffers. Inode caching can occupy up to 12% of pinned physical

    memory in AIX 5.3 and higher. It is not unusual to see 18% of physical memory that is pinned by

    the kernel. This kernel behavior must be taken into account in the relative sizing of ASE memory

    and file cache, whether shared memory is pinned or not, but it is especially important in the case

    of pinning.

    If the limit of pinned memory is exhausted between ASE shared memory and the kernel, the

    system stops. You can restrict the growth of the AIX kernel through certain tunable variables, but

    there can be performance tradeoffs. If kernel growth is a problem, you should place an AIX

    support call.

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    16/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    13

    The best way to understand ASE engine memory requirements is to check the size of its shared

    memory. Compare the values reported by the ASE sp_configure memorycommand (maximum,

    logical, and physical memory) with the value reported by the ipcs -b -m UNIX command.

    The ipcs command

    Refer to the example use of the AIX ipcsinvestigator command (shown in Listing 2):

    # ipcs b -m:

    IPC status from /dev/mem as of Wed Dec 28 12:32:27 est 2005

    T ID KEY MODE OWNER GROUP SEGSZ

    Shared Memory:

    m 8 0xd8009059 --rw------- sybase sybase 2147483648

    m 9 0xd800905a --rw------- sybase sybase 877723648

    #

    Listing 2: Example use of the AIX ipcs investigator command

    The combined byte counts in the SEGSZ column (right-most column in Listing 2) might be much less

    than the ASE configuration max memoryparameter. This indicates that ASE is configured not to

    allocate all memory at startup, but to allocate memory on demand. This is the default setting.

    The AIX VMM has a rather lazy allocation algorithm for shared memory. This means that VMM

    reserves all the memory that an application requests (as seen through ipcs b m); however, VMM

    does not actually allocate the memory until the application refers to it (active virtual memory as

    measured by vmstat, sar and svmon). Even if the sp_configure parameter allocate max shared

    memoryis set to 1 (true), AIX does not allocate a shared-memory segment until it is referenced. The

    memory that ASE uses, as perceived by the OS (refer to The vmstat command section) increases

    over time and usage until it equals the size of the ASE logical memory. The actual size of allocated

    ASE shared memory seen by the Systems Administrator may differ from what the DBA expects. This

    is one of the many reasons for close cooperation between DBAs and system administrators.

    Applications that terminate abnormally, including ASE itself, can leave shared-memory segments

    stranded. It is important that you explicitly remove these stranded-memory segments by using the AIX

    ipcrmcommand.

    The vmstat command

    Administrators can monitor virtual memory usage with the vmstat -I(upper case i) command. The

    output of this command contains relevant information in the avm,pi,po, frand srcolumns(refer to

    Listing 3:).

    The avmcolumn shows the amount of virtual memory in use (in 4 KB memory pages).

    Thepicolumn shows the number of cache misses that are paged in.

    Thepocolumn shows the number of pages that are paged out.

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    17/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    14

    The frcolumn shows the number of pages freed.

    The sr(scan rate) column shows the number of pages that were scanned to find eligible

    pages to page out.

    kthr memory page faultscpu

    ----- ----------- ------------------------ ------------ -----------

    r b avm fre re pi po fr sr cy in sy cs us sy id wa

    2 0 2466459 12842265 0 0 0 0 0 0 2561 683764 13014 8 3 89 0

    0 0 2466459 12842205 0 0 0 0 0 0 3896 875010 18645 11 4 84 0

    1 0 2466462 12842010 0 0 0 0 0 0 10548 1170276 37027 26 9 65 0

    9 0 2466472 12841649 0 0 0 0 0 0 22147 1120355 66841 50 17 33 0

    Listing 3: Using the vmstat command to monitor virtual memory usage

    Note:In contrast to the other UNIX implementations, the AIX file-system I/O operations are

    separate from the paging operations. You can see this by looking at the fiand focolumns from

    the vmstat Icommand output (shown inListing 3:). This helps in determining whether memory is

    actually running low or not. Also, in contrast to other UNIX implementations, AIX does not clear

    the available memory of idle file pages until the space is needed for another process (for example

    a Java heap). This can create the impression (for administrators new to AIX) that AIX is out of

    memory all the time.

    Using large pages with ASE

    Large pages (16 MB) can make a noticeable improvement in performance for large-memory configurations

    by decreasing the number of translation-lookaside buffer (TLB) misses.Field and lab experience showsthat many ASE workloads achieve up to 15% performance improvement with 16 MB large pages. SAP

    Sybase ASE supports 16 MB large pages starting with version 15.0.3. All supported versions of AIX

    implement

    16 MB pages.

    Perform the steps for using large pages.

    1. Ensure that ASE memory is locked according to the procedures given earlier.

    2. Use the lgpg_regionsand lgpg_sizetunable parameters (in the vmocommand) to specify how

    many large pages to pin. Be sure to round up to the next 16 MB page greater than the amount of

    max memory. For example, if max memoryis set to 699999, you need to pin 85 large pages.

    The two parameters must be specified in the same command line, in this example, vmo -p -olgpg_regions=85 -o lgpg_size=16777216. The change is dynamic and a reboot is not

    required. However, the vmocommand warns you to run bosbootto make sure that the values

    are correctly set the next time you reboot. Two cautions apply:

    a. AIX allocates 16 MB large pages in 256 MB frames. If the amount allocated in

    lgpg_regionsis not an even multiple of 256 MB, then the remainder is allocated in 4K

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    18/79

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    19/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    16

    1.

    Listing 4: svmon G command output (last line shows 112 16 MB large pages in use out of a pool of 124 such pages)

    Reclaiming swap space

    AIX versions 5.3, 6.1, and 7.1 periodically reclaim swap space for reuse after the pages it contains have

    beenpaged backinto physical memory. This is true even when long-running processes, such as ASE, are

    still running. In AIX 5.3, AIX 6.1, and AIX 7.1, if the initial paging space is adequately sized (usually 8 GB

    to 16 GB for large memory systems), you do not have to worry about running out of swap space unless

    memory is actually overcommitted. Several tunable parameters control the reclamation of swap space, bu

    the defaults are usually satisfactory. It is a better practice to use a monitoring tool, commercial or home

    grown, to alert the System Administrator to the use of swap space, than to create a huge swap space.

    In AIX 5L V5.2, when paging occurs, occupied swap space is not released until the owner process

    terminates, even if the pages are paged back. This creates a small risk that long-running processes mightrun out of swap space. For the sake of performance, the best defense against this is to avoid paging

    altogether by using the virtual-memory tuning recommendations that have just been offered. Another

    option is to recalculate the memory needed for the ASE instance (max memory) and for the operating

    system and for related batch and maintenance processing. You can then pin this recalculated ASE max

    memory. Consider this as a last resort for AIX 5.2, which does not change the general advice against

    pinning*.

    Processor considerations

    The observations and recommendations in this section apply to the ASE 15.7 threaded kernel and process

    mode except where noted. Additional topics have been added to this section to address the new features

    of the threaded kernel, configuration parameters, and monitoring specific to the threaded kernel.

    * AIX 5.2 is out of support with the exception of the AIX 5.2 workload partition (WPAR) supported under AIX 7.1 Theseremarks about AIX 5.2 paging are retained for the assistance of administrators who might be running legacy environments not yetmigrated to supported versions of AIX.

    # svmon -Gsize inuse free pin virtual mmode

    memory 983040 951504 31536 820477 936839 Dedpg space 737280 110327

    work pers clnt otherpin 736973 0 0 34352in use 875453 0 26899

    PageSize PoolSize inuse pgsp pin virtuals 4 KB - 149600 55831 103773 171463m 64 KB - 18375 3406 13050 19164L 16 MB 124 112 0 124 112

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    20/79

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    21/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    18

    Variable Capacity Weight : -

    Minimum Capacity : 1.00

    Maximum Capacity : 8.00

    Capacity Increment : 1.00

    Maximum Physical CPUs in system : 16

    Active Physical CPUs in system : 16

    ..

    Lparstat i partial listing illustrating a shared processor POWER6 LPAR with twoSMT threads and three virtual processors and entitled capacity of 1

    Node Name : atssg54

    Partition Name : vios2_atssg54_VM1_Sybase1

    Partition Number : 5

    Type : Shared-SMT

    Mode : Uncapped

    Entitled Capacity : 1.00

    Partition Group-ID : 32773

    Shared Pool ID : 0

    Online Virtual CPUs : 3

    Maximum Virtual CPUs : 8

    Minimum Virtual CPUs : 1

    ............................................................

    Listing 5: Using the lparstat i command to obtain processor information in dedicated and shared environments

    General sizing guidelines

    There is a temptation, in particular when consolidating systems or migrating from slower processors to

    faster (for example, POWER6 and POWER7) processor-based environments, to keep the previously

    assigned processor configuration and engine allocation. This often leads to idle ASE engines and wasted

    processors. In earlier generation reduced instruction set computer (RISC) processors, using more active

    ASE engines than the number of physical processors proved detrimental to performance, throughput, and

    response times. Since the introduction of POWER6 processor-based systems in 2008, allocating more

    engines than cores (or virtual processors) has become commonplace. One active ASE engine per physica

    processor, or core, stands as a sizing guideline for POWER5 processor-based systems and as a starting

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    22/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    19

    point for sizing POWER6, POWER7 and POWER7+ processor-based systems. The corresponding

    starting point for sizing ASE in a shared processor environment, engines = virtual processors, is

    explained in the section on shared-processor LPARs. Refer to the SMT and online engines section for

    further discussion of the ratio of engines to processors.

    Some environments, by contrast, benefit from the reverse: running fewer ASE engines than there are

    processors. Other tasks (applications and OS processes) require processor capacity. This includes native

    threads, for example, SAP Sybase Real-Time Data Service (RTDS) and the asynchronous I/O servers for

    asynchronous I/O processes on the file systems. The latter is often forgotten, which leads to poor

    performance in high-workload situations. For this reason, administrators must take asynchronous I/O

    servers and other processes into account when allocating processor resources. It is up to you to test and

    decide what is most beneficial to your organization.

    An ASE that uses raw devices makes little use of the asynchronous I/O subsystem. So, if there are no

    other substantial applications running on the LPAR, engines=cpus (engines=virtual processors)is a safe

    setting. When migrating a relatively constant workload from slower processors to faster processors, the

    best practice is to reduce the number of online engines proportionally to the faster clock speed while

    taking account of the requirements of user connections and physical I/O. Then, add engines andprocessors while monitoring peak and sustained utilization to ensure that ample headroom is available.

    Given the very high clock speeds of the POWER6, POWER7 and POWER7+ processors, ASE engine

    allocation presents two special challenges: avoiding high rates of idling and avoiding spinlock contention.

    Idling engines (ASE 15.7 process mode, ASE 15.03, ASE 15.5)

    ASE engines are peers; that is, each is responsible for its own network and disk I/O processes. Through

    ASE 15.7 in process mode, an engine that has started an I/O method must also manage it (from polling to

    I/O completion and delivering the I/O event to the sleeping task). As an ASE engine starts idling (for

    example, when there are no tasks on local or global run queues that the engine can run) it does not yield

    the processor immediately. Instead, the ASE engine tries to stay on the processor, as this reduces the

    additional processor load of frequently rescheduling the engine. Therefore, instead of yielding, the engine

    looks for work to do by continuously checking for network I/O, completed disk I/O processes, and other

    tasks to run from the run queues. On the AIX operating system, the completion of disk I/O methods does

    not require a system call, but polling the network does.

    For a busy ASE engine, user time typically far outweighs system time, but an idling engine reverses this

    ratio; the data server process consumes a significant amount of system time and a marginal amount of

    user time. The AIX trace report tprofreports a very high rate of select()calls when engines idle. Use the

    ASE sp_sysmonsystem procedure together with vmstatand TOPAS_NMONto troubleshoot the

    throughput issue. In an extreme case, you might witness 100% processor utilization at the OS level (with a

    high percentage of system processor usage) and only 25% average processor busy at the ASE level. If

    the system is improperly sized, then the system processor time consumed in polling might actuallydegrade the overall performance, even on a dedicated ASE server. This is because the polling engines

    can prevent engines with work to do from being dispatched as soon as they are runnable. SAP Sybase

    ASE has a configuration parameter, runnable process search count (RPSC), which controls the numbe

    of times an idle engine polls before yielding the processor. Reducing runnable process search countis

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    23/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    20

    intended to reduce the amount of kernel processor consumed by idling engines. The use of this

    parameter in conjunction with engine count is discussed in the following section.

    After an ASE has started, all its configured engines, that is, the number specified by number of engines at

    startup, administrators can bring additional ASE engines online dynamically. The ASE configurable

    parameter number of engines at startupdefines the number of engines that are booted during startup. If

    an administrator configures additional engines through the ASEmax online enginesparameter,these

    new ASE instances can come online by using the ASE sp_engine 'online' command.

    Note 1: It is preferable to bring ASE engines online when the system is idle or nearly idle.

    Note 2:ASE supports taking engines offline through the ASE proceduresp_engine 'offline' [,

    ]. However, this does not always succeed because the designated engine might hold

    resources that cannot be released. You can use the can_offlineargument to sp_engineto find out

    whether it is possible to take an engine offline.

    ASE engine configuration for AIX and runnable process search count (ASE 15.7

    process mode, ASE 15.03, ASE 15.5)

    Earlier in this discussion of processor best practices for SAP Sybase ASE, idling engines has been

    discussed and runnable process search count has been referred to. To decrease the amount of

    processor time that idling ASE engines consume, it is sometimes suggested that the DBA set the

    runnable process search countconfiguration parameter to a lower number than the default of 2000. An

    engine that cannot find a task to run in local or global queues will only loop that lower number of times

    looking for work, instead of the default 2000 times. A commonly suggested value is 3, which was the

    default value in an earlier version of ASE. Notice that the value of runnable process search count=0

    has the reverse effect, preventing the engine from ever yielding.

    There are several reasons for concern about decreasing the runnable process search count

    (RPSC) parameter:

    Reducing RPSC can delay the completion of pending network I/O processes. You can directly

    observe this through vmstatand other OS monitors as a significant increase in I/O waittime that

    is attendant to reduced RPSC. I/O wait is defined as the percentage of time the processor is idle

    while one or more I/O tasks are pending. Reducing RPSC causes the ASE engines to spend more

    time sleeping, rather than polling for I/O requests.

    No value other than the default works consistently. Some sites use 1000, others use 100, while

    others use 3. It is a matter of trial and error to find an RPSC that improves throughput.

    The net effect of reducing RPSC, at best, is less than that is obtained by reducing the number of

    engines online for the given volume of work. For example, an engine with at least one pending

    asynchronous I/O will repeatedly check for completion of the I/O, evenafter RPSC is exhausted.

    Field experience with ASE and AIX 5L and AIX 6.1 on the POWER6 processor-based architecture

    generally shows degradation of performance when RPSC is reduced from 2000.

    Note: For ASE 15 Cluster Edition, the default RPSC is 3. It is recommended to increase this

    configuration value. Cluster Edition does not use a native thread for cluster interprocess

    communication (IPC) on the AIX platform. If the engine is sleeping, it would delay cluster IPC

    messages and result in decreased performance. This is especially true when the second instance

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    24/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    21

    is passive. If no work is going on, a lower RPSC will have the engine sleeping more often and

    messages from the primary instance will be delayed.

    On the other hand, there are two convincing cases for testing the reduction of RPSC to reduce idle

    polling and benefit standalone ASE performance:

    The case of an I/O bound workload The case of a workload with extreme fluctuations of activity, as between 20% and 100% processo

    busy

    Field and lab results suggest that reducing RPSC is more effective on the POWER7 processor-based

    architecture than it was on POWER6, but the same results confirm the best practice of achieving an

    optimal engine count with RPSC left at the default of 2000 before attempting to reduce RPSC. Put

    another way, reducing RPSC does not compensate for the degraded performance of an excessive

    number of online engines.

    Spinlock contention

    Developers and administrators who work with database management systems (DBMSs) are well

    aware of the need to manage concurrency. The DBMS itself manages concurrency by enforcing

    isolation levels and protecting data, typically through locking. The same risks and needs are present

    for hardware and operating systems. The potential issues are managed through constructs such as

    semaphores and spinlocks. The remedies and potential issues are collectively termed mutual

    exclusion.

    The semaphore is used for long-term locks, and spinlocks are exclusively for short-term locks. Tasks

    waiting for a semaphore typically sleep, although the behavior for tasks that are waiting for a spinlock

    differs significantly. As the term spinlock indicates, a task that is waiting for a spinlock hardly sleeps,

    but continuously polls whether the spinlock is available to grab. Hence, spinlock contention also

    manifests itself through increased processor usage without increased throughput.

    The high clock speed of the POWER6, POWER7, and POWER7+ processors might aggravate

    spinlock contention. This further underscores the need to diagnose and tune for spinlock contention.

    There are hundreds of spinlocks in ASE, and only object managerand data cachespinlocks are

    reported in default sp_sysmon. The sp_sysmon dumpcounters output is notoriously hard to

    interpret, and SAP Sybase support maintains a spinmonutility, which organizes the dumpcounters

    output by ranking the busiest spinlocks by their rate of contention. In addition, ASE 15.7 ESD2 and

    above provide SAP Sybase ASE monitoring and diagnostic agent spinlock monitoring, The most

    effective first measure in reducing spinlock contention is, if possible, to reduce engine count.

    Additional measures include partitioning data caches and binding hot object descriptors in the

    metadata cache. You can find the general guidelines for ASE spinlock tuning in the ASE Performance

    and Tuning Guides.

    ASE 15.7 and the threaded kernel

    ASE 15.7 ESD4 (current release at the time of publishing this paper) has two requirements on AIX which

    do not apply to earlier releases. First, ASE 15.7 ESD4 requires AIX 6.1 Technology Level 7 or above.

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    25/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    22

    Second, ASE 15.7 requires the activation of the I/O completion portsubsystem. IOCP is explained in

    general in the present section, and the precise installation requirements are provided in Appendix C.

    The ASE 15.7 threaded kernel is extensively documented by SAP Sybase*and for the purposes of this

    white paper we will focus on those features which are pertinent to implementation on AIX and Power

    Systems.

    The threaded kernel is enabled by default in ASE 15.7, but the traditional process kernel in which each

    engine is a separate process and does its own disk and network I/O polling is still available. The kernel

    mode is a configuration parameter which can be changed through sp_configure kernel mode, 0,

    [threaded, process] to enable one or the other mode after an ASE restart. The running kernel mode is

    displayed by_configure kernel modewithout arguments, and by select @@kernelmode. Changing

    kernel mode requires a restart of ASE.

    The threaded kernel implements the traditional ASE engine architecture as a single OS process which

    manages three or more thread pools. The threads are native threadsas distinct from Portable Operating

    System Interface (POSIX) threads. The running data server presents itself at the OS level as a single

    process, rather than a separate process for each engine.

    A particular advantage of the threaded kernel is in having less context to restore when a thread is

    dispatched, as compared with a process. Thus, there should be less latency with dispatching an engine on

    a different core or virtual processor.

    The threaded kernel implements another level of asynchronous I/O: engines will initiate network sends,

    Component Integration Services (CIS) requests and disk I/O requests, then hand off the handling of pollin

    for the completion of those requests to dedicated network, disk, and CIS threads. This releases the engine

    to handle other runnable tasks. The engine in the threaded kernel is no longer responsible for completing

    an I/O it has dispatched, nor is it responsible for completing a network request that was assigned to it.

    When an I/O completes, the waiting task can be run by the next available engine, not just its initiator.

    The threaded kernel implements the I/O completion port API to optimize asynchronous I/O.Aninput/output completion port object is created and associated with a number of sockets or file handles.

    When I/O services are requested on the object, completion is indicated by a message queued to the I/O

    completion port. A process requesting I/O services is not notified of completion of the I/O services, but

    instead checks the I/O completion port's message queue to determine the status of its I/O requests. The

    I/O completion port manages multiple threads and their concurrency.*Details on setting up IOCP are

    providedin Appendix C.

    The ASE 15.7 kernel supports four species of thread pools:

    syb_default_pool, which contains general purpose engines

    * Introduction to the ASE 15.7 thread kernel can be found in theASE 15.7 New Features Guideat:

    http://adaptiveserver.de/wp-content/uploads/2012/01/asenewfeat.pdfand in the following two presentations by Peter Thawley: ASE 15.7 technical overview-- Peter Thawley

    https://websmp104.sap-ag.de/~sapidp/011000358700001121852012E/assets/sybase_ase_techovervw.pdf

    ASE 15.7 Technical look inside the "hybrid kernel" -- Peter Thawleyhttp://www.sybase.com/files/Product_Overviews/ASE-15.7-New-Threaded-Kernel.pdf

    * The definition of IOCP is courtesy of Wikipedia.

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    26/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    23

    syb_system_pool, which contains the system, ctlib (CIS) disk, and network controller threads

    syb_blocking_pool, which is dedicated to running long blocking calls

    User defined pools, which contain engines normally used in conjunction with execution classes,

    and which are typically bound to an execution class

    Execution classes are implemented through a legacy ASE feature, the Logical Process Manager, which

    allows users, groups, or named applications to be bound to one or more engines and assigned priorities.

    In earlier ASE releases, engines bound to execution classes were able to run tasks other than those

    assigned by the execution class. In the current implementation, these engine threads are entirely

    dedicated to the execution class to which they are bound. This new implementation promises to make

    execution classes more useful.

    Engine pools are managed through three new commands each having several options create, alter, and

    drop thread pool. The number of disk or network controller threads is controlled by sp_configureor

    configuration file parameters number of disk tasks (default 1) and number of network tasks (default 1)

    The number of engines that can be brought online in syb_default_pool plus any user-defined pools is

    limited, as in earlier ASE, by the configuration parameter max online engines.The number of engines

    at startupis ignored by the threaded kernel, which looks at the number of engines assigned to the thread

    pools instead. As noted earlier in the ASE 15.7 update: kernel resource memory section, underallocation

    of kernel resource memory might unintentionally limit the number of engines that can be brought online.

    The ASE 15.7 kernel has five configurable features:

    The facility to create user thread pools and bind them to execution classes

    The engine thread count, in syb_default_poolor in a user defined pool

    The network and disk controller thread count

    The idle timeout a microsecond counter that determines when an engine yields the processor,

    defaulting to 100 microseconds

    The allocation of processor at the OS level, dedicated or virtualized, with respect to the thread

    count

    Owing to the complete redesign of I/O handling in the threaded kernel, familiar configuration parameters in

    earlier ASE releases (runnable process search countand I/O polling process count) do not apply to

    the threaded kernel. This is because the polling process that an idling engine went through before

    voluntarily yielding the processor (governed by runnable process search count) has been replaced by

    the default 100 microsecond idle loop before yielding. In ASE 15.7, the polling is performed by the disk

    and network controller threads, instead of the engine. Ctlib I/O, used by replication agents and CIS proxy

    and remote procedure calls, is handled by its own thread in system_pool.

    The threaded kernel changes the way in which ASE dispatches I/O requests, the way it handles completed

    I/O requests, and the way it implements the engine s searching for work and idle cycles.

    In the process kernel, the engine itself submits network and disk read and write calls, and the engine polls

    for completed network and disk I/Os.

    In the threaded kernel, ASE uses the IOCP ReadFile()and WriteFile()calls for network operations. It call

    CreateIoCompletionPort()for the disk I/O when running in threaded mode, but still uses aio_read()and

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    27/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    24

    aio_write()to issue the async I/O. Polling is done by calling the IOCP routine,

    GetMultipleCompletionStatus().

    In the process kernel, an engine with no runnable task checks the global queue for tasks it can take over,

    then checks other engine s queues for eligible tasks to take over, polls for completed network I/O and disk

    I/O. If no work is found in the number of these cycles specified by runnable process search count (default

    2000), issues a blocking network I/O that causes it to be descheduled. Eligible tasks for a process engine

    are restricted due to the requirement that the initiating engine complete its own network and disk I/Os.

    In the threaded kernel, this restriction no longer exists, so any engine can take over any runnable task.

    When an I/O completion is received from IOCP by the network or disk controller thread, the task sleeping

    on that I/O is set to runnable, and the next available engine dispatches it. The threaded engine without a

    runnable task still checks the global and engine queues for eligible tasks, but it does not loop looking for

    network and disk I/O completions. Instead, the threaded engine loops looking for runnable tasks for the

    number of microseconds specified by the idle timeoutparameter before issuing

    pthread_cond_timedwait()which yields the processor.

    The changes to thread implementation, I/O handling, and idling in the threaded kernel have significant

    implications for performance. Handing off network and disk I/O to dedicated threads has the potential of

    freeing the engine to run any task that is not waiting on I/O, a potential benefit to computationally-intensive

    workload. The elimination of the engine s polling loop implemented in AIX by the system call select() has

    the potential to reduce a significant source of additional processor load when engines are mostly idle. But

    because of the brevity of idle timeout, the frequency of yields might be much higher in the threaded than in

    the process kernel. This frequency of yields and reschedules is compensated by the lightercontext that

    has to be restored or migrated when an engine thread is rescheduled. Finally, it might be anticipated that

    an I/O bound workload realizes less benefit from handing off I/O if it lacks tasks to run that are not also

    waiting on I/O.

    The main new tuning tasks presented by the threaded kernel are:

    To determine whether engine count and processor allocation play the same way as they did in

    earlier ASE: how do you tell whether you have too few or too many engines

    To determine when more than the default one of each of network and disk controller threads are

    needed

    To determine when idle timeout needs to be adjusted and by how much

    To understanding the new metrics in sp_sysmon that are used to make the above determinations

    ASE 15.7 threaded kernel: New sp_sysmon metrics

    Engine utilization

    This report differs from process_mode sp_sysmonEngine Busy Utilizationin two ways. CPU busyin

    the process version is broken out as user busyand system busyin the threaded version. And, in the

    threaded version, I/O busyinstead of reporting for each engine its own idle time with an I/O pending,

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    28/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    25

    reports an engine s idle time with an I/O pending for anyengine. This is related to the fact that any

    engine can reschedule a task blocking on an I/O when that I/O completes.

    The first example is a processor-intensive workload:

    Engine Utilization (Tick %) User Busy System Busy I/O Busy Idle

    ThreadPool : syb_default_pool

    Engine 0 90.3 % 4.8 % 4.7 % 0.2 %

    Engine 2 90.8 % 3.9 % 5.2 % 0.1 %

    Engine 3 89.4 % 5.6 % 4.9 % 0.1 %

    Engine 4 91.2 % 4.6 % 4.2 % 0.0 %

    Engine 5 88.4 % 6.7 % 4.9 % 0.0 %

    ------------------------- ------------ ------------ ---------- ----------

    Pool summary Total 450.2 % 25.5 % 23.9 % 0.3 %

    Average 90.0 % 5.1 % 4.8 % 0.1 %

    The next example is during a dbcc textallocmaintenance job with very low engine utilization and a

    high value for idle time with some I/O pending This does not necessarily reflect an I/O bottleneck:

    Engine Utilization (Tick %) User Busy System Busy I/O Busy Idle

    ------------------------- ------------ ------------ ---------- ----------

    ThreadPool : syb_default_pool

    Engine 0 3.5 % 0.1 % 93.9 % 2.5 %

    Engine 1 2.6 % 0.2 % 94.7 % 2.5 %

    Engine 2 2.7 % 0.1 % 94.7 % 2.4 %

    Engine 3 3.6 % 0.0 % 93.9 % 2.6 %

    Engine 4 2.2 % 0.1 % 95.2 % 2.5 %

    Engine 5 4.3 % 0.1 % 93.2 % 2.4 %

    Engine 6 3.2 % 0.1 % 94.2 % 2.5 %

    Engine 7 2.7 % 0.1 % 94.6 % 2.6 %

    Engine 8 4.0 % 0.1 % 93.6 % 2.3 %

    ------------------------- ------------ ------------ ---------- ----------

    Pool summary Total 28.8 % 0.8 % 848.1 % 22.3 %

    Average 3.2 % 0.1 % 94.2 % 2.5 %

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    29/79

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    30/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    27

    Full Sleeps 29.4 0.0 3524 14.6 %Interrupted Sleeps 10.1 0.0 1206 5.0 %

    Engine 3Full Sleeps 30.7 0.0 3682 15.3 %Interrupted Sleeps 10.4 0.0 1249 5.2 %

    Engine 4Full Sleeps 29.1 0.0 3494 14.5 %Interrupted Sleeps 10.8 0.0 1295 5.4 %

    Engine 5Full Sleeps 30.2 0.0 3626 15.0 %Interrupted Sleeps 10.5 0.0 1255 5.2 %

    ------------------------- ------------ ------------ ----------Pool Summary 201.2 0.0 24141

    In the second, processor-intensive example, the engines never get through a full sleep, in this case

    indicating the need for more engines. In other cases, a high rate of interrupted sleeps might indicate a

    need to increase engine idle timeout.

    CPU Yields by Engine per sec per xact count % of total------------------------- ------------ ------------ ---------- ----------ThreadPool : syb_default_pool

    Engine 0Full Sleeps 0.0 0.0 0 0.0 %Interrupted Sleeps 31.4 0.0 3765 17.8 %

    Engine 1Full Sleeps 0.0 0.0 0 0.0 %Interrupted Sleeps 29.9 0.0 3591 17.0 %

    Engine 2Full Sleeps 0.0 0.0 0 0.0 %Interrupted Sleeps 30.2 0.0 3629 17.1 %

    Engine 3Full Sleeps 0.0 0.0 0 0.0 %Interrupted Sleeps 31.9 0.0 3825 18.1 %

    Engine 4Full Sleeps 0.0 0.0 0 0.0 %Interrupted Sleeps 29.7 0.0 3564 16.8 %

    Engine 5Full Sleeps 0.0 0.0 0 0.0 %

    Interrupted Sleeps 29.6 0.0 3554 16.8 %------------------------- ------------ ------------ ----------Pool Summary 176.5 0.0 21175

    Thread utilization at the OS level

    This new report gives the percentage of real time that a thread spends on a processor, and estimates

    the number of processor units consumed.

    Thread Utilization (OS %) User Busy System Busy Idle------------------------- ------------ ------------ ----------ThreadPool : syb_blocking_pool : no activity during sample

    ThreadPool : syb_default_poolThread 6 (Engine 0) 89.2 % 1.3 % 9.5 %Thread 7 (Engine 1) 90.3 % 1.1 % 8.6 %Thread 8 (Engine 2) 83.5 % 1.2 % 15.3 %Thread 9 (Engine 3) 72.0 % 1.1 % 26.9 %Thread 10 (Engine 4) 88.9 % 1.3 % 9.8 %Thread 11 (Engine 5) 90.3 % 1.4 % 8.4 %Thread 12 (Engine 6) 80.5 % 1.0 % 18.5 %Thread 13 (Engine 7) 79.4 % 1.2 % 19.4 %Thread 14 (Engine 8) 89.5 % 1.3 % 9.2 %Thread 15 (Engine 9) 62.7 % 0.8 % 36.5 %

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    31/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    28

    Thread 16 (Engine 10) 90.2 % 1.4 % 8.4 %Thread 17 (Engine 11) 89.3 % 1.2 % 9.5 %.. . .

    Thread 27 (Engine 21) 81.3 % 0.9 % 17.8 %Thread 28 (Engine 22) 89.2 % 1.4 % 9.4 %Thread 29 (Engine 23) 87.2 % 1.4 % 11.5 %

    Thread 30 (Engine 24) 82.8 % 1.2 % 16.0 %Thread 31 (Engine 25) 90.4 % 1.4 % 8.2 %Thread 32 (Engine 26) 79.8 % 1.2 % 19.0 %Thread 33 (Engine 27) 89.1 % 1.4 % 9.5 %

    ------------------------- ------------ ------------ ----------Pool Summary Total 2323.3 % 33.7 % 443.0 %

    Average 83.0 % 1.2 % 15.8 %

    ThreadPool : syb_system_poolThread 1 (Signal Handler) 0.1 % 0.1 % 99.8 %Thread 34 (NetController) 1.2 % 1.5 % 97.3 %Thread 35 (DiskController) 7.9 % 15.2 % 76.9 %Thread 36 (Network Listener 0.0 % 0.1 % 99.9 %

    ------------------------- ------------ ------------ ----------Pool Summary Total 9.2 % 16.8 % 374.0 %

    Average 2.3 % 4.2 % 93.5 %Adaptive Server threads are consuming 23.8 CPU units.

    Throughput (committed xacts per CPU unit) : 1179.4

    In this example of a large system running a processor-intensive workload, all three thread pools: the

    engine pool syb_default_pool, syb_system_pool, and blocking_poolare reported. In this

    illustration, the engines are spending 84% of their time on the processor, while the disk controller

    thread is moderately active. The system threads contribute a little to the overall processor

    consumption. This ASE server is using about 84% of its 28 dedicated processors.

    Context switches at the OS level

    Context switches at the OS level are another new metric provided by ASE 15.7 sp_sysmon. These

    context switches are distinct from ASE task context switches, which reflect whether a task is runnable

    or blocked on I/O, a lock or some other ASE resource. Rather, this metric is another way of looking at

    processor yields, already reported in terms of full compared to interrupted sleeps. What is measured

    here is the rate at which the ASE engine yielded the processor voluntarily by issuing

    pthread_cond_timedwait() or whether it used up its OS time slice or was pre-empted by the OS

    scheduler.

    Context Switches at OS per sec per xact count % of total------------------------- ------------ ------------ ---------- ----------ThreadPool : syb_blocking_poolVoluntary 0.0 0.0 0 0.0 %Non-Voluntary 0.0 0.0 0 0.0 %

    ThreadPool : syb_default_poolVoluntary 79.6 0.0 9547 1.1 %Non-Voluntary 83.4 0.0 10010 1.2 %

    ThreadPool : syb_system_poolVoluntary 6713.6 0.9 805636 93.2 %Non-Voluntary 325.2 0.0 39025 4.5 %

    ------------------------- ------------ ------------ ---------- ----------Total Context Switches 7201.8 0.9 864218 100.0 %

  • 8/10/2019 Optimizing SAP Sybase ASE for IBM AIX 2013 final.pdf

    32/79

    Optimizing Sybase Adaptive Server Enterprise (ASE) for IBM AIX

    Copyright IBM Corporation, 2013

    29

    Simultaneous multithreading (POWER5, POWER6, and POWER7 with

    AIX 5.3, AIX 6.1, and AIX 7.1)

    Simultaneous multithreading (SMT) is a POWER5, POWER6, and POWER7 processor-based technology

    that goes beyond traditional symmetric multiprocessing (SMP) and concurrent multiprocessing as

    implemented in other hardware platforms, in allowing two or four threads to run the same type of operation(for example, integer or floating point) at the same time. Its implementation, through duplicate registers,

    added control logic and queues, presents two or four logical processorsto the operating system for each

    physical or virtual processor that is allocated to the LPAR. Although the IBM SMT implementation does no

    double the performance per core, it might increase throughput by 30% (POWER5 and IBM POWER5+) to

    50% (POWER6 and POWER7). Most RDBMS workloads enjoy the benefit of SMT, but certain highly

    uniform and repetitive processes do not. The AIX environment enables SMT by default: in POWER5 and

    POWER6, this is dual-threaded mode, and in POWER7, quad-threaded mode. However, the administrato

    can change the SMT type (on/off or, in POWER7 mode, from four to two threads1) without the need for a

    reboot by using the smtctlcommand or the AIX System Management Interface Tool (SMIT) smitty s