91
RAC Frequently Asked Questions Grid Computing / RAC General RAC Is it supported to install Oracle Clusterware and RAC as different users. Why does netca always creates the listener which listens to public ip and not VIP only? Ct wants to use rconfig to convert a single instance to RAC but ct is using raw devices in RAC. Does rconfig support RAW ? Can we designate the place of archive logs on both ASM disk and regular file system, when we use SE RAC? WARNING: No cluster interconnect has been specified. I get this error starting my RAC database, what do I do? I have changed my spfile with alter system set <parameter_name> =.... scope=spfile. The spfile is on ASM storage and the database will not start. What is Cache Fusion and how does this affect applications? What Application Design considerations should I be aware of when moving to Oracle RAC? Do we have to have Oracle Database on all nodes? Is it difficult to transition from Single Instance to RAC? What are the dependencies between OCFS and ASM in Oracle Database 10g ? What software is necessary for RAC? Does it have a separate installation CD to order? What kind of HW components do you recommend for the interconnect? Is rcp and/or rsh required for normal RAC operation ? Can my customer use Veritas Agents to manage their RAC database on Unix with SFRAC installed? Are jumbo frames supported for the RAC interconnect? Can I use iSCSI storage with my RAC cluster? Why does the NOAC attribute need to be set on NFS mounted RAC Binaries? We are using Transparent Data Encryption (TDE). We create a wallet on node 1 and copy to nodes 2 & 3. Open the wallet and we are able to select encrypted data on all three nodes. Now, we want to REKEY the MASTER KEY. What do we have to do? How do I check for network problems on my interconect? Are block devices supported for OCR, Voting Disks, ASM devices? Are there any issues for the interconnect when sharing the same switch as the public network by using VLAN to separate the network? The Veritas installation document on page 219 asks for setting LD_LIBRARY_PATH_64. Should I remove this? Are Sun Logical Domains (ldoms) supported with RAC? How can a NAS storage vendor certify their storage solution for RAC ? Where can I find a list of supported solutions to ensure NIC availability (for the interconnect) per platform? Can I run RAC 10g with RAC 11g? What is Standard Edition RAC? My customer has an XA Application with a RAC Database, can I do Load Balancing across the RAC instances? Is there a need to renice LMS processes in Oracle RAC 10g Release 2?

RAC FAQ's

Embed Size (px)

DESCRIPTION

rac

Citation preview

RAC Frequently Asked QuestionsGrid Computing / RACGeneral RAC

Is it supported to install Oracle Clusterware and RAC as different users. Why does netca always creates the listener which listens to public ip and not VIP only? Ct wants to use rconfig to convert a single instance to RAC but ct is using raw devices in RAC. Does rconfig support RAW ? Can we designate the place of archive logs on both ASM disk and regular file system, when we use SE RAC? WARNING: No cluster interconnect has been specified. I get this error starting my RAC database, what do I do? I have changed my spfile with alter system set =.... scope=spfile. The spfile is on ASM storage and the database will not start. What is Cache Fusion and how does this affect applications? What Application Design considerations should I be aware of when moving to Oracle RAC? Do we have to have Oracle Database on all nodes? Is it difficult to transition from Single Instance to RAC? What are the dependencies between OCFS and ASM in Oracle Database 10g ? What software is necessary for RAC? Does it have a separate installation CD to order? What kind of HW components do you recommend for the interconnect? Is rcp and/or rsh required for normal RAC operation ? Can my customer use Veritas Agents to manage their RAC database on Unix with SFRAC installed? Are jumbo frames supported for the RAC interconnect? Can I use iSCSI storage with my RAC cluster? Why does the NOAC attribute need to be set on NFS mounted RAC Binaries? We are using Transparent Data Encryption (TDE). We create a wallet on node 1 and copy to nodes 2 & 3. Open the wallet and we are able to select encrypted data on all three nodes. Now, we want to REKEY the MASTER KEY. What do we have to do? How do I check for network problems on my interconect? Are block devices supported for OCR, Voting Disks, ASM devices? Are there any issues for the interconnect when sharing the same switch as the public network by using VLAN to separate the network? The Veritas installation document on page 219 asks for setting LD_LIBRARY_PATH_64. Should I remove this? Are Sun Logical Domains (ldoms) supported with RAC? How can a NAS storage vendor certify their storage solution for RAC ? Where can I find a list of supported solutions to ensure NIC availability (for the interconnect) per platform? Can I run RAC 10g with RAC 11g? What is Standard Edition RAC? My customer has an XA Application with a RAC Database, can I do Load Balancing across the RAC instances? Is there a need to renice LMS processes in Oracle RAC 10g Release 2?

I had a 3 node RAC. One of the nodes had to be completely rebuilt as a result of a problem. As there are no backups, What is the proper procedure to remove the 3rd node from the cluster so it can be added back in? What combinations of Oracle Clusterware, RAC and ASM versions can I use? Is relink required for CRS_HOME after OS upgrade? Does Oracle Clusterware or Real Application Clusters support heterogeneous platforms? Is Infiniband supported for the RAC interconnect? Can I run more than one clustered database on a single RAC cluster? Can I run 9i RAC and RAC 10g in the same cluster? I could not get the user equivalence check to work on my Solaris 10 server when trying to install 10.2.0.1 Oracle Clusterware. The install ran fine without issue. > Does changing uid or gid of the Oracle User affect Oracle Clusterware? How many NICs do I need to implement RAC? Can we output the backupset onto regular file system directly (not onto flash recovery area) using RMAN command, when we use SE RAC? Should the SCSI-3 reservation bit be set for our Oracle Clusterware only installation? A client is a new RAC user and are using it in conjunction with BEA weblogic. Can they use Connection Load Balancing and Services? What about FCF, FAN, RCLB? Why is validateUserEquiv failing during install (or cluvfy run)? What are the restrictions on the SID with a RAC database? Is it limited to 5 characters? What storage is supported with Standard Edition RAC? What would you recomend to customer, Oracle clusterware or Vendor Clusterware (I.E. MC Service Guard, HACMP, Sun Cluster, Veritas etc.) with Oracle Database 10g Real Application Clusters? Can I use RAC in a distributed transaction processing environment? Is it a good idea to add anti-virus software to my RAC cluster? When configuring the NIC cards and switch for a GigE Interconnect should it be set to FULL or Half duplex in RAC?

RAC Assistance

How do I use DBCA in silent mode to set up RAC and ASM?

High Availability

How does OCR mirror work? What happens if my OCR is lost/corrupt? Is Oracle Application Server integrated with FAN and FCF? Why do we have a Virtual IP (VIP) in Oracle RAC 10g? Why does it just return a dead connection when its primary node fails? If I use Services with Oracle Database 10g, do I still need to set up Load Balancing ? In Solaris 10, do we need Sun Clusterware to provide redundancy for the interconnect and multiple switches? I am receiving an ORA-29740 error. What should I do? Can RMAN backup Real Application Cluster databases? What should I do to make my RAC deployment highly available? o I am using shared services which the following set in init.ora SQL> show parameters dispatchers=(protocol=TCP)(listener=listen ers_nl01)(con=500)(serv=oltp). I stopped

my service with srvctl stop service but it is still registered with the listener and accepting connections. Is this expected? How do I configure FCF with BPEL so I can use RAC 10g in the backend? The client gets this error message in Production in the ons.log file every minute or so: 06/11/10 10:11:14 [2] Connection 0,129.86.186.58,6200 SSL handshake failed 06/11/10 10:11:14 [2] Handshake for 0,129.86.186.58,6200: nz error = 29049 interval = 0 (180 max) Is it possible to use SVRCTL start database with a user account other than oracle ( that is other than the owner of the oracle software)? After executing DBMS_SERVICE.START_SERVICE, the service resource remains OFFLINE status when confirming it with crs_stat. Is that expected behavior ? What are my options for load balancing with RAC? Why do I get an uneven number of connections on my instances? With three primary load balancing options (client-side connect-time LB, server-side connecttime LB, and the runtime connection load balancing) Is it fair to say Runtime Connection Load Balancing is the only option to leverage FAN up/down events? How can a customer mask the change in their clustered database configuration from their client or application? (I.E. So I do not have to change the connection string when I add a node to the RAC database) What is Server-side Transparent Application Failover (TAF) and how do I use it? What is CLB_GOAL and how should I set it? Can our 10g VIP fail over from NIC to NIC as well as from node to node ? What does the Virtual IP service do? I understand it is for failover but do we need a separate network card? Can we use the existing private/public cards? What would happen if we used the public ip? What do the VIP resources do once they detect a node has failed/gone down? Are the VIPs automatically acquired, and published, or is manual intervention required? Are VIPs mandatory?

High Availability -- FAN/FCF Why am I seeing the following warnings in my listener.log for my RAC 10g environment? WARNING: Subscription for node down event still pending Will FAN work with SQLPlus? Do I need to install the ONS on all my mid-tier serves in order to enable JDBC Fast Connection Failover (FCF)? Will FAN/FCF work with the default database service? Can I use the 10.2 JDBC driver with 10.1 database for FCF? What clients provide integration with FAN through FCF? Can I use TAF and FAN/FCF? How does the datasource properties initialLimit, minLimit, and maxLimit affect Fast Connection Failover processing with JDBC? Will FAN/OCI work with Instant Client? What type of callbacks are supported with OCI when using FAN/FCF? Does FCF for OCI react to FAN HA UP events? Can I use FAN/OCI with Pro*C? Do I have to link my OCI application with a thread library? Why? Scalability I am seeing the wait events 'ges remote message', 'gcs remote message', and/or 'gcs for action'. What should I do about these? What are the changes in memory requirements from moving from single instance to RAC?

Will adding a new instance to my Oracle RAC database (new node to the cluster) allow me to scale the workload? What do I do if I see GC CR BLOCK LOST in my top 5 Timed Events in my AWR Report? How do I change my Veritas SF RAC installation to use UDP instead of LLT? A customer is currently using RAC in a 2 node environment. How should one review the ability to scale out to 4, 6, 8 or even more nodes? What should the requirements of a scale out test? What is the Load Balancing Advisory? How do I enable the load balancing advisory? What are my options for setting the Load Balancing Advisory GOAL on a Service? How can I validate the scalability of my shared storage? (Tightly related to RAC / Application scalability) How many nodes are supported in a RAC Database? How do I measure the bandwidth utilization of my NIC or my interconnect? Does Database blocksize or tablespace blocksize affect how the data is passed across the interconnect? What is Runtime Connection Load Balancing? Manageability I found in 10.2 that the EM "Convert to Cluster Database" wizard would always fall over on the last step where it runs emca and needs to log into the new cluster database as dbsnmp to create the cluster database targets etc. I changed the password for the dbsnmp account to be dbsnmp (same as username) and it worked OK. Is this a known issue? What storage option should I use for RAC 10g on Linux? ASM / OCFS / Raw Devices / Block Devices / Ext3 ? How do I stop the GSD? What is the purpose of the gsd service in Oracle 9i RAC? How should I deal with space management? Do I need to set free lists and free list groups? I was installing RAC and my Oracle files did not get copied to the remote node(s). What went wrong? If I am using Vendor Clusterware such as Veritas, IBM, Sun or HP, do I still need Oracle Clusterware to run Oracle RAC 10g? Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215, however sqlplus can start it on both nodes? What is the problem? When I look at ALL_SERVICES view in my database I see services I did not create, what are they for? Does RAC work with NTP (Network Time Protocol)? I have 2 clusters named "crs" (the default), how do I get Grid Control to recognize them as targets? If using plsql native code, the plsql_native_library_dir needs to be defined. In RAC environement, must the directory be in the shared storage? How do I determine whether or not an OneOff patch is "rolling upgradeable"? What is the Cluster Verification Utiltiy (cluvfy)? What versions of the database can I use the cluster verification utility (cluvfy) with? What are the implications of using srvctl disable for an instance in my RAC cluster? I want to have it available to start if I need it but at this time to not want to run this extra instance for this database. Platform Specific

How many nodes can be had in an HP-UX/Solaris/AIX/Windows/Linux cluster? How do I check RAC certification? Is crossover cable supported as an interconnect with RAC on any platform ? What is Oracle's position with respect to supporting RAC on Polyserve CFS? Is it possible to run RAC on logical partitions (i.e. LPARs) or virtual separate servers. Can the Oracle Database Configuration Assistant (DBCA) be used to create a database with Veritas DBE / AC 3.5? Is RAC on VMWare supported? Is Veritas Storage Foundation supported with RAC?

Platform Specific -- Linux

Is 3rd Party Clusterware supported on Linux such as Veritas or Redhat? Can you have multiple RAC $ORACLE_HOME's on Linux? Is there a CFS Available for Linux? After installing patchset 9013 and patch_2313680 on Linux, the startup was very slow Is the hangcheck timer still needed with Oracle RAC 10g and 11g? Customer did not load the hangcheck-timer before installing RAC, Can the customer just load the hangcheck-timer ? How do I configure raw devices in order to install 10g Clusterware on RHEL5 or OEL5? How to reorder or rename logical network interface (NIC) names in Linux Are Red Hat GFS and GULM certified for DLM? Oracle Clusterware fails to start after a reboot due to permissions on raw devices reverting to default values, How to fix? How do I configure my RAC Cluster to use the RDS Infiniband? Can RAC 10g and 9i RAC be installed and run on the same physical Linux cluster? My customer is about to install 10202 clusterwere on new Linux machinges. He is getting "No ORACM running" error when run rootpre.sh and exited? Should he worry about this message? Is OCFS2 certified with RAC 10g? A customer installed 10g R2 on Linux RH4 Update 2, 2.6.9-22.ELsmp #1 SMP x86_64 GNU/Linux, and got the error Error in invoking target 'all_no_orcl'. Customer ignored the error and the install succeeded without any other errors and oracle pparently worked fine. What should they do? Because of compatibility with their storage array (EMC DMX with Powerpath 4.5) they must use update 2. Oracle install guide states that RH4 64 bits update 1 "or higher" should be used for 10g R2. How to configure bonding on Suse SLES8. How to configure bonding on Suse SLES9.

Platform Specific -- Solaris

Can I configure IPMP in Actie/Active to increase bandwidth of my interconnect? Does Oracle Support RAC with Solaris 10 Containers (aka Zones)? Does Sun Solaris have a multipathing solution ?

Platform Specific -- HP-UX

Is HMP supported with Oracle RAC 10g on all HP platforms ?

Can I configure HP's Autoport aggregation for NIC Bonding after the install? (i.e. not present beforehand)

Platform Specific -- Windows Does the Oracle Cluster File System (OCFS) support network access throug NFS or Windows Network Shares? The OracleCRService does not start with my windows RAC implementation, what do I do? When running Oracle RAC on Windows 2003, what is the recommended OS level? How do I verify that Host Bus Adapter Node Local Caching has been disabled for the disks I will be using in my RAC cluster? My customer has a failsafe cluster installed, what are the benefits of moving their system to RAC? When I try to login to the +ASM2 on node2 with asmcmd (after setting ORACLE_HOME and ORACLE_SID correctly) I get: ORA-01031: insufficient privileges (DBD ERROR: OCI SessionBegin). When I try to login to +ASM2 using sqlplus (connect / as sysdba) I get the same ORA-01031: insufficient privileges. When I try to login to +ASM2 using sqlplus (connect sys/passwd as sysdba) I get connected successfully. Can I run my 9i RAC and RAC 10g on the same Windows cluster? My customer wants to understand what type of disk caching they can use with their Windows RAC Cluster, the install guide tells them to disable disk caching? Platform Specific -- IBM AIX

Is HACMP needed for RAC on AIX 5.2 using GPFS file system? Do I need HACMP/GPFS to store my OCR/Voting file on a shared device. Is VIO supported with RAC on IBM AIX?

Platform Specific -- IBM-z/OS (Mainframe)

Can I run Oracle RAC 10g on my IBM Mainframe Sysplex environment (z/OS)?

Other Applications & RAC Can I use Oracle Clusterware for failover of the SAP Enqueue and VIP services when running SAP in a RAC environment? Are Oracle Applications certified with RAC? Diagnosibility How do I gather all relevant Oracle and OS log/trace files in a RAC cluster to provide to Support? What are the cdmp directories in the background_dump_dest used for? EBusiness Suite with RAC

Is the Oracle E-Business Suite (Oracle Applications) certified against RAC? Can I use TAF with e-Business in a RAC environment? How to configure concurrent manager in a RAC environment? Should functional partitioning be used with Oracle Applications? Which e-Business version is prefereable?

Can I use Automatic Undo Management with Oracle Applications? Is Server Side Load Balancing supported/recommended/proven technology in Oracle EBusiness Suite?

Clustered File Systems

Can I use OCFS with SE RAC? What is the optimal migration path to be used while migrating the E-Business suite to RAC? What are the maximum number of nodes under OCFS on Linux ? What files can I put on Linux OCFS? Where can I find documentation on OCFS ? What are the Best Practices for using a clustered file system with Oracle RAC? Can I use it for OCR, Voting Disk, Binaries as well as database files? What is the maximum number of nodes I can have in my cluster if I am using OCFS2? Is Red Hat GFS(Global File System) is certified by Oracle for use with Real Application Clusters? Is Sun QFS supported with RAC? What about Sun GFS? Is Linux OCFS2 (OCFS version 2) supported with RAC?

Oracle Clusterware

Can I run a 10.1.0.x database with Oracle Clusterware 10.2 ? In the course of failure testing in an extended RAC environment we find entries in the cssd logfile which indicate actions like 'diskShortTimeout set to (value)' and 'diskLongTimeout set to (value)'. Can anyone please explain the meaning of these two timeouts in addition to disktimeout? Customer is hitting bug 4462367 with an error message saying low open file descriptor, how do I work around this until the fix is released with the Oracle Clusterware Bundle for 10.2.0.3 or 10.2.0.4 is released? Can I change the public hostname in my Oracle Database 10g Cluster using Oracle Clusterware? What do I do, I have a corrupt OCR and no valid backup? Is it supported to rerun root.sh from the Oracle Clusterware installation ? My customer has noticed tons of log files generated under $CRS_HOME/log//client, is there any way automated way we can setup through Oralce Clusterware to prevent/minimize/remove those aggressively generated files? Can I set up failover of the VIP to another card in the same machine or what do I do if I have different network interfaces on different nodes in my cluster (I.E. eth0 on node1,2 and eth1 on node 3,4)? Is it possible to use ASM for the OCR and voting disk? Is it supported to allow 3rd Party Clusterware to manage Oracle resources (instances, listeners, etc) and turn off Oracle Clusterware management of these? What is the High Availability API? How to move the OCR location ? I made a mistake when I created the VIP during the install of Oracle Clusterware, can I change the VIP? Can I configure a firewall (iptables) on the cluster interconnect? Does the hostname have to match the public name or can it be anything else?

I have a 2-node RAC running. I notice that it is always node2 that is evicted when I test private network failure scenario by disconnecting the private network cable. Doesn't matter whether it is node1's or node2's private network cable that is disconnected, it is always the node2 that is evicted. What happens in a 3-nodes RAC cluster if node1's cable is disconnected? How do I use multiple network interfaces to provide High Availability and/or Load Balancing for my interconnect with Oracle Clusterware? Can the Network Interface Card (NIC) device names be different on the nodes in a cluster, for both public and private? Can I use Oracle Clusterware to provide cold failover of my single instance Oracle Databases? What are the licensing rules for Oracle Clusterware? Can I run it without RAC? How do I protect the OCR and Voting in case of media failure? What happens if I lose my voting disk(s)? During Oracle Clusterware installation, I am asked to define a private node name, and then on the next screen asked to define which interfaces should be used as private and public interfaces. What information is required to answer these questions? Can I change the name of my cluster after I have created it when I am using Oracle Clusterware? Which processes access the OCR ? Why does Oracle Clusterware use an additional 'heartbeat' via the voting disk, when other cluster software products do not? Why does Oracle still use the voting disks when other cluster sofware is present? How do I identify the voting file location ? How much I/O activity should the voting disk have? What is the voting disk used for? Does Oracle Clusterware have to be the same or higher release than all instances running on the cluster? Can I use Oracle Clusterware to monitor my EM Agent? When ct run the command 'onsctl start' receives the message "Unable to open libhasgen10.so". Any idea why the message "unable to open libhasgen10.so" ? What are the IP requirements for the private interconnect? How to Restore a Lost Voting Disk used by Oracle Clusterware 10g How can I register the listener with Oracle Clusterware in RAC 10g Release 2? How is the voting disk used by Oracle Clusterware? Does Oracle Clusterware support application vips? Why is the home for Oracle Clusterware not recommended to be subdirectory of the Oracle base directory? How do I put my application under the control of Oracle Clusterware to achieve higher availability? With Oracle Clusterware 10g, how do you backup the OCR? Is it a requirement to have the public interface linked to ETH0 or does it only need to be on a ETH lower than the private interface?: - public on ETH1 - private on ETH2 How do I restore OCR from a backup? On Windows, can I use ocopy? What should the permissions be set to for the voting disk and ocr when doing a RAC Install?

Streched/Extended RAC -- No Sub Category

Can I use ASM to mirror Oracle data in an extended RAC environment? How should voting disks be implemented in an extended cluster environment? Can I use standard NFS for the third site voting disk? Can a customer use SE RAC to implement an "Extended RAC Cluster" ? What are the network requirements for an extended RAC cluster? Can I use ASM as mechanism to mirror the data in an Extended RAC cluster?

Cluster Verification Utility (CVU) -- No Sub Category

Can I check if the storage is shared among the nodes? What are the default values for the command line arguments? How do I check the Oracle Clusterware stack and other sub-components of it? Is there a way to verify that the Oracle Clusterware is working properly before proceeding with RAC install? At what point cluvfy is usable? Can I use cluvfy before installing Oracle Clusterware? What is CVU? What are its objectives and features? What is a stage? What is a component? What is nodelist? Do I have to be root to use CVU? What about discovery? Does CVU discover installed components? How do I report a(or tons of) bug? What are the requirements for CVU? How do I install 'cvuqdisk' package? How do I know about cluvfy commands? The usage text of cluvfy does not show individual commands. Do I have to type the nodelist every time for the CVU commands? Is there any shortcut? How do I get detail output of a check? How do I check network or node connectivity related issues? How do I check whether OCFS is properly configured? How do I check user accounts and administrative permissions related issues? How do I check minimal system requirements on the nodes? Is there a way to compare nodes? Why the peer comparison with -refnode says passed when the group or user does not exist? How do I turn on tracing? Where can I find the CVU trace files? Why cluvfy reports "unknown" on a particular node? What are the known issues with this release? When I run 10.2 CLUVFY on a system where RAC 10g Release 1 is running I get following output: Package existence check failed for "SUNWscucm:3.1". Package existence check failed for "SUNWudlmr:3.1". Package existence check failed for "SUNWudlm:3.1". Package existence check failed for "ORCLudlm:Dev_Release_06/11/04,_64bit_3.3.4.8_reentrant". Package existence check failed for "SUNWscr:3.1". Package existence check failed for "SUNWscu:3.1".

Checking this Solaris system I don't see those packages installed. Can I continue my install? What is 'cvuqdisk' rpm? Why should I install this rpm?

AnswersI have changed my spfile with alter system set =.... scope=spfile. The spfile is on ASM storage and the database will not start.How to recover: In $ORACLE_HOME/dbs . oraenv sqlplus "/ as sysdba" startup nomount create pfile='recoversp' from spfile / shutdown immediate quit Now edit the newly created pfile to change the parameter to something sensible. Then: sqlplus "/ as sysdba" startup pfile='recoversp' (or whatever you called it in step one). create spfile='+DATA/GASM/spfileGASM.ora' from pfile='recoversp' / N.B.The name of the spfile is in your original init.ora so adjust to suit shutdown immediate startup quit

Is it supported to install Oracle Clusterware and RAC as different users.Yes, Oracle Clusterware and RAC can be installed as different users. The Oracle Clsuterware user and the RAC user must both have OSDBA as their primary group, and they should both have OINSTALL as a secondary group.

WARNING: No cluster interconnect has been specified. I get this error starting my RAC database, what do I do?It simply means you do not have cluster_interconnects parameter set and nothing was set in the OCR, so the private interconnect is picked at random by the database and hence the warning... You can either set cluster_interconnects parameter in the init.ora to the private interconnect IP; OR play with oifcfg getif and setif (type oifcfg without anything for help message) $ oifcfg getif eth0 138.2.236.0 global public eth2 138.2.238.0 global cluster_interconnect What does your output look like? Note that if hardware is not identical you'll have to provide each node with it's own correct value, if it's identical hardware you can use the -global switch.

Ct wants to use rconfig to convert a single instance to RAC but ct is using raw devices in RAC. Does rconfig support RAW ?No. rconfig supports ASM and shared file system only.

Can we designate the place of archive logs on both ASM disk and regular file system, when we use SE RAC?Yes, - customers may want to create a standby database for their SE RAC database so placing the archive logs additionally outside ASM is OK.

Why does netca always creates the listener which listens to public ip and not VIP only?This is for backward compatibility with existing clients: consider pre-10g to 10g server upgrade. If we made upgraded listener to only listen on VIP, then clients that didn't upgrade will not be able to reach this listener anymore.

Do we have to have Oracle Database on all nodes?

Each node of a cluster that is being used for a clustered database will typically have the database and RAC software loaded on it, but not actual datafiles (these need to be available via shared disk). For example, if you wish to run RAC on 2 nodes of a 4-node cluster, you would need to install the clusterware on all nodes, RAC on 2 nodes and it would only need to be licensed on the two nodes running the RAC database. Note that using a clustered file system, or NAS storage can provide a configuration that does not necessarily require the Oracle binaries to be installed on all nodes.

What kind of HW components do you recommend for the interconnect?The general recommendation for the interconnect is to provide the highest bandwith interconnect, together with the lowest latency protocol that is available for a given platform. In practice, Gigabit Ethernet with UDP has proven sufficient in every case it has been implemented, and tends to be the lowest common denominator across platforms.

What Application Design considerations should I be aware of when moving to Oracle RAC?The general principals are that fundamentally no different design and coding practices are required for RAC however application flaws in execution or design have a higher impact in RAC. The performance and scalability in RAC will be more sensitive to bad plans or bad schema design. Serializing contention makes applications less scalable. If your customer uses standard SQL and schema tuning, it solves > 80% of performance problems Some of the scaleability pitfalls they should look for are: * Serializing contention on a small set of data/index blocks --> monotonically increasing key --> frequent updates of small cached tables --> segment without automatic segment space management (ASSM) or Free List Group (FLG) * Full table scans --> Optimization for full scans in 11g can save CPU and latency * Frequent invalidation and parsing of cursors --> Requires data dictionary lookups and synchronizations * Concurrent DDL ( e.g. truncate/drop ) Look for: * Indexes with right-growing characteristics --> Use reverse key indexes --> Eliminate indexes which are not needed

* Frequent updated and reads of small tables --> small=fits into a single buffer cache --> Use Sparse blocks ( PCTFREE 99 ) to reduce serialization * SQL which scans large amount of data --> Perhaps more efficient when parallelized --> Direct reads do not need to be globally synchronized ( hence less CPU for global cache )

What is Cache Fusion and how does this affect applications?Cache Fusion is a new parallel database architecture for exploiting clustered computers to achieve scalability of all types of applications. Cache Fusion is a shared cache architecture that uses high speed low latency interconnects available today on clustered systems to maintain database cache coherency. Database blocks are shipped across the interconnect to the node where access to the data is needed. This is accomplished transparently to the application and users of the system. As Cache Fusion uses at most a 3 point protocol, this means that it easily scales to clusters with a large numbers of nodes. For more information about cache fusion see the following links: Additional Information can be found at: Understanding 9i Real Application Clusters Cache Fusion

Is it difficult to transition from Single Instance to RAC?If the cluster and the cluster software are not present, these components must be installed and configured. The RAC option must be added using the Oracle Universal Installer, which necessitates the existing DB instance must be shut down. There are no changes necessary on the user data within the database. However, a shortage of freelists and freelist groups can cause contention with header blocks of tables and indexes as multiple instances vie for the same block. This may cause a performance problem and require data partitioning. However, the need for these changes should be rare. Recommendation: apply automatic space segment management to perform these changes automatically. The free space management will replace the freelists and freelist groups and is better. The database requires one Redo thread and one Undo tablespace for each instance, which are easily added with SQL commands or with Enterprise Manager tools. Datafiles will need to be moved to either a clustered file system (CFS) or raw devices so that all nodes can access it. Also, the MAXINSTANCES parameter in the control file must be greater than or equal to number of instances you will start in the cluster.

For more detailed information, please see Migrating from single-instance to RAC in the Oracle Documentation With Oracle Database 10g Release 2, $ORACLE_HOME/bin/rconfig tool can be used to convert Single instance database to RAC. This tool takes in a xml input file and convert the Single Instance database whose information is provided in the xml. You can run this tool in "verify only" mode prior to performing actual conversion. This is documented in the RAC admin book and a sample xml can be found $ORACLE_HOME/assistants/rconfig/sampleXMLs/ConvertToRAC.xml. This tool only supports databases using a clustered file system or ASM. You cannot use it with raw devices. Grid Control 10g Release 2 provides a easy to use wizard to perform this function. Note: Please be aware that you may hit bug 4456047 (shutdown immediate hangs) as you convert the database. The bug is updated with workaround and the w/a should is release noted as well.

What are the dependencies between OCFS and ASM in Oracle Database 10g ?In an Oracle RAC 10g environment, there is no dependency between Automatic Storage Management (ASM) and Oracle Cluster File System (OCFS). OCFS is not required if you are using Automatic Storage Management (ASM) for database files. You can use OCFS on Windows( Version 2 on Linux ) for files that ASM does not handle - binaries (shared oracle home), trace files, etc. Alternatively, you could place these files on local file systems even though it's not as convenient given the multiple locations. If you do not want to use ASM for your database files, you can still use OCFS for database files in Oracle Database 10g. Please refer to ASM and OCFS Positioning

Is rcp and/or rsh required for normal RAC operation ?rcp"" and ""rsh"" are not required for normal RAC operation. However ""rsh"" and ""rcp"" should to be enabled for RAC and patchset installation. In future releases, ssh will be used for these operations.

What software is necessary for RAC? Does it have a separate installation CD to order?Real Application Clusters is an option of Oracle Database and therefore part of the Oracle Database CD. With Oracle 9i, RAC is part of Oracle9i Enterprise Edition. If you install 9i EE onto a cluster, and the Oracle Universal Installer (OUI) recognizes the cluster, you will be provided the option of installing RAC. Most UNIX platforms require an OSD installation for the necessary clusterware. For Intel platforms (Linux and Windows), Oracle provides the OSD software within the Oracle9i Enterprise Edition release.

With Oracle Database 10g, RAC is an option of EE and available as part of SE. Oracle provides Oracle Clusterware on its own CD included in the database CD pack. Please check the certification matrix (Note 184875.1) or with the appropriate platform vendor for more information. @ Sent by Karin Brandauer

What is Standard Edition RAC?With Oracle Database 10g, a customer who has purchased Standard Edition is allowed to use the RAC option within the limitations of Standard Edition(SE). For licensing restrictions you should read the Oracle Database 10g License Doc. At a high level this means that you can have a max of 4 cpus in the cluster, you must use ASM for all database files. NOTE: 3rd party clusterware and clustered file systems(other than ASM) are not supported. This includes OCFS and OCFS2.

Can I use iSCSI storage with my RAC cluster?For iSCSI, Oracle has made the statement that, as a block protocol, this technology does not require validation for single instance database. There are many early adopter customers of iSCSI running Oracle9i and Oracle Database 10g. As for RAC, Oracle has chosen to validate the iSCSI technology (not each vendor's targets) for the 10g platforms - this has been completed for Linux and Windows. For Windows we have tested up to 4 nodes - Any Windows iSCSI products that are supported by the host and storage device are supported by Oracle. We don't support NAS devices for Windows, however some NAS devices (eg NetApp) can also present themselves as iSCSI devices. If this is the case then a customer can use this iSCSI device with Windows as long as the iSCSI device vendor supports Windows as an initiator OS. No vendor-specific information will be posted on Certify.

What would you recomend to customer, Oracle clusterware or Vendor Clusterware (I.E. MC Service Guard, HACMP, Sun Cluster, Veritas etc.) with Oracle Database 10g Real Application Clusters?You will be installing and using Oracle Clusterware whether or not you use the Vendor Clusterware. The question you need to ask is whether the Vendor Clusterware gives you something that Oracle Clusterware does not. Is the RAC database on the same server as the application server? Are there any other processes on the same server as the database that you require Vendor Clusterware to fail over to another server in the cluster if the server it is running on fails? IF this is the case, you may want the vendor clusterware, if not, why spend the extra money when Oracle Clusterware supplies everything you need to for the clustered database included with

your RAC license. Note: With Oracle Database 10g Release 2, Oracle Clusterware can be used to manage application processes in the cluster (start, stop, checate)

When configuring the NIC cards and switch for a GigE Interconnect should it be set to FULL or Half duplex in RAC?You've got to use Full Duplex, regardless of RAC or not, but for all network communication. Half Duplex means you can only either send OR receive at the same time.

Is it a good idea to add anti-virus software to my RAC cluster?For customers who choose to run anti-virus (AV) software on their database servers, they should be aware that the nature of AV software is that disk IO bandwidth is reduced slightly as most AV software checks disk writes/reads. Also, as the AV software runs, it will use CPU cycles that would normally be consumed by other server processes (e.g your database instance). As such, databases will have faster performance when not using AV software. As some AV software is known to lock the files whilst is scans then it is a good idea to exclude the Oracle Datafiles/controlfiles/logfiles from a regular AV scan

Can I use RAC in a distributed transaction processing environment?YES. Best practices is to have all tightly coupled branches of a distributed transaction running on a RAC database must run on the same instance. Between transactions and between services, transactions can be load balanced across all of the database instances. You can use services to manage DTP environments. By defining the DTP property of a service, the service is guaranteed to run on one instance at a time in a RAC database. All global distributed transactions performed through the DTP service are ensured to have their tightly-coupled branches running on a single RAC instance.

How can a NAS storage vendor certify their storage solution for RAC ?As of January 2007 the OSCP has been discontinued!! Please refer to this link on OTN for details on RAC Technologies Matrix (storage being part of it). Old Answer text:

They should obtain an OCE test kit and complete the required RAC tests. They can submit the request for an OCE kit to [email protected].

The list of certified NAS vendors/solutions is posted on OTN under the OSCP program

Can I run 9i RAC and RAC 10g in the same cluster?YES. However Oracle Clusterware (CRS) will not support a 9i RAC database so you will have to leave the current configuration in place. You can install Oracle Clusterware and RAC 10g into the same cluster. On Windows and Linux, you must run the 9i Cluster Manager for the 9i Database and the Oracle Clusterware for the 10g Database. When you install Oracle Clusterware, your 9i srvconfig file will be converted to the OCR. Both 9i RAC and 10g will use the OCR. Do not restart the 9i gsd after you have installed Oracle Clusterware. Remember to check certify for details of what vendor clusterware can be run with Oracle Clusterware. For example on Solaris, your 9i RAC will be using Sun Cluster. You can install Oracle Clusterware and RAC 10g in the same cluster that is running Sun Cluster and 9i RAC.

Is Infiniband supported for the RAC interconnect?Today IP over IB is supported, and RDS on Linux is supported with 10.2.0.3 forward. Qlogic (formerly SilverStorm) is the supported RDS vendor. Watch certify for updates. As other platforms adopt RDS, we will expand support. There are no plans to support uDAPL or ITAPI protocols.

What combinations of Oracle Clusterware, RAC and ASM versions can I use?See Note: 337737.1 for detailed support matrix. Basically the Clusterware version must be at least the highest release of ASM or RAC. ASM must be at least 10.1.0.3 to work with 10.2 database.

What storage is supported with Standard Edition RAC?As per the licensing documentation, you must use ASM for all database files with SE RAC. There is no support for CFS or NFS. From Oracle Database 10g Release 2 Licensing Doc: Oracle Standard Edition and Real Application Clusters (RAC) When used with Oracle Real

Application Clusters in a clustered server environment, Oracle Database Standard Edition requires the use of Oracle Clusterware. Third-party clusterware management solutions are not supported. In addition, Automatic Storage Management (ASM) must be used to manage all database-related files, including datafiles, online logs, archive logs, control file, spfiles, and the flash recovery area. Third-party volume managers and file systems are not supported for this purpose.

My customer has an XA Application with a RAC Database, can I do Load Balancing across the RAC instances?No, not in the traditional Oracle Net Services Load Balancing. We have written a document that explains the ** best practices for 9i, 10g Release 1 and 10g Release 2** . With the 10g Services, life gets easier. To understand services, read the RAC Admin and Deployment Guide for 10g Release 2 Chapter 6.

Should the SCSI-3 reservation bit be set for our Oracle Clusterware only installation?If you are using only Oracle Clusterware(no Veritas CM), then you don't need to have SCSI-3 PGR enabled, since Oracle Clusterware does not require it for IO fencing. If the reservation is set, then you'll get the inconsistent results. So ask your storage vendor to disable the reservation. Veritas RAC requires that the storage array support SCSI-3 PGR, since this is how Veritas handles IO fencing. This SCSI-3 PGR is set at the array level; for example EMC hypervolume level.

What are the restrictions on the SID with a RAC database? Is it limited to 5 characters?The SID prefix in 10g Release 1 and prior versions was restricted to five characters by install/config tools so that an ORACLE_SID of upto max of 5+3=8 characters can be supported in a RAC environment. The SID prefix is relaxed upto 8 characters in 10g Release 2, see bug4024251 for more information.

Does Oracle Clusterware or Real Application Clusters support heterogeneous platforms?Oracle Clusterware and Real Application Clusters do not support heterogeneous platforms in the same cluster. Enterprise Manager Grid Control supports heterogeneous platforms. We do support machines of different speeds and size in the same cluster. All nodes must run the same operating

system (I.E. they must be binary compatible). In an active data-sharing environment, like RAC, we do not support machines having different chip architectures.

Why is validateUserEquiv failing during install (or cluvfy run)?SSH must be set up as per the pre-installation tasks. It is also necessary to have file permissions set as described below for features such as Public Key Authorization to work. If your permissions are not correct, public key authentication will fail, and will fallback to password authentication with no helpful message as to why. The following server configuration files and/or directories must be owned by the account owner or by root and GROUP and WORLD WRITE permission must be disabled. $HOME $HOME/.rhosts $HOME/.shosts $HOME/.ssh $HOME/.ssh.authorized-keys $HOME/.ssh/authorized-keys2 #Openssh specific for ssh2 protocol. SSH (from OUI) will also fail if you have not connected to each machine in your cluster as per the note in the installation guide: The first time you use SSH to connect to a node from a particular system, you may see a message similar to the following: The authenticity of host 'node1 (140.87.152.153)' can't be established. RSA key fingerprint is 7z:ez:e7:f6:f4:f2:4f:8f:9z:79:85:62:20:90:92:z9. Are you sure you want to continue connecting (yes/no)? Enter |yes| at the prompt to continue. You should not see this message again when you connect from this system to that node. Answering yes to this question causes an entry to be added to a "known-hosts" file in the .ssh directory which is why subsequent connection requests do not reask. This is known to work on Solaris and Linux but may work on other platforms as well.

I had a 3 node RAC. One of the nodes had to be completely rebuilt as a result of a problem. As there are no backups, What is the proper procedure to remove the 3rd node from the cluster so it can be added back in?Follow the documentation for removing a node but you can skip all the steps in the node-removal doc that need to be run on the node being removed, like steps 4, 6 and 7 (See Chapter 10 of RAC Admin and Deployment Guide). Make sure that you remove any database instances that were configured on the failed node with srvctl, and listener resources also, otherwise rootdeltenode.sh

will have trouble removing the nodeapps. Just running rootdeletenode.sh isn't really enough, because you need to update the installer inventory as well, otherwise you won't be able to add back the node using addNode.sh. And if you don't remove the instances and listeners you'll also have problems adding the node and instance back again. Probably a better alternative (than the generic documentation, bug 5929611 filed) for a remove node is Note 269320.1

A client is a new RAC user and are using it in conjunction with BEA weblogic. Can they use Connection Load Balancing and Services? What about FCF, FAN, RCLB?The key item here is whether or not they are using XA. If they are using XA (Tuxedo for example), then they should use the DTP service with 10g Release 2. Have the customer review the Best Practices for using XA with RAC on OTN . If it is not XA then services and Net Service Connection Load Balancing should work fine. They can tune aspects of the recovery such as instance recovery time. Using BEA, they do not get the advanced features such as Fast Connection Failover (FCF) and Runtime Connection Load Balancing . To understand services, FCF, RCLB, read the RAC Admin and Deployment Guide for 10g Release 2 Chapter 6.

Is relink required for CRS_HOME after OS upgrade?Oracle Clusterware binaries cannot be relinked on shiphomes. So to answer your question no, there is not need to relink Oracle Clusterware binaries.

How many NICs do I need to implement RAC?At minimum you need 2: external (public), interconnect (private). When storage for RAC is provided by Ethernet based networks (e.g. NAS/nfs or iSCSI), you will need a third interface for I/O so a minimum of 3. Anything else will cause performance and stability problems under load. From an HA perspective, you want these to be redundant, thus needing a total of 6.

Can we output the backupset onto regular file system directly (not onto flash recovery area) using RMAN command, when we use SE RAC?

Yes, - customers might want to backup their database to offline storage so this is also supported.

Does changing uid or gid of the Oracle User affect Oracle Clusterware?There are a lot of files in the Oracle Clusterware home and outside of the Oracle Clusterware home that are chgrp'ed to the appropriate groups for security and appropriate access. The filesystem records the uid (not the username), and so if you exchange the names, now the files are owned by the wrong group.

I could not get the user equivalence check to work on my Solaris 10 server when trying to install 10.2.0.1 Oracle Clusterware. The install ran fine without issue. >Cluvfy tries to find Ssh on solaris at /usr/local/bin. Workaround is to create a softlink from /usr/bin/ssh to /usr/local/bin. To resolve issues with cluvfy, it often helps to turn on tracing: run cluvfy with -verbose attribute and SRVM_TRACE environment variable set to TRUE $script run.log $export SRVM_TRACE=TRUE $cluvfy -blah ### Add the -verbose $exit

Can my customer use Veritas Agents to manage their RAC database on Unix with SFRAC installed?For details on the support of SFRAC and Veritas Agents with RAC 10g, please see Oracle's Policy for Supporting RAC 10g with Symantec SFRAC on Unix and Using Oracle Clusterware with Vendor Clusterware FAQ

Can I run more than one clustered database on a single RAC cluster?You can run multiple databases in a RAC cluster, either one instance per node (w/ different databases having different subsets of nodes in a cluster), or multiple instances per node (all databases running across all nodes) or some combination in between. Running multiple instances per node does cause memory and resource fragmentation, but this is no different from running multiple instances on a single node in a single instance environment which is quite common. It does provide the flexibility of being able to share CPU on the node, but the Oracle Resource

Manager will not currently limit resources between multiple instances on one node. You will need to use an OS level resource manager to do this.

Where can I find a list of supported solutions to ensure NIC availability (for the interconnect) per platform?IBM AIX - available solutions:

Etherchannel (OS based) HACMP based network failover solution More information: Metalink Note 296856.1 HP HP/UX - available solutions: APA - Auto Port Aggregation (OS based) MC/Serviceguard based network failover solution Combination of both solutions More information: Metalink Note 296874.1 and Auto Port Aggregation (APA) Support Guide Sun Solaris - available solutions: Sun Trunking (OS based) Sun IPMP (OS based) Sun Cluster based network failover solution (IPMP based) More information: Metalink Note 283107.1 - IPMP in general. When IPMP is used for the interconnect: Metalink Note 368464.1 Related RAC FAQ entries: In Solaris 10, do we need Sun Clusterware to provide redundancy for the interconnect and multiple switches? Linux - available solutions:

Bonding More information: Metalink Note 298891.1 Related RAC FAQ entries: How do I use multiple network interfaces to provide High Availability and/or Load Balancing for my interconnect with Oracle Clusterware? Windows - available solutions: Teaming On Windows teaming solutions to ensure NIC availability are usually part of the network card driver. Thus, they depend on the network card used. Please, contact the respective hardware vendor for more information.

Is there a need to renice LMS processes in Oracle RAC 10g Release 2?LMS processes should be running in RT by default since 10.2, so there's NO need to renice them, or otherwise mess with them. Check with ps -efl: 0 S spommere 31191 1 0 75 0 - 270857 - 10:01 ? 00:00:00 ora_lmon_appsu01 0 S spommere 31193 1 5 75 0 - 271403 - 10:01 ? 00:00:07 ora_lmd0_appsu01 0 S spommere 31195 1 0 58 - - 271396 - 10:01 ? 00:00:00 ora_lms0_appsu01 0 S spommere 31199 1 0 58 - - 271396 - 10:01 ? 00:00:00 ora_lms1_appsu01 7th column, if it is 75 or 76 then this is Time Share, 58 is Real Time. You can also use chrt to check: LMS (Real Time): $ chrt -p 31199 pid 31199's current scheduling policy: SCHED_RR pid 31199's current scheduling priority: 1 LMD (Time Share) $ chrt -p 31193 pid 31193's current scheduling policy: SCHED_OTHER pid 31193's current scheduling priority: 0

How do I check for network problems on my interconect?1. Confirm that full duplex is set correctly for all interconnect links on all interfaces on both ends. Do not rely on auto negotiation. 2. ifconfig -a will give you an indication of collisions/errors/overuns and dropped packets 3. netstat -s will give you a listing of receive packet discards, fragmentation and reassembly errors for IP and UDP. 4. Set the udp buffers correctly 5. Check your cabling Note: If you are seeing issues with RAC, RAC uses UDP as the protocol. Oracle Clusterware uses TCP/IP.

Are there any issues for the interconnect when sharing the same switch as the public network by using VLAN to separate the network?RAC and Clusterware deployment best practices recommend that the interconnect be deployed on a stand-alone, physically seperate, dedicated switch. Many customers have consolidated these stand-alone switches into larger managed switches. A consequence of this consolidation is a merging of IP networks on a single shared switch, segmented by VLANs. There are caveats associated with such deployments. RAC cache fusion exercises the IP network more rigorously than non-RAC Oracle databases. The latency and bandwidth requirements as well as availability requirements of the RAC/Clusterware interconnect IP network are more in-line with high performance computing. Deploying the RAC/Clusterware interconnect on a shared switch, segmented VLAN may expose the interconnect links to congestion and instability in the larger IP network topology. If deploying the interconnect on a VLAN, there should be a 1:1 mapping of VLAN to non-routable subnet and the VLAN should not span multiple VLANs (tagged) or multiple switches. Deployment concerns in this environment include Spanning Tree loops when the larger IP network topology changes, Assymetric routing that may cause packet flooding, and lack of fine grained monitoring of the VLAN/port.

Can I run RAC 10g with RAC 11g?Yes. The Oracle Clusterware should always run at the highest level. With Oracle Clusterware 11g, you can run both RAC 10g and RAC 11g databases. If you are using ASM for storage, you can use either Oracle Database 10g ASM or Oracle Database 11g ASM however to get the 11g features, you must be running Oracle Database 11g ASM. It is recommended to use Oracle Database 11g ASM. Yes, you can run 9iRAC in the cluster as well. 9i RAC requires the clusterware that is certified with 9i RAC to be running in addition to Oracle Clusterware 11g.

Are jumbo frames supported for the RAC interconnect?Yes. For details see Metalink Cluster Interconnect and Jumbo Frames

Are Sun Logical Domains (ldoms) supported with RAC?Currently Sun Logical Domains (ldoms) are not supported with Oracle Database (both single instance and RAC). Check certify for the latest information, if certification is completed, it will be listed in Certify.

The Veritas installation document on page 219 asks for setting LD_LIBRARY_PATH_64. Should I remove this?Yes You do not need to set LD_LIBRARY_PATH for Oracle.

Are block devices supported for OCR, Voting Disks, ASM devices?Block Devices are only supported on Linux. For Unix platforms, the directio symantics not applicable (or rather not implemented) for the block devices on these platforms. Note: On Linux, raw devices are being deprecated so you should move to using block devices. Note the Oracle Database 10g OUI does not support block devices however Oracle Clusterware and ASM do.

We are using Transparent Data Encryption (TDE). We create a wallet on node 1 and copy to nodes 2 & 3. Open the wallet and we are able to select encrypted data on all three nodes. Now, we want to REKEY the MASTER KEY. What do we have to do?After a re-key on node one, 'alter system set wallet close' on all other nodes, copy the wallet with the new master key to all other nodes, 'alter system set wallet open identified by "password"; on all other nodes to load the (obfuscated) master key into node's SGA.

Why does the NOAC attribute need to be set on NFS mounted RAC Binaries?The noac attribute is required because the installer determines sharedness by creating a file and checking for that files existance on remote node. If the noac attribute is not enabled then this test will incorrectly fail. This will confuse installer and opatch. Some other minor issues issues with spfile in the default $ORACLE_HOME/dbs will definitely be affected.

How do I use DBCA in silent mode to set up RAC and ASM?If I already have an ASM instance/diskgroup then the following creates a RAC database on that diskgroup: su oracle -c "$ORACLE_HOME/bin/dbca -silent -createDatabase -templateName General_Purpose.dbc -gdbName $SID -sid $SID -sysPassword $PASSWORD -systemPassword $PASSWORD -sysmanPassword $PASSWORD -dbsnmpPassword $PASSWORD -emConfiguration LOCAL -storageType ASM -diskGroupName $ASMGROUPNAME -datafileJarLocation $ORACLE_HOME/assistants/dbca/templates -nodeinfo $NODE1,$NODE2 -characterset WE8ISO8859P1 -obfuscatedPasswords false -sampleSchema false -oratabLocation /etc/oratab" The following will create a ASM instance & 1 diskgroup su oracle -c "$ORA_ASM_HOME/bin/dbca -silent -configureASM -gdbName NO -sid NO -emConfiguration NONE -diskList $ASM_DISKS -diskGroupName $ASMGROUPNAME -datafileJarLocation $ORACLE_HOME/assistants/dbca/templates -nodeinfo $NODE1,$NODE2 -obfuscatedPasswords false -oratabLocation /etc/oratab -asmSysPassword $PASSWORD -redundancy $ASMREDUNDANCY" where ASM_DISKS = '/dev/sda1,/dev/sdb1' and ASMREDUNDANCY='NORMAL'

w does OCR mirror work? What happens if my OCR is lost/corrupt?OCR is the Oracle Cluster Registry, it holds all the cluster related information such as instances, services. The OCR file format is binary and starting with 10.2 it is possible to mirror it. Location of file(s) is located in: /etc/oracle/ocr.loc in ocrconfig_loc and ocrmirrorconfig_loc variables. Obviously if you only have one copy of the OCR and it is lost or corrupt then you must restore a recent backup, see ocrconfig utility for details, specifically -showbackup and -restore flags. Until a valid backup is restored the Oracle Clusterware will not startup due to the corrupt/missing OCR file.

The interesting discussion is what happens if you have the OCR mirrored and one of the copies gets corrupt? You would expect that everything will continue to work seemlessly. Well.. Almost.. The real answer depends on when the corruption takes place. If the corruption happens while the Oracle Clusterware stack is up and running, then the corruption will be tolerated and the Oracle Clusterware will continue to funtion without interruptions. Despite the corrupt copy. DBA is advised to repair this hardware/software problem that prevent OCR from accessing the device as soon as possible; alternatively, DBA can replace the failed device with another healthy device using the ocrconfig utility with -replace flag. If however the corruption happens while the Oracle Clusterware stack is down, then it will not be possible to start it up until the failed device becomes online again or some administrative action using ocrconfig utility with -overwrite flag is taken. When the Clusteware attempts to start you will see messages similar to:total id sets (1), 1st set (1669906634,1958222370), 2nd set (0,0) my votes (1), total votes (2) 2006-07-12 10:53:54.301: [OCRRAW][1210108256]proprioini:disk 0 (/dev/raw/raw1) doesn't have enough votes (1,2) 2006-07-12 10:53:54.301: [OCRRAW][1210108256]proprseterror: Error in accessing physical storage [26]

This is because the software can't determin which OCR copy is the valid one. In the above example one of the OCR mirrors was lost while the Oracle Clusterware was down. There are 3 ways to fix this failure: a) Fix whatever problem (hardware/software?) that prevent OCR from accessing the device. b) Issue "ocrconfig -overwrite" on any one of the nodes in the cluster. This command will overwrite the vote check built into OCR when it starts up. Basically, if OCR device is configured with mirror, OCR assign each device with one vote. The rule is to have more than 50% of total vote (quorum) in order to safely make sure the available devices contain the latest data. In 2-way mirroring, the total vote count is 2 so it requires 2 votes to achieve the quorum. In the example above there isn't enough vote to start if only one device with one vote is available. (In the earlier example, while OCR is running when the device is down, OCR assign 2 vote to the surviving device and that is why this surviving device now with two votes can start after the cluster is down). See warning below c) This method is not recommend to be performed by customers. It is possible to manually modify ocr.loc to delete the failed device and restart the cluster. OCR won't do the vote check if the mirror is not configured. See warning below EXTREME CAUTION should be excersized if chosing option b or c above since data loss can occur if the wrong file is manipulated, please contact Oracle Support for assistance before proceeding. Bug 5055145 was the basis for this FAQ, also thanks to Ken Lee for his valuable feedback.

Why do we have a Virtual IP (VIP) in Oracle RAC 10g? Why does it just return a dead connection when its primary node fails?The goal is application availability. When a node fails, the VIP associated with it is automatically failed over to some other node. When this occurs, the following things happen. (1) VIP detects public network failure which generates a FAN event. (2) the new node re-arps the world indicating a new MAC address for the IP. (3) connected clients subscribing to FAN immediately receive ORA-3113 error or equivalent. Those not subscribing to FAN will eventually time out. (4) New connection requests rapidly traverse the tnsnames.ora address list skipping over the dead nodes, instead of having to wait on TCP-IP timeouts Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP timeout period (which can be up to 10 min) before getting an error. As a result, you don't really have a good HA solution without using VIPs and FAN. The easiest way to use FAN is to use an integrated client with Fast Connection Failover (FCF) such as JDBC, OCI, or ODP.NET.

If I use Services with Oracle Database 10g, do I still need to set up Load Balancing ?Yes, Services allow you granular definition of workload and the DBA can dynamically define which instances provide the service. Connection Load Balancing (provided by Oracle Net Services) still needs to be set up to allow the user connections to be balanced across all instances providing a service. With Oracle RAC 10g Release 2 or higher, set the CLB_GOAL on service to define the type of load balancing you want, SHORT for short lived connections (IE connection pool) or LONG (default) for applciations that have connections active for long periods (IE Oracle Forms applicaiton).

Is Oracle Application Server integrated with FAN and FCF?Yes, For detailed information on the integration with the various releases of Application Server 10g, http://www.oracle.com/technology/tech/java/newsletter/articles/oc4j_data_sources/oc4j_ds.htm

In Solaris 10, do we need Sun Clusterware to provide redundancy for the interconnect and multiple switches?Link Aggregation (GLDv3) is bundled in the OS as of Solaris 10. IPMP is available for Solaris 10 and Solaris 9. Neither require Sun Cluster to be installed. For the interconnect and switch redundancy, as a best practice, avoid VLAN trunking across the switches. For ease of

configuration (e.g. fewer IP address requirements), use IPMP with link mode failure detection in primary/standby configuration. This will give you a single failover IP which you will define in cluster_interconnects init.ora parameter. Remove any interfaces for the interconnect from the OCR using `oifcfg delif`. AND TEST THIS RIGOROUSLY. For now, as Link Aggregation (GLDv3) cannot span multiple switches from a single host, you will need to configure the switch redundancy and the host NICs with IPMP. When configuring IPMP for the interconnect with multiple switches available, configure IPMP as active/standby and *not* active/active. This is to avoid potential latencies in switch failure detection/failover which may impact the availability of the rdbms. Note, IPMP spreads/load balances outbound packets on the bonded interfaces, but inbound packets are received on a single interface. In an active/active configuration this makes send/receive problems difficult to diagnose. Both Link Aggregation (GLDv3) and IPMP are core OS packages SUNWcsu, SUNWcsr respectively and do not require Sun Clusterware.

Can RMAN backup Real Application Cluster databases?Absolutely. RMAN can be configured to connect to all nodes within the cluster to parallelize the backup of the database files and archive logs. If files need to be restored, using set AUTOLOCATE ON alerts RMAN to search for backed up files and archive logs on all nodes. RAC with RMAN in the Oracle Documentation

I am receiving an ORA-29740 error. What should I do?This error can occur when problems are detected on the cluster: Error: ORA-29740 (ORA-29740) Text: evicted by member %s, group incarnation %s --------------------------------------------------------------------------Cause: This member was evicted from the group by another member of the cluster database for one of several reasons, which may include a communications error in the cluster, failure to issue a heartbeat to the control file, etc. Action: Check the trace files of other active instances in the cluster group for indications of errors that caused a reconfiguration. For more information on troubleshooting this error, see the following Metalink note: Troubleshooting ORA-29740 in a RAC Environment

What does the Virtual IP service do? I understand it is for failover but do we need a separate network card? Can we use the existing private/public cards? What would happen if we used the public ip?The 10g Virtual IP Address (VIP) exists on every RAC node for public network communication. All client communication should use the VIPs in their TNS connection descriptions. The TNS ADDRESS_LIST entry should direct clienst to VIPs rather than using hostnames. During normal runtime, the behaviour is the same as hostnames, however when the node goes down or is shutdown the VIP is hosted elsewhere on the cluster, and does not accept connection requests. This results in a silent TCP/IP error and the client fails immediately to the next TNS address. If the network interface fails within the node, the VIP can be configured to use alternate interfaces in the same node. The VIP must use the public interface cards. There is no requirement to purchase additional public interface cards (unless you want to take advantage of within-node card failover.)

What do the VIP resources do once they detect a node has failed/gone down? Are the VIPs automatically acquired, and published, or is manual intervention required? Are VIPs mandatory?When a node fails, the VIP associated with the failed node is automatically failed over to one of the other nodes in the cluster. When this occurs, two things happen: 1. The new node re-arps the world indicating a new MAC address for this IP address. For directly connected clients, this usually causes them to see errors on their connections to the old address; 2. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately. In the case of existing SQL conenctions, errors will typically be in the form of ORA-3113 errors, while a new connection using an address list will select the next entry in the list. Without using VIPs, clients connected to a node that died will often wait for a TCP/IP timeout period before getting an error. This can be as long as 10 minutes or more. As a result, you don't really have a good HA solution without using VIPs.

What are my options for load balancing with RAC? Why do I get an uneven number of connections on my instances?All the types of load balancing available currently (9i-10g) occur at connect time. This means that it is very important how one balances connections and what these connections do on a long term basis. Since establishing connections can be very expensive for your application, it is good programming practice to connect once and stay connected. This means one needs to be careful as to what option one uses. Oracle Net Services provides load balancing or you can use external methods such as hardware based or clusterware solutions.

The following options exist prior to Oracle RAC 10g Releae 2 (for 10g Release 2 see Load Balancing Advisory): Random Either client side load balancing or hardware based methods will randomize the connections to the instances. On the negative side this method is unaware of load on the connections or even if they are up meaning they might cause waits on TCP/IP timeouts. Load Based Server side load balancing (by the listener) redirects connections by default depending on the RunQ length of each of the instances. This is great for short lived connections. Terrible for persistent connections or login storms. Do not use this method for connections from connection pools or applicaton servers Session Based Server side load balancing can also be used to balance the number of connections to each instance. Session count balancing is method used when you set a listener parameter, prefer_least_loaded_node_listener-name=off. Note listener name is the actual name of the listener which is different on each node in your cluster and by default is listener_nodename. Session based load balancing takes into account the number of sessions connected to each node and then distributes the connections to balance the number of sessions across the different nodes.

Can our 10g VIP fail over from NIC to NIC as well as from node to node ?Yes the 10g VIP implementation is capable from failing over within a node from NIC to NIC and back if the failed NIC is back online again, and also we fail over between nodes. The NIC to NIC failover is fully redundant if redundant switches are installed.

What is CLB_GOAL and how should I set it?CLB_GOAL is the connection load balancing goal for a service. There are 2 options, CLB_GOAL_SHORT and CLB_GOAL_LONG (default). Long is for applications that have long-lived connections. This is typical for connection pools and SQL*Forms sessions. Long is the default connection load balancing goal. Short is for applications that have short-lived connections. The GOAL for a service can be set with EM or DBMS_SERVICE. Note: You must still configure load balancing with Oracle Net Services

What is Server-side Transparent Application Failover (TAF) and how do I use it?Oracle Database 10g Release 2, introduces server-side TAF when using services. After you create a service, you can use the dbms_service.modify_service pl/sql procedure to define the TAF policy

for the service. Only the basic method is supported. Note this is different than the TAF policy (traditional client TAF) that is supported by srvctl and EM Services page. If your service has a server side TAF policy defined, then you do not have to encode TAF on the client connection string. If the instance where a client is connected, fails, then the connection will be failed over to another instance in the cluster that is supporting the service. All restrictions of TAF still apply. NOTE: both the client and server must be 10.2 and aq_ha_notifications must be set to true for the service. Sample code to modify service: execute dbms_service.modify_service (service_name => 'gl.us.oracle.com' , aq_ha_notifications => true , failover_method => dbms_service.failover_method_basic , failover_type => dbms_service.failover_type_select , failover_retries => 180 , failover_delay => 5 , clb_goal => dbms_service.clb_goal_long);

With three primary load balancing options (client-side connect-time LB, serverside connect-time LB, and the runtime connection load balancing) Is it fair to say Runtime Connection Load Balancing is the only option to leverage FAN up/down events?No. The listener is a subscriber to all FAN events (both from the load balancing advisory and the HA events). Therefore server side connection load balancing leverages FAN HA events as well as laod balancing advisory events. With the Oracle JDBC driver 10g Release 2, if you enable Fast Connection Failover, you also enable Runtime Connection Load Balancing (one knob for both).

How can a customer mask the change in their clustered database configuration from their client or application? (I.E. So I do not have to change the connection string when I add a node to the RAC database)The combination of Server Side load balancing and Services allows you to easily mask cluster database configuration changes. As long as all instances register with all listeners (use the LOCAL_LISTENER and REMOTE_LISTENER parameters), server side load balancing will allow clients to connect to the service on currently available instances at connect time. The load balancing advisory (setting a goal on the service) will give advice as to how many connections to send to each instance currently providing a service. When a service is enabled on an instance, as long as the instance registers with the listeners, the clients can start getting connections to the service and the load balancing advisory will include that instance is its advice.

After executing DBMS_SERVICE.START_SERVICE, the service resource remains OFFLINE status when confirming it with crs_stat. Is that expected behavior ?YES this is expected behaviour. Unfortunately, the DBMS_SERVICE.START_SERVICE does not update the clusterware. You should use srvctl start service -d dbname then you should see it come online.

Is it possible to use SVRCTL start database with a user account other than oracle ( that is other than the owner of the oracle software)?YES. When you create a RAC db as a user different than the home/software owner (oracle) user, the db creation assistant would set the correct permissions/ACLs on the CRS resources that control the db/instances etc, assuming that you had setup group membership for this user to the dba group of the home (find it using oracle_home/bin/osdbagrp) and also part of the crs home owners primary group (usually oinstall) and there was group write permission on the oracle_home.

The client gets this error message in Production in the ons.log file every minute or so: 06/11/10 10:11:14 [2] Connection 0,129.86.186.58,6200 SSL handshake failed 06/11/10 10:11:14 [2] Handshake for 0,129.86.186.58,6200: nz error = 29049 interval = 0 (180 max)These annoying messages in ons.log are telling you that you have a configuration mismatch for ONS somewhere in the farm. RAC has its own ONS server for which SSL is disabled by default. You must either enable SSL for RAC ONS, or disable it for OID ONS(OPMN). You need to create a wallet for each RAC ONS server, or copy one of the wallets from OPMN on the OID instances. In ons.conf you need to specify the wallet file and password: walletfile= walletpassword= ONS only uses SSL between servers, and so ONS clients will not be affected. You specify the wallet password when you create the wallet. If you copy a wallet from an OPMN instance, then use the same password configured in opmn.xml. If there is no wallet password configured in opmn.xml, then you don't need to specify a wallet password in ons.conf either.

How do I configure FCF with BPEL so I can use RAC 10g in the backend?** Note:372456.1 describes the procedure to set up BPEL with a Oracle RAC 10g Release 1 database. If you are using SSL, ensure the SSL enable attribute of ONS in opmn.xml file has same value, either true or false, for all OPMN servers in the Farm. To troubleshoot OPMN at the application

server level, look at appendix A in Oracle Process Manager and Notification Server Administrator's Guide.

I am using shared services which the following set in init.ora SQL> show parameters dispatchers=(protocol=TCP)(listener=listen ers_nl01)(con=500)(serv=oltp). I stopped my service with srvctl stop service but it is still registered with the listener and accepting connections. Is this expected?YES. This is by design of dispatchers which are part of Oracle Net Services. If you specify the service attribute of the dispatchers init.ora parameter, the service specified cannot be managed by the dba.

What should I do to make my RAC deployment highly available?Customers often deploy Oracle Real Application Clusters (RAC) to provide a highly available infrastructure for their mission critical applications. Oracle RAC removes the server as a single point of failure. Load balancing your workload across many servers along with fast recovery from failures means that the loss of any one server should have little or no impact on the end user of the application. The level of impact to the end user depends on how well the application has been written to mask failure. If an outage occurs on a RAC instance, the ideal situation would be that the failover time + transaction response time to be less then the maximum acceptable response time. Oracle RAC has many features that customers can take advantage of to mask failures from the end user however it requires more work than just installing Oracle RAC. To the application user, the availability metric that means the most is the response time for their transaction. This is the end-to-end response time which means all layers must be available and performing to a defined standard for the agreed times. If you are deploying Oracle RAC and require high availability, you must make the entire infrastructure of the application highly available. This requires detailed planning to ensure there are no single points of failure throughout the infrastructure. Oracle Clusterware is constantly monitoring any process that it under its control, which includes all the Oracle software such as the Oracle instance, listener, etc. Oracle Clusterware has been programmed to recover from failures, which occur for the Oracle processes. In order to do its monitoring and recovery, various system activities happen on a regular basis such as user authentication, sudo, and hostname resolution. In order for the cluster to be highly available, it must be able to perform these activities at all times. For example, if you choose to use the Lightweight Directory Access Protocol (LDAP) for authentication, then you must make the LDAP server highly available as well as the network connecting the users, application, database and LDAP server. If the database is up but the users cannot connect to the database because the LDAP server is not accessible, then the entire system is down in the eyes of your users. When using external authentication such as LDAP or NIS (Network Information Service), a public network failure will cause failures within the cluster. Oracle recommends that the hostname, vip, and interconnect are defined in the /etc/hosts file on all nodes in the cluster.

During the testing of the RAC implementation, you should include a destructive testing phase. This is a systematic set of tests of your uration to ensure that 1) you know what to expect if the failure occurs and how to recover from it and 2) that the system behaves as expected during the failure. This is a good time to review operating procedures and document recovery procedures. Destructive testing should include tests such as node failure, instance failure, public network failure, interconnect failures, storage failure, storage network failure, voting disk failure, loss of an OCR, and loss of ASM. Using features of Oracle Real Application Clusters and Oracle Clients including Fast Application Notification (FAN), Fast Connection Failover (FCF), Oracle Net Service Connection Load Balancing, and the Load Balancing Advisory, applications can mask most failures and provide a very highly available application. For details on implementing best practices, see the MAA document Client Failover Best Practices for Highly Available Oracle Databases and the Oracle RAC Administration and Deployment Guide.

Why am I seeing the following warnings in my listener.log for my RAC 10g environment? WARNING: Subscription for node down event still pendingThis message indicates that the listener was not able to subscribe to the ONS events which it uses to do the connection load balancing. This is most likely due to starting the listener using lsnrctl from the database home. When you start the listener using lsnrctl, make sure you have set the environment variable ORACLE_CONFIG_HOME = {Oracle Clusterware HOME}, also set it in racgwrap in the $ORACLE_HOME/bin for the database.

Will FAN work with SQLPlus?Yes with Oracle RAC 11g, you can specify the -F (FAILOVER) option. This enables SQL*Plus to interact with the OCI failover mode in a Real Application Cluster (RAC) environment. In this mode a service or instance failure is transparently handled with transaction status messages if applicable.

What clients provide integration with FAN through FCF?With Oracle Database 10g Release 1, JDBC clients (both thick and thin driver) are integrated with FAN by providing FCF. With Oracle Database 10g Release 2, we have added ODP.NET and OCI. Other applications can integrate with FAN by using the API to subscribe to the FAN events. Note: If you are using a 3rd party application server, then you can only use FCF if you use the Oracle driver and except for OCI, its connection pool. If you are using the connection pool of the

3rd Party Application Server, then you do not get FCF. Your customer can subscribe directly to FAN events however that is a development project for the customer. See the white paper Workload Management with Oracle RAC 10g on OTN

Can I use TAF and FAN/FCF?With Oracle Database 10g Release 1, NO. With Oracle Database 10g Release 2, the answer is YES for OCI and ODP.NET, it is recommended. For JDBC, you should not use TAF and FCF even with the Thick JDBC driver.

How does the datasource properties initialLimit, minLimit, and maxLimit affect Fast Connection Failover processing with JDBC?The initialLimit property on the Implicit Connection Cache is effective only when the cache is first created. For example, if the initialLimit is set to 10, you'll have 10 connections pre-created and available when the conn cache is first created. Pls don't be confused between minLimit and initialLimit. The current behavior is that after a DOWN event and the affected connections are cleaned up, it is possible for the number of connections in the cache to be lower than minLimit.

An UP event is processed for both (a) new instance joins, as well as (b) down followed by an instance UP. This has no relevance to initialLimit, or even minLimit. When a UP event comes into our jdbc Implicit Connection Cache, we will create some new connections. Assuming you have your listener load balancing set up properly, then those connections should go to the instance that was just started. When your application does a get connection to the pool, it will be given an idle connection, if you are running 10.2 and have the load balancing advisory turned on for the service, we will allocate the session based on the defined goal to provide the best service level MaxLimit, when set, defines the upper boundary limit for the connection cache. By default, maxLimit is unbounded - your database sets the limit.

Do I need to install the ONS on all my mid-tier serves in order to enable JDBC Fast Connection Failover (FCF)?With 10g Release 1, the middle tier must have ONS running (started by same users as application). ONS is not included on the Client CD however is is part of the Oracle Database 10g cd. With 10g Release 2 or later, they do not need to install the ons on the middle tier. The JDBC driver

allows the use of remote ONS (ie uses the ONS running in the RAC cluster) . Just use the datasource parameter ods.setONSConfiguration("nodes=racnode1:4200,racnode2.:4200");

Will FAN/OCI work with Instant Client?Yes, FAN/OCI will work with Instant Client. Both client and server must be Oracle Database 10g Release 2.

What type of callbacks are supported with OCI when using FAN/FCF?There are two separate callbacks supported. The HA Events (FAN) callback is called when an event occurs. When a down event occurs, for example, you can clean up a custom connection pool. i.e. purge stale connections. When the failover occurs, the TAF callback is invoked. At failover time you can customize the newly created database session. Both FAN and TAF are clientside callbacks. FAN also has a separate server side callout that should not be confused with the OCI client callback.

Does FCF for OCI react to FAN HA UP events?OCI does not perform any implicit actions on an up event, however if a HA event callback is present, it is invoked. You can take any required action at that time.

Can I use FAN/OCI with Pro*C?Since Pro*C (sqllib) is built on top of OCI, it should support HA events. You need to precompile the application with the option EVENTS=TRUE, make sure you link the application with a thread library. The database connection must use a Service that has been enabled for AQ events. Use dbms_service.modify_service to enable the service for events (aq_ha_notifications => true) or use the EM Cluster Database Services page.

Do I have to link my OCI application with a thread library? Why?YES, you must link the application to a threads library. This is required because the AQ notifications occur asynchronously, over an implicitly spawned thread.

Can I use the 10.2 JDBC driver with 10.1 database for FCF?Yes with the patch for Bug 5657975 for 10.2.0.3,the 10.2 JDBC driver will work with a 10.1 database. The fix will be part of the 10.2.0.4 patchset. If you do not have the patch then using FCF, use the 10.2 JDBC driver with 10.2 database. If database is 10.1, use 10.1 JDBC driver.

Will FAN/FCF work with the default database service?No. If you want the advanced features of RAC provided by FAN and FCF, then create a cluster managed service for your application. Use the Clustered Managed Services Page in Enterprise Manager DBControl to do this.

I am seeing the wait events 'ges remote message', 'gcs remote message', and/or 'gcs for action'. What should I do about these?These are idle wait events and can be safetly ignored. The 'ges remote message' might show up in a 9.0.1 statspack report as one of the top wait events. To have this wait event not show up you can add this event to the PERFSTAT.STATS$IDLE_EVENT table so that it is not listed in Statspack reports.

What are the changes in memory requirements from moving from single instance to RAC?If you are keeping the workload requirements per instance the same, then about 10% more buffer cache and 15% more shared pool is needed. The additional memory requirement is due to data structures for coherency management. The values are heuristic and are mostly upper bounds. Actual esource usage can be monitored by querying current and maximum columns for the gcs resource/locks and ges resource/locks entries in V$RESOURCE_LIMIT. But in general, please take into consideration that memory requirements per instance are reduced when the same user population is distributed over multiple nodes. In this case: Assuming the same user population N number of nodes M buffer cache for a single system then (M / N) + ((M / N )*0.10) [ + extra memory to compensate for failed-over users ] Thus for example with a M=2G & N=2 & no extra memory for failed-over users

=( 2G / 2 ) + (( 2G / 2 )) *0.10 =1G + 100M

What is the Load Balancing Advisory?To assist in the balancing of application workload across designated resources, Oracle Database 10g Release 2 provides the Load Balancing Advisory. This Advisory monitors the current workload activity across the cluster and for each instance where a service is active; it provides a percentage value of how much of the total workload should be sent to this instance as well as service quality flag. The feedback is provided as an entry in the Automatic Workload Repository and a FAN event is published. The easiest way for an application to take advantage of the load balancing advisory, is to enable Runtime Connection Load Balancing with an integrated client.

What is Runtime Connection Load Balancing?Runtime connection load balancing enables the connection pool to route incoming work requests to the available database connection that will provide it with the best service. This will provide the best service times globally, and routing responds fast to changing conditions in the system. Oracle has implemented runtime connection load balancing with ODP.NET and JDBC connection pools. Runtime Connection Load Balancing is tightly integrated with the automatic workload balancing features introduced with Oracle Database 10g I.E. Services, Automatic Workload Repository, and the new Load Balancing Advisory.

How do I enable the load balancing advisory?The load balancing advisory requires the use of services and Oracle Net connection load balancing. To enable it, on the server: set a goal (service_time or throughput, and set CLB_GOAL=SHORT ) on your service. For client, you must be using the connection pool. For JDBC, enable the datasource parameter FastConnectionFailoverEnabled. For ODP.NET enable the datasource parameter Load Balancing=true.

What are my options for setting the Load Balancing Advisory GOAL on a Service?

The load balancing advisory is enabled by setting the GOAL on your service either through PL/SQL DBMS_SERVICE package or EM DBControl Clustered Database Services page. There are 3 options for GOAL: None Default setting, turn off advisory THROUGHPUT Work requests are directed based on throughput. This should be used when the work in a service completes at homogenous rates. An example is a trading system where work requests are similar lengths. SERVICE_TIME Work requests are directed based on response time. This should be used when the work in a service completes at various rates. An example is as internet shopping system where work requests are various lengths