Performance problem with Oracle*Net Failover when TCP Network down (no IP address) [ID 249213.1]
修改时间 27-AUG-2010 类型 PROBLEM 状态 PUBLISHED
Applies to:
Oracle Net Services - Version: 8.1.7.4.0 to 11.1.0.7.0 - Release: 8.1.7 to 11.1
Information in this document applies to any platform.
Symptoms
The Network Interface Card (NIC) or Server box is powered off, or the Network cable is
disconnected from the Server.
Transparent Application Failover (TAF) and/or Connect-time Failover that was taking a
few seconds to occur, is now taking minutes, or even longer, and possibly timing out with an error.
Changes
The Network Card (NIC) no longer "exists" on the TCP/IP Network, and cannot be 'ping'ed.
Cause
What is happening is that the TCP Protocol is timing out at the OS / Network layer.
It can be shown that it is not only Oracle that is experiencing this fail over slowness, but other tools as well.
This can be shown with a third party tool like FTP, or Telnet. Try and make a connection to the failed IP address with one of these non-Oracle tools. The results will most likely be the same as Oracle is experiencing, with a slow response to a failure.
Unfortunately, Software fail over and load balancing will only work if the underlying TCP layer is intact (not just Oracle). If the underlying hardware has failed, the software will also fail.
Normal scenario
~~~~~~~~~~~~
Normally, when a failure occurs, but the NIC is still alive, we can explain it as follows:
As with the three-way connection model, the Client talks to a host service advertised on a certain port number.
Unfortunately, the service is not up and therefore not listening on that port.
The Client initiates the normal TCP/IP three-way handshake, but it gets back a RESET packet from the host and stops trying to connect (few seconds).
At least the host is available to communicate this information.
Abnormal scenario
~~~~~~~~~~~~~
However, when the NIC dies, or is not available to the TCP/IP network, here are those client connection request steps:
The Client talks to a host service on a host that does not exist, ie: there is no system operational on the IP address the client is trying to connect to.
Therefore there is no possibility that something will even respond to that IP address.
Again as per the connection model, the client initiates a TCP/IP three-way handshake, but there is no response.
The client waits a specified amount of time (OS configurable usually) like 200ms.
It sends the SYN packet again, but still gets no response.
So it waits 400ms and tries again.
Still no response, so it waits 800ms and tries again.
Again, no response, so it waits 1600ms and tries again.
After another wait of 3200ms, the client gives up.
By now 6.2 seconds have passed by.
Therefore it keeps trying every 3200ms until a magic interval steps in and it stops.
On Oracle Solaris this interval is tcp_ip_abort_cinterval and defaults to 3 minutes (180000ms).
You would have to check with the OS vendor what this is on any particular platform (see OS settings help following).
Solution
1. 10g/11g Timeout parameters (with limitations) -->:
With the release of 10g, Oracle now has the capability of timing out within a desired period, instead of waiting for the TCP timeout to occur.
The following settings can be used in the sqlnet.ora file on the client or server:
sqlnet.inbound_connect_timeout (server)
sqlnet.send_timeout (client and/or server)
sqlnet.recv_timeout (client and/or server)
However, please note that this is not at Connect-time failover, but rather during TAF operations.
In other words, these 10g settings will NOT correct any shortcomings at the TCP layer.
Oracle is STILL reliant on that layer and will still be susceptible to any abnormalities there.
The timeout values will only work when the TCP/IP address is alive but other software failures occur.
This is important to note.
Note on TIMEOUT parameters:
There are some isolated reports (Bug6894171 for example) where the TIMEOUT parameters were not working as expected and only working correctly with the 10.2.0.4 patchset. This was not reproducible but please be aware that any use of these TIMEOUT parameters need to be fully tested on any installation prior to inclusion into production (as should always be done).
2. So for all Oracle client versions 11g and below, the only recommended solution would be to utilize some type of Hardware Failover mechanism or HA (High Availability) or Cluster Services, such as one that uses a Virtual IP address (VIP) to reference more than one physical IP address.
There may be Hardware Load Balancing systems or Clusters Services that can do this.
The OS vendor would have to be contacted for such systems.
Please note, though, that this reference refers to a single database and listener.
3. For RAC or OPS systems, due to the fact that they have more than one listener, the HA setup is far more complex and needs to be dealt with by a RAC or OPS expert and references.
Oracle 10g does have VIP functionality, but this will not be explained in this note.
(See references following ...)
What happens (in a nutshell) is that the virtual IP address represents one or other actual IP addresses.
If one address "dies", then the OS specific HA mechanism detects its failure, and switches to the next available IP address on the system setup.
Oracle (or other software using the TCP layer) is unaware of this switch, as it only knows the Virtual IP address, so the program continues on, unbroken.
Now if VIP is not available, there are other 'options' available, but these do deal with changes to the actual TCP settings at the OS level, and may have adverse effects on any other components using the same TCP/IP layer.
What you need to do is speak with your System Admin and try to determine what TCP parameters
need to be set at the OS level that will allow the CLIENT to fail over quicker in the case of a Node Hardware / NIC failure.
Just for informational purposes, details on some client platforms (Linux, Solaris, AIX, and
Microsoft Windows) are given below, but if your clients are on other platforms or you need more detailed information on them, then you must contact those vendors and request similar settings or recommendations from them.
TCP SETTINGS - LINUX
net.ipv4.tcp_keepalive_time
net.ipv4.tcp_keepalive_intvl
net.ipv4.tcp_keepalive_probes
net.ipv4.tcp_retries
net.ipv4.tcp_syn_retries
A real life example follows:
Say the Oracle Client is on Linux, and the Primary Server node (any platform running 9i RAC)
is "powered off". This may result in the time to get an error at the client end being unacceptably slow.
There are reports of this being up to 4 minutes for the client to detect that Node1 was down and to reconnect to the other node which is still up. It has also been seen that a TELNET session attempt to the failed node also took about 4 minutes to generate an error.
After changing the following Linux parameters:
net.ipv4.tcp_keepalive_time 3000
net.ipv4.tcp_retries 5
net.ipv4.tcp_syn_retries 1
the timeout period was reduced to about 20 seconds.
TCP SETTINGS - Oracle SOLARIS
The following are CLIENT settings suggested by Sun (for test purposes) that would make NIC failures on the Server quicker to detect by the client system:
tcp_ip_abort_cinterval = 10000 (default is 180000)
tcp_keepalive_interval = 240000 (default is 7200000)
tcp_ip_abort_interval = 60000 (default is 480000)
Note: These are OS settings and effect ALL applications, tools and components running on the System.
TCP SETTINGS - AIX
The following are CLIENT settings suggested by IBM (AIX) and for test purposes, that would make NIC failures on the Server quicker to detect by the client system.
AIX TCP_KEEPIDLE
Description: This determines the length of inactivity before keepalive messages are sent and ensuring how loong a connection stays in an active/ESTABLISHED state.
Default value: 14400 (in half seconds, which is 2 hours)
TCP_KEEPINTVL
Description: Specifies how often these keepalive probe messages are sent.
The connection is considered broken after 8 unresponded probes.
Default value: 150 (in half seconds which is 75 seconds)
TCP_KEEPINIT
Description: Specifies the initial timeout value for TCP connections.
Default value: 150 (in half seconds which is 75 seconds)
To check the current setting, type "no -a", to change the settings.
So, as root, issue "no -o ="
For example:
no -o tcp_keepidle=1200
no -o tcp_keepintvl=40
These would set the values to 10 minutes and 20 seconds respectively.
To make the change permanent, the above two commands need to be added to the end of /etc/rc.tcpip
This is a ROOT function and so should only be done by the System Admin.
Note: Once again, these are OS settings and effect ALL applications, tools and components running on the
AIX System.
TCP SETTINGS - Microsoft Windows
The following are client settings suggested by Microsoft (all warnings noted in the above sections apply here too):
You change microsoft TCP parameters in the registry, under:
HKEY_LOCAL_MACHINE\System\CurrectControlSet\services\Tcpip\Parameters
(TcpTimedWaitDelay, and KeepAlive)
To do this:
* Backup the registry ...
* Open the registry hive under HKEY_LOCAL_MACHINE:SYSTEM (CurrentControlSet:Services:TCPIP:Parameters)
* Add KeepAliveTime : REG_DWORD to the registry (Note: The correct case is required for KeepAlive)
* Set the value to a reasonable timeout. Eg:(120000 = 2 minutes)
* Save this
* Reboot
For additional information, please see the following articles on the Microsoft support web site
(support.microsoft.com): Q120642, Q102974, and Q170359.
IMPORTANT NOTES!!!
- Please remember that any changes at the OS level are suggestions only, and need to be confirmed and actioned by the person/s responsible for that layer of the system.
- The OS TCP/IP parameters given in this note are as suggestions ONLY and for informational purposes.
They should ONLY be adjusted in conjunction with and if agreed to by the Network or System Administrator(s) owing to the impact it will have on other 3rd party applications using the same TCP Network.
- Testing should be a gradual one, and confirmation that all components in the entire system are working correctly before staying with customized settings in Production.
As mentioned, Oracle's recommendation to correct these Hardware failures would be through the use
of a Hardware HA system. The Hardware vendor should be able to provide you with options.
References
NOTE:259301.1 - CRS and 10g/11.1 Real Application Clusters
NOTE:438060.1 - Reducing Long Connect Time When One of Listener Machine Is Down
BUG:6894171 - (MAL-)FUNCTIONING OF SQLNET.SEND_TIMEOUT AND SQLNET.RECV_TIMEOUT