Upload
arthur-clarke
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
MQSeries Auto Channel Recovery
William HaoCommunications Middleware Worldwide Technical OperationsJuly 11, 2002
2
Contents
• Overview of Worldspan’s Current MQ Connectivity
• Summary of MQ Channel Issues
• Solutions to MQ Channel Issues
• Conclusion/Benefits
3
Overview of Worldspan’s Current MQ Connectivity
TPF Loosely-Coupled Complex
UNIX MQ Hub
WIN Servers
Remote MQ Connections
OS/390
UNIX
OTHERS
UNISYS
5
Summary of MQ Channel Issues
• Automated Channel Retry in TPF (PJ28758) not yet available at the time.
text
TPF
chl restart
6
Summary of MQ Channel Issues (cont.)
• Message sequence numbering between sender and receiver channel pair gets out of sync.
Sequence numbers are generated at the sending end of the channel and is incremented before being used, which means that the current seq num is that of the last message sent. These are filed for the last message transferred in a batch and are used during channel start-up to ensure that both ends agree on which messages have been transferred succesfully.
text text
Msg seq = 00123
MQ Server
MQ Server
Msg seq = 00113RECEIVER
SENDER
7
Summary of MQ Channel Issues (cont.)
• Sender channels go into INDOUBT status.
In MQ, messages are always transferred individually; however, these are committed or backed out as a batch. When MQ commits a batch, it syncpoints a logical unit of work (LUW). If this syncpoint procedure is interrupted, an indoubt chl condition may occur.
text text
MQ Server
MQ Server
8
text text
Summary of MQ Channel Issues (cont.)
• TPF rcvr chl shows READY but partner sdr chl in UNIX cannot establish channel connection.
• UNIX rcvr chl shows RUNNING but partner sdr chl in TPF cannot establish channel connection.
start chl ready /
running
MQ Server
MQ Server
10
Automated Channel Recovery Function in TPF
Cycle to NORM activates a time-initiated auto-chl recovery function which has the following features:
• First time around, START all sdr chls.
• CRETs to itself every minute.
• Check status of all sdr chls and perform necessary action.
• Can be activated or deactivated via functional entry.
11
Automated Channel Reset for TPF
RESET and START the sender channel
Is sender chlStatus not READY
Nor INDOUBT?
YES
12
Automated Channel Resolve for TPF
The sdr chl goes into INDOUBT status if it is in doubt with the partner rcvr chl about which msgs have been sent and received. In this situation, the sdr chl has to be told whether to COMMIT or BACKOUT these msgs. Although this condition rarely occurs, it requires manual intervention to resynchronize the channels via functional entry.
13
Automated Channel Resolve for TPF
RESOLVE, RESET and START the sender channel
Is sender chlStatus INDOUBT?
YES
14
Automated Channel Retry for UNIX
UNIX v5.2 has a built-in channel retry mechanism and may be used in conjunction with the following channel attributes:
• SHORTRTY – Short retry is the max nbr times sdr chl will try to allocate a session to its partner (set at 60).
• SHORTTMR – Short retry timer is the interval in sec wherein sdr chl will wait before retrying to establish a chl connection during the short retry mode (set at 60 sec).
• LONGRTY – Long retry kicks in after SHORTRTY expires (set at 999999999).
15
Automated Channel Retry for UNIX (cont.)
• LONGTMR – Long retry timer is set at 1200 sec (20 min).
• HBINT – Heartbeat interval is the interval in sec wherein the sending MCA will send heartbeat flows to unblock the receiving MCA so that it can disconnect the channel.
• DISCINT – Disconnect interval is the time out value in sec for the sdr chl to disconnect when the xmitq becomes empty.
Note: Setting these channel attributes will work only when the Queue Manager of the partner channel can support it.
16
Automated Channel Recovery Function for UNIX
The CRON table contains a script file which has the following features:
• Activated once every minute.
• Check status of all sdr chls.
17
Automated Channel Resolve for UNIX
RESOLVE and RESET the sender channel
Is sender chlStatus INDOUBT?
YES
18
Automated Channel Reset for UNIX
RESET the sender channel
Is sender channel in RETRYING mode?
YES
20
Using TCP KeepAlive
• TCP KeepAlive knows nothing about MQSeries channels. It works on the TCP socket level.
• It sends a KeepAlive msg to the socket partner.
• If it detects that the partner is no longer available, it will disconnect the socket.
21
text text
Using TCP KeepAlive (cont.)
• Alleviates the problem where the rcvr chl shows READY or RUNNING but the partner sdr chl is retrying to establish a new connection.
MQ Server
MQ Server
start chl
chl started
22
Using TCP KeepAlive (cont.)
• For TPF native stack, PJ28289 (PUT16 APAR) enables the KeepAlive option of the socket used by MQ rcvr chls.
• For TPF native stack, the socket sweeper checks if a socket has the KeepAlive option and sends a KeepAlive msg. Currently, the socket sweeper activates every 2 minutes.
• For UNIX, the KeepAlive interval is currently set to 1 minute.
23
Conclusion
Automated channel restart mechanism for TPF
Automated RESET mechanism for TPF
Automated RESOLVE mechanism for TPF
Automated RESET mechanism for UNIX
Automated RESOLVE mechanism for UNIX
Automated channel resolution between TPF and UNIX