Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011
ClusterXL
Under the hood
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
About authorValeri Loukine
• CCMA 0019
• Ex-Check Point
• Senior Security Consultant - Dimension Data
• Email: [email protected]
• Blog: http://checkpoint-master-architect.blogspot.com/
2
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Agenda
• Understanding the Cluster Elements
• CCP
• State synchronization
• Pnote
• Check Point Solutions: aka ClusterXL (HA ,Load Sharing)
• Advanced features and problematic scenarios
• 3rd party clusters
• Some Troubleshooting
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
CCP
• Check Control protocol runs on proto UDP 8116.
• CCP is running on all interfaces (in Cluster XL)• Note: When VLAN are used CCP will run only on the lowest VLAN ID
(Not true for VSX)
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
CCP is in charge of
• Health status reports
• Cluster member probing
• State change commands
• Querying for cluster membership
• State table synchronization
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
CCP modes
• Multicast or Broadcast
• To change: cphaconf set_ccp STATE
• $FWDIR/boot/ha_boot.conf
• The Mac address used for the multicast is determine with a special algorithm.
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Checking CCP state
# cphaprob –a if
Required interfaces: 3
Required secured interfaces: 1
eth0 UP non sync(non secured), multicasteth1 UP non sync(non secured), multicasteth2 UP sync(secured), multicast
Virtual cluster interfaces: 2
eth0 192.168.10.1 eth1 10.1.1.1
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
State Sync
• Used to exchange kernel table information between cluster members
• Composed of two phases: Full Sync and Delta Sync
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Full Sync
• Happens upon boot
• fwd communication on port 256
• Does not have to be on the Sync interface
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Delta Sync
• Done over CCP (UDP 8116)
• Updates changes in kernel tables incrementally
• Happens with every operation done to a synchronized kernel table
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
How it works
• When cluster members starts, it requests Full Sync before becoming Standby Member
• Full Sync replicates all existing kernel tables and existing connections’ information
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
How it works
• Upon FS completion cluster member changes its state to Standby
• From now on, Delta Sync occurs
• Only changes are synced
• Some may be not synced, configurable
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Tuning Sync
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Tuning Sync
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Sync summary
• Global - supports all kernel table operations
• Transparent - does not require direct awareness of its existence
• Serves both ClusterXL and third parties without significant changes
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Sync summary
• User mode applications information is not synced! (Security Servers, etc)
• May require some performance and bandwidth
• Can be tuned – administrator can choose some services not to be synced
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
fw ctl pstat - syncSync:
Version: new
Status: Able to Send/Receive sync packets
Sync packets sent:
total : 209693, retransmitted : 166, retrans reqs : 129, acks : 54
Sync packets received:
total : 134755, were queued : 221, dropped by net : 101
retrans reqs : 29, received 26 acks
retrans reqs for illegal seq : 0
dropped updates as a result of sync overload: 0
Callback statistics: handled 11 cb, average delay : 1, max delay : 1
Wednesday, September 14, 2011
CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011
Under the hood
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Pnote
• critical device AKA a Problem Notification (pnote)
• If a critical device stops functioning, this is defined as a Failure
• fwd , cphad are predefined
• also checked: policy (filter) , sync and interfaces
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Pnote
• To check: cphaprob list
• Can be used to cause a failover by adding a new faulty device
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
cphaprob listBuilt-in Devices:
Device Name: Interface Active CheckCurrent state: OK
Registered Devices:
Device Name: cphadRegistration number: 2Timeout: 2 secCurrent state: OKTime since last report: 0 sec
Device Name: fwdRegistration number: 3Timeout: 2 secCurrent state: OKTime since last report: 0.8 sec
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Register new device
•cphaprob -d <device> -t <timeout(sec)> -s <ok|init|problem> [-p] register
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Clusters basic requirements
• OS must be the same.
• FW-1 version must be the same.
• Installed products must be the same.
NOTE : Check Point recommends that customers use the same hardware.
Wednesday, September 14, 2011
CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011
ClusterXL basics
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
ClusterXL
• CP clustering product (CCP), UDP 8116
• Same for both HA and LS solutions
• Supports Solaris, SPLAT and Linux, not IPSO
• 4 modes of operation
• HA – Legacy and New
• LS – Multicast and unicast!
Wednesday, September 14, 2011
CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011
HA new mode
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
HA new mode
• Active - Standby roles
• CCP runs on multicast by default
• Active member answer whois ARP for VIP with its physical MAC address
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
HA new mode
• Sync is done
• If Active fails, Standby takes over and becomes Active
• By default no secondary failover
• CCP can be switched to unicast (flooding VIP segment)
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
HA new mode
#cphaprob stat
Cluster Mode: New High Availability (Active Up)
Number Unique Address Assigned Load State
1 (local) 172.18.100.5 100% active
2 172.18.100.6 0% standby
Wednesday, September 14, 2011
CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011
HA legacy mode
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
HA legacy mode
• Linux only
• Both members are configured to have same IP addresses and SAME MAC addresses on clustered interfaces
• Managed through private interfaces
Wednesday, September 14, 2011
CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011
LS multicast
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
LS multicast mode
• Both members process traffic
• whois is answered with virtual multicast MAC shared among members
• All members receive the packet
• Random decision to process
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
LS multicast mode
#cphaprob stat
Cluster Mode: Load Sharing (Multicast)
Number Unique Address Assigned Load State
1 192.10.0.1 33% active
2 192.10.0.2 33% active
3 (local) 192.10.0.3 33% active
Wednesday, September 14, 2011
CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011
LS pivot (unicast)
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
LS pivot mode
• Pivot always answers whois with its physical MAC
• It always get packets, but forwards some of them to other cluster members
• Forwarding is done on receiving network, original source MAC is replaced
• Load is not equally shared
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
LS pivot mode
#cphaprob stat
Cluster Mode: Load Sharing (Unicast)
Number Unique Address Assigned Load State
1 (local) 10.10.10.57 30% active (pivot)
2 10.10.10.61 70% active
Wednesday, September 14, 2011
CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011
Advanced parameters
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Advanced parameters
• Asymmetric Routing
• Session from standby (Forwarding)
• Block new Conns
• Different subnet
• Magic MAC
• Disconnected interfaces
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Asymmetric Routing
• C2S packet goes through one cluster member
• S2C packet goes through another
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Asymmetric Routing
• What’s the problem?
• Race conditions (syn/syn-ack/ack)
• Features without sync (Security Servers)
• NATed and encrypted connections
• Data connections
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Asymmetric Routing
• Resolution:
• Flush and Ack mechanism – hold a packet that made a change in the kernel table until the change is synced successfully
• Sticky Decision Function
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Decision Function
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
SDF - when?
• FTP - The data connections are passed through the same cluster member as the control connection
• NATed connections, including Static NAT and Hide NAT
• VPN, including encrypted connections generated from SecuRemote/SecureClient or from another VPN gateway.
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
SDF - limitations
• Some connection types are not recognized by SDF- default DF will be used
• SDF does not work with SecureXL acceleration will be stopped
• Does not work for VPN routing
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Session from Standby
• If a session start from Standby:
• To the server it will go directly from Standby
• From the Server it will go to Active member , then it will forward the connection to Standby
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Block new conns
• If sync is at risk, new connections should not be processes.
• Error message:
“FW-1: State synchronization is in risk. Please examine your synchronization network to avoid further problems!”
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Block new conns
• fw_sync_block_new_conns
• Enable load detection - set to 0
• Disable load detection - set to -1
• FW-1 default is -1 , VSX the default is 0
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Different Subnet
• When VIP is not on the same subnet as physical member IP addresses
• Automatic ARP is not supported. local.arp required
• May need some additional static routes
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Magic MAC
• Used by CCP on Layer 2
• Belongs to all members on all interfaces
• Forward MAC is used to forward packets
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Magic MAC
• fwha_mac_magic 0xfe
• fwha_mac_forward_magic 0xfd
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Disconnected Interfaces
• Interfaces that do not run CCP
• Sync Interface must NOT be disconnected
• In 3rd party all interfaces except for the sync interface
• Will not be monitored
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Disconnected Interfaces
• $FWDIR/conf/discntd.if, reboot
• May define in topology as private
• No need to list them in 3rd party
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
3rd party clusters
• Were many vendors
• Now Crossbeam and IPSO, what else?
• ClusterXL - only Sync
Wednesday, September 14, 2011
CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011
Cluster member state
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Cluster member state
• Active
• Active Attention
• Down
• Ready
• Standby
• Initializing
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Active
• Everything is good
• Passing traffic
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Active Attention
• Something is wrong in the cluster
• I am passing traffic
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Down
• One of the critical devices is down
• Not passing traffic
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Ready
• Upgraded, old version member is Active
• Not passing traffic
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Standby
• Everything is good
• Not passing traffic
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Initializing
• Cluster member is booting up,
• ClusterXL product is already running
• VPN-1 Pro is not yet ready
• Full Sync is not completed
Wednesday, September 14, 2011
CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011CPUG 2011 Chur Switzerland (c) Valeri Loukine 2011
Troubleshooting tools
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
CLI
• cphaprob
list
-a if
state
• fw ctl pstat (check sync data)
• fw ctl debug –m cluster xxx
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
fw ctl debug flags
• conf – Configuration related kdebug messages
• if - Interface tracking and validation
• stat - Cluster module state change
• select - Packet selection including DF
• ccp – Cluster control packet handeling
• pnote - Pnote device
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
fw ctl debug flags
• mac – mac address sync
• forward – forwarding layer debug
• df – decision function
• drop – drops caused by SDF
Wednesday, September 14, 2011
(c) Valeri Loukine 2011CPUG 2011 Chur Switzerland
Other tips
• Snoop (still using UDP port 8116 traffic)
• fw monitor (forwarded packets may cause confusion)
Wednesday, September 14, 2011
Questions And Answers
Wednesday, September 14, 2011
Thank You For Your Time!
Wednesday, September 14, 2011