1. Adjusting Carbon Topology toMatch High Availability Scenario
Requirements Afkham Azeez Director of Architecture WSO2 Inc 1
2. About Me PMC member Apache Axis, Committer Synapse & Web
Services Member, Apache Software Foundation Co-author, Axis2 Web
Services Director of Architecture, WSO2 Inc Blog:
http://blog.afkham.org Twitter: afkham_azeez 2
3. Agenda A brief look at the WSO2 platform Carbon clustering
for availability Cost of availability & related topologies
3
4. WSO2 Offerings WSO2 Carbon Full platform of servers for
deployment on-premise, in private or public cloud Products share a
consistent architecture and core platform services (e.g. logging,
management, security, identity, caching) through OSGi and the
Carbon Core Includes ESB, AppServer, Data Services, Governance,
Identity, Business Process, and more WSO2 Stratos
Platform-as-a-Service (PaaS) Foundation Supports running servers as
elastic, metered, billed, multi-tenant with self-service Including
all Carbon Servers, PHP, Jetty, and a growing list through a
standard Cartridge model WSO2 StratosLive
http://stratoslive.wso2.com WSO2s Public PaaS An instance of
Stratos running in the cloud with all Carbon Servers available
4
5. Consistent Architecture Carbon: A consistent set of
class-leading enterprise servers The same products run either
on-premise or in the cloud, single-tenant or multi- tenant Utilize
the same Carbon core runtime for a seamless experience Stratos: A
cloud platform for enterprise, hybrid and public deployment Extends
the deployment to support full self-service, elastic scaling,
metering and billing Supports Carbon and native server runtimes
Including Java and non-Java servers such as Jetty and PHP Re-uses
the same core Carbon architecture to offer core PaaS services
including: Identity, Logging, File, Relational Storage, Column
Storage, Code Deployment, etc Both projects share a common set of
OSGi modules and a core runtime architecture 5
6. WSO2 SOA Platform 6
7. WSO2 Carbon 7
8. AvailabilityThe degree to which a system, subsystem, or
equipment is in a specified operable and committable state at the
start of a mission, when the mission is called for at an unknown,
i.e., a random, time.Simply put, availability is the proportion of
time a system is in a functioning condition. 8
9. Availability 9
10. Availability 10
11. High Availability (HA)A system that is designed for
continuous operation in the event of a failure of one or more
components. However, the system may display some degradation of
service, but will continue to perform correctly.The proportion of
time during which the service is accessible with reasonable
response times should be close to 100%.All single points of failure
should be eliminated 11
12. HA, CO & CA Continuous Operation (CO) Ability to avoid
planned outages. hardware and software maintenance carried out
while applications remains available users. Continuous Availability
(CA) Combines the characteristics of HA and CO to keep the
applications running without any noticeable downtime Hot update/
graceful round-robin restart 12
13. High Availability Techniques Redundancy Time retransmit
Data e.g. parity bits Processing e.g. redundant nodes Diversity
e.g. Hybrid deployments, do the same thing using different
implementations 13
14. How to decide required availability? Average throughput
(TPS) Max throughput (TPS) Monetary value of a transaction Average
loss & max loss per second of downtime Decide on how much to
invest on availability based on cost vs. benefit tradeoff 14
15. Patching Production Deployments Patch Distribution
Coordinator 1. Check patch list 2.Pull new patch 3. Push patch 3.
Push patch 3. Push patch 3. Push patch 15
16. Patching Production Deployments Patch Distribution
Coordinator Round-robin 4. Maintenance mode 5. Graceful restart
16
17. Clustering Clustering for scalability Clustering for
availability 17
18. Clustering for scalability 18
19. Clustering for availability Group Communication
Channel/State replication 19
21. Static membership Predefined members Other (non-predefined)
nodes cannot join Static group M1 M2 N M3 M4 21
22. Dynamic membership No predefined members Nodes can join
& leave Dynamic group M1 M2 N Join M3 M4 22
23. Hybrid membership Some predefined (well-known) members, and
some dynamic members Nodes can join & leave Membership revolves
around the static members Hybrid group Dynamic members Static
members N Join (IP, M5 M6 M1 M2 Port) M7 M3 M4 23
24. Multicast based membership management M4 M1 N Join (IP,
Port) M2 M3 24
25. Well-known Address (WKA) based membership managementHybrid
group Dynamic members Static members M6 M5 WK1 N WK2 Notify Join
(IP, Port) M7 WK3 WK4 25
26. Multicast vs. WKAMulticast WKAAll nodes should be in the
same subnet Nodes can be in different networksAll nodes should be
in the same multicastdomain No multicasting requirementMulticasting
should not be blockedNo fixed IP addresses or hosts required At
least one well-known IP address or host requiredFailure of any
member does not affect New members can join with some WKAmembership
discovery nodes down, but not if all WKA nodes are downDoes not
work on IaaSs such as Amazon IaaS-friendlyEC2 Requires keepalived,
elastic IPs or some other mechanism for remapping IP addresses of
WK members in cases of failure 26
27. Multicast vs. WKA how to decide? Multicast Cluster is going
to be setup in a network where multicasting is allowed WKA Cloud
based deployment Members are distributed across datacenters &
regions Multicasting blocked 27
29. State ReplicationJSR-107/JCache A standard Java Caching API
for use by developers and a standard SPI ("Service Provider
Interface") for use by implementers. import javax.cache.*
CacheManager cacheMgr = Caching.getCacheManager(); Cache cache
=cacheMgr .getCache(cacheName); cache.put(key, sampleValue);
Integer i = cache.get(key); 29
30. State ReplicationCarbonContext based API Cache cache =
CarbonContext.getCurrentContext().getCache(); cache.put(key,
sampleValue); Integer i = cache.get(key);Axis2 Contexts Using Axis2
clustering StateManager axis2.xml 30
31. Elastic Load Balancer 2.0 New sysadmin-friendly
configuration language High performance PassThrough transport
Tenant-aware load balancing Ability to dedicate clusters for
tenants (private jet mode) Improved auto-scaler Separate IaaS-aware
Cloud controller takes care of spawning new instances on different
IaaSs 31
32. Tenant-aware LB 32
33. Private Jet mode Analogy Economy class no SLA management,
only elasticity Business class elasticity plus SLA guarantees
Private Jet Guaranteed isolated VMs or machines for a specific
tenant Still elastically scaled
34. Private Jet Mode 34
35. Topologies Single node Multi-node with LB Multi-node with
elasticity using ELB Management & worker node separated
Multi-zone or multi-datacenter deployment Multi-region 35
39. Active cluster, multiple LBHIGHEST keepalived Availability
CostLOWEST Active Active Active 39
40. Management & Worker Node Separation Proper separation
of concerns - management nodes specialize in management of the
setup while worker nodes specialize in serving requests to
deployment artifacts Only management nodes are authorized to add
new artifacts into the system or make configuration changes Worker
nodes can only deploy artifacts & read configuration Lower
memory foot in the worker nodes because the management console
related OSGi bundles are not loaded Improved security - management
nodes can be behind the internal firewall & be exposed to
clients running within the organization only, while worker nodes
can be exposed to external clients. Isolation of failures 40
44. Multi-zone or multi-datacenter DeploymentHIGHEST Cloud
Controller Zone 1 Zone 2 Availability Region X CostLOWEST 44
45. Multi-region deploymentHIGHEST Zone 1 Zone 2 Region X Zone
1 Availability Zone 2 CostLOWEST Region Y 45
46. Multi-IaaS Deployment Cloud Controller 46
47. Multiple IaaS (hybrid) DeploymentHIGHEST Zone 1 Private
cloud (data center) Zone 2 Zone 1 Zone 2 Amazon EC2 Zone 1
Availability Cost Zone 2LOWEST Rackspace Cloud 47
48. Single Node Primary-Secondary, single LB Primary-Secondary,
with multiple LBs Multi-node active cluster - Single zone Cost of
Availability Multi-zone Multi-region Multi-IaaS48
49. HA for the Load Balancer Load balancer cluster Keepalived
Elastic IP address Round Robin DNS 49
50. Monitoring Servers Monit Automatically provide alerts &
restart processes when monitored items (e.g. latency) fall below
certain thresholds. New Relic Nagios 50
51. ReferencesInformation on tenant-aware load
balancinghttp://sanjeewamalalgoda.blogspot.com/2012/03/tenant-aware-load-balancer-is-upcoming.htmlhttp://sanjeewamalalgoda.blogspot.com/2012/05/tenant-aware-load-balancer.htmlScaling
Stratoshttp://srinathsview.blogspot.com/2012/06/scaling-wso2-stratos.htmlhttp://blog.afkham.org/2011/09/how-to-setup-wso2-elastic-load-balancer.htmlhttp://blog.afkham.org/2011/09/wso2-load-balancer-how-it-works.html
51
52. Auto-scaler service
deploymenthttp://nirmalfdo.blogspot.com/2012/07/autoscaler-service-deployment.htmlAuto-scaler
servicehttp://nirmalfdo.blogspot.com/2012/07/wso2-autoscaler-service-part-i.htmlAutomatic
failover for WSO2
ELBhttp://gonesimple.org/2012/09/24/automatic-fail-over-for-wso2-elb/
52