Upload
quinton-hoole
View
1.224
Download
2
Embed Size (px)
Citation preview
Federation of Kubernetes Clusters ("Übernetes")Kubecon 2015
Quinton Hoole <[email protected]>Staff Software Engineer - Googlequinton_hoole@github
Google has beeeg data centers...... but you know that already.
Images by Connie Zhou
But we also have rather a lot of them...
Treating these differently can have benefits...
UI
CLI
API
Control Plane Servers
Kubernetes
Users
containerscontainers
containerscontainers
containers
containerscontainers
containerscontainers
containers
containerscontainers
containerscontainers
containers
Cluster / Data Center / Availability Zone
UI
All you really care about?
API Containers
UI
CLI
API
Control Plane Clusters
Übernetes
API
Users
Kubernetes on
Kubernetes on
Kubernetes on Premise
Federation
Why is this interesting?
Reason 1: High Availability
• Cloud providers have outages, yes, but...• Has one of your application software
upgrades ever gone terribly wrong?• How about infrastructure upgrades
(auth systems? quota? data store?)• How about a fat-fingered config
change?• There are several interesting variants:
• Multiple availability zones?• Multiple cloud providers?
Cross-cluster Load Balancer
Your paying
customer
Cluster 1
Cluster 2
Cluster 3
Reason 2: Application Migration
• Migrating applications between clusters is tedious and error-prone if done manually• Much like software upgrades, you
*can* script them, but (K)ubernetes just does it quicker/safer/better.• Now with rollback too!
• On-premise ↔ Cloud• Amazon ↔ Google :-)• ...
Ubernetes
UI
On-Premise Cluster In-Cloud Cluster
Migrate: On Premise→Cloud
Different Cloud Provider
Reason 3: Policy Enforcement
• Some data must be stored and processed within specified political jurisdictions, by law.
• Some software/data must be on premise and air-gapped, by company policy.
• Some business units get to use the expensive gear, some don't.
• Auditing is also a big deal, so funnelling all operations through a central control point makes this easier.
Ubernetes
UI
U.S. Cloud Cluster E.U Cloud Cluster
On-premise Cluster
Reason 4: Vendor Lock-in Avoidance
• Make it easy to migrate applications between cloud providers.
• Run the same app on multiple cloud providers and choose the best one for your:• workload characteristics• budget• performance requirements• availability requirements
Ubernetes
UI
Kubernetes on GCE Kubernetes on AWS
Kubernetes On-Premise
Reason 5: Capacity Overflow
• Make intelligent placement decisions • Utilization• Cost• Performance Ubernetes
User
On Premise Cluster
Other Cloud Provider
Preferred Cloud Provider
Run my stuff
"OK, I'm sold. Where's the catch?"
Provider 1
Zone A
Zone B
Federation comes with some challenges...
Provider 2Zone C
Provider 1
Zone D
● Different bandwidth charges/latency/through-put/reliability
● Different service discovery (but DNS!)
● Consolidated monitoring & alerting
Cross-cluster load balancing
• Geographically aware DNS gets clients to the "closest" healthy cluster.
• Standard Kubernetes service load balancing within each cluster.
• New L7 LB's available soon.• Can be extended to divert traffic away from
"healthy-but-saturated" clusters.
Cross-cluster service discovery
• DNS + Kubernetes cluster-local service discovery.
• Can default to cluster-local with failover to remote clusters.
Location affinity
• Strictly coupled pods/applications• High bandwidth requirements• Low latency requirements• High fidelity requirements• Cannot easily span clusters
• Loosely coupled• Opposite of above• Relatively easily distributed across
clusters• Preferentially coupled
• Strongly coupled but can be migrated piecemeal.
Cross-cluster monitoring and auditing...
• "Cluster per tab" might suffice for small numbers of clusters
• Some monitoring solutions provide stronger integration and global summarization
Cluster Federation - The Implementation...
API Compatible with Kubernetes
• Less new stuff to learn• Can learn incrementally, as you
need new functionality.• Analogous argument applies to
existing automation systems (PAAS etc). • These can be ported to
Ubernetes relatively easily.• All Kubernetes entities are
"federatable".
Ubernetes or Kubernetes
Client
Applications
Applications
Applications
Run my stuff
State and control resides in underlying clusters (for the most part)
• Better scalability• Kubernetes scales with
number of nodes per cluster (<10,000)
• Ubernetes scales with number of clusters (~100)
• Beter fault isolation• Kubernetes clusters fail
independently of Ubernetes
Kubernetes Cluster Kubernetes Cluster
Ubernetes
API
APIRepl. Ctrl etcState
API
APIRepl. Ctrl etcState
API
APIRepl. Ctrl etcState
• Drive current state -> desired state• But per-cluster state, not per node,
per pod etc.
• Observed state is the truth
Recurring pattern in the system
Examples: • ReplicationController• Service
observe
diff
act
Similar Control loops to Kubernetes
Modularity
Loose coupling is a goal everywhere• simpler• composable• extensible
Code-level plugins where possible
Multi-process where possible
Isolate risk by interchangeable parts
Examples:• MigrationController• Scheduler
Federation status & plans
Federation Lite (single cluster, multiple zones)• In alpha Q4 2015• Productionized ~Q1 2016
Federation Proper (multiple clusters, federated)• Alpha Q1 2016
Google Container Engine (GKE)• hosted Federation too• GKE Federation Lite ~Q1-Q2 2016
PaaSes and Distros• RedHat OpenShift, CoreOS Tectonic, RedHat Atomic...• ... watch this space...
I want more!
• Requirements doc - comments welcome• tinyurl.com/ubernetesv2
• Special interest group• groups.google.com/forum/kubernetes-sig-federation
• [email protected]• quinton_hoole@github
Kubernetes Cluster Kubernetes Cluster
Ubernetes
API
APIRepl. Ctrl etcState
API
APIRepl. Ctrl etcState
API
APIRepl. Ctrl etcState