Upload
jakub-pavlik
View
707
Download
2
Embed Size (px)
DESCRIPTION
tcp cloud deployment experience with OpenContrail 1.05 on OpenStack Havana
Citation preview
Contrail deployment experience
Jakub Pavlik
About me
Jakub Pavlík• Cloud Platform Engineer• 3 years in Cloud• 2 years in OpenStack
tcp cloud a.s.
• #1 deployer of OpenStack solutions in Czech Republic• 2 years experiences with OpenStack• Basic two products
• TCP Private Cloud – on premises, hosted in arbitrary place
• TCP Virtual Private Cloud (TCP VPC)– hosted in TCP, pay per use, tenant = customer (Contrail based)
TCP VPC Contrail/OpenStack Facts
• Contrail 1.05 with Havana on CentOS 6.4
• High Availability OpenStack architecture based on Corrosync/Pacemaker, HAProxy and MySQL Galera
• 3x OpenStack Controllers, 2x Contrail Controllers – all virtual machines except one OpenStack Controller (because of Fibre Channel)
• Not deployed by Fabric, but by SaltStack
• Own solution for monitoring and metering/billing
Contrail/OpenStack Deployment
Not Fabric – Why?• Fabric is great for showcase or standalone deployment• Fabric is not suitable for deploying OpenStack, because each
deployment is different (cinder backend, glance backend)• Fabric does not deploy HA OpenStack for now• OpenStack config files needs to be extended for other items• Each company prefers different configuration management
(puppet, salt, chef)• Fabric HA deployment of Contrail is not HA• Next releases are better
Recommendations• There should be guide for integration Contrail with existing
OpenStack and maybe steps for manual configuration without Fabric
Deployment issues with 1.05
Fabric HA is not HA• Other config nodes refers to ip address of first node (e.g. discovery
service)• There is no VIP address in HAProxy
Bug in nova.conf• service_down_time = 100000
Bad configuration for nova live migration• Nova.conf and libvirtd.conf must include parameters for doing live
migrationNo options for cinder and glance backend
• Config optimization and options inside of Fabric are missing
OpenStack modules – TCP VPC
OpenStack HA
TCP VPC
MySQL RabbitMQ
Openstack Controller
GALERA
Zoo
keep
er
Cassandra
Contrail Database
Contrail Configwith Analytics & WebUI
Contrail Control
Zoo
keep
er Cassandra
Contrail Database
MySQL RabbitMQ
Openstack Controller
MySQL RabbitMQ
Openstack Controller
Zoo
keep
er
Cassandra
Contrail Database
Contrail Control
Contrail Configwith Analytics & WebUI
HAProxy HAProxy HAProxy
VIP
Bond Interface
PacemakerCorosync
Contrail Configwith Analytics & WebUI
PacemakerCorosync
OpenStack HA Architecture
Contrail screens from environment
HA issuesRestoring BGP peering after failure one of control nodes
• Sometimes joining control node after failure desynchronize BGP peering and whole floating ips are down for 20-60 sec.
• Solved by deploying three config nodes and two control nodes
Configuration files• Problems can be solved better configuration.• Native HA in Contrail release 1.10
General issues
Cassandra problem• Flow records grow up very quickly. /etc/contrail/vizd_param
ANALYTICS_DATA_TTL=8
MTU configuration at MX and QFabric• MTU size for North-South communication is set to
1550 QFabric and 1554 MX (at the end we had to increase to 9000 MTU)
Bug in instance snapshot• qemu-img convert has bad parameters on CentOS
General issues
Nova floating ip attaching• nova floating-ip-associate does not work –
causes problems with existing PaaS tools
Details for almost all issues are at user mailing list• All issues was discussed at list by me and
most of them are successfully solved.
Pre-production 1.10 release
• Migration from CentOS to Ubuntu• Full HA – 3 nodes together with OpenStack
Thank you for your attention!
Questions?