31
SCALING HUMANS: BIGPANDA'S FABULOUS CHATOPS ADVENTURE BY ERIK ZAADI | DEC 13, 2016

Scaling Humans - BigPanda's Fabulous ChatOps Adventure - Erik Zaadi, BigPanda - DevOpsDays Tel Aviv 2016

Embed Size (px)

Citation preview

SCALING HUMANS: BIGPANDA'S FABULOUS CHATOPS ADVENTURE

BY ERIK ZAADI | DEC 13, 2016

TEAM LEADER AND COMMUNITY GUY ERIKZAADI

• Continuous Delivery • Deploy by ssh to deploy server (AWS) • Ran Ansible on our playbooks repository

A TALE OF DEPLOYMENT EVOLUTION

FLASHBACK TO EARLY 2014

8 deployers 5 Microservices 15 Servers

• Synchronizing - Lack of visibility • Not reliable deploys

• Disconnections • Debugging

• Ansible was not common knowledge

WE SIMPLY WANTED TO DEPLOY

PROBLEMS:

Bash wrapper for Ansible

NEXT STEP: LET'S MAKE IT EASY

ENTER BASH - DEPLOY.SH

DEPLOY.SH IS

COMING

Solved: • No Ansible Knowledge Problems: • Synchronization • Not reliable deploys

DEPLOY.SH

• Wrapped SSH login with TMUX • Used local users with symlinked

playbooks repository • Enabled mosh

NEXT STEP: LET'S STAY CONNECTED

TMUX TO THE RESCUE

TMUX IS

COMING

Solved: • Not reliable deploys

• Disconnections (session) • Debugging

Problems: • Synchronization

TMUX AND LOCAL USERS

• HipChat bot using Hubot • Written in CoffeeScript • Called Beanbot • Limited to staging servers • Used Jenkins to run Ansible • Could query past deploys

NEXT STEP: CHATOPS TO THE RESCUE!

HELLO THERE HUBOT!

Problems: • Synchronization

• Lack of HipChat usage • Almost no visibility

• Didn’t solve the production pain • Only 3 contributors

• People really hated Coffeescript

BEANBOT

AND WE CONTINUED GROWING

THIS WAS MID 2015

15 deployers 15 Microservices 40 Servers

Slack channel: #deployment-locks We'd manually notify locking. Escalated quickly to be unusable.

Stupid Humans.

NEXT STEP: LET'S SOLVE IT IN CHAT

SO, CHAT-NOT-OPS?

• Hackathon rewrite from scratch • Robin (Hubot) • Written in Javascript • Using micro services standards

• Tests • Deploys

NEXT STEP: CHATOPS TO THE RESCUE AGAIN!

FIX THE STUPID HUMANS

• Managed deployment locks • Deployed to production (!) • Prevented Human Errors • Similar syntax to bash script

ZOMG TIS’S DA THINGZ

THE HYPE TRAIN TOOK OFF

• Managed using GitHub issues • 10 contributors (~1/2 RND) • Javascript vs Coffeescript • Slack vs HipChat • Hubot Scripts • Use anything with an API

INSTANT POPULARITY GAINED

WHY ROBIN WORKED

• Integrated part of our dev cycle 30 Deployers 30 Microservices 160 Servers ~100 daily deploys and locks

TODAY

WE CAN’T LIVE WITHOUT ROBIN

EXAMPLES - KEEP IT SIMPLE AND SAVE HUMAN STEPS

EXAMPLES - MINIMIZE HUMAN ERRORS AND ADD CONFIRMATION

EXAMPLES - MINIMIZE HUMAN ERRORS

EXAMPLES - PROVIDE USEFUL FEEDBACK

EXAMPLES - BE QUERY-ABLE

EXAMPLES - ENSURE NAME SPACING CHATOPS COMMANDS

EXAMPLES - UNIT TEST ALL THE THINGS! CONVERSATION REGEX

TBD: SOLARIZED LIGHT?

EXAMPLES - SOLVE NON VISIBLE OPERATIONS TASKS

EXAMPLES - DON’T SPAM PUBLIC CHANNELS TOO MUCH

EXAMPLES - HELP WITH DAY TO DAY TASKS

EXAMPLES - MAKE THE USAGE OF THE BOT FUN

• Ensure deploy can work without ChatOps as well • Reuse existing deploy process and syntax • Run the bot both locally and on a staging env • Make contributing to the bot fun and easy • Company bot skeletons (allowing quick POCs)

THINGS WE LEARNED THE HARD WAY

YOUR BOT WILL BE DOWN

• SuperPanda - Groceries wish list • Gordon - Sales and Support • Bellboy - Opens our office door • Splinter - Code linting rewarder

DIVIDED THEY CONQUER

MOAARR BOTS!!!1

EOF