How To Develop Innovative, Scalable Systems?
Chat System
How hard can it be?
• Authentication• Slowmode• Moderator• Admins• Subscribers• Timeout• Ban• Limits per user• Notify messages• Imagelog• Unlimited Rooms
Not that easy!• IP-Ban• Raffle• Voting• Whisper• Blacklist• Block user• Posting Images• DDOS Protection• Limits per channel• Chatlog• System messages
Realtime
No IRC!
WebSocket
Web Server
UI
Let’s try this
PHPMysql
Auto scaling
Less than 2k
Back to the drawing Board!
WebSocket Load balancing
Permission and security model (Admin, Mods, ...)
Frontend Server Backend Server
UI
Ok, so let’s try this!
Frontend ServerNodeJs
data storageRedis Cluster
hitboxREST-API
PHPNginx
BackendServerNodeJs
Auto scalingAuto scaling
Average roundtrip / message: < 300ms
• Small, cheap machines • Handle the connections, no logic• When it breaks it breaks only for a few user• Automatic Failover to another chat frontend server• Socket.io for handling websockets• Carrier for sending messages between front & back• Up & Downscale possible as needed
Frontend Server
• Small, cheap machines • Handles all the logic• Stateless, can be restarted/upgraded any time• Easy expandable with new features • Up & Downscale possible as needed• Load balancing via round robin
Backend Server
• Fast• I mean, REALLY fast!• You can cluster it• Easy to back up
Redis
No single point of failure
Websockets...
WebSocket Load balancing
Permission and security model (Admin, Mods, ...)
Frontend Server Backend Server
UI
Ok, let’s fix Websockets
Frontend ServerNodeJs
data storageRedis Cluster
hitboxREST-API
PHPNginx
BackendServerNodeJs
Auto scaling
Auto scaling
Long Polling Fallback
FallbackServerNodeJs
Script Kiddies
Validate everything!
• Frontend servers report CPU load every 10 Seconds• Lowest X frontend servers are send to the UI• UI selects a frontend server randomly from this five• If UI gets disconnected it removes server from list• UI tries another frontend server• IF no servers left UI gets X new frontend servers from API
Load Balancing
2000 Messages/seconds
Async.js for the win!
„Self“ DDOS
Cache early, Cache often!
Stupid Software Design
Sometimes Realtime is bad!
Monitor Everything!
Etsy‘s statsd
So, is it Working?
Thank [email protected]
We are hiring!jobs.hitbox.tv