High Performance Systems in Go - GopherCon 2014

Preview:

DESCRIPTION

Short presentation on some techniques to gain performance of the NATS messaging system as it was rewritten in Go.

Citation preview

High Performance Systems in Go

Derek Collison April 24, 2014 GopherCon

About

Architected/Built TIBCO Rendezvous and EMS Messaging SystemsDesigned and Built CloudFoundry at VMwareCo-founded AJAX APIs group at GoogleDistributed SystemsFounder of Apcera, Inc. in San Francisco, CA@derekcollisonderek@apcera.com

Derek Collison

Why Go?• Simple Compiled Language

• Good Standard Library

• Concurrency

• Synchronous Programming Model

• Garbage Collection

• STACKS!

Why Go?

• Not C/C++

• Not Java (or any JVM based language)

• Not Ruby/Python/Node.js

What about High Performance?

NATS

NATS Messaging 101• Subject-Based

• Publish-Subscribe

• Distributing Queueing

• TCP/IP Overlay

• Clustered Servers

• Multiple Clients (Go, Node.js, Java, Ruby)

NATS• Originally written to support CloudFoundry

• In use by CloudFoundry, Baidu, Apcera and others

• Written first in Ruby -> 150k msgs/sec

• Rewritten at Apcera in Go (Client and Server)

• First pass -> 500k msgs/sec

• Current Performance -> 5-6m msgs/sec

Tuning NATS (gnatsd)

or how to get from 500k to 6m

Target Areas

• Shuffling Data

• Protocol Parsing

• Subject/Routing

Target Areas

• Shuffling Data

• Protocol Parsing!

• Subject/Routing

Protocol Parsing• NATS is a text based protocol

• PUB foo.bar 2\r\nok\r\n

• SUB foo.> 2\r\n

• Ruby version based on RegEx

• First Go version was port of RegEx

• Current is zero allocation byte parser

Some Tidbits

• Early on, defer was costly

• Text based proto needs conversion from ascii to int

• This was also slow due to allocations in strconv.ParseInt

defer

defer Results

golang1.3 looks promising

parseSize

parseSize vs

strconv.ParseInt

Target Areas

• Shuffling Data

• Protocol Parsing

• Subject/Routing

Subject Router

• Matches subjects to subscribers

• Utilizes a trie of nodes and hashmaps

• Has a frontend dynamic eviction cache

• Uses []byte as keys (Go’s builtin does not)

Subject Router

• Tried to avoid []byte -> string conversions

• Go’s builtin hashmap was slow pre 1.0

• Built using hashing algorithms on []byte

• Built on hashmaps with []byte keys

Hashing Algorithms

Hashing Algorithms

Jesteress

HashMap Comparisons

Some Lessons Learned• Use go tool pprof (linux)

• Avoid short lived objects on the heap

• Use the stack or make long lived objects

• Benchmark standard library builtins (strconv)

• Benchmark builtins (defer, hashmap)

• Don’t use channels in performance critical path

Big Lesson Learned?

Go is a good choice for performance based

systems

Go is getting better faster than the others

Thanks

Resources

• https://github.com/apcera/gnatsd

• https://github.com/apcera/nats

• https://github.com/derekcollison/nats

Recommended