View
263
Download
0
Category
Preview:
DESCRIPTION
Writing concurrent programs robustly with Go and zeromq.
Citation preview
Writing Concurrent Programsrobustly and productively with Go and zeromq18 November 2014
Loh Siu YinTechnology Consultant, Beyond Broadcast LLP
1 of 35
Traditional v. Concurrent
sequential (executed one after the other)
concurrent (executed at the same time)
concurrency -- reflects the way we interact with the real world
2 of 35
Which languages can be used to write concurrentprograms?
3 of 35
Concurrent languages
C with pthreads lib
Java with concurrent lib
Scala with actors lib
Scala with Akka lib
Go
Any difference between Go and the other languages?
4 of 35
Concurrent programs are hard to write
5 of 35
Why Go for concurrent programs?
Go does not need an external concurrency library
Avoids, locks, semaphore, critical sections...
Instead uses goroutines and channels
6 of 35
Software I use to write concurrent programs:
Go -- Go has concurrency baked into the language.
zeromq -- zeromq is a networking library with multitasking functions.
For more info:
golang.org (http://golang.org)
zeromq.org (http://zeromq.org)
7 of 35
Go Hello World program
Why capital P in Println?Println is not a class (like in java).It is a regular function that is exported (visible from outside) the fmt package.The "fmt" package provides formatting functions like Println and Printf.
package main
import "fmt"
func main() { fmt.Println("Hello Go!")} Run
8 of 35
Package naming in Go
If you import "a/b/c/d" instead of import "fmt" and that package "a/b/c/d" hasexported Println.That package's Println is called as d.Println and not a.b.c.d.Println.
If you import "abcd", what is the package qualifier name?
package main
import "fmt"
func main() { fmt.Println("Hello Go!")} Run
9 of 35
Go Concurrent Hello World
package main
import ( "fmt" "time")
func a() { for { fmt.Print(".") time.Sleep(time.Second) }}func b() { for { fmt.Print("+") time.Sleep(2 * time.Second) }}func main() { go b() // Change my order go a() //time.Sleep(4 * time.Second) // uncomment me!} Run
10 of 35
goroutine lifecylce
goroutines are garbage collected when they end. This means the OS reclaims theresources used by a goroutine when it ends.
goroutines are killed when func main() ends. The killed goroutines are then garbagecollected.
goroutines can be run from functions other than main. They are not killed if thatfunction exits. They are killed only when main() exits.
The GOMAXPROCS variable limits the number of operating system threads that canexecute user-level Go code simultaneously. [default = 1]
At 1 there is no parallel execution,
increase to 2 or higher for parallel execution if you have 2 or more cores.
golang.org/pkg/runtime (http://golang.org/pkg/runtime/)
11 of 35
Go channels
Go channels provide a type-safe means of communication between:
the main function and a goroutine, or
two goroutines
What is:
a goroutine?
the main function?
12 of 35
func x()
func x() returns a channel of integers can can only be read from.Internally it runs a goroutine that emits an integer every 500ms.
Demo type safety. Use MyInt for i.
type MyInt int
func x() <-chan int { ch := make(chan int) go func(ch chan<- int) { // var i MyInt // Make i a MyInt var i int for i = 0; ; i++ { ch <- i // Send int into ch time.Sleep(500 * time.Millisecond) } }(ch) return ch} Run
13 of 35
func y()
func y() returns a channel of integers can can only be written to.All it does is run a goroutine to print out the integer it receives.
func y() chan<- int { ch := make(chan int) go func(ch <-chan int) { for { i := <-ch fmt.Print(i, " ") } }(ch) return ch} Run
14 of 35
Go channels 1
1 of 2 ...
package main
import ( "fmt" "time")
type MyInt int
func x() <-chan int { ch := make(chan int) go func(ch chan<- int) { // var i MyInt // Make i a MyInt var i int for i = 0; ; i++ { ch <- i // Send int into ch time.Sleep(500 * time.Millisecond) } }(ch) return ch} Run
15 of 35
Go channels 2
2 of 2
func y() chan<- int { ch := make(chan int) go func(ch <-chan int) { for { i := <-ch fmt.Print(i, " ") } }(ch) return ch}func main() { xch := x() // emit int every 500 ms ych := y() // print the int for { select { case n := <-xch: ych <- n // send it you ych for display case <-time.After(501 * time.Millisecond): // Change me fmt.Print("x") } }} Run
16 of 35
Synchronizing goroutines
n gets an integer from xch and pushes it to ych to be displayed.
What if the source of data, xch, is on a different machine?
Go can't help here. There is no longer a network channel package.
Rob Pike (one of the Go authors) said that he didn't quite know what he was doing...
func main() { xch := x() // emit int every 500 ms ych := y() // print the int for { select { case n := <-xch: ych <- n // send it you ych for display case <-time.After(501 * time.Millisecond): // Change me fmt.Print("x") } }} Run
17 of 35
zeromq Networking Patterns
Pub/Sub Many programs can pub to a network endpoint. Many other programs cansub from that endpoint. All subscribers get messages from mulitple publishers.
Req/Rep Many clients can req services from a server endpoint which rep withreplies to the client.
Push/Pull Many programs can push to a network endpoint. Many other programscan pull from that endpoint. Messages are round-robin routed to an availablepuller.
18 of 35
Using zeromq in a Go program
Import a package that implements zeromq
19 of 35
zeromq Pusher
package main
import ( "fmt" zmq "github.com/pebbe/zmq2")
func main() { fmt.Println("Starting pusher.")
ctx, _ := zmq.NewContext(1) defer ctx.Term()
push, _ := ctx.NewSocket(zmq.PUSH) defer push.Close() push.Connect("ipc://pushpull.ipc") // push.Connect("tcp://12.34.56.78:5555")
for i := 0; i < 3; i++ { msg := fmt.Sprintf("Hello zeromq %d", i) push.Send(msg, 0) fmt.Println(msg) } // Watch for Program Exit} Run
20 of 35
zeromq Puller
This puller may be a data-mover moving gigabytes of data around. It has to berock-solid with the program running as a daemon (service) and never shut down.Go fits this description perfectly! Why?In addition, Go has a built-in function defer which helps to avoid memory leaks.
func main() { fmt.Println("Starting puller")
ctx, _ := zmq.NewContext(1) defer ctx.Term()
pull, _ := ctx.NewSocket(zmq.PULL) defer pull.Close() pull.Bind("ipc://pushpull.ipc")
for { msg, _ := pull.Recv(0) time.Sleep(2 * time.Second) // Doing time consuming work fmt.Println(msg) // work all done }} Run
21 of 35
msg, _ := pull.Recv(0)
functions in Go can return multiple values.
the _ above is usually for an err variable. Eg. msg,err := pull.Recv(0)
a nil err value means no error
if you write msg := pull.Recv(0) [note: no _ or err var], the compiler will fail thecompile with an error message (not a warning).
typing _ forces the programmer to think about error handling
msg,err := pull.Recv(0)if err != nil { fmt.Println("zmq pull:", err)}
22 of 35
zeromq Controller/Pusher in ruby
The pusher may be a controller that is in active development -- requiring frequentcode updates and restarts. With zeromq, we can decouple the stable long-runningprocess from the unstable code being developed. What is the advantage of this?
#!/usr/bin/env rubyrequire 'ffi-rzmq'
puts "Starting ruby pusher"ctx = ZMQ::Context.new(1)push = ctx.socket(ZMQ::PUSH)push.connect("ipc://pushpull.ipc")# push.connect("tcp://12.34.56.78:5555")
(0..2).each do |i| msg = "Hello %d" % i push.send_string(msg) puts(msg)end
push.close()ctx.terminate() Run
23 of 35
Putting it all together
email_mover (puller) has two slow tasks: email and move big data.
24 of 35
email_mover
func main() { fmt.Println("Starting email_mover") z := zmqRecv() e := emailer() m := mover() for { s := <-z e <- s m <- s // report when done to 0.1ms resolution fmt.Println(time.Now().Format("05.0000"), "done:", s) }} Run
25 of 35
zmqRecv goroutine
func zmqRecv() <-chan string { ch := make(chan string)
go func(ch chan<- string) { ctx, _ := zmq.NewContext(1) defer ctx.Term()
pull, _ := ctx.NewSocket(zmq.PULL) defer pull.Close() pull.Bind("ipc://pushpull.ipc") for { msg, _ := pull.Recv(0) ch <- msg } }(ch) return ch} Run
26 of 35
emailer goroutine
func emailer() chan<- string { ch := make(chan string, 100) // buffered chan go func(ch <-chan string) { for { s := <-ch time.Sleep(1 * time.Second) fmt.Println("email:", s) } }(ch) return ch} Run
27 of 35
mover goroutine
func mover() chan<- string { ch := make(chan string, 100) // buffered chan go func(<-chan string) { for { s := <-ch time.Sleep(3 * time.Second) fmt.Println("move:", s) } }(ch) return ch} Run
28 of 35
email_mover main()
Why do we need buffered channels in emailer() and mover() and not for zmqRecv()?
email_mover is a stable service in production written in Go.This service should never stop, leak memory or lose data.
func main() { fmt.Println("Starting email_mover") z := zmqRecv() e := emailer() m := mover() for { s := <-z e <- s m <- s // report when done to 0.1ms resolution fmt.Println(time.Now().Format("05.0000"), "done:", s) }} Run
29 of 35
Go Pusher (not changed)
package main
import ( "fmt" zmq "github.com/pebbe/zmq2")
func main() { fmt.Println("Starting pusher.")
ctx, _ := zmq.NewContext(1) defer ctx.Term()
push, _ := ctx.NewSocket(zmq.PUSH) defer push.Close() push.Connect("ipc://pushpull.ipc") // push.Connect("tcp://12.34.56.78:5555")
for i := 0; i < 3; i++ { msg := fmt.Sprintf("Hello zeromq %d", i) push.Send(msg, 0) fmt.Println(msg) } // Watch for Program Exit} Run
30 of 35
ruby Pusher (not changed)
#!/usr/bin/env rubyrequire 'ffi-rzmq'
puts "Starting ruby pusher"ctx = ZMQ::Context.new(1)push = ctx.socket(ZMQ::PUSH)push.connect("ipc://pushpull.ipc")# push.connect("tcp://12.34.56.78:5555")
(0..2).each do |i| msg = "Hello %d" % i push.send_string(msg) puts(msg)end
push.close()ctx.terminate() Run
31 of 35
email_mover maintenance
email_mover sends emails and moves files. These are well understood, stablefunctions. This service should never be shutdown.
However, hardware needs maintenance. How do we swap-in a new email_moverwithout losing data?
zeromq to the rescue!"ØMQ ensures atomic delivery of messages; peers shall receive either all message partsof a message or none at all."
32 of 35
Maintenance Procedure:
Disconnect the network cable from the first email_mover host. zeromq messageswill not begin to queue up at the pushers. Because zeromq message delivery isatomic, no data is lost.
Connect the network cable to the new email_mover host. zeromq messages beginto flow again.
Job done!
33 of 35
Why is data not lost?
ReceiveThe half-received message / data packet received by the old host was deemed as notreceived by zeromq and is discarded.
SendThe half-sent message / data packet that was interrupted when the connection wasbroken was detected by zeromq as not delivered and will be sent again. This time tothe new host.
34 of 35
Thank you
Loh Siu YinTechnology Consultant, Beyond Broadcast LLPsiuyin@beyondbroadcast.com (mailto:siuyin@beyondbroadcast.com)
35 of 35
Recommended