28
Crunching data with go: Tips, tricks, use-cases Sergii Khomenko, Data Scientist, STYLIGHT [email protected] @lc0d3r MUNICH GOPHERS - APR 24 2014, MUNICH

Crunching data with go: Tips, tricks, use-cases

Embed Size (px)

DESCRIPTION

Talk for the first meetup of Munich Golang User Group. Described use-cases from real Go development, covered fetching data from sql database, connecting to Google services like Google Analytics, Google BigQuery, other aspect of building a geolocation application.

Citation preview

Page 1: Crunching data with go: Tips, tricks, use-cases

!Crunching data wi th go:

T ips, t r icks, use -cases S e r g i i K h o m e n k o , D a t a S c i e n t i s t , S T Y L I G H T

s e r g i i . k h o m e n k o @ s t y l i g h t . c o m @ l c 0 d 3 r

M U N I C H G O P H E R S - A P R 2 4 2 0 1 4 , M U N I C H

Page 2: Crunching data with go: Tips, tricks, use-cases

Agenda

Relational databases !Google Analytics and BigQuery !

Geolocation !Useful things from Go-world

W H A T I T ’ S A B O U T

Page 3: Crunching data with go: Tips, tricks, use-cases

Relational databases

Page 4: Crunching data with go: Tips, tricks, use-cases

• github.com/jmoiron/sqlx type Clickout struct {!! Id, Count int!! Ip string!! Type int!! Commision, Eu_commission float32!}

Page 5: Crunching data with go: Tips, tricks, use-cases

! db, err := sqlx.Connect(config.Database.Driver, fmt.Sprintf("%s:%s@%s(%s)/%s?parseTime=true", config.Database.Username,!! ! config.Database.Password, config.Database.Protocol, config.Database.Server, config.Database.Database))!!! fmt.Printf("Connect to %s:(%s)... \n", config.Database.Protocol, config.Database.Server)!! if err != nil {!! ! log.Fatalf("Can not connect to the mysql server - %s", err)!! ! return!! }!! defer db.Close()!!!

Page 6: Crunching data with go: Tips, tricks, use-cases

! dbParams := paramStruct{"start": arguments["<from>"].(string) + " 00:00:00", "end": arguments["<to>"].(string) + " 23:59:59"}!! geoParams := paramStruct{}!!! siteStr, _ := arguments["--site"].(string)!! if siteInt, err2 := strconv.Atoi(siteStr); err2 == nil {!! ! dbParams["site"] = siteInt!! }!!! query := getClickoutsQuery(dbParams)!! rows, err := db.Queryx(query)

Page 7: Crunching data with go: Tips, tricks, use-cases

! if err == nil {!! ! for rows.Next() {!! ! ! click := Clickout{}!!! ! ! err2 := rows.StructScan(&click)!! ! ! if err2 == nil {!! ! ! ! task <- click!!! ! ! } else {!! ! ! ! fmt.Println(err2)!! ! ! }!! ! }!! ! close(task)!! } else {!! ! log.Fatalf("SQL Error - %s", err)!! }!

Page 8: Crunching data with go: Tips, tricks, use-cases

GeolocationW H E R E M Y I P S A R E F R O M

Page 9: Crunching data with go: Tips, tricks, use-cases

! task := make(chan Clickout)!! result := make(chan IpResult)!! done = make(chan interface{})!!! go processChannel(task, result)!! go aggregateResults(result, &results)!!! if err == nil {!! ! for rows.Next() {!! ! ! click := Clickout{}!!! ! ! err2 := rows.StructScan(&click)!! ! ! if err2 == nil {!! ! ! ! task <- click!!! ! ! } else {!! ! ! ! fmt.Println(err2)!! ! ! }!! ! }!! ! close(task)!! } else {!! ! log.Fatalf("SQL Error - %s", err)!! }

Page 10: Crunching data with go: Tips, tricks, use-cases

func processChannel(tc chan Clickout, rc chan IpResult) {!! for click := range tc {!! ! if subnet, err := findNetwork(click.Ip); err == nil {!! ! ! rc <- IpResult{click, subnet}!! ! } else {!! ! ! rc <- IpResult{click, new(IpSubnet)}!! ! }!! }!! close(rc)!}!

Page 11: Crunching data with go: Tips, tricks, use-cases

func aggregateResults(rc chan IpResult, rs *map[string]*AggrResults) {!! results := *rs!! found, notFound := 0, 0!!! for result := range rc {!! ! if result.Subnet.startInt == 0 {!! ! ! notFound += result.click.Count!! ! ! log.Printf("Can not find ip %s\n", result.click.Ip)!! ! } else {!! ! ! found += result.click.Count!! ! ! log.Printf("%s is {%s - %s} \n", result.click.Ip,!! ! ! ! result.Subnet.startIp, result.Subnet.endIp)!!! ! ! AddResult(&results, result)!! ! }!! }!! fmt.Printf("%f (%d) IPs in GeoIP db and %f (%d) not found out of %d\n", float32(found)/float32(found+notFound),!! ! found, float32(notFound)/float32(found+notFound), notFound, found+notFound)!!! close(done)!}!!

Page 12: Crunching data with go: Tips, tricks, use-cases

package main!!import (!! "fmt"!! "runtime"!)!!func main() {!!! fmt.Printf("GOMAXPROCS is %d %d %d\n", runtime.GOMAXPROCS(0), runtime.NumCPU(), runtime.NumGoroutine())!!! runtime.GOMAXPROCS(runtime.NumCPU())!! fmt.Printf("GOMAXPROCS is %d %d %d\n", runtime.GOMAXPROCS(0), runtime.NumCPU(), runtime.NumGoroutine())!!}!

Page 13: Crunching data with go: Tips, tricks, use-cases

! db, err := geoip2.Open("data/GeoLite2-City.mmdb")!! if err != nil {!! ! panic(err)!! }!! !! ip := net.ParseIP("81.2.69.142")!! record, err := db.City(ip)!! if err != nil {!! ! panic(err)!! }!!! fmt.Printf("Portuguese (BR) city name: %v\n", record.City.Names["pt-BR"])!! fmt.Printf("English subdivision name: %v\n", record.Subdivisions[0].Names["en"])!! fmt.Printf("Russian country name: %v\n", record.Country.Names["ru"])!! fmt.Printf("ISO country code: %v\n", record.Country.IsoCode)!! fmt.Printf("Time zone: %v\n", record.Location.TimeZone)!! fmt.Printf("Coordinates: %v, %v\n", record.Location.Latitude, record.Location.Longitude)!!! db.Close()

Page 14: Crunching data with go: Tips, tricks, use-cases

Google Analytics and BigQuery

Page 15: Crunching data with go: Tips, tricks, use-cases

var config = &oauth.Config{!! ClientId: “client-id-here.apps.googleusercontent.com",!! ClientSecret: “client-secret-here“,!! Scope: "https://www.googleapis.com/auth/analytics.readonly",!! AuthURL: "https://accounts.google.com/o/oauth2/auth",!! TokenURL: "https://accounts.google.com/o/oauth2/token",!}

Page 16: Crunching data with go: Tips, tricks, use-cases

! oauthHttpClient := getOAuthClient(config)!! analyticsService, err := analytics.New(oauthHttpClient)!! if err != nil {!! ! log.Fatal("Failed to create GA service")!! }!!! dataService := analytics.NewDataGaService(analyticsService)!! dataGaGetCall := dataService.Get(gaId, start, end, metrics)

Page 17: Crunching data with go: Tips, tricks, use-cases

! data, err := dataGaGetCall.Do()!! if err != nil {!! ! log.Fatal("Failed fetch data from GA")!! }!!! return data.Rows

Page 18: Crunching data with go: Tips, tricks, use-cases

func main() {!! gaOptions := map[string]string{!! ! "dimensions": "ga:region,ga:city",!! ! "sort": "-ga:visits",!! ! "limit": "10",!! }!! rows := fetchGAData(config, "ga:11781168", "2014-04-06", "2014-04-06", !"ga:visits", gaOptions)!!! for row := 0; row <= len(rows)-1; row++ {!! ! fmt.Printf("row=%d %v\n", row, rows[row])!! }!}

Page 19: Crunching data with go: Tips, tricks, use-cases
Page 20: Crunching data with go: Tips, tricks, use-cases

! config := &oauth.Config{!! ! ClientId: "client-id-here.apps.googleusercontent.com",!! ! ClientSecret: "client-secret-here",!! ! Scope: bigquery.BigqueryScope,!! ! AuthURL: "https://accounts.google.com/o/oauth2/auth",!! ! TokenURL: "https://accounts.google.com/o/oauth2/token",!! }! !! transport := &oauth.Transport{!! ! Token: token,!! ! Config: config,!! }!! client := transport.Client()

Page 21: Crunching data with go: Tips, tricks, use-cases

! service, err := bigquery.New(client)!! if err != nil {!! ! panic(err)!! }! !! datasetList, err := service.Datasets.List(“testing-project").Do()!! if err != nil {!! ! panic(err)!! }! !! for _, d := range datasetList.Datasets {!! ! fmt.Println(d.FriendlyName)!! }!

Page 22: Crunching data with go: Tips, tricks, use-cases

Useful and interesting Gophers

Page 23: Crunching data with go: Tips, tricks, use-cases

Interesting Gophers

• Golang machine learning lib https://github.com/xlvector/hector • Logistic Regression • Factorized Machine • CART, Random Forest, Random Decision Tree,

Gradient Boosting Decision Tree • Neural Network

Page 24: Crunching data with go: Tips, tricks, use-cases

Interesting Gophers

• library for numeric operationhttps://github.com/gonum - fairly, but they are working to bring some useful packages • matrix - Scientific math package for the Go

language. • graph - Discrete math structures and functions

Page 25: Crunching data with go: Tips, tricks, use-cases

Reference list

• Why are ‘Cool Kids’ at Github Moving to GO Language? - http://www.homolog.us/blogs/blog/2014/01/16/golang/

• How suitable Go will be for scientific computing? - https://groups.google.com/forum/#!topic/golang-nuts/_VoZfniBTZE

Page 26: Crunching data with go: Tips, tricks, use-cases

Thank you!

M U N I C H G O P H E R S - A P R 2 4 2 0 1 4 , M U N I C H

Page 27: Crunching data with go: Tips, tricks, use-cases

M U N I C H G O P H E R S - A P R 2 4 2 0 1 4 , M U N I C H

S e r g i i K h o m e n k o ,

D a t a S c i e n t i s t

S T Y L I G H T G m b H

s e r g i i . k h o m e n k o @ s t y l i g h t . c o m

@ l c 0 d 3 r !S T Y L I G H T . C O M

Page 28: Crunching data with go: Tips, tricks, use-cases

DAHO.AM — Developer Conference 06-06-14S A F E T H E D A T E