Xephon-KA lightweight TSDB with multiple backends
Pinglei Guo https://github.com/xephonhq/xephon-k
Agenda
● Overview
● Time Series Data Revisited
● Time Series Database state of the art
● Xephon-K Design
● Xephon-K Implementation
● Evaluation
● Lessons learned
● Related & Future work
● Conclusion
Overview
● Written in Golang (1,700 loc including bench and test)
● Use Cassandra as main backend
● Simple data model
● It is working
Time Series Data Revisited
NOT just data with timestamp
‘What happened, happened and couldn’t have happened another way’
- The Matrix
Time Series Data Revisited
Name Saving Update time
Rabbit $100 2017/03/20:12:59:33
Tiger $250 2017/03/20:12:59:33
Name Daily Transaction
Date
Rabbit +$100, 000 2017/03/19
Rabbit -$99, 900 2017/03/20
Tiger +$125 2017/03/19
Tiger +$125 2017/03/20
Single record, update in place, tell current state
A series of events, immutable, tell the history
Time Series Database state of the art
Xephon-K Cassandra Yes Golang at15 N/A 1
Full list on: https://github.com/xephonhq/awesome-time-series-database
Xephon-K Implementation
● Naive schema and Cassandra data model
● Internal representation
● In Memory storage
● API
Xephon-K Implementation - Naive schema
metric_name metric_timestamp value
cpu 2017/03/17:13:24:00:20 10.2
cpu 2017/03/17:13:25:00:00 3.3
cpu 2017/03/17:13:26:00:00 5.6
mem 2017/03/17:13:24:00:20 80.3
mem 2017/03/17:13:25:00:00 60.2
mem 2017/03/17:13:26:00:00 90.3
cqlsh> SELECT * FROM metrics
Xephon-K Implementation - Naive schema
name metric_timestamp val
cpu 2017/03/17:13:24:00:20 10.2
cpu 2017/03/17:13:25:00:00 3.3
cpu 2017/03/17:13:26:00:00 5.6
mem 2017/03/17:13:24:00:20 80.3
mem 2017/03/17:13:25:00:00 60.2
mem 2017/03/17:13:26:00:00 90.3
The table is an abstraction of underlying map
Xephon-K Implementation
● Naive schema and Cassandra data model
● Internal representation
● In Memory storage
● API
Xephon-K Implementation - Internal representation
type IntPoint struct {T int64V int
}type DoublePoint struct {
T int64V double
}
type IntSeries struct {Name stringTags map[string]stringPoints []IntPoint
}type DoubleSeries struct {
Name stringTags map[string]stringPoints []DoublePoint
}
Xephon-K Implementation
● Naive schema and Cassandra data model
● Internal representation
● In Memory storage
● API
Xephon-K Implementation - In Memory storage
type Data map[SeriesID]*IntSeriesStore
type IntSeriesStore struct {mu sync.RWMutexseries common.IntSerieslength int
}
type Index []IndexRow
type IndexRow struct {key stringvalue stringseriesID SeriesID
}
Xephon-K Implementation
● Naive schema and Cassandra data model
● Internal representation
● In Memory storage
● API
Xephon-K Implementation - API Write
[ { "name": "archive_file_tracked", "tags": { "host": "server1", "data_center": "DC1" }, "points": [
[1359788400000, 123], [1359788300000, 13], [1359788410000, 23]
] }]
http://localhost:2333/write
{ "points": [ [1359788400000, 123], [1359788300000, 13], ], "points": [ {"t": 1359788400000, "v": 123}, {"t": 1359788300000, "v": 13}, ]}
Use array instead of object, all numeric values are number in JSON
Evaluation Environment Setup
● i7-6700 CPU @ 3.40GHz 32 GB RAM HDD Ubuntu 16.10 ( kernel 4.8.0-39 )
● Docker 1.13 without resource limits on container
● InfluxDB 1.2
● KairosDB 1.12 + Cassandra 2.2
● Xephon-K (Go 1.7.4) + Cassandra 3.10
● Write to one series with one tag `cpi{agent:xephon-bench}` with fixed value
● Batch size 100 points, client timeout 30 seconds
● No QPS limit, No retry, No backoff
Evaluation - Throughput
Database Total Requests
XKM 12327
XKC 7931
KairosDB 15561
InfluxDB 118
5 seconds, 10 workers
● InfluxDB performance is extremely poor (my bad?)
● KairosDB outperformed Xephon-K (K is from KairosDB …)
● Prometheus can’t be benchmarked (no HTTP API)
Evaluation Analysis
Q: Why InfluxDB is so slow ?A: Good question, I am still figuring it out (see #15), you can’t blame docker, run it locally results the same
Q: Why KairosDB is faster, Java > Golang ?● lock
● Buffer (batch size)
Q: That’s it?A: Bingo! But https://github.com/xephonhq/xephon-k/tree/master/doc/bench has bunch of results I didn’t dealt with
Q: The chart looks good, what are you using?A: echarts3 http://echarts.baidu.com/ (One JavaScript a day, Keep Microsoft Excel away)
Lessons learned
● Write ugly code and make things work
● Hardware improve productivity, double the monitor, double the Loc/hr
● Source code is your bestfriend, don’t blindly believe what people say in the
doc, blog, conference, paper, twitter, stackoverflow
Related work
Xephon-B: A TSDB benchmark tool and benchmark result sharing platform
● https://github.com/xephonhq/xephon-b● Is a never finished course project with @zchen
Reika A DSL for TSDB
● https://github.com/xephonhq/tsdb-proxy-java/tree/master/ql● Is also a course project two
Xephon-K: I am course project three QvQ
<- Reika
Future work
● Refactor (everyday I am blaming the code of yesterday)
● Storage without Cassandra (yeah, this is course project four)
● Dashboard
● Benchmark driven development using Xephon-B
Conclusion
● Time series data is a series of immutable data points, it tells history
● CQL is an illusion created for RDBMS people
● Cassandra is a map of maps that contains maps
● http://echarts.baidu.com/ is a good charting library
● Ugly code works, perfect is the enemy of deadline (well, video games to be honest)
● Xephon-K is awesome
● What people say in their presentation may not be true, use the source, Luke