Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
DNS-OARC 29 C-DNS in traffic capture
Using the C-DNS file format in packet capture
Jim Hague [email protected] sinodun.com
@SinodunCom
1
DNS-OARC 29 C-DNS in traffic capture
Agenda
• C-DNS project background & C-DNS capture format • Open source collector implementation & deployment
• Presentation from 2017 ICANN DNS Symposium
• Present latest work (code name wombat): • Database cluster & schema to consume C-DNS • Provides ad-hoc queries + Grafana visualisation
2
DNS-OARC 29 C-DNS in traffic capture
C-DNS Project Background
• ICANN DNS Engineering team is responsible for the Root Server operated by ICANN
• DNS-STATS org created in 2014: A covering entity for the implementation of open source DNS statistics collection and presentation software.
• Sinodun contracts for DNS Eng Team on DNS-STATS work e.g. Hedgehog - a presentation tool for DNS traffic stats (DSC XML)
3
DNS-OARC 29 C-DNS in traffic capture
C-DNS Project Background• ICANN managed root server (IMRS) operates 220+ anycast
instances, 9.5 billion queries/day
• Historically used a combination DSC XML + PCAP files
• Needed a DNS customised scalable file format for efficient data storage: C-DNS (Compacted DNS)
4
Majority are hosted, in many different types of networks, by many different organisations.
Some constrained, AND all on a 1RU server
DNS-OARC 29 C-DNS in traffic capture
C-DNS Project Goal: Target most limited use case
• Data collection on same hardware as nameserver
• Constrained instance resources: • CPU, bandwidth (Focus on serving not capturing DNS!) • Collected data locally stored on same hardware • Upload will use the same interface as DNS traffic
5
DNS-OARC 29 C-DNS in traffic capture
C-DNS Project Goal: Target most limited use case
• Data collection on same hardware as nameserver
• Constrained instance resources: • CPU, bandwidth (Focus on serving not capturing DNS!) • Collected data locally stored on same hardware • Upload will use the same interface as DNS traffic
5
Design solution: Combine queries&responses and abstract common data
DNS-OARC 29 C-DNS in traffic capture
Storage format? CBOR
• What is it?
• Why use it? • IETF standard (RFC7049) • Simple format, simple to implement (16 languages) • CDDL draft - CBOR data definition language • Converts to JSON nicely
6
A serialisation format comparable to JSON but with binary representation
DNS-OARC 29 C-DNS in traffic capture
C-DNS format overview
7
DNS-OARC 29 C-DNS in traffic capture
C-DNS format overview
7
DNS-OARC 29 C-DNS in traffic capture
C-DNS format overview
7
DNS-OARC 29 C-DNS in traffic capture
C-DNS format overview
7
DNS-OARC 29 C-DNS in traffic capture
C-DNS format overview
7
DNS-OARC 29 C-DNS in traffic capture
C-DNS format overview
7
DNS-OARC 29 C-DNS in traffic capture
Results: File size
8
Format PCAP C-DNS
File size (Mb) 660 75Compressed with ‘xz -9’ (Mb) 49 18User time for compression (s) 161 39
DNS-OARC 29 C-DNS in traffic capture
Results: File size
8
COMPRESSED SIZE: C-DNS is 30-40% size of PCAP
Format PCAP C-DNS
File size (Mb) 660 75Compressed with ‘xz -9’ (Mb) 49 18User time for compression (s) 161 39
DNS-OARC 29 C-DNS in traffic capture
Results: File size
8
COMPRESSED SIZE: C-DNS is 30-40% size of PCAP
COMPRESSION CPU: C-DNS uses ~25% of PCAP
Format PCAP C-DNS
File size (Mb) 660 75Compressed with ‘xz -9’ (Mb) 49 18User time for compression (s) 161 39
DNS-OARC 29 C-DNS in traffic capture
DNSOP - Draft Status• Oct 2016: Submitted first draft
• Dec 2016: Draft adopted by WG
• Jul 2018: WG Last Call
• Oct 2018: Submitted to IESG for Publication
• draft-ietf-dnsop-dns-capture-format
9
DNS-OARC 29 C-DNS in traffic capture
Open Source release
• https://github.com/dns-stats/compactor
• compactor - capture traffic from network interface or convert PCAP, optional compression with xz or gzip
• inspector - convert C-DNS to PCAP
• work in progress: converter for C-DNS to templated text (TSV) output for DB import
10
DNS-OARC 29 C-DNS in traffic capture
Deployment Status
• compactor deployed in ICANN operated Root Servers for last 2 years collecting C-DNS
• inspector used by OCTO to perform (lossy) reconstruction of PCAPs for manual traffic analysis
11
DNS-OARC 29 C-DNS in traffic capture
Deployment• compactor runs on 1 CPU on the DNS server,
listening on service network interface
• Writes xz compressed files to local storage
• Output file rotated every 5 minutes
• Periodically files uploaded to central collection server
12
DNS-OARC 29 C-DNS in traffic capture 13
DNS Server
DNS Server
DNS Server
C-DNS
C-DNS
C-DNS
PCAP storage
C-DNS storage
C-DNS OCTO processingDNS-STATS Compactor
C-DNS copied
DNS-STATSInspector
DNS-OARC 29 C-DNS in traffic capture 14
DNS Server
DNS Server
DNS Server
C-DNS
C-DNS
C-DNS
PCAP storage
C-DNS storage
wombatDNS-STATS Compactor
C-DNS copied
DNS-STATSInspector
C-DNS imported
ClickHouse Grafana
DNS-OARC 29 C-DNS in traffic capture
ClickHouse?• Open source time series SQL column database
• We evaluated Elasticsearch, Spark/Cassandra (SMACK) and ClickHouse
• ClickHouse was the clear winner on performance (rate of import) and ease of use
• Grafana plugin and SQL for ad-hoc queries
• Used by CloudFlare for DNS analytics and by NIC Chile Research Labs DNSZeppelin project
15
DNS-OARC 29 C-DNS in traffic capture
ClickHouse aggregation• All query/response raw data is stored in the main table
• ClickHouse does on-insert aggregation to additional aggregated tables for performance (details in appendix)
• Aggregation is simple SQL MATERIALIZED VIEW with specialised storage engine
• Currently doing per-second aggregation on selected quantities
16
DNS-OARC 29 C-DNS in traffic capture
ClickHouse details• 4 server cluster
• Adding ~10 billion records per day, ~120kqps
• Disc usage 1Tb per ~40 billion records
• Sample query speed: count all AAAA queries in a week
17
100x speed up
Raw 30.5s 65.3 billion rows processedAggregated 0.3s 0.11 billion rows processed
DNS-OARC 29 C-DNS in traffic capture 18
Total queries per region
DNS-OARC 29 C-DNS in traffic capture 19
Grafana queries are SQL, not rocket science!
DNS-OARC 29 C-DNS in traffic capture 20
Flexible Dashboards
DNS-OARC 29 C-DNS in traffic capture
Summary• C-DNS is used in the wild on a Root server
instance
• Clickhouse is DB of choice for wombat - supports data import rates and fast query times
• Aggregation is key!
• Open source of wombat is planned
21
DNS-OARC 29 C-DNS in traffic capture
Thank you!
Any questions?
22
DNS-OARC 29 C-DNS in traffic capture 23
Existing Hedgehog graphing
DNS-OARC 29 C-DNS in traffic capture 24
Graphing with raw data
DNS-OARC 29 C-DNS in traffic capture 25
production :) select QueryName, count() from QueryResponse where Date = '2018-09-23' and QueryClass=3 and NodeID in (select toUInt16(node_id) from wombat.node_text where server_name = ‘IMRS') group by QueryName order by QueryName;
SELECT QueryName, count() FROM QueryResponse WHERE (Date = '2018-09-23') AND (QueryClass = 3) AND (NodeID IN ( SELECT toUInt16(node_id) FROM wombat.node_text WHERE server_name = 'IMRS' )) GROUP BY QueryName ORDER BY QueryName ASC
┌─QueryName───────┬─count()─┐ │ │ 11 │ │ HOSTNAME.BIND │ 220 │ │ ID.SERVER │ 3 │ │ VERSION.BIND │ 18 │ │ bInd │ 1 │ │ binD │ 1 │ │ hostname.bind │ 4939338 │ │ id.server │ 653675 │ │ vERSIOn.BInD │ 1 │ │ vERsioN.binD │ 1 │ │ vErsioN.BiNd │ 1 │ │ version │ 2 │ │ version.bind │ 27238 │ │ version.maradns │ 2 │ │ version.mydns │ 2 │ │ version.pdns │ 2 │ │ version.server │ 27181 │ └─────────────────┴─────────┘
17 rows in set. Elapsed: 14.029 sec. Processed 10.42 billion rows, 20.99 GB (742.69 million rows/s., 1.50 GB/s.)
Example ad-hoc SQL query
DNS-OARC 29 C-DNS in traffic capture 26
production :) select sum(QueryTypeMap.Count) from QueriesPerSecond array join QueryTypeMap where Date between '2018-09-20' and '2018-09-25' and QueryTypeMap.QueryType=28
SELECT sum(QueryTypeMap.Count) FROM QueriesPerSecond ARRAY JOIN QueryTypeMap WHERE ((Date >= '2018-09-20') AND (Date <= '2018-09-25')) AND (QueryTypeMap.QueryType = 28)
┌─sum(QueryTypeMap.Count)─┐ │ 12445460581 │ └─────────────────────────┘
1 rows in set. Elapsed: 0.313 sec. Processed 105.82 million rows, 6.30 GB (338.46 million rows/s., 20.16 GB/s.)
production :) select count() from QueryResponse where QueryType=28 and QueryResponseHasQuery and Date between '2018-09-20' and '2018-09-25';
SELECT count() FROM QueryResponse WHERE (QueryType = 28) AND QueryResponseHasQuery AND ((Date >= '2018-09-20') AND (Date <= '2018-09-25'))
┌─────count()─┐ │ 12445460581 │ └─────────────┘
1 rows in set. Elapsed: 30.530 sec. Processed 65.33 billion rows, 168.06 GB (2.14 billion rows/s., 5.50 GB/s.)
Raw query performance
Aggregated query performance
DNS-OARC 29 C-DNS in traffic capture 27
CREATE MATERIALIZED VIEW wombat.QueriesPerSecondShard ( Date Date, DateTime DateTime, NodeID UInt16, QueryCount UInt32, QueryOpcodeMap Nested ( QueryOpcode UInt8, Count UInt32 ), QueryTypeMap Nested ( QueryType UInt16, Count UInt32 ), QueryClassMap Nested ( QueryClass UInt16, Count UInt32 ), QueryDOCount UInt32, QueryRecursionDesiredCount UInt32, QueryCheckingDisabledCount UInt32, TransportIPv6Count UInt32, TransportTCPMap Nested ( TransportTCP UInt8, Count UInt32 ) ) ENGINE = SummingMergeTree(Date, (Date, DateTime, NodeID), 8192)
AS SELECT Date, DateTime, NodeID, CAST(1 as UInt32) AS QueryCount, [QueryOpcode] AS `QueryOpcodeMap.QueryOpcode`, [CAST(1 AS UInt32)] AS `QueryOpcodeMap.Count`, [QueryType] AS `QueryTypeMap.QueryType`, [CAST(1 AS UInt32)] AS `QueryTypeMap.Count`, [QueryClass] AS `QueryClassMap.QueryClass`, [CAST(1 AS UInt32)] AS `QueryClassMap.Count`, CAST(QueryDO AS UInt32) AS QueryDOCount, CAST(QueryRecursionDesired AS UInt32) AS QueryRecursionDesiredCount, CAST(QueryCheckingDisabled AS UInt32) AS QueryCheckingDisabledCount, CAST(TransportIPv6 AS UInt32) AS TransportIPv6Count, [TransportTCP] AS `TransportTCPMap.TransportTCP`, [CAST(1 AS UInt32)] AS `TransportTCPMap.Count` FROM wombat.QueryResponseShard WHERE QueryResponseHasQuery;
Clickhouse Aggregation example