Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
DATA PLANE PROGRAMMING
MODULE 3 – IN NETWORK MONITORING, CACHING AND CONTROL (DVAD43)
HHK3.KAU.SE/DPP
• In-Network Caching– E.g. NetCache–Parse packets for application specific information
• E.g. Keys for Key/vaue caching• could also parse sensor data for e.g. MQTT protocol?• Store the data into register or lookup value• Provide the data to other control or data plane devices
SO FAR…
• In-Network Caching
SO FAR…
• In-Network Caching• Example P4 snippet–Assume we have two register array holding information per app
(lasttime, lastvalue) and we parse from packet headersapp.value
SO FAR…
If(app.isValid() {lasttime.write(stdmeta.ingress_port, stdmeta.ingress_timestamp);lastvalue.write(appID, app.value);
}
• In addition to recording data, trigger actions upon (processed) data
• Example use-cases–Detect events in streaming data and report events using e.g.
Digests• Networking related events, e.g. Queuing latency
– In-Band telemetry queue latency à filter on spikes in latency, record local history and EWMA queue latency information
– Trigger actions upon conditions e.g. If (VoIP flow latency) > Threshold, send to high priorityqueue or path
IN-NETWORK CONTROL
• INT reports need to be – Parsed and processed by end-hosts, at stream processor à INT monitors– Performance and Scalability concerns when
processing millions of Telemetryreports
IN-NETWORK CONTROL – INT TELEMETRY
Log, AnalyzeReplay, Streamprocess Visualize
INT Monitor
INT Sourceadd INT instruction bitmask
INT Transitadd INT telemtry items
INT SinkCreatesTelemetryreports
Log, AnalyzeReplay, Streamprocess
• Observations– telemetry information does not
change much in normal case• Under low load or peak load, packet delays
are similar• Within a flow, consecutive packets most
likely follow same path
– Some flows care about latency• Interested when things go wrong– Filter out unimportant INT
information in switch – INT complex event detection in P4
data plane
INT COMPLEX EVENT PROCESSING
VisualizeINT Monitor
INT Sink with ComplexEvent DetectionFramework in P4
event
• Example Filters– Threshold-based filters for per hop metric
– Threshold-based filters for E2E metric
– EWMA for E2E metric
EVENT BASED IN-NETWORK PREPROCESSING USING FILTERS
• Control plane configures algorithm and parameters per flow– Implemented through metadata register arrays per flow to keep statistics– Loop unrolling, if event detected, clone packet and create telemetry report
• Fast detection action: uses simple, but fast event detection algorithms– per-flow: if INT metadata value changes more than a configurable threshold– per-hop: same as per-flow, but on a per-hop basis– mov-avg: same as per-flow, but detects events using the moving average– noop: all packets trigger events
• Complex detection action: (see FastReact later)– e.g. ((hop-latency > 10) and (egress_port== 2)) à how to express in P4?– Implemented through register arrays that encode the expression in CNF
TOWARDS COMPLEX IN-NETWORK EVENT PROCESSING
P4CEP INT PIPELINE
P4 primitives
Programmable Parsing
RW packet metadata
Comparison/arithmetic operators
RW registers
Store and update latencyvalues in P4 registers
• Fast packet processing options: kernel? DPDK? AF_XDP?• AF_XDP keeps a ring buffer between userspace/device which is continuously
polled for packets. • AF_XDP INT Monitor parses INT report messages– Creates Kafka publisher messages and pushes to Kafka topic for ML– Alternatively, sends data to time-series database (e.g. ELK stack)
INT MONITOR
INT ReportCan also send to ELK Stack
• Testbed setup using OSNT traffic generator–Replays pcap file containing INT headers–Created using Facebook traces (Cache, Hadoop, Web)
emulating the duration and inter-burst time for eachmicroburst à queue occupancies for single switch
EVALUATION
NETWORK EVENT REPORT CAPACITY (THRESHOLD)
Programmable Event Detection for In-Band Network Telemetry, Jonathan Vestin, Andreas Kassler, Deval Bhamare, Karl-Johan Grinnemo, Jan-OlofAndersson, Gergely Pongracz, in IEEE CloudNet2019, 4-6 Nov 2019, Coimbra, Portugal.
• Not only store, act upon stored values• Example use-cases–Detect events in streaming data and report events using e.g.
Digests• Networking related events, e.g. Queuing latency
– In-Band telemetry queue latency à filter on spikes in latency, record local history and EWMA queue latency information
– Trigger actions upon conditions (e.g. If for VoIP flow latency is above threshold, send to different queue
• Application related events, e.g. Key/value stores– Detect hot keys– Trigger actions upon conditions (e.g. Send digest), but could also be more complicated
• How about IoT and Cyber-Physical systems?
IN-NETWORK CONTROL
01101111
01110100
10
01101100
10001
01101001
0001011
01101101
11010100
0
01101111
110001
• Increasing need for more customized products
• Flexible production lines are needed• Cost effective
personalized production
• Fast reconfiguration• Agile behavior
• Softwarization has already started, but it is slow and painful process
SMART PRODUCTION SCENARIO
• Industrial applications have strict requirements• Availability, security and timeliness
• Most devices are designed for long term operation (>10years)• High cost of acquiring devices• Protocol updates are generally not possible• Replacement with smarter alternatives is unrealistic
• Industrial protocols designed for closed industrial networks• Assuming low latency, almost zero loss, reliable (wired) links• Ensuring the integration of various field devices
WHY SOFTWARIZATION IS DIFFICULT?
EXAMPLE: REMOTE CONTROLLED INVERTED PENDULUM (RWTH)
Edge-cloud-based Control
InternetSwitch
Latency too high for control
Cloud-based Control of complex Production Processes
n Many dependencies among processes
n Latencies in cloud-based control too high
Edge-cloud-based Control
InternetSwitch
Latency ok
Complexity and Latency Reduction
n Deduction of simple control laws
n E.g. inverted pendulum, trigger fire-alarm if temp > T, process control,…
n Pushed towards the process under control (towards in-network processing)
control reflex
• Cloud-based industrial controllers (e.g., SoftPLCs)• PRO: Software-based alternatives to hardware solutions• CON: Larger latency - e.g., slow reaction to emergency situations
Sensors may generate large amount of data to be transmitted (esp. imaging)• IDEA: moving time-critical computations closer to the field devices
• Example: In-network event detection with FastReact
TOWARDS INCREASED FLEXIBILITY
Moving event detection closer tothe field devices
• Local Decision Making instead of centralized control– Early reaction reduces time required for processing– Reduces network data rate– Fewer devices that can fail
• FastReact– Implemented in P4 data plane
programming language– Reconfigurable rules in runtime
using CNF– Trigger local actions based on
locally stored data
IN-NETWORK EVENT DETECTION WITH FASTREACT
* J. Vestin, A. Kassler, S. Laki, G. Pongrácz: Towards In-Network Event detection and Filtering forPublish/Subscribe Communication using Programmable Data Planes, In IEEE Transactions onNetwork and Service Management (IEEE TNSM), Volume: 18, Issue: 1, Page(s): 415 - 428, March 2021
• Traditionally done with Big-Data frameworks– E.g. Apache Flink– Requires many servers for scalability
• FastReact: Outsource it to P4 Data Plane– Packets pass through switch anyhow– Additional overhead: application specific parsing, storage, processing
COMPLEX EVENT PROCESSING
• Main idea– Complex Event Processing inside
programmable switch– Specify complex event in CEP
language, implement specificcompiler which generates P4 code
– Usecase: firealarm
CEP IN THE DATA PLANE
• Challenges– Logic processing and data storage update requires synchronization– Locking whole register array expensive (difficult to process parallel events)– Can imlement external functions implementing fine-granular spin-lock
• E.g. On Netronome target in Micro-C
CEP IN THE DATA PLANE
• Challenges– Logic processing and data storage update requires synchronization– Locking whole register array expensive (difficult process parallel events)– Can imlement external functions implementing fine-granular spin-lock
• E.g. On Netronome target in Micro-C
CEP IN THE DATA PLANE
• Challenge– Pub/Sub requires application broker or event bus (many servers)
• E.g. MQTT, OPC UA, Cloud Systems
EXAMPLE: PUBSUB MESSAGE FILTER
Publisher
Publisher
Publisher
Brokertopics
Subscriber
Subscriber
Subscriber
Subscriber
Publish(topic1)event(topic1)
event(topic1)
subscribe(topic1)
subscribe(topic1)
subscribe(topic1)
event(topic1)
• Challenge– Pub/Sub requires application broker or event bus (many servers)
• E.g. MQTT, OPC UA, Cloud Systems
PUBSUB MESSAGE FILTER
Publisher
Publisher
Publisher
Brokertopics
Subscriber
Subscriber
Subscriber
Subscriber
Publish(topic2)
event(topic2) subscribe(topic2)
subscribe(topic2)event(topic2)
• Challenge– Pub/Sub requires application broker or event bus (many servers)
• Topic/interest matching and message replication expensive
PUBSUB MESSAGE FILTER
Publisher
Publisher
Publisher
Brokertopics
Subscriber
Subscriber
Subscriber
SubscriberPublish(topic3)
subscribe(topic3)
event(topic3)
• In-Network Pub/Sub– Packet subscriptions– Subscribers instruct switch in what subscriptions they are interested in
• E.g. MSFT.stockprice > 80 US$ and MSFT.stockprice < 200 US$• Switch only forwards to intrested subscribers
– High level language with special compiler to create P4 pipeline
PUBSUB MESSAGE FILTER
Publisher
Publisher
P4Switch
Subscriber
Subscriber
SubscriberPublish(topic3)
subscribe(topic3)
event(topic3)
• FastReact is highly flexible– Re-Configurable from controller without recompilation– Logic tables link sensor values together à Conjunctive Normal Form (CNF)– Updating switch processing logic takes 1…20 ms depending on complexity
FASTREACT
• Challenges– Pipeline complexity– Timing issues due to hardware constraints of the target– Synchronization isuses due to atomic locks– Time-based window processing difficult due to loop support missing
• Unroll loops
• Opportunities– Reduced latency and jitter– Can save many servers à reduced number of cores and CO2– Bring your own use-cases
CHALLENGES AND OPPORTUNITIES
• Data Plane Programming– Learned P4 à DVAD41–Applied it to
• Data center load balancing à DVAD42• In Network Monitoring, Caching, Control à DVAD43
–Quizzes, Exercises, Assignments
COURSE RECAP
• For Open Learners–Participate in all quizzes as indicated– Send screenshot as proof for you that you did try the quizz–Once completed, Send email to [email protected]
• For credit bearing course– Submit the course assignments on canvas (graded), official
course certificate through swedish system
COURSE CERTIFICATE
THE END