Replicable and Reproducible Edge-to-Cloud Experiments ...icl.utk.edu/jlesc11/files/STM1.2/jlesc11_rosendo.pdfapplication and support the reproducibility of my experiments on the Edge-to-Cloud

11TH JLESC WORKSHOPSEPTEMBER 8TH-10TH, 2020 | EARTH, SOLAR SYSTEM, MILKY WAY, LANIAKEA

E2Clab: Exploring the Computing Continuum through Repeatable, Replicable and Reproducible Edge-to-Cloud Experiments

Daniel Rosendo*, Pedro Silva†, Matthieu Simonin*, Alexandru Costan*, Gabriel Antoniu**University of Rennes, Inria, CNRS, IRISA - Rennes, France

{daniel.rosendo, matthieu.simonin, alexandru.costan, gabriel.antoniu}@inria.fr†Hasso-Plattner Institut, University of Potsdam - Berlin, Germany, [email protected]

HPC-BigDataINRIA Project LAB

(IPL)

The Computing Continuum

2

Complex Application Workflows

Continuous dataflow from IoT Edge devices to the HPC/Cloud

Deploying real-life applications;Running large-scale experiments;

Edge Intelligence & Pre-Processing

Short-termStream Analytics

Long-termBig Data Analytics

Research Challenges & Opportunities

CLOUD

FOG

EDGE

distributed

centralized

highly distributed

Research Challenges

3

Evaluation & Validation Simple testbed setupsTechnical ChallengesResearch Challenges & Opportunities

Representative setup to understand performance

Repeatable, Replicable & Reproducible experiments

source: https://www.nature.com/news/1.19970

“+70% failed to reproduce another scientist's experiments”

“+50% failed to reproduce their own experiments”

Install & configure services

Configure network

Map application parts with underlying infrastructure

Define the execution workflow

Establish procedures for reproducibility

Scalability issues

Many abstractions

Hard to reproduce


https://www.nature.com/news/1.19970

What Would Be an Ideal Solution?

4

Evaluation & Validation Large-scale testbed setupsTechnical ChallengesResearch Challenges & Opportunities

Representative setup to understand performance

Repeatable, Replicable & Reproducible experiments

source: https://www.nature.com/news/1.19970

“+70% failed to reproduce another scientist's experiments”

“+50% failed to reproduce their own experiments”

Install & configure services

Configure network

Map application parts with underlying infrastructure

Define the execution workflow

Establish procedures for reproducibility

Easy to scale

Representative setup

Leverage reproducible experimentsAbstract

complexities

Focus onhigh-level aspects


https://www.nature.com/news/1.19970

5

Contribution:E2Clab Framework

E2Clab

EnOSlib

lyr_svc_conf network_conf workflow_conf

LYR & SVCManager

NetworkManager

WorkflowManager

Experiment Manager

Define Experimental Environment

Real-life Application Workflows

Testbed Environments

Performance metrics

Configuration parameters

Environment & Workflow

Resentative setup to understand performance

Access to experiment

results

Access to experiment

artifacts

Well-definedexperiment

methodology

Repeatable, replicable & reproducible experiments

ServicesLayer 1

Services

Services

ServicesLayer 2

Services

Services

ServicesLayer N

Services

Services

Services

Services

Services

Services

Services

Services

Services

Services

Services

Services

Services

Services

Services

Services

Services

Services

Services

Services

Publicrepository

Metrics Monitoring Visualization

Provide access to artifacts

Define workflow

Provide access to

results

Define layers & services

Define network

Defin

e Ex

peri

men

tal E

nvir

onm

ent

Public repository

experiment configs.

Software, algorithmsDataset

Contribution:E2Clab Framework

6

Supports repeatability, replicability & reproducibility

Reproducible Experiments

Application parts & physical testbed

Mapping

Experiment variation and transparent scaling

Variation & ScalingEdge-to-Cloud communication

constraints

Network Emulation

Deployment,Execution & Monitoring

Experiment Management Methodology

How Can You Use E2Clab?

7

video stream

Gateways Ingestion Systems

Processing Frameworks

Data Producers

per camera:detect & count

person

all cameras: detect & count

person

all cameras: count person

1 2

21

Edge-to-Cloud Computing Continuum Workflow

EDGE FOG CLOUD

Cloud-centric processing

Hybrid processingResearch question

● What is the impact of both processing approaches on resource consumption?

Cloud-centric vs Hybrid processing

Use Case: A Smart Surveillance System

1 node

Kafka

Zookeeper

Kafka Cluster

1 node

Task Manager

Flink Cluster

Job Manager

16 Gateways

16 nodes

640 Cameras

16 nodes1 node

Sink

Metrics collector

Experimental Design

Edgent

Mosquitto

40 Producers

lyr_svc_conf.yaml net_conf.yaml wf_conf.yaml

Define Experimental Environment

Layers: Edge + Fog + CloudServices: Flink, Kafka, gateways, etc.

Network constraints: delay, loss, rate. Execution logic: prepare, launch, and finalize services; interconnections, etc.

8

Great! I could express the scenario in a descriptive manner! Besides, these files are easy to comprehend and to adapt to other scenarios.

What is the impact of hybrid processing on resource consumption?

Cloud-centric vs Hybrid processing

Evaluation

9

Transparent scaling small, medium, large

Network emulation 1 profile

Resource monitoring Flink and Kafka memory

● Especially for large workloads, aggregating and processing these data volumes in the Fog can translate into reduced resource costs.

● Kafka and Flink consume less memory in the Hybrid approach. Flink uses 6GB for Hybrid processing against 15.5GB for Cloud-centric (since all the image processing is done on the cloud).

Summary

10

E2Clab helped me to understand performance of my application and support the reproducibility of my experiments on the Edge-to-Cloud Computing Continuum!

● Configure a Flink Cluster and a Kafka Cluster● Deploy 16 gateways + 640 cameras and interconnect them in round-robin● Define network constraints between the Edge, Fog, and Cloud● Manage, validate, and backup experiments● Analyze scenarios: end-to-end (Edge+Fog+Cloud) and per layer (Fog gateways)

● Access to experiment artifacts and results○ https://gitlab.inria.fr/Kerdata/Kerdata-Codes/e2clab-examples Reproducible

Research

https://gitlab.inria.fr/Kerdata/Kerdata-Codes/e2clab-examples

A Real-life Use CasePl@ntNet: recognizing the world's flora

11

ZENITH team: Patrick Valduriez, Alexis Joly

15M downloads1.7M accounts500K users per day700 API users30K species (390K known)

Geographical distribution of the images used for learning the recognition model of the world flora.

Pl@ntNet has an exponential grow of new users acquisition every spring (peaks in May-June)

How E2Clab can help Pl@ntNet?

12

Research questions● Where are the main bottlenecks of Pl@ntNet architecture in terms of throughput (images/sec)

and resource consumption?● How many threads are worth to allocate to tasks in the identification engine?● What is the impact of the network on the end-to-end latency?

Understanding Performance of Pl@ntNet

We would like to anticipate what should be the appropriate evolution of the infrastructure to pass the next spring peak without problem (May-June 2021) and also to know what should be done the following years (2022, 2023).

Collaboration Opportunities

13

Enable built-in support for other large-scale experimental testbeds (besides Grid’5000) such as Chameleon and more.

Leverage E2Clab to build a benchmark suite for applications running on the Computing Continuum.

Make E2Clab evolve towards an emulation framework for applications in the Edge-to-Cloud Computing Continuum to emulate common behaviors such as resource constrained devices, failures (network, services, etc.), and mobility.

Deploy more real-life use cases using E2Clab to perform large-scale experiments. Do you have any use cases to share?

Help Us to Improve E2Clab

14

● Provide your feedback! [email protected] ○ Use cases○ New features

● E2Clab Documentation○ https://kerdata.gitlabpages.inria.fr/Kerdata-Codes/e2clab/

mailto:[email protected]

https://kerdata.gitlabpages.inria.fr/Kerdata-Codes/e2clab/

Thank you!11TH JLESC WORKSHOP

SEPTEMBER 8TH-10TH, 2020 | EARTH, SOLAR SYSTEM, MILKY WAY, LANIAKEA

E2Clab: Exploring the Computing Continuum through Repeatable, Replicable and Reproducible Edge-to-Cloud Experiments

Daniel Rosendo*, Pedro Silva†, Matthieu Simonin*, Alexandru Costan*, Gabriel Antoniu**University of Rennes, Inria, CNRS, IRISA - Rennes, France

{daniel.rosendo, matthieu.simonin, alexandru.costan, gabriel.antoniu}@inria.fr†Hasso-Plattner Institut, University of Potsdam - Berlin, Germany, [email protected]

HPC-BigDataINRIA Project LAB

(IPL)

Documents

Replicable and Reproducible Edge-to-Cloud Experiments ...icl.utk.edu/jlesc11/files/STM1.2/jlesc11_rosendo.pdfapplication and support the reproducibility of my experiments on the Edge-to-Cloud