10
Simulation of O2 offline processing – 02/2015 Faculty of Electrical Engineering, Mechanical Engineering and Naval Architecture Eugen Mudnić

Simulation of O2 offline processing – 02/2015 Faculty of Electrical Engineering, Mechanical Engineering…

Embed Size (px)

DESCRIPTION

Omnet++ (4.6) – A lot of C++ 11 code – More manageable code than previous C++ vers. – Requires good C++ programming skills

Citation preview

Page 1: Simulation of O2 offline processing – 02/2015 Faculty of Electrical Engineering, Mechanical Engineering…

Simulation of O2 offline processing – 02/2015Faculty of Electrical Engineering, Mechanical Engineering and Naval ArchitectureEugen Mudnić

Page 2: Simulation of O2 offline processing – 02/2015 Faculty of Electrical Engineering, Mechanical Engineering…

What has been done:• Created Omnet++ DE framework for simulation of massive

data processing– Implemented network flow model

– Implemented simulation of simple global file system

– Implemented simulation of job generation– Implemented simulation of (primitive)

• Processing node• Storage node

– Started tests/debugging of simulation framework

Page 3: Simulation of O2 offline processing – 02/2015 Faculty of Electrical Engineering, Mechanical Engineering…

Omnet++ (4.6)

– A lot of C++ 11 code

– More manageable code thanprevious C++ vers.

– Requires good C++programming skills

Page 4: Simulation of O2 offline processing – 02/2015 Faculty of Electrical Engineering, Mechanical Engineering…

Implemented network flow model - topology

– Bandwidth sharing links, discrete data flow changes

– Included some dynamics for smaller files (to be refined if necessary)

– Model is defined programmatically from standard Omnet++ module/channel topology description (NED language with optional visualization)

– Network simulation consumes most of simulation time

– Test case:• 3x300 EPN (10Gbps)• 1 EPN = 8 slots• 3 x SE (400Gbps)• Non-blocking switches• Simulation time ?

Page 5: Simulation of O2 offline processing – 02/2015 Faculty of Electrical Engineering, Mechanical Engineering…

System configuration / job workload / ….

Page 6: Simulation of O2 offline processing – 02/2015 Faculty of Electrical Engineering, Mechanical Engineering…

Processing node

• Groups of processing nodes (A,B,C) with common parameters

• Multiple execution slots per node • Capabilities (could be matched with job requirements)• Slots[0..n]<- bandwidth -> BUS <- bandwidth -> network• Job execution (at this moment):

– Load input files (remote->local storage/memory)

– Execute (exec. time based on kHEPSpec of the machine)

– Save output (->remote storage)

Page 7: Simulation of O2 offline processing – 02/2015 Faculty of Electrical Engineering, Mechanical Engineering…

Storage node

• Groups of storage nodes (A,B,C) with common parameters• One storage unit <- bandwidth -> BUS <- bandwidth -> network

– More detailed model required • Global file system / storage node content:

• Storage state – preserved in database for successive simulations

Page 8: Simulation of O2 offline processing – 02/2015 Faculty of Electrical Engineering, Mechanical Engineering…

Global file system• What is stored where – minimal description• Where job can find required input files

– Some files are with fixed position

– Other have probability that they exist in on some SE • File types

• Storage elements

• File instances

Page 9: Simulation of O2 offline processing – 02/2015 Faculty of Electrical Engineering, Mechanical Engineering…

Simulation running

• 20000 jobs -> 900 processing nodes– Input 60000 files/output 20000 files ~4PB data

• EPN_A uses SE_A for data input• Real time ~4h - simulation time ~4h • Simulation time depends heavily on data transport parallelism• At this moment not optimized

Page 10: Simulation of O2 offline processing – 02/2015 Faculty of Electrical Engineering, Mechanical Engineering…

Current work - further steps• Settled Omnet framework for massive job processing

simulation• Current work: improving performances, debugging• Further steps: customizing to O2 data processing scenarios

– Implement O2 job workload management system

– Define O2-like network/EPN/storage topology

– Define data distribution on storage elements (what is where)

– More detailed storage and processing node model