30
The new The new MONARC Simulation Framework MONARC Simulation Framework Iosif Legrand California Institute of Technology

The new The new MONARC Simulation Framework Iosif Legrand California Institute of Technology

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

The new The new

MONARC Simulation FrameworkMONARC Simulation Framework

Iosif Legrand

California Institute of Technology

June 2003 I.C. Legrand 2

The GOALS of the Simulation FrameworkThe GOALS of the Simulation Framework

The aim of this work is to continue and improve the The aim of this work is to continue and improve the

development of the MONARC simulation frameworkdevelopment of the MONARC simulation framework

To perform realistic simulation and modelling of large scale distributed computing systems, customised for specific HEP applications.

To offer a dynamic and flexible simulation environment to be used as a design tool for large distributed systems

To provide a design framework to evaluate the performance of a range of possible computer systems, as measured by their ability to provide the physicists with the requested data in the required time, and to optimise the cost.

June 2003 I.C. Legrand 3

A Global View for ModellingA Global View for Modelling

Simulation Engine

Basic Components

Specific Components

Computing Models

LAN WAN

DB CPU

Scheduler Job

Catalog

Analysis

Distributed Scheduler

MetaDataJobs

MONITORING

REAL Systems Testbeds

June 2003 I.C. Legrand 4

Design ConsiderationsDesign Considerations

This Simulation framework is not intended to be a

detailed simulator for basic components such as operating systems, data base servers or routers.

Instead, based on realistic mathematical models and measured parameters on test bed systems for all the basic components, it aims to correctly describe the performance and limitations of large distributed systems with complex interactions.

June 2003 I.C. Legrand 5

Simulation EngineSimulation Engine

Simulation Engine

Basic Components

Specific Components

Computing Models

LAN WAN

DB CPU

Scheduler Job

Catalog

Analysis

Distributed Scheduler

MetaDataJobs

MONITORING

REAL Systems Testbeds

June 2003 I.C. Legrand 6

Design Considerations of the Design Considerations of the Simulation EngineSimulation Engine

A process oriented approach for discrete event simulation is well suited to describe concurrent running programs.

“Active objects” (having an execution thread, a program counter, stack...) provide an easy way to map the structure of a set of distributed running programs into the simulation environment.

The Simulation engine supports an “interrupt” scheme This allows effective & correct simulation for concurrent

processes with very different time scale by using a DES approach with a continuous process flow between events

June 2003 I.C. Legrand 7

Tests of the Engine Tests of the Engine

Processing a TOTAL of 100 000 simple jobs in 1 , 10, 100, 1000, 2 000 , 4 000, 10 000 CPUs using the same number of parallel threads

more tests: http://monalisa.cacr.caltech.edu/MONARC/

1

10

100

1000

10000

10 100 1000 10000 100000

No of THREADS

Tim

e [

s]

2X2.4 GHz, Linux

2X450MHz Solaris

2X3GHz, Windows

June 2003 I.C. Legrand 8

Basic ComponentsBasic Components

Simulation Engine

Basic Components

Specific Components

Computing Models

LAN WAN

DB CPU

Scheduler Job

Catalog

Analysis

Distributed Scheduler

MetaDataJobs

MONITORING

REAL Systems Testbeds

June 2003 I.C. Legrand 9

Basic ComponentsBasic Components

These Basic components are capable to simulate the These Basic components are capable to simulate the core functionality for general distributed computing core functionality for general distributed computing systems. They are constructed based on the systems. They are constructed based on the simulation engine and are using efficiently the simulation engine and are using efficiently the implementation of the interrupt functionality for the implementation of the interrupt functionality for the active objects .active objects .

These components should be considered the basic These components should be considered the basic classes from which specific components can be classes from which specific components can be derived and constructed derived and constructed

June 2003 I.C. Legrand 10

Basic ComponentsBasic Components

Computing Nodes Computing Nodes Network Links and Routers , IO protocols Network Links and Routers , IO protocols Data Containers Data Containers Servers Servers

Data Base ServersData Base Servers File Servers (FTP, NFS … )File Servers (FTP, NFS … )

Jobs Jobs Processing JobsProcessing Jobs FTP jobs FTP jobs

Scripts & Graph execution schemes Scripts & Graph execution schemes Basic Scheduler Basic Scheduler Activities ( a time sequence of jobs ) Activities ( a time sequence of jobs )

June 2003 I.C. Legrand 11

Multitasking Processing ModelMultitasking Processing Model

Concurrent running tasks share resources (CPU, memory, I/O)

“Interrupt” driven scheme: For each new task or when one task is finished, an interrupt is

generated and all “processing times” are recomputed.

It provides:

Handling of concurrent jobs with different priorities.

An efficient mechanism to simulate multitask processing.

An easy way to apply different load balancingschemes.

June 2003 I.C. Legrand 12

LAN/WAN Simulation ModelLAN/WAN Simulation Model

Node Link

Node

Node

LANNode

Link

Node

Node

LAN

Node Link

Node

Node

LAN

Internet Connections

ROUTER

ROUTER“Interrupt” driven simulation : for each new message an interrupt is created and for all the active transfers the speed and the estimated time to complete the transfer are recalculated.

Continuous Flow between events !An efficient and realistic way to simulate concurrent transfers

having different sizes / protocols.

June 2003 I.C. Legrand 13

Output of the simulationOutput of the simulation

Simulation Engine

Node

DB

Router

User C

Output Listener Filters

Output Listener Filters

Log Files EXEL

GRAPHICS

Any component in the system can generate generic results objects Any client can subscribe with a filter and will receive the results it is Interested in .VERY SIMILAR structure as in MonALISA . We will integrate soon The output of the simulation framework into MonaLISA

June 2003 I.C. Legrand 14

Specific ComponentsSpecific Components

Simulation Engine

Basic Components

Specific Components

Computing Models

LAN WAN

DB CPU

Scheduler Job

Catalog

Analysis

Distributed Scheduler

MetaDataJobs

MONITORING

REAL Systems Testbeds

June 2003 I.C. Legrand 15

Specific ComponentsSpecific Components

These Components should be derived from the basic These Components should be derived from the basic components and must implement the specific components and must implement the specific characteristics and way they will operate.characteristics and way they will operate.

Major Parts :Major Parts : Data Model Data Model Data Flow Diagrams from Production and Data Flow Diagrams from Production and

especially for Analysis Jobsespecially for Analysis Jobs Scheduling / pre-allocation policies Scheduling / pre-allocation policies Data Replication Strategies Data Replication Strategies

June 2003 I.C. Legrand 16

Generic Data Container

Size Event Type Event Range Access Count INSTANCE

Data ModelData Model

FTP ServerNode

DB Server NFS Server

FILE Data Base

Custom Data Server

NetworkFILE

META DATA CatalogReplication Catalog

Export / Import

June 2003 I.C. Legrand 17

Data Model (2)Data Model (2)

Data Container

JOB

META DATA CatalogReplication Catalog

Data Request

Data Container

Data Container

Data Container

List Of IO Transactions

Data Processing JOB

Select from the options

June 2003 I.C. Legrand 18

Data Flow Diagrams for JOBSData Flow Diagrams for JOBS

Processing 1

Input

Output

Processing 2

Processing 3 Processing 4

Output

Output

Output

Processing 4 Output

Input

Input

Input

Input

10x

Input and output is a collection of data. This data is described by type and range

Process is

described by nameA fine granularity decomposition of processes which can be executed independently and the way they communicate can be very useful for optimization and parallel execution !

June 2003 I.C. Legrand 19

Job Scheduling Job Scheduling Centralized SchemeCentralized Scheme

CPU FARM

JobScheduler

CPU FARM

JobScheduler

Site A Site B

GLOBAL

Job Scheduler

Dynamically loadable module

June 2003 I.C. Legrand 20

Job Scheduling Job Scheduling Distributed Scheme – market modelDistributed Scheme – market model

CPU FARM

JobScheduler

CPU FARM

JobScheduler

Site A Site B

CPU FARM

JobScheduler

Site A

Request

COST

DECISION

June 2003 I.C. Legrand 21

Computing Models Computing Models

Simulation Engine

Basic Components

Specific Components

Computing Models

LAN WAN

DB CPU

Scheduler Job

Catalog

Analysis

Distributed Scheduler

MetaDataJobs

MONITORING

REAL Systems Testbeds

June 2003 I.C. Legrand 22

Activities: Arrival Patterns Activities: Arrival Patterns

A flexible mechanism to define the Stochastic process of how users perform data processing tasks

Dynamic loading of “Activity” tasks, which are threaded objects and are controlled by the simulation scheduling mechanism

Physics ActivitiesInjecting “Jobs”

Each “Activity” thread generates data processing jobs

for( int k =0; k< jobs_per_group; k++) { Job job = new Job( this, Job.ANALYSIS, "TAG”, 1, events_to_process); farm.addJob(job ); // submit the job sim_hold ( 1000 ); // wait 1000 s }

Regional Centre Farm

Job

Activity

Job

Job

Activity

These dynamic objects are used to model the users behavior

June 2003 I.C. Legrand 23

Regional Centre ModelRegional Centre Model

Complex Composite Object

Servers Servers

Simplified topologyof the Centers

AB

C

D

E

June 2003 I.C. Legrand 24

MonitoringMonitoring

Simulation Engine

Basic Components

Specific Components

Computing Models

LAN WAN

DB CPU

Scheduler Job

Catalog

Analysis

Distributed Scheduler

MetaDataJobs

MONITORING

REAL Systems Testbeds

June 2003 I.C. Legrand 25

Real Need for Flexible Monitoring SystemsReal Need for Flexible Monitoring Systems

It is important to measure & monitor the Key applications in It is important to measure & monitor the Key applications in a well defined test environment and to extract the parameters a well defined test environment and to extract the parameters we need for modeling we need for modeling

Monitor the farms used today, and try to understand how Monitor the farms used today, and try to understand how they work and simulate such systems. they work and simulate such systems.

It requires a flexible monitoring system able to dynamically It requires a flexible monitoring system able to dynamically add new parameters and provide access to historical dataadd new parameters and provide access to historical data

Interfacing monitoring tools to get the parameters we need in Interfacing monitoring tools to get the parameters we need in simulations in a nearly automatic waysimulations in a nearly automatic way

MonALISA was designed and developed based on the MonALISA was designed and developed based on the experience with the simulation problems.experience with the simulation problems.

June 2003 I.C. Legrand 26

Input for the Data ModelsInput for the Data Models

We need information related with all the possible data types, expected size and distribution.

Which mechanism for data access will be used for activities like production and analysis :

Flat files and FTP like transfer to the local disk Network file system Data Base access ( batch queries with independent threads ) Root like file system Client / Server Web Services

To simulate access to “hot spots” data into the system we need a range of probabilities for such activities

June 2003 I.C. Legrand 27

Input for how jobs are executedInput for how jobs are executed

How the parallel decomposition of a job is done ? How the parallel decomposition of a job is done ?

Scheduler using a Job description language,Scheduler using a Job description language, Master / slaves model (parallel root ) Master / slaves model (parallel root )

Centralized or distributed job scheduler ?Centralized or distributed job scheduler ? What types of policies we should consider for inter-What types of policies we should consider for inter-

site job scheduling ?site job scheduling ?

Which data should be replicated ?Which data should be replicated ? Which are the “predefined data replication” policies Which are the “predefined data replication” policies

Should we consider dynamic replication / caching Should we consider dynamic replication / caching

for (selected) data which are used more frequently ?for (selected) data which are used more frequently ?

June 2003 I.C. Legrand 28

StatusStatus

The engine was tested (performance and quality) on The engine was tested (performance and quality) on several platforms and it is working well. several platforms and it is working well.

We developed all the basic components ( CPU, We developed all the basic components ( CPU, Servers, DB, Routers, network links, Jobs, IO Jobs) Servers, DB, Routers, network links, Jobs, IO Jobs) and we are now testing/debugging them.and we are now testing/debugging them.

A quite flexible output scheme for simulation is now A quite flexible output scheme for simulation is now includedincluded

Examples made with specific components for Examples made with specific components for production and analysis are being tested. production and analysis are being tested.

A quite general model for the data catalog and data A quite general model for the data catalog and data replication is under development it will be soon replication is under development it will be soon integrated.integrated.

June 2003 I.C. Legrand 29

Still to de done… Still to de done…

Continue the testing of Basic Components , Network servers and Continue the testing of Basic Components , Network servers and start modeling and real farms, Web Services , peer to peer start modeling and real farms, Web Services , peer to peer systems ….systems ….

Improve the Documentation Improve the Documentation Improve the graphical output , interface with MonALISA and Improve the graphical output , interface with MonALISA and

create a service to extract simulation parameters from real-create a service to extract simulation parameters from real-systemssystems

Gather information from the current computing systems and Gather information from the current computing systems and future possible architectures and start building the Specific future possible architectures and start building the Specific Components & Computing Models scenarios. Components & Computing Models scenarios.

Include Risk Analysis into the system Include Risk Analysis into the system Development / evaluation of different scheduling and replication Development / evaluation of different scheduling and replication

strategies strategies

June 2003 I.C. Legrand 30

SummarySummary

Modelling and understanding current systems, their performance and limitations, is essential for the design of the large scale distributed processing systems. This will require continuous iterations between modelling and monitoring

Simulation and Modelling tools must provide the functionality to help in designing complex systems and evaluate different strategies and algorithms for the decision making units and the data flow management.

http://monalisa.cacr.caltech.edu/MONARC/