52
March 13, 2006 PROOF Tutorial 1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

Embed Size (px)

DESCRIPTION

March 13, 2006PROOF Tutorial3 PROOF Original Design Goals Interactive parallel analysis on local cluster Transparency Same selectors, same chain Draw(), etc. on PROOF as in local session Scalability Quite good and well understood up to 1000 nodes (most extreme case) Extensive monitoring capabilities MLM (Multi-Level-Master) improves scalability on wide area clusters Adaptability Partly achieved, system handles varying load on cluster nodes MLM allows much better latencies on wide area clusters No support yet for coming and going of worker nodes

Citation preview

Page 1: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 1

Distributed Data Analysis with PROOF

Fons Rademakers

Bring the KB to the PB not the PB to the KB

Page 2: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 2

Outline What is PROOF, basic principles How is it implemented Setting up PROOF on a cluster PROOF on the Grid Running a PROOF session New exciting hardware Conclusions Demo

Page 3: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 3

PROOF Original Design Goals

Interactive parallel analysis on local cluster Transparency

Same selectors, same chain Draw(), etc. on PROOF as in local session

Scalability Quite good and well understood up to 1000 nodes (most

extreme case) Extensive monitoring capabilities MLM (Multi-Level-Master) improves scalability on wide area

clusters Adaptability

Partly achieved, system handles varying load on cluster nodes

MLM allows much better latencies on wide area clusters No support yet for coming and going of worker nodes

Page 4: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 4

PROOF Parallel Execution

root

Remote PROOF Cluster

proof

proof

proof

TNetFile

TFile

Local PC

$ root

ana.Cstdout/obj

node1

node2

node3

node4

$ rootroot [0] tree->Process(“ana.C”)

$ rootroot [0] tree->Process(“ana.C”)root [1] gROOT->Proof(“remote”)

$ rootroot [0] tree->Process(“ana.C”)root [1] gROOT->Proof(“remote”)root [2] chain->Process(“ana.C”)

ana.C

proof

proof = slave server

proof

proof = master server

#proof.confslave node1slave node2slave node3slave node4

*.root

*.root

*.root

*.root

TFile

TFile

Page 5: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 5

PROOF Error Handling Handling death of PROOF servers

Death of master Fatal, need to reconnect

Death of slave Master can resubmit packets of death slave

to other slaves Handling of ctrl-c

OOB message is send to master, and forwarded to slaves, causing soft/hard interrupt

Page 6: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 6

Authentication, Authorization

Use (x)rootd authentication plugins Certificates (login and user name)

Single experiment wide login User name used for sandbox

Authorization to sandbox and shared global space Not to other user’s sandboxes under

same account

Page 7: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 7

PROOF New Features Support for “interactive batch” mode

Allow submission of long running queries Allow client/master disconnect and

reconnect Powerful, friendly and complete GUI Work in grid environments

Startup of agents via Grid job scheduler Agents calling out to master (firewalls, NAT) Dynamic master-worker setup

Page 8: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 8

PROOF Scalability

32 nodes: dual Itanium II 1 GHz CPU’s,2 GB RAM, 2x75 GB 15K SCSI disk,1 Fast Eth, 1 GB Eth nic (not used)

Each node has one copy of the data set(4 files, total of 277 MB), 32 nodes: 8.8 Gbyte in 128 files, 9 million events

8.8GB, 128 files1 node: 325 s

32 nodes in parallel: 12 s

Page 9: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 9

Architecture and Implementation

Page 10: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 10

TSelector – The User Code Basic ROOT TSelector

// Abbreviated versionclass TSelector : public TObject {Protected: TList *fInput; TList *fOutput;public void Init(TTree*); void Begin(TTree*); void SlaveBegin(TTree *); Bool_t Process(int entry); void SlaveTerminate(); void Terminate();};

Page 11: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 11

INTERACTIVE ANALYSIS – SELECTORS (1)

Create a selector: TFile *fESD =

TFile::Open(“AliESDs.root”); TTree *tESD = (TTree *)fESD-

>Get(“esdTree”); tESD->MakeSelector(); Modify the selector accordingly.

Page 12: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 12

TChain – The Data Set A TChain is a collection of TTrees

root[0] TChain *c = new TChain(“esd”);root[1] c->Add(“root://rcrs4001/a.root”);…root[10] c->Print(“a”);root[11] c->Process(“mySelector.C”, nentries, first);

Returned by DB or File Catalog query etc. Use logical filenames (“lfn:…”)

Page 13: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 13

Running a PROOF Session

Page 14: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 14

Running PROOFTGrid *alien = TGrid::Connect(“alien”);

TGridResult *res; res = alien->Query(“lfn:///alice/simulation/2001-04/V0.6*.root“);

TChain *chain = new TChain("AOD");chain->Add(res);

gROOT->Proof(“master”);chain->Process(“myselector.C”);

// plot/save objects produced in myselector.C. . .

Page 15: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 15

New PROOF GUI

Page 16: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 16

New PROOF GUI

Page 17: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 17

New PROOF GUI

Page 18: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 18

New PROOF GUI

Page 19: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 19

Conclusions

Page 20: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 20

People Working On PROOF Maarten Ballintijn Bertrand Bellenot Gerri Ganis Jan Iwaszkiewics Guenter Kickinger Andreas Peters Fons Rademakers

Page 21: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 21

Conclusions The PROOF system on local clusters

provides efficient parallel performance on up to O(1000) nodes

Combined with Grid middleware it becomes a powerful environment for “interactive” parallel analysis of globally distributed data

Page 22: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 22

PROOF FOR ALICE CAF = CERN Analysis Facility Cluster running PROOF The CAF will be used for prompt

processing of data pp: Analysis PbPb: Calibration & Alignment, pilot

analysis

Page 23: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 23

CAF test system 40 machines (each 2 CPUs, 200 GB

disk) Disks organized as xrootd pool 8

TB Goal: up to 500 CPUs Next months: Intensive testing

Usability Performance

Page 24: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

CAF Schema

ExperimentDisk Buffer

Tier-1 data export

Tape storage

CAF computing cluster

Proof node

localdisk

Proof node

localdisk

Proof node

localdisk

Proof node

localdisk

Proof node

localdisk

Proof node

localdisk

...

Sub set (moderated)

Page 25: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 25

BACKUP

BACKUP

Page 26: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 26

ROOT in a Nutshell An efficient data storage and access system

designed to support structured data sets in very large distributed data bases (PetaBytes).

A query system to extract information from these distributed data sets.

The query system is able to use transparently parallel systems on the GRID (PROOF).

A scientific visualization system with 2-D and 3-D graphics.

An advanced Graphical User Interface. A C++ interpreter allowing calls to user defined

classes. An Open Source Project (LGPL).

Page 27: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 27

ROOT - An Open Source Project

The project is developed as a collaboration between:

Full time developers: 10 people full time at CERN 2 developers at FermiLab 1 key developer in Japan 2 key developer at MIT

Many contributors spending a substantial fraction of their time in specific areas (> 50).

Key developers in large experiments using ROOT as a framework.

Several thousand users giving feedback and a very long list of small contributions, comments and bug fixes.

Page 28: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 28

The ROOT Web Pages

http://root.cern.ch General Information and News

Download source and binaries

HowTo’s & tutorials

User Guide & Reference Guide

Roottalk mailing list & Forum

Page 29: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 29

Data Access Strategies Each slave gets assigned, as much

as possible, packets representing data in local files

If no (more) local data, get remote data via (x)rootd (needs good LAN, like GB eth)

In case of SAN/NAS just use round robin strategy

Page 30: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 30

Workflow For Tree Analysis –Pull Architecture

Initialization

Process

Process

Process

Process

Wait for nextcommand

Slave 1 Process(“ana.C”)

Pack

et g

ener

ator

Initialization

Process

Process

Process

Process

Wait for nextcommand

Slave NMasterGetNextPacket()

GetNextPacket()

GetNextPacket()

GetNextPacket()

GetNextPacket()

GetNextPacket()

GetNextPacket()

GetNextPacket()

SendObject(histo)SendObject(histo)Add

histogramsDisplay

histograms

0,100

200,100

340,100

490,100

100,100

300,40

440,50

590,60

Process(“ana.C”)

Page 31: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 31

Interactive/Batch queries

GUI

Commandsscripts Batch

stateful

statefulor stateless

stateless

Page 32: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 32

AQ1: 1s query produces a local histogramAQ2: a 10mn query submitted to PROOF1AQ3->AQ7: short queriesAQ8: a 10h query submitted to PROOF2BQ1: browse results of AQ2BQ2: browse temporary results of AQ8BQ3->BQ6: submit 4 10mn queries to PROOF1

CQ1: Browse results of AQ8, BQ3->BQ6

Monday at 10h15

ROOT sessionon my laptop

Monday at 16h25

ROOT sessionon my laptop

Wednesday at 8h40

Carrot sessionon any web

browser

Analysis Session Example

Page 33: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 33

Sandbox – The Cache Minimize the number of file transfers

One cache per file space Locking to guarantee consistency File identity and integrity ensured

using MD5 digest Time stamps

Transparent via TProof::Sendfile()

Page 34: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 34

Sandbox – Package Manager

Provide a collection of files in the sandbox

Binary or source packages PAR files: PROOF ARchive. Like Java jar

Tar file, ROOT-INF directory BUILD.sh SETUP.C, per slave setting

API to manage and activate packages

Page 35: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 35

Merge API Collect output lists in master server Objects are identified by name Combine partial results Member function: Merge(TCollection *)

Executed via CINT, no inheritance required

Standard implementation for histograms and (in memory) trees

Otherwise return the individual objects

Page 36: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 36

TPacketizer – The Work Distributor

The packetizer is the heart of the system It runs on the master and hands out

work to the workers Different packetizers allow for different

data access policies All data on disk, allow network access All data on disk, no network access Data on mass storage, go file-by-file Data on Grid, distribute per Storage Element

Page 37: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 37

Interactive Analysis with PROOF on a Large Cluster

PROOFPROOF

USER SESSIONUSER SESSION

PROOF PROOF SLAVE SLAVE SERVERSSERVERS

PROOF MASTERPROOF MASTER SERVERSERVER

PROOF PROOF SLAVE SLAVE SERVERSSERVERS

PROOF PROOF SLAVE SLAVE SERVERSSERVERS

Guaranteed site access throughPROOF Sub-Masters calling outto Master (agent technology)

PROOF SUB-PROOF SUB-MASTERMASTER SERVER SERVER

PROOFPROOF

PROOFPROOF

PROOFPROOF

Grid/Root Authentication

Grid Access Control Service

TGrid UI/Queue UI

Proofd Startup

Grid Service Interfaces

Grid File/Metadata CatalogueClient retrieves listof logical files (LFN + MSN)

Slave servers access data via xrootd from local disk pools

Page 38: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 38

Exciting New Hardware

Page 39: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 39

Multi-Core CPU’s Multi-Core CPU’s are the next step to

stay on Moore’s performance curve AMD currently price/performance leader

Dual-core Athlon64 X2 Dual-core Opteron Quad-core Opteron later this year

Intel going same way following AMD, now shipping dual-core Pentium-M laptop CPU’s

Page 40: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 40

Multi-Core CPU’s: They Deliver

We recently acquired a dual-core Athlon AMD64 X2 4800+

One of the fastest machine ROOT has been compiled on (8m with parallel compile), and for sure the cheapest machine ever to deliver that performance

Two CPU’s for the price and power consumption of one, forget about traditional SMP machines But not about SMP multi-core machines ;-)

Page 41: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 41

TB Disks Current maximum is 500 GB 3.5”

disks End of next year 1 TB 3.5” is within

reach This means we will pass the

breakeven point where it will be cheaper to have all the data on disk in stead of on tape

Page 42: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 42

Sandbox – The Environment

Each slave runs in its own sandbox Identical, but independent

Multiple file spaces in a PROOF setup Shared via NFS, AFS, shared nothing

File transfers are minimized Cache Packages

Page 43: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 43

Setting Up PROOF

Page 44: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 44

Setting Up PROOF Install ROOT system Add proofd and rootd to /etc/services

The rootd (1094) and proofd (1093) port numbers have been officially assigned by IANA

Start the rootd and proofd daemons (put in /etc/rc.d for startup at boot time) Use scripts provided in $ROOTSYS/etc

Setup proof.conf file describing cluster Setup authentication files (globally, users

can override)

Page 45: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 45

PROOF Configuration File# PROOF config file. It has a very simple format:## node <hostname> [image=<imagename>]# slave <hostname> [perf=<perfindex>]# [image=<imagename>] [port=<portnumber>]# [srp | krb5]# user <username> on <hostname>

node csc02 image=nfs

slave csc03 image=nfsslave csc04 image=nfsslave csc05 image=nfsslave csc06 image=nfsslave csc07 image=nfsslave csc08 image=nfsslave csc09 image=nfsslave csc10 image=nfs

Page 46: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 46

PROOF on the Grid

Page 47: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 47

PROOF Grid Interfacing Grid file catalog

Data set creation Meta data, #events, time, run, etc.

Event catalog PROOF agent creation

Agents call out (no-incoming connection) Config file generation / zero config Grid aware packetizer Scheduled execution Limiting processing to specific part of the

data set

Page 48: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 48

TGrid – Abstract Grid Interface

class TGrid : public TObject {public: virtual Int_t AddFile(const char *lfn, const char *pfn) = 0; virtual Int_t DeleteFile(const char *lfn) = 0; virtual TGridResult *GetPhysicalFileNames(const char *lfn) = 0; virtual Int_t AddAttribute(const char *lfn, const char *attrname, const char *attrval) = 0; virtual Int_t DeleteAttribute(const char *lfn, const char *attrname) = 0; virtual TGridResult *GetAttributes(const char *lfn) = 0; virtual void Close(Option_t *option="") = 0; virtual TGridResult *Query(const char *query) = 0;

static TGrid *Connect(const char *grid, const char *uid = 0, const char *pw = 0);

ClassDef(TGrid,0) // ABC defining interface to GRID services};

Page 49: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 49

TGrid TAlien

TAlienFileTFile

xrootd

GRID Service Authentication- Job-Interface- Catalogue-Interface- File Listing/Location/Query

TFile Plugin Implementation for the “alien://” protocol Uses TAlien methods to translate “alien://” => “root://” URL (or others)

TXNetFile

TGrid::Connect(“alien://..)

TFile::Open(“alien://..)

UI to GRIDServices + Files

Abstract GRIDBase Classes

GRID PluginClasses

Alien APIService

AliEn Grid Interface

Page 50: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 50

Page 51: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 51

Page 52: March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB

March 13, 2006 PROOF Tutorial 52

Conclusions The arrival of cheap and low power

multi-core CPUs and very large disks is changing the landscape

Many new PROOF developments are scheduled for the coming months that allow us to achieve the new exciting goals that will hugely enhance the data analysis experience of very large data sets