Upload
kristian-oneal
View
219
Download
0
Embed Size (px)
DESCRIPTION
March 13, 2006PROOF Tutorial3 PROOF Original Design Goals Interactive parallel analysis on local cluster Transparency Same selectors, same chain Draw(), etc. on PROOF as in local session Scalability Quite good and well understood up to 1000 nodes (most extreme case) Extensive monitoring capabilities MLM (Multi-Level-Master) improves scalability on wide area clusters Adaptability Partly achieved, system handles varying load on cluster nodes MLM allows much better latencies on wide area clusters No support yet for coming and going of worker nodes
Citation preview
March 13, 2006 PROOF Tutorial 1
Distributed Data Analysis with PROOF
Fons Rademakers
Bring the KB to the PB not the PB to the KB
March 13, 2006 PROOF Tutorial 2
Outline What is PROOF, basic principles How is it implemented Setting up PROOF on a cluster PROOF on the Grid Running a PROOF session New exciting hardware Conclusions Demo
March 13, 2006 PROOF Tutorial 3
PROOF Original Design Goals
Interactive parallel analysis on local cluster Transparency
Same selectors, same chain Draw(), etc. on PROOF as in local session
Scalability Quite good and well understood up to 1000 nodes (most
extreme case) Extensive monitoring capabilities MLM (Multi-Level-Master) improves scalability on wide area
clusters Adaptability
Partly achieved, system handles varying load on cluster nodes
MLM allows much better latencies on wide area clusters No support yet for coming and going of worker nodes
March 13, 2006 PROOF Tutorial 4
PROOF Parallel Execution
root
Remote PROOF Cluster
proof
proof
proof
TNetFile
TFile
Local PC
$ root
ana.Cstdout/obj
node1
node2
node3
node4
$ rootroot [0] tree->Process(“ana.C”)
$ rootroot [0] tree->Process(“ana.C”)root [1] gROOT->Proof(“remote”)
$ rootroot [0] tree->Process(“ana.C”)root [1] gROOT->Proof(“remote”)root [2] chain->Process(“ana.C”)
ana.C
proof
proof = slave server
proof
proof = master server
#proof.confslave node1slave node2slave node3slave node4
*.root
*.root
*.root
*.root
TFile
TFile
March 13, 2006 PROOF Tutorial 5
PROOF Error Handling Handling death of PROOF servers
Death of master Fatal, need to reconnect
Death of slave Master can resubmit packets of death slave
to other slaves Handling of ctrl-c
OOB message is send to master, and forwarded to slaves, causing soft/hard interrupt
March 13, 2006 PROOF Tutorial 6
Authentication, Authorization
Use (x)rootd authentication plugins Certificates (login and user name)
Single experiment wide login User name used for sandbox
Authorization to sandbox and shared global space Not to other user’s sandboxes under
same account
March 13, 2006 PROOF Tutorial 7
PROOF New Features Support for “interactive batch” mode
Allow submission of long running queries Allow client/master disconnect and
reconnect Powerful, friendly and complete GUI Work in grid environments
Startup of agents via Grid job scheduler Agents calling out to master (firewalls, NAT) Dynamic master-worker setup
March 13, 2006 PROOF Tutorial 8
PROOF Scalability
32 nodes: dual Itanium II 1 GHz CPU’s,2 GB RAM, 2x75 GB 15K SCSI disk,1 Fast Eth, 1 GB Eth nic (not used)
Each node has one copy of the data set(4 files, total of 277 MB), 32 nodes: 8.8 Gbyte in 128 files, 9 million events
8.8GB, 128 files1 node: 325 s
32 nodes in parallel: 12 s
March 13, 2006 PROOF Tutorial 9
Architecture and Implementation
March 13, 2006 PROOF Tutorial 10
TSelector – The User Code Basic ROOT TSelector
// Abbreviated versionclass TSelector : public TObject {Protected: TList *fInput; TList *fOutput;public void Init(TTree*); void Begin(TTree*); void SlaveBegin(TTree *); Bool_t Process(int entry); void SlaveTerminate(); void Terminate();};
March 13, 2006 PROOF Tutorial 11
INTERACTIVE ANALYSIS – SELECTORS (1)
Create a selector: TFile *fESD =
TFile::Open(“AliESDs.root”); TTree *tESD = (TTree *)fESD-
>Get(“esdTree”); tESD->MakeSelector(); Modify the selector accordingly.
March 13, 2006 PROOF Tutorial 12
TChain – The Data Set A TChain is a collection of TTrees
root[0] TChain *c = new TChain(“esd”);root[1] c->Add(“root://rcrs4001/a.root”);…root[10] c->Print(“a”);root[11] c->Process(“mySelector.C”, nentries, first);
Returned by DB or File Catalog query etc. Use logical filenames (“lfn:…”)
March 13, 2006 PROOF Tutorial 13
Running a PROOF Session
March 13, 2006 PROOF Tutorial 14
Running PROOFTGrid *alien = TGrid::Connect(“alien”);
TGridResult *res; res = alien->Query(“lfn:///alice/simulation/2001-04/V0.6*.root“);
TChain *chain = new TChain("AOD");chain->Add(res);
gROOT->Proof(“master”);chain->Process(“myselector.C”);
// plot/save objects produced in myselector.C. . .
March 13, 2006 PROOF Tutorial 15
New PROOF GUI
March 13, 2006 PROOF Tutorial 16
New PROOF GUI
March 13, 2006 PROOF Tutorial 17
New PROOF GUI
March 13, 2006 PROOF Tutorial 18
New PROOF GUI
March 13, 2006 PROOF Tutorial 19
Conclusions
March 13, 2006 PROOF Tutorial 20
People Working On PROOF Maarten Ballintijn Bertrand Bellenot Gerri Ganis Jan Iwaszkiewics Guenter Kickinger Andreas Peters Fons Rademakers
March 13, 2006 PROOF Tutorial 21
Conclusions The PROOF system on local clusters
provides efficient parallel performance on up to O(1000) nodes
Combined with Grid middleware it becomes a powerful environment for “interactive” parallel analysis of globally distributed data
March 13, 2006 PROOF Tutorial 22
PROOF FOR ALICE CAF = CERN Analysis Facility Cluster running PROOF The CAF will be used for prompt
processing of data pp: Analysis PbPb: Calibration & Alignment, pilot
analysis
March 13, 2006 PROOF Tutorial 23
CAF test system 40 machines (each 2 CPUs, 200 GB
disk) Disks organized as xrootd pool 8
TB Goal: up to 500 CPUs Next months: Intensive testing
Usability Performance
CAF Schema
ExperimentDisk Buffer
Tier-1 data export
Tape storage
CAF computing cluster
Proof node
localdisk
Proof node
localdisk
Proof node
localdisk
Proof node
localdisk
Proof node
localdisk
Proof node
localdisk
...
Sub set (moderated)
March 13, 2006 PROOF Tutorial 25
BACKUP
BACKUP
March 13, 2006 PROOF Tutorial 26
ROOT in a Nutshell An efficient data storage and access system
designed to support structured data sets in very large distributed data bases (PetaBytes).
A query system to extract information from these distributed data sets.
The query system is able to use transparently parallel systems on the GRID (PROOF).
A scientific visualization system with 2-D and 3-D graphics.
An advanced Graphical User Interface. A C++ interpreter allowing calls to user defined
classes. An Open Source Project (LGPL).
March 13, 2006 PROOF Tutorial 27
ROOT - An Open Source Project
The project is developed as a collaboration between:
Full time developers: 10 people full time at CERN 2 developers at FermiLab 1 key developer in Japan 2 key developer at MIT
Many contributors spending a substantial fraction of their time in specific areas (> 50).
Key developers in large experiments using ROOT as a framework.
Several thousand users giving feedback and a very long list of small contributions, comments and bug fixes.
March 13, 2006 PROOF Tutorial 28
The ROOT Web Pages
http://root.cern.ch General Information and News
Download source and binaries
HowTo’s & tutorials
User Guide & Reference Guide
Roottalk mailing list & Forum
March 13, 2006 PROOF Tutorial 29
Data Access Strategies Each slave gets assigned, as much
as possible, packets representing data in local files
If no (more) local data, get remote data via (x)rootd (needs good LAN, like GB eth)
In case of SAN/NAS just use round robin strategy
March 13, 2006 PROOF Tutorial 30
Workflow For Tree Analysis –Pull Architecture
Initialization
Process
Process
Process
Process
Wait for nextcommand
Slave 1 Process(“ana.C”)
Pack
et g
ener
ator
Initialization
Process
Process
Process
Process
Wait for nextcommand
Slave NMasterGetNextPacket()
GetNextPacket()
GetNextPacket()
GetNextPacket()
GetNextPacket()
GetNextPacket()
GetNextPacket()
GetNextPacket()
SendObject(histo)SendObject(histo)Add
histogramsDisplay
histograms
0,100
200,100
340,100
490,100
100,100
300,40
440,50
590,60
Process(“ana.C”)
March 13, 2006 PROOF Tutorial 31
Interactive/Batch queries
GUI
Commandsscripts Batch
stateful
statefulor stateless
stateless
March 13, 2006 PROOF Tutorial 32
AQ1: 1s query produces a local histogramAQ2: a 10mn query submitted to PROOF1AQ3->AQ7: short queriesAQ8: a 10h query submitted to PROOF2BQ1: browse results of AQ2BQ2: browse temporary results of AQ8BQ3->BQ6: submit 4 10mn queries to PROOF1
CQ1: Browse results of AQ8, BQ3->BQ6
Monday at 10h15
ROOT sessionon my laptop
Monday at 16h25
ROOT sessionon my laptop
Wednesday at 8h40
Carrot sessionon any web
browser
Analysis Session Example
March 13, 2006 PROOF Tutorial 33
Sandbox – The Cache Minimize the number of file transfers
One cache per file space Locking to guarantee consistency File identity and integrity ensured
using MD5 digest Time stamps
Transparent via TProof::Sendfile()
March 13, 2006 PROOF Tutorial 34
Sandbox – Package Manager
Provide a collection of files in the sandbox
Binary or source packages PAR files: PROOF ARchive. Like Java jar
Tar file, ROOT-INF directory BUILD.sh SETUP.C, per slave setting
API to manage and activate packages
March 13, 2006 PROOF Tutorial 35
Merge API Collect output lists in master server Objects are identified by name Combine partial results Member function: Merge(TCollection *)
Executed via CINT, no inheritance required
Standard implementation for histograms and (in memory) trees
Otherwise return the individual objects
March 13, 2006 PROOF Tutorial 36
TPacketizer – The Work Distributor
The packetizer is the heart of the system It runs on the master and hands out
work to the workers Different packetizers allow for different
data access policies All data on disk, allow network access All data on disk, no network access Data on mass storage, go file-by-file Data on Grid, distribute per Storage Element
March 13, 2006 PROOF Tutorial 37
Interactive Analysis with PROOF on a Large Cluster
PROOFPROOF
USER SESSIONUSER SESSION
PROOF PROOF SLAVE SLAVE SERVERSSERVERS
PROOF MASTERPROOF MASTER SERVERSERVER
PROOF PROOF SLAVE SLAVE SERVERSSERVERS
PROOF PROOF SLAVE SLAVE SERVERSSERVERS
Guaranteed site access throughPROOF Sub-Masters calling outto Master (agent technology)
PROOF SUB-PROOF SUB-MASTERMASTER SERVER SERVER
PROOFPROOF
PROOFPROOF
PROOFPROOF
Grid/Root Authentication
Grid Access Control Service
TGrid UI/Queue UI
Proofd Startup
Grid Service Interfaces
Grid File/Metadata CatalogueClient retrieves listof logical files (LFN + MSN)
Slave servers access data via xrootd from local disk pools
March 13, 2006 PROOF Tutorial 38
Exciting New Hardware
March 13, 2006 PROOF Tutorial 39
Multi-Core CPU’s Multi-Core CPU’s are the next step to
stay on Moore’s performance curve AMD currently price/performance leader
Dual-core Athlon64 X2 Dual-core Opteron Quad-core Opteron later this year
Intel going same way following AMD, now shipping dual-core Pentium-M laptop CPU’s
March 13, 2006 PROOF Tutorial 40
Multi-Core CPU’s: They Deliver
We recently acquired a dual-core Athlon AMD64 X2 4800+
One of the fastest machine ROOT has been compiled on (8m with parallel compile), and for sure the cheapest machine ever to deliver that performance
Two CPU’s for the price and power consumption of one, forget about traditional SMP machines But not about SMP multi-core machines ;-)
March 13, 2006 PROOF Tutorial 41
TB Disks Current maximum is 500 GB 3.5”
disks End of next year 1 TB 3.5” is within
reach This means we will pass the
breakeven point where it will be cheaper to have all the data on disk in stead of on tape
March 13, 2006 PROOF Tutorial 42
Sandbox – The Environment
Each slave runs in its own sandbox Identical, but independent
Multiple file spaces in a PROOF setup Shared via NFS, AFS, shared nothing
File transfers are minimized Cache Packages
March 13, 2006 PROOF Tutorial 43
Setting Up PROOF
March 13, 2006 PROOF Tutorial 44
Setting Up PROOF Install ROOT system Add proofd and rootd to /etc/services
The rootd (1094) and proofd (1093) port numbers have been officially assigned by IANA
Start the rootd and proofd daemons (put in /etc/rc.d for startup at boot time) Use scripts provided in $ROOTSYS/etc
Setup proof.conf file describing cluster Setup authentication files (globally, users
can override)
March 13, 2006 PROOF Tutorial 45
PROOF Configuration File# PROOF config file. It has a very simple format:## node <hostname> [image=<imagename>]# slave <hostname> [perf=<perfindex>]# [image=<imagename>] [port=<portnumber>]# [srp | krb5]# user <username> on <hostname>
node csc02 image=nfs
slave csc03 image=nfsslave csc04 image=nfsslave csc05 image=nfsslave csc06 image=nfsslave csc07 image=nfsslave csc08 image=nfsslave csc09 image=nfsslave csc10 image=nfs
March 13, 2006 PROOF Tutorial 46
PROOF on the Grid
March 13, 2006 PROOF Tutorial 47
PROOF Grid Interfacing Grid file catalog
Data set creation Meta data, #events, time, run, etc.
Event catalog PROOF agent creation
Agents call out (no-incoming connection) Config file generation / zero config Grid aware packetizer Scheduled execution Limiting processing to specific part of the
data set
March 13, 2006 PROOF Tutorial 48
TGrid – Abstract Grid Interface
class TGrid : public TObject {public: virtual Int_t AddFile(const char *lfn, const char *pfn) = 0; virtual Int_t DeleteFile(const char *lfn) = 0; virtual TGridResult *GetPhysicalFileNames(const char *lfn) = 0; virtual Int_t AddAttribute(const char *lfn, const char *attrname, const char *attrval) = 0; virtual Int_t DeleteAttribute(const char *lfn, const char *attrname) = 0; virtual TGridResult *GetAttributes(const char *lfn) = 0; virtual void Close(Option_t *option="") = 0; virtual TGridResult *Query(const char *query) = 0;
static TGrid *Connect(const char *grid, const char *uid = 0, const char *pw = 0);
ClassDef(TGrid,0) // ABC defining interface to GRID services};
March 13, 2006 PROOF Tutorial 49
TGrid TAlien
TAlienFileTFile
xrootd
GRID Service Authentication- Job-Interface- Catalogue-Interface- File Listing/Location/Query
TFile Plugin Implementation for the “alien://” protocol Uses TAlien methods to translate “alien://” => “root://” URL (or others)
TXNetFile
TGrid::Connect(“alien://..)
TFile::Open(“alien://..)
UI to GRIDServices + Files
Abstract GRIDBase Classes
GRID PluginClasses
Alien APIService
AliEn Grid Interface
March 13, 2006 PROOF Tutorial 50
March 13, 2006 PROOF Tutorial 51
March 13, 2006 PROOF Tutorial 52
Conclusions The arrival of cheap and low power
multi-core CPUs and very large disks is changing the landscape
Many new PROOF developments are scheduled for the coming months that allow us to achieve the new exciting goals that will hugely enhance the data analysis experience of very large data sets