Upload
yosefu
View
44
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Distributed Data Analysis and Tools. CHEP, 21-27 March 2009, Prague P. Mato /CERN. Foreword. Distributed Data Analysis is very wide subject and I don’t like catalogue-like talks - PowerPoint PPT Presentation
Citation preview
Distributed Data Analysis and Tools
CHEP, 21-27 March 2009, Prague P. Mato /CERN
Distributed Data Analysis is very wide subject and I don’t like catalogue-like talks
Narrowing the scope of the presentation to the perspective of the ‘physicists’, discussing issues that affects them directly
My presentation will be LHC centric, which is very relevant for the current phase we are now. -- Sorry
Thanks to all the people that has help me to prepare this presentation
Foreword
26/9/09Distributed Data Analysis and Tools 2
The full data processing chain from reconstructed event data up to producing the final plots for publication
Data analysis is a iterative process◦ Reduce data samples to more interesting subsets
(selection)◦ Compute higher level information, redo some
reconstruction, etc.◦ Calculate statistical entities
Algorithm development is essential in analysis◦ The ingeniousness is materialized in code
Data Analysis
26/9/09Distributed Data Analysis and Tools 3
The large amount of data to be analyzed and the computing requirements prevents the idea of non-distributed data analysis
The scale of ‘distribution’ goes from a local cluster to computer center or to the whole grid(s)
Distributed analysis complicates the life of the physicists◦ In addition to the analysis code he/she has to
worry about many other technical issues
◦
Some Obvious Facts
26/9/09Distributed Data Analysis and Tools 4
Distributed Data Analysis and Tools 5
LHC Analysis Data Flow
26/9/09
Data is generated at the experiment, process and distributed worldwide (T1, T2, T3)
The analysis will process, reduce, transform and select parts of the data iteratively until it can fit in a single computer
How this is realized?
All elements there and still valid◦ Less organized activity (chaotic)◦ Input data defined by asking questions◦ Data scattered all over the world◦ Own algorithms◦ Data provenance◦ Software version management◦ Resource estimation◦ Interactivity
Advocating for a sophisticated WMS◦ Common to all VO’s◦ Plugins to VO’s specific tools/services
HEPCAL-II† Dreams
WorkloadManagemen
tSystem
DatasetQuery
UserAlgorithms
UserOutput
† Common use cases for a HEP Common Application Layer for Analysis, LCG-2003
OtherServices
26/9/09 6Distributed Data Analysis and Tools
“If there is no special middleware support [for analysis], the job may not benefit from being run in the grid environment, and analysis may even take a step backward from pre-grid days”
Need for a Common Layer
26/9/09Distributed Data Analysis and Tools 7
The implementation has evolved into a number of VO specific “middleware” using a small set of basic services◦ E.g. DIRAC, PanDA, AliEn, Glide-In
Development of “user-friendly” and ‘intelligent” interfaces to hide the complexity◦ E.g. Crab, Ganga
Not optimal for small VOs that cannot afford to develop specific services/interface◦ Or individuals with special needs
HEPCAL-II Reality
26/9/09Distributed Data Analysis and Tools 8
VO specificWMS, DSC
Grid middlewareBasic Services
[VO specific]Front-end interface
Computing & Storageresources
Specialization of the VO’s Frameworks and Data Models for data analysis to process ESD/AOD◦ CMS Physics Analysis Toolkit (PAT), ATLAS Analysis
Framework, LHCb DaVinci/LoKi/Bender, ALICE Analysis Framework
◦ In same cases selecting subset of Framework libraries ◦ Collaboration approved analysis algorithms and tools
Other [scripting] languages have a role here◦ PYTHON is getting very popular in addition to CINT
macros◦ Ideal for prototyping new ideas
User typically develops its own Algorithm(s) based on these frameworks but also is willing to replace parts of the official release
Analysis Software
26/9/09Distributed Data Analysis and Tools 9
Front-End Tools
26/9/09Distributed Data Analysis and Tools 10
Ganga ALICE
Crab
Both Ganga and ALICE provide an interactive shell to configure and automate analysis jobs (Python, CINT)◦ In addition Ganga provides a GUI
Crab has a thin client. Most of the work (automation, recovery, monitoring, etc) is done in a server◦ This functionality is delegated to the VO specific WMS for the
other cases Ganga offers a convenient overview of all user jobs (job
repository) enabling automation Both Crab and Ganga are able to pack local user libraries and
environment automatically making use of the configuration tool knowledge◦ For ALICE the user provides .par files with the sources
Major Differences
26/9/09Distributed Data Analysis and Tools 11
Distributed Data Analysis and Tools 12
1. Algorithm development and testing starts locally and small◦ Single computer small cluster
2. Grows to a large data and computation task◦ Large cluster the Grid
3. Final analysis is again more local and small◦ Small cluster single computer
Ideally the analysis activity should be a continuum in terms of tools, software frameworks, models, etc.◦ LHC experiments are starting to offer this to their physicists◦ Ganga is a good example. From inside the same session
you can do a large data job and do final analysis with the results
Analysis Activity
26/9/09
The user specifies on what data to run the analysis using VO specific dataset catalogs ◦ Specification is based on a query◦ The front-end interfaces provide functionality to facilitate
the catalog queries Each experiment has developed event tags
mechanisms for sparse input data selection Data scattered over the world
◦ Computing model and policies of the experiment dictate the placement of data
◦ Read-only data with several replicas◦ Portions of the data copied to local clusters (CAF, T3, etc)
for local access
Input Data
26/9/09Distributed Data Analysis and Tools 13
Small output data files such like histogram files are returned to the client session (using the sandbox)◦ Usually limited to few MB
Large output files are typically put in Storage Elements (e.g. Castor) and registered in the grid file catalogue (e.g. LFC) and can be used as input for other Grid jobs (iterative process)
Tools such as CRAB and Ganga (ATLAS) provides strong links with VO’s Distributed Data Management/Transfer systems (eg. DQ2, PhEDEx) to place output where user wants it
Output Data
26/9/09Distributed Data Analysis and Tools 14
Distributed Data Analysis and Tools 15
The goal is to make it easy for physicists Distributed analysis as simple as doing it locally
◦ Which is already complicated enough!!◦ Hiding the technical details is a must
In Ganga changing the back-end from LSF to DIRAC requires to change one parameter
In ALICE changing from PROOF to AliEn requires to change one name and provide a AliEn plugin configuration
In CRAB changing from local batch to gLite requires a single parameter change in the configuration file
Submission Transparency
26/9/09
PROOF
Out
put l
ist
AM
O1
AM
task1
task2
task3
taskN
Inpu
ts
Out
puts
AM
task1
task2
task3
taskN
Inpu
ts
Out
puts
AM
task1
task2
task3
taskN
Inpu
ts
Out
puts
AM
task1
task2
task3
taskN
Inpu
ts
Out
puts
AM
task1
task2
task3
taskN
Inpu
ts
Out
puts
AM
task1
task2
task3
taskN
Inpu
ts
Out
puts
Inpu
t list
AM
PROOF Transparency example
26/9/09Distributed Data Analysis and Tools 16
AnalysisManager
task1task2task3
taskNInpu
t cha
in
Out
puts
WorkerWorkerWorkerWorkerWorker
AliAnalysisSelector
TSelector
AM->StartAnalysis(“proof”)
MyAnalysis.CCLIENT
O2
On
O
O
O
Master
O2
O1
On
Terminate()
SlaveBegin()Process()SlaveTerminate()
Andrei Gheata
A large variety of frontends and backends It is great, but it may add confusion and
complicate user support
ATLAS Physicist Choices
26/9/09Distributed Data Analysis and Tools 17
Distributed analysis relies on the software installed in the remote nodes (e.g local cluster, Grid)◦ Experiment’s officially released software is taken care by
VOs◦ Installation procedures for big VO are well oiled ◦ Problem for small VOs / Individuals
Physicist’s add-ons and private analysis algorithms need to be send along with the job◦ Every user tool provides some level of support for this◦ The exact matching of the OS version/compiler (platform)
is required when sending binaries The later imposes strong constrains on the
platform uniformity of the different facilities◦ Local interactive service Local facility Grid
Managing the Software
26/9/09Distributed Data Analysis and Tools 18
Distributed Data Analysis and Tools 19
CernVM is a Virtual Appliance that provides a complete, portable and easy to configure user environment for developing and running analysis locally and on the Grid independently of physical software and hardware platform
It comes with the read only file system (CVMFS) optimized for software distribution◦ Very little fraction of the software is actually used (~10%)◦ Very aggressive local caching, web proxy cache (squids)◦ Operational in off-line mode
On-demand Install with CernVM
26/9/09
CernVM CVMFSCernVM CVMF
SCernVM CVMFS
LAN/WAN https://
The CernVM platform is stating to be used by Physicists to develop/test/debug data analysis◦ With a laptop you carry the complete development
environment and the Grid UI with you◦ Managing all phases of analysis from the same ‘window’
Ideally the same environment should be used to execute their jobs in the Grid◦ Validation with large datasets◦ Decoupling application
software from system software and hardware
Can the existing ‘Grid’ be adapted to CernVM?
Virtualization Role
26/9/09Distributed Data Analysis and Tools 20
Job splitting (parallelization) is essential to be able to analyze large data samples in a limited time◦ Very lasting jobs are more unreliable
Tools such as PROOF splits dynamically the analysis job at the sub-file level (packets) offering [quasi] interactivity with the user
All the other Grid submission tools provides parallelization by splitting the list of input files◦ Sub-jobs constrained by input data location
The more difficult part is the result merging◦ Standard automation of the most common cases◦ User intervention for more complicated ones
Job Splitting
26/9/09Distributed Data Analysis and Tools 21
Majority of today computing resources are based on multi-core architectures◦ Exploiting these multi-core architectures (MT, MP)
can optimize the use of resources (memory, I/O)◦ See V. Innocente’s presentation
Submitting a single job per node that utilizes all available cores can be advantageous◦ Efficient in resources, mainly increasing the
fraction of shared memory◦ Scale down the number of jobs that the WMS
needs to handle
Using Multi-Core Architectures
26/9/09Distributed Data Analysis and Tools 22
Distributed Data Analysis and Tools 23
Grouping data analysis is way to optimize when going over a large part or the full dataset◦ Requires the support of the framework, (a model)◦ …and some discipline
Examples:◦ Alice is using the AliAnalysisManager framework to
optimize CPU/IO ratio (85% savings reported)◦ LHCb is grouping
pre-selections intheir stripping jobs
Analysis Trains
26/9/09
At the time of HEPCAL-II resource estimation was an important issue◦ How much CPU time this analysis would take, what will be
the output data size, etc. In practice Physicists can estimate resources
pretty well since test analysis are performed with small data samples before submitting large jobs◦ Proper reporting of the ‘cost’ of each job with
standardized units could facilitate this estimation◦ In the old times of CERNVM a job summary with the CPU
time in ‘CERN units’ was printed in each job
Resource Estimation
26/9/09Distributed Data Analysis and Tools 24
Distributed Data Analysis and Tools 25
Job failures are very common (E.g. ~45% of the CMS analysis jobs do not terminate successful)◦ The reasons are very diverse (data access, stalled, upload data,
application failure,…) Proper reporting of job failures is essential for diagnosing
and handling them efficiently◦ Detailed monitoring, log files, etc.
Handling failures may imply to provide corrections in configurations, code, re-submission, managing site backlists, etc.◦ Automated correction actions can handled by severs (e.g. CRAB)◦ Scripting support available to users (e.g. Ganga)
Handling Job Failures
26/9/09
[1]: jobs.select(status=‘failed’).resubmit()[2]: jobs.select(name= ‘testjob’).kill()[3]: newjobs = jobs.select( status=’new’)[4]: newjobs.select( name= ’urgent’).submit()
Monitoring is essential for the users and also for administrators
Physicists may use the web based interfaces to find out information about their jobs◦ Each WMS have develop a very complete monitoring
tools◦ The details available are
really impressive (e.g. Panda Monitor)
Often the connectionwith the submissiontools is poor◦ Not well integrated
Monitoring
26/9/09Distributed Data Analysis and Tools 26
If the front-end submission tool understands the analysis application [framework] it can become extremely helpful to the users
E.g. the Ganga application component can◦ Setup the correct environment, collect user shareable
libraries, analyze configuration files and follow dependencies, determine inputs and outputs and register them automatically, etc.
The technical solution to achieve this is by implementing ‘plugins’ for each type of application
Application Awareness
26/9/09Distributed Data Analysis and Tools 27
Fundamentally the way analysis is being done has not changed very much◦ The initial dreams that the Grid will change dramatically
the paradigm has not happen◦ Parts of the analysis with large data jobs will be done in
batch and parts will be done more locally and interactively
Each collaboration has developed tools to cope with the large data and computational requirements and to simply the life of physicists◦ Turned out that the model/architecture of these tools are
very similar but they are not in common◦ The number of users of these tools are increasing rapidly
Summary
26/9/09Distributed Data Analysis and Tools 28