Upload
magda
View
23
Download
0
Embed Size (px)
DESCRIPTION
Future of Analysis Environments Personal views. Rene Brun CERN. Type of data ? Any type ? PAW-like ntuple?. No restrictions. Data. Restricted to histogramming & visualisation ?. Analysis. Structure ? What is modularity? Abstract interfaces? Languages? Parallelism?. Coherent - PowerPoint PPT Presentation
Citation preview
Acat 2000 18 October Rene Brun 1
Future of Analysis EnvironmentsPersonal views
Rene BrunCERN
Acat2000 16 Octobre Rene Brun 2
Data
Analysis
Packages
Type of data ?Any type ?
PAW-like ntuple?
Restricted tohistogramming
& visualisation ?
Structure ?What is modularity?Abstract interfaces?
Languages?Parallelism?
No restrictions
CoherentFramework
ofCooperatingsystemsI/O + UII/O + UIObject Bus
Acat2000 16 Octobre Rene Brun 3
Type of Data in the past
Event data managed by data structure (bank) managers (zebra, bos..) a bank is like an object
Final physics data in ntuple format (paw) ntuple is like a table in a RDBMS
Run/File catalog with adhoc tools (fatmen) calibrations, geometry, etc, adhoc tools
(hepdb)
Acat2000 16 Octobre Rene Brun 4
Type of data: trends-1
Put everything in an Object Data base like Objectivity
Choice of RD45 project Many experiments initially following this
line Abandonned by most experiments
recently Interesting experience with Babar Solution not suited for PAW-like analysis
Acat2000 16 Octobre Rene Brun 5
Type of data: trends-2
Put write-once data in an object store like ROOT in Streamer mode
Use a RDBMS for : Run/Event catalogs Geometry, calibrations eg with ROOT<->Oracle interface
http://www.phenix.bnl.gov/WWW/publish/onuchin/rooObjy/
or with ROOT <-> Objectivity interface http://www.phenix.bnl.gov/WWW/publish/onuchin/RDBC/
Use ROOT split/no-split mode for phys analysis
Acat2000 16 Octobre Rene Brun 6
Framework basic requirements
Dynamic Linking AND Unlinking of user
shared libs
User can define new classes interactively
Interpreted code can call compiled code
Compiled code can call interpreted code
Scripts can be dynamically
compiled/linked
This is the normaloperation mode
Interesting featurefor GUIs &
event displays
Script CompilerRoot > .x file.C++
Acat2000 16 Octobre Rene Brun 7
Fundamental features of an Object-Oriented Framework
Functions
Data DDL
KUIPCDF
Data
FunctionsRTTI
Persistencyservices
User Interface
Procedural World
OO World
ROOTC++
C++Java
Acat2000 16 Octobre Rene Brun 8
Acat2000 16 Octobre Rene Brun 9
Acat2000 16 Octobre Rene Brun 10
Automatic Code generation
Hand-writtencode
Automaticallygenerated
code40 per centin ROOT
Algorithms Meta information
Used by I/O, GUI,Inspectors, browsers interpreter, html, etc
Acat2000 16 Octobre Rene Brun 11
Java - ROOT interface(s)
Read ROOT files from a java program see Tony Johnson will be simpler with new ROOT 2.26 supporting
automatic schema evolution Call ROOT classes from a java program
work by Subir Sarkar (hand-coded JNI interface)
could use JACO (see Tony Johnson) or better use a variant of rootcint (rootjava)
Generate ROOT-Java data classes TTree::MakeJava like TTree::MakeClass
Acat2000 16 Octobre Rene Brun 12
Java - ROOT interface (s) import root.*; TROOT troot = new TROOT("simple", "Simple Java to root interface");
TApplication app = new TApplication("ROOT Apllication"); System.out.println("TApplication .....");
TBenchmark bench = new TBenchmark(); bench.Start("Hsum");
TRandom random = new TRandom();
TH1F total = new TH1F("total","total distribution",100,-4.0F,4.0F); TH1F main = new TH1F("main","Main contributor",100,-4.0F,4.0F); TH1F s1 = new TH1F("s1","first signal",100,-4.0F,4.0F); TH1F s2 = new TH1F("s2","second signal",100,-4.0F,4.0F);
total.Sumw2(); // this makes sure that the sum of squares of weights will be stored total.SetMarkerStyle(21); total.SetMarkerSize(0.7F); main.SetFillColor(16); s1.SetFillColor(42); s2.SetFillColor(46);
TCanvas canvas = new TCanvas("c1","The HSUM example",200,10,600,400); canvas.SetGrid();
and so on.
Acat2000 16 Octobre Rene Brun 13
Java - ROOT interface (s)
It is important to cooperate to: facilitate the Java/C++ integration
Could be interesting for applications where performance is not an issue (event display)
However, I do not believe in a solution where the bulk of data is stored as C++ objects and analyzed with a Java-based system. It must fun but very inefficient what do you gain?
Acat2000 16 Octobre Rene Brun 14
Languages for data analysis
Data analysis requires an efficient access to objects (both data and functions).
It requires a powerful programming language: in interpreted mode in compiled mode Transition from interpreted mode to compiled
mode must be smooth and transparent. A scripting language is not the solution Python is not a solution
Acat2000 16 Octobre Rene Brun 15
GUI
Commands
Interpretedscripts
Compiledscripts
Acat2000 16 Octobre Rene Brun 16
A role for commercial components ?
Data bases Oracle very likely, others NO
Graphics/UI NO but YES for interfaces to commercial systems
Special algorithms like fitting strong doubts
I strongly believe in the advantages of Open Source systems Large news/discussions groups
Acat2000 16 Octobre Rene Brun 17
Our current work
Continuous consolidation of the system Automatic schema evolution Common GUI between Unix and Windows Upgrade UI to new style GUI Tree query processor reimplemented
using the new TSelector facility. PROOF (Parallel ROOT Facility) (see next) Interface with other systems, eg G3, G4 Support thousands of usersSupport thousands of users
Acat2000 16 Octobre Rene Brun 18
The OODBMS dreams
SelectionParameters
FederationDB1
DB3
DB4
DB5
DB6
CPU
Local
Remote OODB
DB2
Acat2000 16 Octobre Rene Brun 19
ROOT/PROOF and GRIDs
SelectionParameters
DB1
DB4
DB5
DB6
CPU
Local
Remote
Procedure
Proc.C
Proc.C
Proc.C
Proc.C
Proc.C
PROOF
CPU
CPU
CPU
CPU
CPU
TagDB
RDB
DB3
DB2
Acat2000 16 Octobre Rene Brun 20
What is a modular system ?
Modularity is a nice word. Everybody claims to be modular.
a system with many small and independent modules? where is the object bus? what is the cost of assembling all the pieces in
a real application? a hierarchical system with easily
replaceable components? but with many internal dependencies
Acat2000 16 Octobre Rene Brun 21
What is a modular system ?
a system with well defined interfaces? where is the object bus? passing data by reference or value? Collections/Folders?
a system easy to understand (user view) ? end users like monolithic systems doing everything
a system easy to maintain (developer view) ? a system that can easily be integrated into other
systems? a theoretical system and no implementation?
Modularity is difficult to achievein a growing system.
Acat2000 16 Octobre Rene Brun 22
Modularity and Dependencies in ROOT
By dependency, we mean binary dependency,when one module (shared library) forces the loading of another library. In the past this was a weak point of the system. For example,if you wanted to produce in a batch program some histograms you were required to link your app with all ROOT graphics libs up to X11.Like with PAW
This problem was rightly pointed out by many users as something to befixed. We did this. In the current system only a small set of baselibraries are needed when creating e.g. histograms, in batch mode.Besides the decoupling of the graphics system many more abstract layerswere introduced to decouple other parts of the system: histogram fromits painter, the tree storage system from its query mechanism (treeplayer),fitting from minuit, etc. Following this reorganization none of the lowerlevel libraries depend anymore on higher level libraries. These changesimproved besides modularity also overal system performance.
Acat2000 16 Octobre Rene Brun 23
Acat2000 16 Octobre Rene Brun 24
Acat2000 16 Octobre Rene Brun 25ALICE 13/3/2000 Software Panel Computing Review 6
Typically5 yearsbetween
alpha releaseand mature
product
RelativeImportance
Acat2000 16 Octobre Rene Brun 26
ROOT Quality assurance
Acat2000 16 Octobre Rene Brun 27
A growing users base
Acat2000 16 Octobre Rene Brun 28
Summary We are implementing a powerful system
designed for large scale data analysis with parallel architectures in a GRID context.
The ROOT system is a framework providing a coherent object bus in DAQs, simulation, reconstruction and analysis phases.
We have learnt a lot in the past 5 years, also following our 10 years of experience with PAW.
Developing the system and at the same time supporting a rapidly growing users base is a demanding but also rewarding job.