Upload
davis-middleton
View
36
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Application d evelopment using the ALMA Common Software. G.Chiozzi a , A.Caproni a e , R.Cirami e ,P.Di Marcantonio e , D.W.Fugate d , S.Harrington b , B.Jeram a , M.Pesko c , M.Sekoranja c , H.Sommer a , V.Wang a , K.Zagar c a ESO b NRAO c Cosylab d U.Calgary e INAF-AOT [email protected]. - PowerPoint PPT Presentation
Citation preview
SPIE 2006 – 6274-06
Application development using the
ALMA Common Software
G.Chiozzia, A.Caproniae, R.Ciramie,P.Di Marcantonioe, D.W.Fugated, S.Harringtonb, B.Jerama, M.Peskoc, M.Sekoranjac,
H.Sommera, V.Wanga, K.Zagarc aESO bNRAO cCosylab dU.Calgary eINAF-AOT
2SPIE 2006 – 6274-06 ALMA Common Software
Contents
• What is ALMA? Status?
• What is ACS?
• Where are we?
• Main features
• Platforms
• Looking back: considerations and lessons learned
• Projects and community
• Conclusion
• Free software• Asynchronous communication• Multithreading and distributed applications• Components and containers• …more in the paper
3SPIE 2006 – 6274-06 ALMA Common Software
The ALMA project
• 50x12m antennas
• 4x12m antennas optimized for total-power (Japan)
• Compact array of 12x7m antennas for higher sensitivity to broad, low-surface brightness features (Japan)
• Antennas can be moved to ~185 different pads.
• Maximum baselines from 150m to 18km, resolutions from 1” to <0.01” at 850 µm.
4SPIE 2006 – 6274-06 ALMA Common Software
Sensitivity and Angular Resolution
5SPIE 2006 – 6274-06 ALMA Common Software
OperationSupport
Facility (OSF)
ALMA Sites in Chile
Antenna Operations Site (AOS)
Santiago Central Office (SCO)
6SPIE 2006 – 6274-06 ALMA Common Software
Chile, Chajnantor plateau - 5000mChile, Chajnantor plateau - 5000m
Chile, Chajnantor plateau - 5000mChile, Chajnantor plateau - 5000m
7SPIE 2006 – 6274-06 ALMA Common Software
ALMA AOS Technical Building, Apr 2006
8SPIE 2006 – 6274-06 ALMA Common Software
ALMA Operation Site Facility (2900m – Atacama desert)
9SPIE 2006 – 6274-06 ALMA Common Software
ALMA Operation Site Facility today
10SPIE 2006 – 6274-06 ALMA Common Software
Antenna’s construction timeline
• First antenna delivered on site in Q1 2007.
• Further antennas coming from Q4 2007 with new antenna every 2 months.
• First science: end of 2009.
• Full completion: 2012.
11SPIE 2006 – 6274-06 ALMA Common Software
ALMA Computing
• Large but extremely distributed team
• 40 Full Time Equivalent for whole E2E sw
=> Total development effort to 2011 ~280 FTE-years
• Staff in 14 Institutions Europe/North America/Japan
12SPIE 2006 – 6274-06 ALMA Common Software
What is ACS?
ACS is a software infrastructure for the development of distributed systems based
on the Component/Container paradigm
• end-to-end: from data reduction to control applications• common application framework and design patterns, not
just libraries• well tested software that avoids duplication• make upgrades and maintenance reasonable• common development environment and tools
13SPIE 2006 – 6274-06 ALMA Common Software
Where are we?
• Developed for ALMA and used by several other projects.• ACS is based on a kernel of software contributed by
cosylab and developed for the ANKA Synchrotron. Our collaboration started in Trieste at ICALEPCS 1999.
• ½ of allocated development effort spent until now• Total allocated ~30 man years + additional external
contribution (~10).• 10th release• Extensively used in the field:
– ALMA Test Interferometer and labs– ALMA software integrations– Other projects
14SPIE 2006 – 6274-06 ALMA Common Software
Main Features
• ACS provides the basic services needed for object oriented distributed computing. Among these:– Transparent remote object invocation– Object deployment and location based on a
container/component model– Distributed error and alarm handling– Distributed logging– Distributed events
• The ACS framework is based on CORBA and built on top of free CORBA implementations.
15SPIE 2006 – 6274-06 ALMA Common Software
Supported Platforms
• Operating system: Linux, RH-E + other flavours
• Real-time: RTAI and VxWorks
• Languages: C++, JAVA, Python
• CORBA middleware: TAO (& ACE) (C++), JacORB (Java), Omniorb (Python), CORBA services.
• Embedded ACS Container: PC104, Debian, 300Mhz Geode, 256MB RAM, 256 MB flash (CosyLAB microIOC)
16SPIE 2006 – 6274-06 ALMA Common Software
LGPL and free software
• The strategy to provide common features to our users is:– Use as much as possible open-source tools, instead of implementing
things. • Do not reinvent the wheel• Reuse experience of other projects• Do not pay for licenses• Support from user’s community
– Identify the best way to perform a task among the possibilities
– Wrap with convenience and unifying APIs• ACS is distributed under LGPL license• Free software has also drawbacks:
– Fast lifecycle and support only of the newest– Free/commercial support– Documentation not as good as commercial products
17SPIE 2006 – 6274-06 ALMA Common Software
Asynchronous communication
• ACS provides 4 ways to communicate between Components:– Synchronous method calls– Asynchronous method calls (callbacks)– Notification channel (publisher-subscriber)– Bulk data (hi-performance streaming, 60
Mbytes/s)
18SPIE 2006 – 6274-06 ALMA Common Software
Notification channel usage
• Notification Channel has been used much more than expected to:– Synchronize subsystems: synchronization events– Publish data to multiple subscribers, not known a priori
• Preferred to callbacks because easier to implement• Very easy to use: evolution of ACS implementation
classes, coding conventions and tools• Drawbacks:
– more difficult to keep track of dependencies– Circular dependencies make a system more fragile
• Stick to unidirectional dependencies!Callbacks can help reversing dependencies.
19SPIE 2006 – 6274-06 ALMA Common Software
Multithreading and distributed applications: the problems
• Component based systems like ACS are intrinsically:– Highly distributed– Asynchronous and multithreaded
• Methods can be called at any time, in parallel to the same or other methods.
• Developers need to be well aware of concurrency issues
• Debugging is more problematic than with single threaded applications
20SPIE 2006 – 6274-06 ALMA Common Software
Multithreading and distributed applications: the solutions
ACS addresses these problems by means of:• Threading management classes• Centralized logging• Distributed error handling• Dynamic loading/unloading of Components• Component’s simulation• Tools for automatic regression tests
involving distributed Components
21SPIE 2006 – 6274-06 ALMA Common Software
Components and Containers
• In recent releases we have improved decoupling of Components and Containers:– Container services– Dynamic components– Tasks
• Configuration of the runtime system is being re-discussed. We need better tools to keep aligned the configuration of the various ALMA deployments
• We have started concentrating on the deployment of Containers: requirements form pipeline and offline subsystems.
22SPIE 2006 – 6274-06 ALMA Common Software
ACS installations and projects
23SPIE 2006 – 6274-06 ALMA Common Software
The ACS community
HPTHexapod Telescope(Germany → Chile)
SardinianRadioTelescope(Italy)
OAN 30m (Spain)
ALMA(Chile)
ANKA (Germany)
APEX (Chile)
24SPIE 2006 – 6274-06 ALMA Common Software
Conclusion
• Core concepts are now very stable. • Most packages are implemented, but not 100%• Most of the work comes now from change requests and feedback and
not from long term planning items• ACS is significantly improving in stability, usability and global
consistency.• Analysis of the applications developed and of the feedback from the
developers tells us what is good, what is bad and how to improve.More in the paper!
Having a user’s base in addition to our main project has provided important feedback,
cross-fertilization of concepts and ideas and contributed to software quality
25SPIE 2006 – 6274-06 ALMA Common Software
Questions (& Answers)
http://www.hq.eso.org/projects/alma/http://www.eso.org/projects/alma/develop/acs
6267-02 The Atacama Large Millimeter Array: overview and status
6274-06
Application development using the ALMA common software
6274-07 Integrating the CERN laser alarm system with the ALMA common software
6274-45
Time synchronization within the ALMA software infrastructure
6274-54 Bulk data transfer distributer: a high performance multicast model in ALMA ACS
SC-644 Course: An Introduction to Scalable Frameworks for Observatory Software Infrastructure
6267-47 Phase correction studies for ALMA,
6271-14 System engineering in the ALMA project,
SPIE Papers on ALMA/ACS
26SPIE 2006 – 6274-06 ALMA Common Software
Reserve slides
27SPIE 2006 – 6274-06 ALMA Common Software
Observing in different bands
• 10 Frequency bands coincident with atmospheric windows have been defined.
• Bands 3, 6, 7 and 9 will be available from the start.
• Bands 4, 8 and 10 will be built by Japan.
• Some band 5 receivers built with EU funding.
28SPIE 2006 – 6274-06 ALMA Common Software
Highlights of the Last Two Years
New in ACS for:
• developers to use in their code: libraries, convenience classes, utilities to improve the quality of the code
• test/integration/administrators, transparently from application code (i.e. in principle transparent to Component developers)
29SPIE 2006 – 6274-06 ALMA Common Software
Component Container Evolution/Cleanup
• Container Services– Full separation between
Container and Container Services
– Cleaner interfaces– Easier to replace Container
implementation– The most important services
provided now by the ContainerServices are…
• Component life cycle– Plain instantiation of
Components not sufficient– Standard lifecycle state
machine introduced for the Container to manage Components
lifecycleinterface:init()cleanUp()
container
Com
p
Com
p
functionalinterface:observe()
container service interfacegetComponent(“CompB”);Logger getLogger();
30SPIE 2006 – 6274-06 ALMA Common Software
Master Component
• ALMA subsystems interact with the Executive.• Executive treat all in the same way.
Lifecycle for subsystems, not only components
• Fits smoothly into acs concept: - each subsystem needs a mastercomponent - it is a component with a specific interface - ACS defines the underlying state machine
• Implementation:
- a generator (using open-architectureware) maps UML to state machines- generator creates convenience base classes - state machine has been refined in a couple of design iterations
• The introduction of the Master Component has been very effective.
• Cost of prototype generator not higher than cost of developing the MasterComponent in code
31SPIE 2006 – 6274-06 ALMA Common Software
Event Handling and Notification Channel
• Events are widely used in ALMA for synchronization and asynchronous, *-to-* communication.
• Decoupling of Consumers and Suppliers• Very easy interface:
– Supplier classes– Consumer classes
• Contract based on IDL data structures.• Strong naming conventions and
checking tools• Administrative interface• Quality of service
ALMA Operator
(Java)
CAN bus devices (C++)
ALMA prototype antenna at the ATF
High-level Usage of ALMA Events
Secured Host for CORBA
Notification and Naming
ServicePipeline
processes(Python)
Observation Scheduler
(Java)
Astronomer Consuming
ALMA Events as
They Occur
CORBA Notification
Service Running On
an Unsecured
Server
Telescope Calibration
(C++)
32SPIE 2006 – 6274-06 ALMA Common Software
Threading Support
• Many Components have a multi-threaded structure
• Management of threads was a source of problems
• Developed easy-to-use threading classes:– Override a run() method– Use the thread manager
• Based on ACE Threads in C++, concurrent library in Java
Component
ThreadManager
create<ThreadClass>()add()destroy()suspendAll()resumeAll()terminateAll()
11
Thread
suspend()resume()terminate()onStart()onStop()
0..n
0..1
ControlLoopThread
onStart()runLoop()onStop()
OneShotThread
onStart()run()onStop()
0..n
0..1
1
33SPIE 2006 – 6274-06 ALMA Common Software
Concurrency patterns
• Concurrent execution is allowed.Access to resources is protected.
• The newly invoked request shall be rejected• The newly invoked request shall supersede
the old one• The newly invoked request is queued and
executed after completion of the previous one
34SPIE 2006 – 6274-06 ALMA Common Software
A change in paradigm!
Real-time Support
• VxWorks → TICS• Entire LCU in real-time OS• ACS provides complete
Container/Component in real-time environment
• Support will probably remain for other projects (VLT)
• RTAI → ALMA• RT Kernel inside Linux• Component not real-time ↔
small time-critical functions in Kernel.
• Less code in real-time, but more complex to debug
• ACS provides easy:– Communication with kernel
modules– Logging from kernel modules– Kernel module management
35SPIE 2006 – 6274-06 ALMA Common Software
Bulk Data Transfer
• Requirements from the correlator:– 64 MB (megabyte)/sec
• Based on CORBA A/V streaming service
• TAO C++ implementation
• Very easy interface, based on our use cases
• No CORBA A/V visible
Achieved Performance• Gigabit P2P Ethernet• BD throughput around 800
Mbits/s (~100 MB/s) requirements fulfilled
• CORBA throughput around then 500 Mbit/s (~ 55 MB/s)
• Estimated gain in the throughput around 30%
36SPIE 2006 – 6274-06 ALMA Common Software
Simulation
• Why simulation?– Distributed development
– Features or entire subsystems not yet available
– Test a subsystem in isolation
• Simulation of Components from IDL interface spec.
• Dumb default or “intelligent” simulation
SPIE 2006 – 6274-06 ALMA Common Software
CERN Laser
ACS Alarm System: LaserFeasibility prototype to re-use
CERN Laser Alarm System
Keep the same API
Reuse the LaserAlarm Console
ACS Component/Cont
ainer replaces J2EE
acsjms implements jms for ACS on top
of Notification Channel
IDL interfaces replace EJB interfaces
Reimplementation
of Laser interfaces
The challenge: reuse a complete subsystem/service in a verydifferent software infrastructure
38SPIE 2006 – 6274-06 ALMA Common Software
Tasks and Parameters
• ACS is used in ALMA also as data reduction infrastructure framework• Requirement:
data reduction to be started as a stand-alone process.A program which starts-up, performs processing and shuts down.
• Implementation:– Stand alone executable– Static container– Works with and without ACS suite
• Input parameters are complex data sets:– Parameter set definition (xml)– Parameter set instantiation (xml)– Validation – Parsing
39SPIE 2006 – 6274-06 ALMA Common Software
GUIsA LabView
prototype has been
implemented
ACS Supports ABeans development with an
Eclipse plug-in
Some projects are using Qt
Different projects and differentsubsystems have different requirements!
40SPIE 2006 – 6274-06 ALMA Common Software
Performance and Benchmarking
• ACS has performance requirements to satisfy
• Changes to code and upgrade of external libraries can affect performance
• Created performance measurements and reporting framework
• Performance of Component to Component communication, notification channel, logging system…
http://www.eso.org/~almamgr/AlmaAcs/Performance/BenchmarkDoc/
41SPIE 2006 – 6274-06 ALMA Common Software
Performance
1471
2049
996
1659
2105
1065
0 0
125
981
1133
334
546
2013
9
1000
0 26
0
500
1000
1500
2000
2500
cpp py java cpp py java
Event Channel Throughput (100 bytes)
Average Throughput
Best Throughput
Worst Throughput
0
5000
10000
15000
20000
25000
Logs per second
AverageThroughput
BestThroughput
WorstThroughput
C++ Logging
10 bytes
25 bytes
100 bytes
1000 bytes
5000 bytes
Average C++ throughput: 1500 event/s (100 bytes)
Average C++ throughput: 3500 event/s (100 bytes)