Upload
dinah-wilcox
View
214
Download
0
Embed Size (px)
Citation preview
ATLAS OpenLab IdeasMario Lassnig (CERN), Sami Kama (SMU),
Eric Lancon (CEA Saclay), Simone Campana (CERN),Graeme Stewart (U Glasgow), Walter Lampl (U Arizona),
[email protected]@cern.ch
Disclaimer
● These slides represent
… a collection of preliminary ideas… by a limited number of people… from various ATLAS activities.
● We consider these topics
… long-term,… research-driven,… and disruptive.
2
Introduction
As you probably already know upgrades to the LHC will lead to ● More collisions and more data in next run periods● Which means more computing and more storage requirements● Which can be addressed by 3 things
3
The challenge is to achieve more computing and storage with a similar or less budget than today
Topics● Computing on detector● Platform independence
○ New Hardware
● Software Tools○ Training○ Profiling○ Checking/Validation
● Networking● Machine learning
4
● Future (~current) detectors contain embedded compute resources based on commodity chips (such as FPGA) with custom designs
○ Custom programs
○ Hard to debug/design/change
● Would it be possible to use radiation hard/resistant extremely low power generic chips supporting computing standards
○ IoT chips?
○ Fast intercoms
● Use detector itself as a massively parallel computer?
Computing on Detector
5
Platform independence
● Most LHC code is bound to x86 architecture. That is leaving out○ GPGPUs○ ARMs○ PowerPCs○ Other architectures?
● Multitude of APIs, both open and proprietary○ Vulkan, OpenCL, OpenMP, CUDA, Mantle, …
● Porting software to such platforms will enable more compute resources○ Geant4 on PHIs○ Many-core ARM racks○ Backfill mode HPCs
6
Software tools
● Better utilization of existing platforms, also including training○ Vectorization○ Threading
● Profilers/tools to guide design choices○ Intel tools, GooDa, OProfile, …○ Others?
● Compilers○ Clang, Intel○ Others?
● Static checkers (Coverity), continuous builds, dead/obsolete code identification, Memory profilers?
● Version control, code review, …
7
Networking
● Network is currently mostly used to move data between storage systems○ Jobs run where the data is located
○ Failover to retrieve data from remote sites is possible, though not the primary access path
● Two potential improvements that go hand-in-hand○ Improve the distribution of data to free up network resources
○ Make extensive use of the now available network resources for remote data access
● Two focus areas○ Long flows: Content-delivery networks (CDNs)
○ Short flows: Edge computing
8
Networking — CDNs
● Multimedia companies are the incumbent users and innovators of CDNs○ YouTube/Netflix (VoD), Steam/Origin (Videogames), Twitch/NvidiaGRID (Video broadcasting), …○ Data volumes are comparable to our datasets, sometimes even exceed it!○ But, we don't have a billion USD computing budget …
● Learn from industry best-practices○ Geographically dispersed caching strategies based on social metrics, not technical ones○ Multicast of datasets to multiple destinations — Feasible over the WAN? LAN?○ Approximate correctness — How to deal with partially available / in-flight files?
● Why don't we just do X like company Y?○ Our macro access patterns do not follow the common assumptions — Neither Power nor Zipf's law○ Large-scale LRU-style caching thus not effective
■ Data delivery is strongly dependent on analysis workflows■ If it has been read recently, most likely it's not needed again for some time
9
● Moving towards a remote-IO world requires an even better network○ Make the network aware about the storage and computation demands○ For many data access paths, we only need parts of the file
● All our data access software has intelligence built in○ Usually, data is serialised hierarchically using ROOT trees○ Access path libraries highly optimised — only read selected branches or single events
● Can we offload some of this intelligence into the hardware?○ Reduce latency and increase bandwidth for random IO○ Bring the data closer to the NIC — Cache the file on/close-to the NIC via 3D NANDs○ Programmable NIC — Structure of the 3D NAND and layout of the serialised bytestream○ Edge computing — no need to go "into" the datacentre anymore for the duration of file interaction
Networking — Edge computing 1/2
10
● We are not benefitting from Software-Defined Networks (SDNs) at the moment● Our WANs are optimised for long-flows with large MTUs● Short flows for remote random IO are disruptive
○ IP fragmentation○ Packet reordering
● Decrease of bandwidth, increase of latency● Could a NIC/Switch/NIC infrastructure automatically respond to short flows?
○ Traffic shaping○ VLAN tagging○ IETF Geneve — Automatic network virtualisation
Networking — Edge computing 2/2
11
Machine Learning
● Machine Learning for physics analysis○ E.g., track and vertex reconstruction, physics process and entities classification, …○ Follow the activities here: http://iml.cern.ch○ Next workshop: http://indico.cern.ch/event/395374/
● Machine Learning for computing○ Improve resource usage to reduce costs○ "Easy" to do in constrained environments — global heterogeneous systems are a different story○ Three main areas
Storage — Dynamic data placementNetwork — Data access pathsHuman — Operational support 24/7
12
Machine Learning — Storage and network
● Optimum placement of files given a storage volume constraint○ Solved theoretically some time in the 1960s… but why is it still difficult?
● We do not have full system state knowledge for optimal point-in-time decisions○ Our current dynamic data placement / popularity system is not handling all of our cases○ Approximate a "good" solution — unfortunately orthogonal cases of "good"
● How can machine learning help?○ Classification — e.g, which data should be replicated more often, which data should be deleted?○ Regression — e.g., what's the expected bandwidth for a link? expected IO rate for a storage system?○ Clustering — e.g., which data should be kept together for a particular analysis?○ Reduction — e.g., which infrastructure metrics influence the placement policy the most?
● Eventually, a hybrid ML model using all these results should converge towards○ the least amount of data that has to be stored resident, ○ so we can use the remaining capacity for cached/on-demand compute
13
Machine Learning — Human
● Emergent properties are our daily business — the result of complex systems○ System A degrades, System B fails as a result, System C tries to recover B's failure, but causes …
● Manual post-facto analysis, post-mortems, operational procedures take time● Through supervised or reinforcement learning, teach the system to
○ Automate anomaly detection○ Escalate preventive measures through anomaly prediction○ Submit early warnings to shifters/operators
● Through unsupervised learning○ Detect correlations between system behaviours○ Classify categories for reactive operations
● Large overlap with data mining community — cooperation will be very welcome
14
THANK YOU