Upload
intel-it-center
View
105
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Today, data drives discovery. And discoveries create are key to creating sustained advantages. The better your critical workflows are able to create and access data – the better you’ll be able to discover new, innovative solutions to important problems, or to create entirely new products. More than ever before, data intensive applications need the sustained performance and virtually unlimited scalability that only parallel storage software delivers. Designed for maximum performance and scale, storage solutions powered by Lustre software deliver the performance at scale to meet today’s storage requirements. As the most widely used parallel storage system for HPC, Lustre-powered storage is the ideal storage foundation. But scalable performance storage by itself only solves half the problem. Today’s users expect storage solutions that deliver sustained performance, scale upward to near limitless capacities, and are simple to install and manage. Intel(r) Enterprise Edition for Lustre* software combines the straight line speed and scale of Lustre with the bottom line need for lowered management complexity and cost. As the recognized leaders in the development and support of the Lustre file system, Intel has the expertise to make storage solutions for data intensive applications faster, smarter and easier.
Citation preview
Intel Confidential — Do Not Forward
The Importance of Fast, Scalable Storage for Today’s HPCBill Webster
High Performance Data Division, Intel Corporation
Some Data About Data….
2.5 >80% ~90%
Quadrillion bytes of data created daily1
Of data today’s data is unstructured
Of the world’s data has been created within the last 2
years…
1 Source: IBM
The Case for Fast, Scalable Storage
Solving important problems drives technology investments
Fast storage is critical for maximum application performance
Lustre software was created for performance at large scale
Storage fueled by Lustre* is stable, flexible and highly efficient
Lustre is the most widely used parallel storage for HPC1
Over 60% of the fastest 100 HPC sites worldwide rely on Lustre2
1 Source: IDC research
2 Intel analysis of www.top500.org rankings, December 2013
* Some names and brands may be claimed as the property of others.
4
• Workloads are diverse and dynamic, and applications are compute or data-intensive – often both
• The value of HPC storage is measured by speed, scale & IOPS
• To meet these requirements, HPC storage needs to:
• Scale-out for increased I/O and capacity
• Perform I/O in parallel for maximum throughput
• Support virtually unlimited number of clients
• Commercial “HPC” needs the same level of performance
Lustre was architected for speed, scale and IOPS
HPC Places Unique Demands on Storage
Intel Confidential — Do Not Forward
HPC Storage SoftwareIntroducing the Lustre file system
5
6
What is Lustre*?
Open source, distributed, parallel, clustered file system
Designed for maximum performance at massive scale
POSIX compliant – key for supporting applications
Global, shared name space – all clients can access all data
Very resource efficient and cost effective
* Some names and brands may be claimed as the property of others.
7
What Makes Lustre* So Important?
Purpose-built for speed and scale: Speed: Unmatched performance
Openness: choice of storage platforms
Efficiency: Achieves +90% utilization of storage resources
Affordable: Low CAPEX and OPEX
Scale-out: Independently scale storage capacity and bandwidth
Stable and reliable: Backed by Intel, the worldwide leader in Lustre support
* Some names and brands may be claimed as the property of others.
8
Good Fit Applications for Lustre*…
Financial analysis – Modeling risk exposure & portfolio valuation
Geosciences - weather forecasting and climate modeling
Bioinformatics – genomics, proteomics, drug discovery
Energy - exploration, reservoir modeling, wind energy
Engineering - CAE, CFD and FEA for aerospace, automotive
SCIENCEANALYTICS ENGINEERING
* Some names and brands may be claimed as the property of others.
What Does a Lustre* Solution Look Like?
ManagementNetwork
High Performance Data Network(InfiniBand, 10GbE)
MetadataServers
Object StorageServers
Intel Manager for Lustre* (requires Enterprise Edition)
Object StorageServers
Object StorageTargets (OSTs)
Object StorageTargets (OSTs)
MetadataTarget (MDT)
ManagementTarget (MGT)
Lustre Clients – diskless compute servers
* Some names and brands may be claimed as the property of others.
Management Servers
ManagementNetwork
High Performance Data Network(InfiniBand, 10GbE)
MetadataServers
Object StorageServers
Intel Manager for Lustre* (requires Enterprise Edition)
Object StorageServers
Object StorageTargets (OSTs)
Object StorageTargets (OSTs)
MetadataTarget (MDT)
ManagementTarget (MGT)
Lustre Clients – diskless compute servers
1
* Some names and brands may be claimed as the property of others.
Storage Servers
ManagementNetwork
High Performance Data Network(InfiniBand, 10GbE)
MetadataServers
Object StorageServers
Intel Manager for Lustre* (requires Enterprise Edition)
Object StorageServers
Object StorageTargets (OSTs)
Object StorageTargets (OSTs)
MetadataTarget (MDT)
ManagementTarget (MGT)
Lustre Clients – diskless compute servers
2
* Some names and brands may be claimed as the property of others.
Compute clients
ManagementNetwork
High Performance Data Network(InfiniBand, 10GbE)
MetadataServers
Object StorageServers
Intel Manager for Lustre* (requires Enterprise Edition)
Object StorageServers
Object StorageTargets (OSTs)
Object StorageTargets (OSTs)
MetadataTarget (MDT)
ManagementTarget (MGT)
Lustre Clients – diskless compute servers
3
* Some names and brands may be claimed as the property of others.
Interconnect fabric
ManagementNetwork
High Performance Data Network(InfiniBand, 10GbE)
MetadataServers
Object StorageServers
Intel Manager for Lustre* (requires Enterprise Edition)
Object StorageServers
Object StorageTargets (OSTs)
Object StorageTargets (OSTs)
MetadataTarget (MDT)
ManagementTarget (MGT)
Lustre Clients – diskless compute servers
4
* Some names and brands may be claimed as the property of others.
The Results? Fast, scalable storage & I/O
ManagementNetwork
High Performance Data Network(InfiniBand, 10GbE)
Object StorageServers
Object StorageServers
Lustre Clients – diskless compute servers
Object StorageTargets (OSTs)
Object StorageTargets (OSTs)
MetadataTarget (MDT)
ManagementTarget (MGT)
• Over +2 TB/s achieved
• 500-750 GB/s production
• +80,000 IO/s
* Some names and brands may be claimed as the property of others.
Intel Confidential — Do Not Forward
Intel® Lustre SolutionsEnterprise Edition for Lustre* software
* Some names and brands may be claimed as the property of others.
16
Intel® Enterprise Edition for Lustre* Intel®
Manager for Lustre is the heart of all Intel EE for Lustre based solutions.
* Some names and brands may be claimed as the property of others.
17
Intel® Manager for Lustre*
The ‘dashboard’ canvas displays a variety of charts that illustrates performance levels and resource utilization.
Visual system status indictor
Configure, create and optimize Lustre file systems
Intelligent, intuitive logging – understand how your storage is performing quickly and easily
* Some names and brands may be claimed as the property of others.
Intel Confidential — Do Not Forward
A word about Big Data.
19
The Convergence of HPC and Big Data• Big Data problems are getting larger
• More compute power. More files. More capacity and data throughput
• MapReduce workloads are being added to HPC environments
• 1 in 3 HPC sites have deployed Hadoop1
• But MapReduce workloads run differently than typical HPC applications
• Compute nodes are diskless – no local storage
• By default, Hadoop expects local storage within each node
• Lustre storage accelerates the value of Hadoop • Improves application performance
• Boosts storage efficiency and lowers management complexity
* Some names and brands may be claimed as the property of others.
1 Source: IDC research
Intel® Enterprise Edition
for Lustre* software Includes theHadoop ‘adapter’ for Lustre
• Replacement for HDFS• Shared, parallel
storage optimizes performance• Lowers
management complexity• Maximizes
utilization of storage resources
* Some names and brands may be claimed as the property of others.
21
Case Study: Sanger Wellcome Trust
Challenge: Improved processes and lab equipment led to exponential increases in the volume of data being generated – but storage budgets were growing slowly.
Large data sets are difficult to proactively manage, and can easily overwhelm storage resources. Un-optimized storage had a direct, negative impact on application performance – slowing the time for breakthrough results.
Solution: Exploit the power and scale of HPC-class storage, powered by Lustre* software and supported by Intel.
Benefits provided: Openness – Broad array of storage vendors and
products Global namespace – all clients can access all data Performance – Upwards of 1 TB/s Capacity - Virtually unlimited file system and per
file sizes Confidence – Backed by Intel expertise with Lustre
• 10-15 TB of processed data weekly
• Processed data is small fraction of overall storage capacity
• Stored in iRODS data warehouse
• BAM or FASTA format files• Use pattern matching
algorithms like BWA and BLAST
• Lustre offers immense scalable capacity
• Now have 8 production Lustre file systems – and are planning to add more
• Performance was main goal – but scale, flexibility, efficiency were critical
* Some names and brands may be claimed as the property of others.
22
Thank You.
Intel Confidential — Do Not Forward