Upload
mark-madsen
View
420
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Cloud computing is creating a new era for IT by providing a set of services that appear to have infinite capacity, immediate deployment and high availability at trivial cost. These are all appealing to someone running a data warehouse when data volume, use and cost are growing at a rapid rate. Today most organizations look at cloud as a way to lower data center and IT costs. While cost reduction is a real benefit, there is more value in the increased scalability, speed to procure (and give up) resources, and ease of delivery in cloud environments. Database workloads are particularly challenging in the cloud. Cloud deployments beyond a moderate scale favor shared-nothing database architectures designed to run transparently in a multi-node environment. We are still in an early period of standardization and design of software to run in the cloud. Not all workloads are suitable for deployment on a collection of small virtualized servers today. Business intelligence and analytic database workloads fall into this area, raising the importance of analysis for fit with public and private cloud options.
Citation preview
Cloud Computing
" …a model for enabling ubiquitous, convenient, on‐demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction."
What people see: seemingly infinite resource to apply to performance problems on short notice and at low cost
http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf
Generators: Expensive Product
Generators: Commodity Product
Generators as a Service: Electricity
The Natural Process of Commoditization
Simon Wardley, A Lifecycle Approach to Cloud Computing
Managing Hardware Resources
Systems are sized for the peak workload, with the expectation that it will fluctuate.
Demand
Capacity
Time
Resources
Idle resources = low utilizations = money wasted
Demand
Capacity
Time
Resources
Idle resources
Not enough resource is (much) worse than too much.
Demand
Capacity
Time
Resources
Maintaining capacity just above the peak as workloads increase is the art of capacity planning.
One problem is the large step when upgrading to more resources, equating to a large capital cost.
Demand
Capacity
Time
Resources
Great performance after an upgrade, bad performance at year‐end before the next upgrade.
A steady decline can be worse for user perception than constant mediocre performance.
Demand
Capacity
Time
Resources
Idle
What everyone would like: elastic capacity
Pay for the resources you use when you use them, not up front for the entire system that supplies them. Just like electricity.
Capacity
Time
Resources
Demand
Five Key Cloud Characteristics
1. On‐demand self‐service
2. Network accessibility
3. Resource pooling
4. Measured service
5. Elasticity
Cloud Architecture
Started with virtual machines
Lots of servers, lots of virtual nodes. But in public clouds:• Storage can, often is separated
• VMs don’t run across nodes
• Great for OLTP, not so much for BI
• Implies new software architectures
Database Architecture and the Cloud
Virtualizing on a single server makes no sense for a database that needs the full resources.
If your server hardware environment looks like this:
then it’s probably good for lightweight transaction processing, simple storage and retrieval, procedural computations on data.
If you want to use it for a data warehouse, you need:
• A shared‐nothing database• A proper storage architecture• Dynamic licensing
Three Models of Deployment
3. Private cloud
1. Public cloud
2. Leased / hosted private cloud
Benefits and Rationale
Why did you / are you considering a move to the cloud?
Two primary reasons:▪ Cost reduction▪ Reduced time to value
IBM global survey of IT and line-of-business decision makers
Unexpected Benefits
Speed to deploy:▪ opex vs capex means faster approvals and less planning
▪ Provision on‐demand means ability to do all those small projects that needed resources and staff to set up
Performance management:▪ Resource‐oriented fixes done in minutes
▪ Instead of static resources and fluctuations in performance, set static SLAs and fluctuate the resources
Administration:▪ No more hardware or operating system upgrades to deal with
Public Cloud Challenges
1. Multi‐tenant servers and unpredictable I/O performance
2. Legal problems:▪ Data co‐mingling in multi‐tenant databases
▪ Data locality and national laws3. Cloud compatibility for data
integration and data management tools (environment, data movement)
4. Security requirements
When these are a concern, private clouds may be the better option today.
What are manager preferences?
9%
21%
52%
44%
39%
35%
Data mining, text mining, or other analytics
Data warehouses or data marts
Prefer not to use cloud
Private cloud preference
Public cloud preference
IBM global survey of IT and line-of-business decision makers
Comparison of Models
New and growing use cases drive the need to expand
The use cases are now interactive applications, lower latency data, complex analytics and rapidly growing data volumes.
Image Attributions
Thanks to the people who supplied the images used in this presentation:
Commoditization diagram – from A Lifecycle Approach to Cloud Computing, © Simon Wardleytesla coil train ‐ http://www.flickr.com/photos/winterhalter/27364687Amazon Virtual Private Cloud diagram‐© Amazon, Inc..caged_tower_melbourne.jpg ‐ http://www.flickr.com/photos/vermininc/2227512763
About the PresenterMark Madsen is president of Third Nature, a technology research and consulting firm focused on business intelligence, analytics and information management. Mark is an award-winning author, architect and former CTO whose work has been featured in numerous industry publications. During his career Mark received awards from the American Productivity & Quality Center, TDWI, Computerworld and the Smithsonian Institute. He is an international speaker, contributing editor at Intelligent Enterprise, and manages the open source channel at the Business Intelligence Network. For more information or to contact Mark, visit http://ThirdNature.net.
About Third Nature
Third Nature is a research and consulting firm focused on new and emerging technology and practices in business intelligence, analytics and performance management. If your question is related to BI, analytics, information strategy and data then you‘re at the right place.
Our goal is to help companies take advantage of information-driven management practices and applications. We offer education, consulting and research services to support business and IT organizations as well as technology vendors.
We fill the gap between what the industry analyst firms cover and what IT needs. We specialize in product and technology analysis, so we look at emerging technologies and markets, evaluating technology and hw it is applied rather than vendor market positions.