Cloud Computing Primer: Using cloud computing tools in your museum

Preview:

DESCRIPTION

A presentation by Robert Stein, Charlie Moad and Ari Davidow on cloud computing for the Museum Computer Network Conference in Portland, OR November, 2009

Citation preview

Cloud Computing Primer:

Steps for using the Cloud in Your Museum

Ari Davidow – jwa.orgCharles Moad – imamuseum.orgRobert Stein - imamuseum.org

wikipedia on cloud computing via wordle.net

CLOUD COMPUTING

CLOUDAPPLICATIONS

UTILITYCOMPUTING

Cloud Applications

... eliminate the need to install and run the application on the customer's own computer, thus alleviating the burden of software maintenance, ongoing operation, and support.

Cloud Applications

Utility Computing

… a style of computing where scalable and elastic IT-related capabilities are provided as a service to external customers using Internet technologies. 

Utility Computing

Buzz Worthy

Search Trends for “cloud computing”Via Google Search Trends

Source “Cloud Computing Gains in Currency”, Pew Research, May 2008 http://pewresearch.org/pubs/948/cloud-computing-gains-in-currency

69% of Americans use cloud computing

services

image courtesy of gartner.com

Gartner’s Hype Cycle for 2009

21% of companies are piloting SaaS

applications up from 18% last year

– Forrester, Feb 2009

In Forrester’s List of the Top 15

Technology Trends

State of Cloud Computing

Forrester feels that cloud computing is one of the Top 15 Technology Trends and that it warrants investment now so you can gain the experience necessary to take advantage of it in its many forms to transform your organization into a more efficient and responsive service provider to the business

-Forrester, October 13, 2009,

http://blogs.forrester.com/it_infrastructure/2009/10/cloud-computing-belongs-on-your-3year-roadmap.html

Gartner’s #1 Strategic Technology

Area for 2010

State of Cloud Computing

Cloud computing isn't going to be vapor much longer… It's complicated, poses security risks, and computing technology companies are latching onto the buzzword in droves, but the phenomenon should be taken seriously…

-Gartner - October 20,2009

http://news.cnet.com/8301-30685_3-10378782-264.html

Concerns about SaaS

Pros of Cloud Computing

- Fast Deployment- Lower cost / No Capital Expense- Reduced IT maintenance- Elastic and Unlimited Scalability- Energy Efficiency- Reliability (Service & Data)- Better Resource Utilization

Cons of Cloud Computing

- Information Security- Physical Security- Long Term Offline Storage- Bandwidth Bottleneck- Potential Vendor Lock-in- Lack of control during downtime

Amazon Web Services (AWS) Overview

Amazon Web Services (AWS) Infrastructure Services

Elastic ComputeCloud (EC2)

Simple Storage Service (S3)

SimpleDBSimple QueueService (SQS)

Elastic Block Store (EBS)

ElasticMapReduce

CloudFront Content Delivery

Network

Relational Database

Service (RDS)

Virtual Private Cloud

How to make choices about Cloud Computing

- What sort of security requirements fit your data?

- How granular is the information you’re working with? (documents, images, video?)

- Where are your likely performance bottlenecks? (compute, bandwith, latency)

- What is your IT staff like? (small but flexible, large)

Jungle Disk

Jungle Disk - $20 / www.jungledisk.com

Requires

- Amazon S3 account, and the requisite keys:

- JungleDisk software installed

Usage

- Backs up at scheduled times- Can back up more than one machine, or to more

than one backup set- The first backup may take days – or longer. No

problem. The software gracefully goes to sleep when you shut down or hibernate; resumes upon waking until done

- Can retrieve files using drag/drop interface using pull-down to set the date of the view from which you wish to retrieve (i.e., let me see the files as they were on July 7, 2008).

- Retrieves files gracefully and quickly

Converse Example

IMA’s SAN

- IMA Purchased 32TB of EMC SAN in 2006 - 16TB local and 16TB at an offsite co-

location facility

- Due to growth in Collection Photography, Video, and Conservation Imagery that space is all but full!

IMA’s SAN

- Benchmark Growth Rate- Total Current Size – Initial Size / 36

months- Ballpark Rate of 142 GB/month

- Yielding 13.9TB estimated in the next 4 years

IMA’s SAN

- 16TB Onsite + 14TB AWS = $164,544- 16TB Onsite + 16TB Colo = $94,200

$0

$20,000

$40,000

$60,000

$80,000

$100,000

$120,000

$140,000

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47

Hidden Costs

- DS3+Colo = $96,000 (3 yr commitment)

- Maintenance and administration of servers (2 FTE’s)

- As replicated backup for super large file-systems time for restore would be huge!

Tools

- S3Fox- AWS Console- Elastic Fox

Moving Drupalto the Cloud:Step by step

Introducing the EC2 Console

Creating a Key Pair

Creating a Security Group

Selecting a Starter AMI

Launching an AMI

Connecting to Your EC2 Instance

Creating an EBS Volume

Configuring Apache and MySQL

Setting up Drupal

Bundling an AMI

ami-764bab1f

Fedora for DAM

Fedora as a testbed on AWS

Project Goals:External vendor to create

least-possible Fedora instance to enable preservation work

Summary

- Create on AWS and hand over instance when done- When dev site is completed, create “Amazon

Machine Instance” and check into Subversion- Document installation and everything else in

wiki- We create new instance from checked-out AMI- This ensures that we have maintainable code

that we can get up and running, before developer moves on

What we did

- Original server created using developer’s favorite Linux

- We use CentOS, so when we checked out the AMI, we recreated running under CentOS, bundled new AMI to S3

- AMIs can be independent of the underlying OS

Bugs

- Our repository, which consists of lots of very large files, uses a unix filesystem called XFS

- XFS supports very large volumes better the usual filesystem, and supports real-time snapshotting of huge file-systems

- AWS updated CentOS and broke XFS- We (actually, our webmaster) rebuilt

kernel to work around AWS CentOS bug

Other Gotchas

- An EC2 instance doesn’t preserve state- When you restart, it restarts from scratch- All config changes, anything else that was

done and saved to the previous instance is gone

- So, you use EBS, which acts something like a network drive (think NetApps)- You purchase blocks of EBS space at a time,

but it is cheaper than S3 per GB/Month- This is different from S3 storage where you pay

only for what you consume

AMIs

- Amazon Machine Instance- Sort of like a “ghost”ed server image- Amazon (and others) provide lots of AMIs

to work with- AMIs can be public or private- You can use different AMIs on different servers

in your AWS setup- “bundling” is the AWS term for saving that AMI

with your modifications for future use.- We store AMIs on S3; could also use EBS

Lessons Learned

- We liked AWS so much, and saved so much money, that we have now moved all of our web services to AWS.

- Our website used to cost us $1200/mo. It has added about $450/mo to what we already pay for the Fedora instance – about $900/mo total.

Rightscale – basic services free

www.rightscale.com

AWS Infrastructure

Amazon Web Services (AWS) Infrastructure Services

Elastic ComputeCloud (EC2)

Simple Storage Service (S3)

SimpleDBSimple QueueService (SQS)

Elastic Block Store (EBS)

ElasticMapReduce

CloudFront Content Delivery

Network

Relational Database

Service (RDS)

Virtual Private Cloud

EC2CloudFront

S3

ArtBabble.org

Apache/ MySQL

EBS

Wowza StreamingServer

ArtBabbleVideos

WWWImages

2. Images served directlyfrom S3/ CDN

1. Web request

3. Video stream

4. Videos downloadfrom S3/ CDN

Scalabble

Video Processing

Total Monthly AWS

Monthly Bill for ArtBabble.org Web Server

Monthly Bill for Wowza Video Server(s)

AWS Bill - CloudFront

AWS Bill – EC2

AWS Bill – S3

AWS Bill - Wowza

The Numbers (so far)

- 150,000 video views (168k visits / 576k pages)

- 81,000 note clicks- 1:3 of the notes expanded

- 22,400 views of “Behind the Babble”- 25,015 views of most popular YouTube video

posted Feb. 1st, 2008

- 5,000 registered users- 44% signed up using OpenID (but didn’t

realize it)

The Numbers (so far) for geeks

- 112 hours of video processed- 525 videos- 1700 instance hours

- At a cost of ~$0.65 per video

- April 1st – October 30th

- 167,000 visitors- From 166 countries

- April 1st – September 31st

- 1.1TB of web data transferred out- At a transfer cost of $200

- 1.25TB of video streamed- At a transfer cost of $250

- 11 Mbps average transfer on embedded videos- IMA just upgraded to 5Mb pipe Fall ‘08

QUESTIONS?

Recommended