Distributed Computing Economics

Preview:

DESCRIPTION

Distributed Computing Economics. Jim Gray Microsoft Research gray@microsoft.com Presentation To Microsoft Venture Capital Summit 28 April 2004. Distributed Computing Economics. Why is Seti@Home a great idea? Why is Napster a great deal? Why is the Computational Grid uneconomic? - PowerPoint PPT Presentation

Citation preview

Distributed Computing Distributed Computing Economics Economics

Jim GrayJim GrayMicrosoft Research Microsoft Research gray@microsoft.comgray@microsoft.comPresentation To Microsoft Venture Presentation To Microsoft Venture Capital SummitCapital Summit28 April 200428 April 2004

Distributed Computing Distributed Computing EconomicsEconomics

Why is Seti@Home a great idea?Why is Seti@Home a great idea?

Why is Napster a great deal?Why is Napster a great deal?

Why is the Computational Grid uneconomic?Why is the Computational Grid uneconomic?

When does computing on demand work?When does computing on demand work?

What is the “right” level of abstraction?What is the “right” level of abstraction?

Is the Access Grid the real killer app?Is the Access Grid the real killer app?

Based on: Distributed Computing Economics, Based on: Distributed Computing Economics, Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24

http://research.microsoft.com/research/pubs/view.aspx?tr_id=655http://research.microsoft.com/research/pubs/view.aspx?tr_id=655

Computing Is FreeComputing Is Free

Computers cost 1k$ (if you shop right) Computers cost 1k$ (if you shop right) (yes, there are 1(yes, there are 1μμ$ to 1M$ computers, but..)$ to 1M$ computers, but..)

So 1 cpu day = 1$ (computers last 3 years)So 1 cpu day = 1$ (computers last 3 years)

If you pay the phone bill, internet bandwidth If you pay the phone bill, internet bandwidth costs 50…500$/mbps/m (not including costs 50…500$/mbps/m (not including routers and management)routers and management)

So 1GB costs 1$ to send and 1$ to receiveSo 1GB costs 1$ to send and 1$ to receive

Caveat: All numbers rounded to nearest factor of 3.Caveat: All numbers rounded to nearest factor of 3.

Why Is Seti@Home A Why Is Seti@Home A Good Deal?Good Deal?

Send 300 KB: Send 300 KB: Costs 3e-4$Costs 3e-4$

User computes for ½ day:User computes for ½ day: Benefit .5e-Benefit .5e-1$1$

ROI: 1500:1ROI: 1500:1

Seti@HomeSeti@HomeThe worlds most powerful computerThe worlds most powerful computer

67 TF is sum of top 4 of Top 50067 TF is sum of top 4 of Top 50067 TF is 9x the number 2 system67 TF is 9x the number 2 system67 TF more than the sum of systems 2...1067 TF more than the sum of systems 2...10

Seti@HomeSeti@Homehttp://setiathome.ssl.berkeley.edu/totals.htmlhttp://setiathome.ssl.berkeley.edu/totals.html

26 April 200426 April 2004

   TotalTotal Last 24 HoursLast 24 Hours

UsersUsers 5 M5 M 1,1381,138

Results receivedResults received 1.3 B1.3 B 1,5 M1,5 M

Total CPU timeTotal CPU time 1.5 M years1.5 M years 1,199 years1,199 years

Floating Point Floating Point OperationsOperations

5 E+21 flops5 E+21 flops

5 zeta flops5 zeta flops6 E+18 FLOPS/day 6 E+18 FLOPS/day

6767 TeraFLOPs TeraFLOPs

Why Was Napster A Why Was Napster A Good Deal?Good Deal?

Send 5 MB Send 5 MB costs 5e-3$costs 5e-3$½ a penny per ½ a penny per

songsong

Both sender and receiver can afford itBoth sender and receiver can afford it

Same logic powers web sites (Yahoo!...)Same logic powers web sites (Yahoo!...)1e-3$/page view advertising revenue1e-3$/page view advertising revenue

1e-5$/page view cost of serving web page1e-5$/page view cost of serving web page

100:1 ROI 100:1 ROI

Computing EquivalentsComputing Equivalents1$ buys1$ buys

1 day of cpu time1 day of cpu time

4 GB (fast) ram for a day 4 GB (fast) ram for a day

1 GB of network bandwidth1 GB of network bandwidth

1 GB of disk storage for 3 years1 GB of disk storage for 3 years

10 M database accesses 10 M database accesses

10 TB of disk access (sequential)10 TB of disk access (sequential)

10 TB of LAN bandwidth (bulk)10 TB of LAN bandwidth (bulk)

10 KWhrs == 4 days of computer time10 KWhrs == 4 days of computer time

Depreciating over 3 years, and there are about 1k days in 3 years.Depreciating over 3 years, and there are about 1k days in 3 years.

Some ConsequencesSome Consequences

Beowulf networking is 10,000x cheaper than Beowulf networking is 10,000x cheaper than WAN networking factors of 10WAN networking factors of 1055 matter matter

The cheapest and fastest way to move The cheapest and fastest way to move Terabytes cross country is sneakernetTerabytes cross country is sneakernet24 hours = 4 MB/s24 hours = 4 MB/s50$ shipping vs 1,000$ wan cost50$ shipping vs 1,000$ wan cost

Sending 10PB CERN data via network is silly: Sending 10PB CERN data via network is silly: buy disk bricks in Geneva, fill them, ship thembuy disk bricks in Geneva, fill them, ship them

TeraScale SneakerNet: Using Inexpensive Disks for Backup, TeraScale SneakerNet: Using Inexpensive Disks for Backup, Archiving, and Data ExchangeArchiving, and Data Exchange

Jim Gray; Wyman Chong; Tom Barclay; Alex Szalay; Jan vandenBergJim Gray; Wyman Chong; Tom Barclay; Alex Szalay; Jan vandenBergMicrosoft Technical Report may 2002, MSR-TR-2002-54 Microsoft Technical Report may 2002, MSR-TR-2002-54

http://research.microsoft.com/research/pubs/view.aspx?tr_id=569http://research.microsoft.com/research/pubs/view.aspx?tr_id=569

Computational Grid Computational Grid EconomicsEconomics

To the extent that computational grid is like To the extent that computational grid is like Seti@Home or ZetaNet or Folding@home or…it is a Seti@Home or ZetaNet or Folding@home or…it is a great thinggreat thing

The extent that the computational grid is MPI or data The extent that the computational grid is MPI or data analysis, it fails on economic grounds: move the analysis, it fails on economic grounds: move the programs to the data, not the data to the programsprograms to the data, not the data to the programs

The Internet is The Internet is notnot the cpu backplane the cpu backplane

An alternate reality: Nearly free networkingAn alternate reality: Nearly free networkingTelcos go bankrupt and price=cost=0Telcos go bankrupt and price=cost=0

Taxpayers pay your phone bill so price=0 and telcos receive Taxpayers pay your phone bill so price=0 and telcos receive a BIG government subsidya BIG government subsidy

When To Export A TaskWhen To Export A Task

IFIF instruction density > instruction density > 100,000 instructions/byte100,000 instructions/byte

ANDAND remote computer is free remote computer is free (costs you nothing)(costs you nothing)

THEN THEN ROI > 0ROI > 0ELSEELSE ROI < 0ROI < 0

Computing On DemandComputing On Demand

Was called outsourcing/service bureaus in my youth. Was called outsourcing/service bureaus in my youth. CSC and IBM did itCSC and IBM did it

It is not a new way of doing things: think payroll. It is not a new way of doing things: think payroll. Payroll is standard outsourced servicePayroll is standard outsourced service

Now Hotmail, Salesforce.com, Oracle.com,…Now Hotmail, Salesforce.com, Oracle.com,…

Works for standard appsWorks for standard apps

COD works for commoditized servicesCOD works for commoditized services

Airlines outsource reservations. Banks Airlines outsource reservations. Banks outsource ATMsoutsource ATMs

But Amazon, Amex, Wal-Mart, eTrade, eBay... Can’t But Amazon, Amex, Wal-Mart, eTrade, eBay... Can’t outsource their core competence outsource their core competence

What’s The Right Abstraction Level For What’s The Right Abstraction Level For Internet Scale Distributed Computing?Internet Scale Distributed Computing?

Disk block? Disk block? No too lowNo too lowFile? File? No too lowNo too lowDatabase? Database? No too lowNo too lowApplication? Application? Yes, of Yes, of coursecourse

Blast searchBlast searchGoogle searchGoogle searchSend/Get eMailSend/Get eMailPortals that federate astronomy archivesPortals that federate astronomy archives((http://skyQuery.Net/http://skyQuery.Net/))

Web Services (.NET, EJB, OGSA) give this Web Services (.NET, EJB, OGSA) give this abstraction levelabstraction level

Access GridAccess Grid

Q: What comes after the telephone?Q: What comes after the telephone?

A: eMail?A: eMail?

A: Instant messaging?A: Instant messaging?

Both seem retro: text & emotonsBoth seem retro: text & emotons

Access Grid could revolutionize human Access Grid could revolutionize human communicationcommunication

But, it needs a new ideaBut, it needs a new idea

Q: What comes after the telephone?Q: What comes after the telephone?

Supercomputers You UseSupercomputers You Use

Hotmail, Yahoo!, Google: ~10k serversHotmail, Yahoo!, Google: ~10k servers

Amazon, Barnes&NobleAmazon, Barnes&Noble

Expedia, OrbitzExpedia, Orbitz

Dell, HP,…Dell, HP,…

Service-oriented architecturesService-oriented architectures

Not computing on demandNot computing on demand, but , but information on demand!information on demand!

Distributed Computing EconomicsDistributed Computing Economics

Why is Seti@Home a great idea?Why is Seti@Home a great idea?Why is Napster a great deal?Why is Napster a great deal?Why is the Computational Grid Why is the Computational Grid uneconomicuneconomicWhen does computing on When does computing on demand work?demand work?What is the “right” level of abstraction?What is the “right” level of abstraction?Is the Access Grid the real killer app?Is the Access Grid the real killer app?

Based on: Distributed Computing Economics, Based on: Distributed Computing Economics, Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24

http://research.microsoft.com/research/pubs/view.aspx?tr_id=655http://research.microsoft.com/research/pubs/view.aspx?tr_id=655

PollPoll

Is there a market for Supercomputers?Is there a market for Supercomputers?Yes, Google, Expedia, Hotmail,…Yes, Google, Expedia, Hotmail,…

Is Computing On Demand a high-Is Computing On Demand a high-margin business?margin business?I think notI think not

Do you know the equivalent high-Do you know the equivalent high-margin business?margin business?Information on demandInformation on demand

Take AwaysTake Aways

Computing on demand is a service Computing on demand is a service business; probably not high margin; business; probably not high margin; questionable economics; think questionable economics; think LoudCloudLoudCloud

Distributed computing is coming,Distributed computing is coming,but it is probably via Service Oriented but it is probably via Service Oriented Architecture (SOA)Architecture (SOA)

Web Services is the way to do SOAWeb Services is the way to do SOA

OutlineOutline

Overview of Microsoft ResearchOverview of Microsoft Research

Distribute Computing EconomicsDistribute Computing Economics

Q&AQ&A

The Cost Of ComputingThe Cost Of ComputingComputers are Computers are NOTNOT free! free!

IBM, HP, Dell make $billionsIBM, HP, Dell make $billions

Capital Cost of a TpcC system Capital Cost of a TpcC system is mostly storage and is mostly storage and storage software (database)storage software (database)IBM 32 cpu, 512 GB ram IBM 32 cpu, 512 GB ram 2,500 disks, 43 TB2,500 disks, 43 TB(680,613 tpmC @ 11.13 $/tpmc available 11/08/03)(680,613 tpmC @ 11.13 $/tpmc available 11/08/03)http://www.tpc.org/results/individual_results/IBM/IBMp690es_05092003.pdfhttp://www.tpc.org/results/individual_results/IBM/IBMp690es_05092003.pdf

A 7.5M$ super-computerA 7.5M$ super-computer

Total Data Center Cost: Total Data Center Cost: 40% capital & facilities 60% staff40% capital & facilities 60% staff(includes app development)(includes app development)

TpcC Cost Components DB2/AIXhttp://www.tpc.org/results/individual_ results/IBM /IBMp690es_05092003.pdf

software10%

storage61%

cpu/mem29%

Recommended