21

Distributed Computing Economics

  • Upload
    vic

  • View
    20

  • Download
    0

Embed Size (px)

DESCRIPTION

Distributed Computing Economics. Jim Gray Microsoft Research [email protected] Presentation To Microsoft Venture Capital Summit 28 April 2004. Distributed Computing Economics. Why is Seti@Home a great idea? Why is Napster a great deal? Why is the Computational Grid uneconomic? - PowerPoint PPT Presentation

Citation preview

Page 1: Distributed Computing Economics
Page 2: Distributed Computing Economics

Distributed Computing Distributed Computing Economics Economics

Jim GrayJim GrayMicrosoft Research Microsoft Research [email protected]@microsoft.comPresentation To Microsoft Venture Presentation To Microsoft Venture Capital SummitCapital Summit28 April 200428 April 2004

Page 3: Distributed Computing Economics

Distributed Computing Distributed Computing EconomicsEconomics

Why is Seti@Home a great idea?Why is Seti@Home a great idea?

Why is Napster a great deal?Why is Napster a great deal?

Why is the Computational Grid uneconomic?Why is the Computational Grid uneconomic?

When does computing on demand work?When does computing on demand work?

What is the “right” level of abstraction?What is the “right” level of abstraction?

Is the Access Grid the real killer app?Is the Access Grid the real killer app?

Based on: Distributed Computing Economics, Based on: Distributed Computing Economics, Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24

http://research.microsoft.com/research/pubs/view.aspx?tr_id=655http://research.microsoft.com/research/pubs/view.aspx?tr_id=655

Page 4: Distributed Computing Economics

Computing Is FreeComputing Is Free

Computers cost 1k$ (if you shop right) Computers cost 1k$ (if you shop right) (yes, there are 1(yes, there are 1μμ$ to 1M$ computers, but..)$ to 1M$ computers, but..)

So 1 cpu day = 1$ (computers last 3 years)So 1 cpu day = 1$ (computers last 3 years)

If you pay the phone bill, internet bandwidth If you pay the phone bill, internet bandwidth costs 50…500$/mbps/m (not including costs 50…500$/mbps/m (not including routers and management)routers and management)

So 1GB costs 1$ to send and 1$ to receiveSo 1GB costs 1$ to send and 1$ to receive

Caveat: All numbers rounded to nearest factor of 3.Caveat: All numbers rounded to nearest factor of 3.

Page 5: Distributed Computing Economics

Why Is Seti@Home A Why Is Seti@Home A Good Deal?Good Deal?

Send 300 KB: Send 300 KB: Costs 3e-4$Costs 3e-4$

User computes for ½ day:User computes for ½ day: Benefit .5e-Benefit .5e-1$1$

ROI: 1500:1ROI: 1500:1

Page 6: Distributed Computing Economics

Seti@HomeSeti@HomeThe worlds most powerful computerThe worlds most powerful computer

67 TF is sum of top 4 of Top 50067 TF is sum of top 4 of Top 50067 TF is 9x the number 2 system67 TF is 9x the number 2 system67 TF more than the sum of systems 2...1067 TF more than the sum of systems 2...10

Seti@HomeSeti@Homehttp://setiathome.ssl.berkeley.edu/totals.htmlhttp://setiathome.ssl.berkeley.edu/totals.html

26 April 200426 April 2004

   TotalTotal Last 24 HoursLast 24 Hours

UsersUsers 5 M5 M 1,1381,138

Results receivedResults received 1.3 B1.3 B 1,5 M1,5 M

Total CPU timeTotal CPU time 1.5 M years1.5 M years 1,199 years1,199 years

Floating Point Floating Point OperationsOperations

5 E+21 flops5 E+21 flops

5 zeta flops5 zeta flops6 E+18 FLOPS/day 6 E+18 FLOPS/day

6767 TeraFLOPs TeraFLOPs

Page 7: Distributed Computing Economics

Why Was Napster A Why Was Napster A Good Deal?Good Deal?

Send 5 MB Send 5 MB costs 5e-3$costs 5e-3$½ a penny per ½ a penny per

songsong

Both sender and receiver can afford itBoth sender and receiver can afford it

Same logic powers web sites (Yahoo!...)Same logic powers web sites (Yahoo!...)1e-3$/page view advertising revenue1e-3$/page view advertising revenue

1e-5$/page view cost of serving web page1e-5$/page view cost of serving web page

100:1 ROI 100:1 ROI

Page 8: Distributed Computing Economics

Computing EquivalentsComputing Equivalents1$ buys1$ buys

1 day of cpu time1 day of cpu time

4 GB (fast) ram for a day 4 GB (fast) ram for a day

1 GB of network bandwidth1 GB of network bandwidth

1 GB of disk storage for 3 years1 GB of disk storage for 3 years

10 M database accesses 10 M database accesses

10 TB of disk access (sequential)10 TB of disk access (sequential)

10 TB of LAN bandwidth (bulk)10 TB of LAN bandwidth (bulk)

10 KWhrs == 4 days of computer time10 KWhrs == 4 days of computer time

Depreciating over 3 years, and there are about 1k days in 3 years.Depreciating over 3 years, and there are about 1k days in 3 years.

Page 9: Distributed Computing Economics

Some ConsequencesSome Consequences

Beowulf networking is 10,000x cheaper than Beowulf networking is 10,000x cheaper than WAN networking factors of 10WAN networking factors of 1055 matter matter

The cheapest and fastest way to move The cheapest and fastest way to move Terabytes cross country is sneakernetTerabytes cross country is sneakernet24 hours = 4 MB/s24 hours = 4 MB/s50$ shipping vs 1,000$ wan cost50$ shipping vs 1,000$ wan cost

Sending 10PB CERN data via network is silly: Sending 10PB CERN data via network is silly: buy disk bricks in Geneva, fill them, ship thembuy disk bricks in Geneva, fill them, ship them

TeraScale SneakerNet: Using Inexpensive Disks for Backup, TeraScale SneakerNet: Using Inexpensive Disks for Backup, Archiving, and Data ExchangeArchiving, and Data Exchange

Jim Gray; Wyman Chong; Tom Barclay; Alex Szalay; Jan vandenBergJim Gray; Wyman Chong; Tom Barclay; Alex Szalay; Jan vandenBergMicrosoft Technical Report may 2002, MSR-TR-2002-54 Microsoft Technical Report may 2002, MSR-TR-2002-54

http://research.microsoft.com/research/pubs/view.aspx?tr_id=569http://research.microsoft.com/research/pubs/view.aspx?tr_id=569

Page 10: Distributed Computing Economics

Computational Grid Computational Grid EconomicsEconomics

To the extent that computational grid is like To the extent that computational grid is like Seti@Home or ZetaNet or Folding@home or…it is a Seti@Home or ZetaNet or Folding@home or…it is a great thinggreat thing

The extent that the computational grid is MPI or data The extent that the computational grid is MPI or data analysis, it fails on economic grounds: move the analysis, it fails on economic grounds: move the programs to the data, not the data to the programsprograms to the data, not the data to the programs

The Internet is The Internet is notnot the cpu backplane the cpu backplane

An alternate reality: Nearly free networkingAn alternate reality: Nearly free networkingTelcos go bankrupt and price=cost=0Telcos go bankrupt and price=cost=0

Taxpayers pay your phone bill so price=0 and telcos receive Taxpayers pay your phone bill so price=0 and telcos receive a BIG government subsidya BIG government subsidy

Page 11: Distributed Computing Economics

When To Export A TaskWhen To Export A Task

IFIF instruction density > instruction density > 100,000 instructions/byte100,000 instructions/byte

ANDAND remote computer is free remote computer is free (costs you nothing)(costs you nothing)

THEN THEN ROI > 0ROI > 0ELSEELSE ROI < 0ROI < 0

Page 12: Distributed Computing Economics

Computing On DemandComputing On Demand

Was called outsourcing/service bureaus in my youth. Was called outsourcing/service bureaus in my youth. CSC and IBM did itCSC and IBM did it

It is not a new way of doing things: think payroll. It is not a new way of doing things: think payroll. Payroll is standard outsourced servicePayroll is standard outsourced service

Now Hotmail, Salesforce.com, Oracle.com,…Now Hotmail, Salesforce.com, Oracle.com,…

Works for standard appsWorks for standard apps

COD works for commoditized servicesCOD works for commoditized services

Airlines outsource reservations. Banks Airlines outsource reservations. Banks outsource ATMsoutsource ATMs

But Amazon, Amex, Wal-Mart, eTrade, eBay... Can’t But Amazon, Amex, Wal-Mart, eTrade, eBay... Can’t outsource their core competence outsource their core competence

Page 13: Distributed Computing Economics

What’s The Right Abstraction Level For What’s The Right Abstraction Level For Internet Scale Distributed Computing?Internet Scale Distributed Computing?

Disk block? Disk block? No too lowNo too lowFile? File? No too lowNo too lowDatabase? Database? No too lowNo too lowApplication? Application? Yes, of Yes, of coursecourse

Blast searchBlast searchGoogle searchGoogle searchSend/Get eMailSend/Get eMailPortals that federate astronomy archivesPortals that federate astronomy archives((http://skyQuery.Net/http://skyQuery.Net/))

Web Services (.NET, EJB, OGSA) give this Web Services (.NET, EJB, OGSA) give this abstraction levelabstraction level

Page 14: Distributed Computing Economics

Access GridAccess Grid

Q: What comes after the telephone?Q: What comes after the telephone?

A: eMail?A: eMail?

A: Instant messaging?A: Instant messaging?

Both seem retro: text & emotonsBoth seem retro: text & emotons

Access Grid could revolutionize human Access Grid could revolutionize human communicationcommunication

But, it needs a new ideaBut, it needs a new idea

Q: What comes after the telephone?Q: What comes after the telephone?

Page 15: Distributed Computing Economics

Supercomputers You UseSupercomputers You Use

Hotmail, Yahoo!, Google: ~10k serversHotmail, Yahoo!, Google: ~10k servers

Amazon, Barnes&NobleAmazon, Barnes&Noble

Expedia, OrbitzExpedia, Orbitz

Dell, HP,…Dell, HP,…

Service-oriented architecturesService-oriented architectures

Not computing on demandNot computing on demand, but , but information on demand!information on demand!

Page 16: Distributed Computing Economics

Distributed Computing EconomicsDistributed Computing Economics

Why is Seti@Home a great idea?Why is Seti@Home a great idea?Why is Napster a great deal?Why is Napster a great deal?Why is the Computational Grid Why is the Computational Grid uneconomicuneconomicWhen does computing on When does computing on demand work?demand work?What is the “right” level of abstraction?What is the “right” level of abstraction?Is the Access Grid the real killer app?Is the Access Grid the real killer app?

Based on: Distributed Computing Economics, Based on: Distributed Computing Economics, Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24

http://research.microsoft.com/research/pubs/view.aspx?tr_id=655http://research.microsoft.com/research/pubs/view.aspx?tr_id=655

Page 17: Distributed Computing Economics

PollPoll

Is there a market for Supercomputers?Is there a market for Supercomputers?Yes, Google, Expedia, Hotmail,…Yes, Google, Expedia, Hotmail,…

Is Computing On Demand a high-Is Computing On Demand a high-margin business?margin business?I think notI think not

Do you know the equivalent high-Do you know the equivalent high-margin business?margin business?Information on demandInformation on demand

Page 18: Distributed Computing Economics

Take AwaysTake Aways

Computing on demand is a service Computing on demand is a service business; probably not high margin; business; probably not high margin; questionable economics; think questionable economics; think LoudCloudLoudCloud

Distributed computing is coming,Distributed computing is coming,but it is probably via Service Oriented but it is probably via Service Oriented Architecture (SOA)Architecture (SOA)

Web Services is the way to do SOAWeb Services is the way to do SOA

Page 19: Distributed Computing Economics
Page 20: Distributed Computing Economics

OutlineOutline

Overview of Microsoft ResearchOverview of Microsoft Research

Distribute Computing EconomicsDistribute Computing Economics

Q&AQ&A

Page 21: Distributed Computing Economics

The Cost Of ComputingThe Cost Of ComputingComputers are Computers are NOTNOT free! free!

IBM, HP, Dell make $billionsIBM, HP, Dell make $billions

Capital Cost of a TpcC system Capital Cost of a TpcC system is mostly storage and is mostly storage and storage software (database)storage software (database)IBM 32 cpu, 512 GB ram IBM 32 cpu, 512 GB ram 2,500 disks, 43 TB2,500 disks, 43 TB(680,613 tpmC @ 11.13 $/tpmc available 11/08/03)(680,613 tpmC @ 11.13 $/tpmc available 11/08/03)http://www.tpc.org/results/individual_results/IBM/IBMp690es_05092003.pdfhttp://www.tpc.org/results/individual_results/IBM/IBMp690es_05092003.pdf

A 7.5M$ super-computerA 7.5M$ super-computer

Total Data Center Cost: Total Data Center Cost: 40% capital & facilities 60% staff40% capital & facilities 60% staff(includes app development)(includes app development)

TpcC Cost Components DB2/AIXhttp://www.tpc.org/results/individual_ results/IBM /IBMp690es_05092003.pdf

software10%

storage61%

cpu/mem29%