Transcript
Page 1: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

How to Do/Evaluate Cloud Computing Research

Young Choon Lee

Page 2: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Cloud Computing

• Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. (NIST definition)

Page 3: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Cloud Computing

• Broadly falls into distributed computing • Relevant/similar concepts

– Internet: largely information sharing – Grid computing: primarily used as scientific

platforms

• On-demand/elasticity + Utility computing

Page 4: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Cloud Computing • Service Models

– IaaS (Infrastructure as a Service) • AWS, Microsoft Azure, Google Compute Engine

– PaaS (Platform as a Service) • Rackspace, Microsoft Azure

– SaaS (Software as a Service) • Salesforce.com

• Deployment Models – Private cloud – Public cloud – Hybrid cloud – Community cloud

Page 5: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Clouds

• A (IaaS) cloud is basically a data centre • Major public clouds host hundreds of

thousands of servers, or even millions of them • A single data centre often consists of

thousands of servers or more; and they are “virtualized”

Page 6: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Example application

• Cycle Computing’s cloud computing deployment – 16,788 Amazon EC2 instance cluster

• with 156,314 cores • across 8 regions • in five continents

– for materials science experiments that collectively screen some 205,000 candidate molecules.

– This deployment has powered 2.3 million computing hours at a cost of only $33,000 which is equivalent to $68 million worth of equipment.

Page 7: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Problems

• Performance – High performance/throughput – Performance isolation and resource contention due to

multi-tenancy – Scalability

• Load balancing • Cost Efficiency • Resource failure

– Fault tolerance – Failure recovery

Page 8: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Research Process

Design

Implementation

Experimental Evaluation

Analysis

Ideas / Feedback

Page 9: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Research Process: ideas

• Pick up a promising/interesting domain, E.g., – Cloud computing – “Big data”

• In scientific discovery, the first three paradigms were experimental, theoretical and computational science (simulation), The Fourth Paradigm: Data-Intensive Scientific Discovery, Microsoft Research.

– Energy efficiency • Survey/review literature

– Read papers, blog posts and even news paper articles – Watch leading RGs and community/industry activities

• Find a “real(istic)” problem

Page 10: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Research Process: Design

• Algorithms, e.g., – Scheduling – Resource allocation – Load balancing – Fault tolerance

• Systems – Hardware: new systems architecture, e.g., FAWN – Software

• Res. management systems, e.g., Mesos, • Workflow execution systems, e.g., DEWE

Page 11: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Research Process: Implementation

• A realization of the design • Often involves

– Prototyping, e.g., Hadoop schedulers – Software development (programming)

• Extensibility • Portability • Scalability • Etc.

Page 12: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Research Process: Evaluation

• Experimental evaluation – Simulations

• Only when you can WELL justify, e.g., energy efficiency with realistic data like Google cluster data, SPEC benchmarks

• Use existing simulators, e.g., NS-3, CloudSim • Or write your own and make it open source

– Experiments in real systems • Open-source systems, e.g., Hadoop, Xen, Docker • Real systems like Amazon EC2 (AWS Education Grants)

Page 13: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Research Process: Analysis

• Cleaning and organising experimental results • Scrutinizing results • Conducting comparison study • Describing results with effective use of figures

and tables; colours, patterns and different chart types

00.10.20.30.40.50.60.70.80.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

00.10.20.30.40.50.60.70.80.9

1

Page 14: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Research Process: Analysis

4m4r6m6r8m8r SM LRS 4m4r6m6r8m8r SM LRS 4m4r6m6r8m8r SM LRS 4m4r6m6r8m8r SM LRS 4m4r6m6r8m8r SM LRS 4m4r6m6r8m8r SM LRSexec 952 945 962 870 810 862 787 780 749 713 905 898 915 895 825 896 870 867 821 763 1415 1363 1273 1231 1231 3127 3142 3231 2990 2767idle 12.8 21.7 10.4 7.88 7.17 15.6 10.2 8.35 8.29 8.13 18.5 9.16 14.2 9.05 5.89 13.9 9.20 7.75 5.90 5.71 11.7 9.50 8.24 7.53 7.53 6.86 3.38 2.81 1.83 3.52iowait 7.64 8.32 11.7 7.14 5.88 7.79 5.75 5.72 5.26 4.69 11.1 8.20 10.9 8.72 11.1 5.76 3.73 2.82 2.55 2.42 0.06 0.04 0.01 0.01 0.01 3.89 2.08 4.32 1.35 1.80system 10.6 11.3 10.4 11.5 10.0 11.4 12.3 13.9 12.5 10.9 20.7 27.2 26.2 26.2 23.3 10.4 12.4 13.6 12.7 10.7 2.88 3.15 3.52 3.46 3.46 13.7 17.6 18.6 18.5 12.3user 68.8 58.6 67.4 73.4 76.9 65.1 71.6 71.9 73.9 76.2 49.5 55.4 48.5 55.9 59.5 69.8 74.5 75.8 78.8 81.0 85.3 87.3 88.2 89.0 89.0 75.5 76.9 74.2 78.3 82.3

0

20

40

60

80

100

CPU

util

izat

ion

Page 15: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Issues with Evaluation

• The seven deadly sins of cloud computing research, HotCloud 2012. – sin, n. – common simplification or shortcut employed by

researchers; may present threat to scientific integrity and practical applicability or research

• Systems Benchmarking Crimes, Gernot Heiser, NICTA/UNSW. – “When reviewing systems papers (and sometimes even when

reading published papers) I frequently come across highly misleading use of benchmarks.”

– “I call such cases benchmarking crimes. Not because you can go to jail for them (but probably should?) but because they undermine the integrity of the scientific process.”

Page 16: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Issues with Evaluation: The seven deadly sins of cloud computing research

• Sin 1: Unnecessary distributed parallelism • Sin 2: Assuming performance homogeneity • Sin 3: Picking the low-hanging fruit • Sin 4: Forcing the abstraction • Sin 5: Unrepresentative workloads • Sin 6: Assuming perfect elasticity • Sin 7: Ignoring fault tolerance

Page 17: How to Do/Evaluate Cloud Computing Researchshhong/yclee.pdfCloud Computing • Cloud computing ... • Load balancing • Cost Efficiency • Resource failure – Fault tolerance

Issues with Evaluation: Systems Benchmarking Crimes

• Pretending micro-benchmarks represent overall performance • Benchmark sub-setting without strong justification • Selective data set hiding deficiencies • Throughput degraded by x% ⇒ overhead is x% • 6% → 13% overhead is a 7% increase • Same dataset for calibration and validation • No indication of significance of data • Benchmarking of simplified simulated system • Inappropriate and misleading benchmarks • Relative numbers only • No proper baseline • Only evaluate against yourself • Unfair benchmarking of competitors • Arithmetic mean for averaging across benchmark scores


Recommended