How to Do/Evaluate Cloud Computing shhong/yclee.pdfCloud Computing • Cloud computing ... • Load…

  • Published on

  • View

  • Download

Embed Size (px)


<ul><li><p>How to Do/Evaluate Cloud Computing Research </p><p>Young Choon Lee </p></li><li><p>Cloud Computing </p><p> Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. (NIST definition) </p><p></p></li><li><p>Cloud Computing </p><p> Broadly falls into distributed computing Relevant/similar concepts </p><p> Internet: largely information sharing Grid computing: primarily used as scientific </p><p>platforms </p><p> On-demand/elasticity + Utility computing </p></li><li><p>Cloud Computing Service Models </p><p> IaaS (Infrastructure as a Service) AWS, Microsoft Azure, Google Compute Engine </p><p> PaaS (Platform as a Service) Rackspace, Microsoft Azure </p><p> SaaS (Software as a Service) </p><p> Deployment Models Private cloud Public cloud Hybrid cloud Community cloud </p><p></p></li><li><p>Clouds </p><p> A (IaaS) cloud is basically a data centre Major public clouds host hundreds of </p><p>thousands of servers, or even millions of them A single data centre often consists of </p><p>thousands of servers or more; and they are virtualized </p></li><li><p>Example application </p><p> Cycle Computings cloud computing deployment 16,788 Amazon EC2 instance cluster </p><p> with 156,314 cores across 8 regions in five continents </p><p> for materials science experiments that collectively screen some 205,000 candidate molecules. </p><p> This deployment has powered 2.3 million computing hours at a cost of only $33,000 which is equivalent to $68 million worth of equipment. </p></li><li><p>Problems </p><p> Performance High performance/throughput Performance isolation and resource contention due to </p><p>multi-tenancy Scalability </p><p> Load balancing Cost Efficiency Resource failure </p><p> Fault tolerance Failure recovery </p></li><li><p>Research Process </p><p>Design </p><p>Implementation </p><p>Experimental Evaluation </p><p>Analysis </p><p>Ideas / Feedback </p></li><li><p>Research Process: ideas </p><p> Pick up a promising/interesting domain, E.g., Cloud computing Big data </p><p> In scientific discovery, the first three paradigms were experimental, theoretical and computational science (simulation), The Fourth Paradigm: Data-Intensive Scientific Discovery, Microsoft Research. </p><p> Energy efficiency Survey/review literature </p><p> Read papers, blog posts and even news paper articles Watch leading RGs and community/industry activities </p><p> Find a real(istic) problem </p><p></p></li><li><p>Research Process: Design </p><p> Algorithms, e.g., Scheduling Resource allocation Load balancing Fault tolerance </p><p> Systems Hardware: new systems architecture, e.g., FAWN Software </p><p> Res. management systems, e.g., Mesos, Workflow execution systems, e.g., DEWE </p><p></p></li><li><p>Research Process: Implementation </p><p> A realization of the design Often involves </p><p> Prototyping, e.g., Hadoop schedulers Software development (programming) </p><p> Extensibility Portability Scalability Etc. </p></li><li><p>Research Process: Evaluation </p><p> Experimental evaluation Simulations </p><p> Only when you can WELL justify, e.g., energy efficiency with realistic data like Google cluster data, SPEC benchmarks </p><p> Use existing simulators, e.g., NS-3, CloudSim Or write your own and make it open source </p><p> Experiments in real systems Open-source systems, e.g., Hadoop, Xen, Docker Real systems like Amazon EC2 (AWS Education Grants) </p><p></p></li><li><p>Research Process: Analysis </p><p> Cleaning and organising experimental results Scrutinizing results Conducting comparison study Describing results with effective use of figures </p><p>and tables; colours, patterns and different chart types </p><p></p><p>1</p><p>0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1</p><p></p><p>1</p></li><li><p>Research Process: Analysis </p><p>4m4r6m6r8m8r SM LRS 4m4r6m6r8m8r SM LRS 4m4r6m6r8m8r SM LRS 4m4r6m6r8m8r SM LRS 4m4r6m6r8m8r SM LRS 4m4r6m6r8m8r SM LRSexec 952 945 962 870 810 862 787 780 749 713 905 898 915 895 825 896 870 867 821 763 1415 1363 1273 1231 1231 3127 3142 3231 2990 2767idle 12.8 21.7 10.4 7.88 7.17 15.6 10.2 8.35 8.29 8.13 18.5 9.16 14.2 9.05 5.89 13.9 9.20 7.75 5.90 5.71 11.7 9.50 8.24 7.53 7.53 6.86 3.38 2.81 1.83 3.52iowait 7.64 8.32 11.7 7.14 5.88 7.79 5.75 5.72 5.26 4.69 11.1 8.20 10.9 8.72 11.1 5.76 3.73 2.82 2.55 2.42 0.06 0.04 0.01 0.01 0.01 3.89 2.08 4.32 1.35 1.80system 10.6 11.3 10.4 11.5 10.0 11.4 12.3 13.9 12.5 10.9 20.7 27.2 26.2 26.2 23.3 10.4 12.4 13.6 12.7 10.7 2.88 3.15 3.52 3.46 3.46 13.7 17.6 18.6 18.5 12.3user 68.8 58.6 67.4 73.4 76.9 65.1 71.6 71.9 73.9 76.2 49.5 55.4 48.5 55.9 59.5 69.8 74.5 75.8 78.8 81.0 85.3 87.3 88.2 89.0 89.0 75.5 76.9 74.2 78.3 82.3</p><p>0</p><p>20</p><p>40</p><p>60</p><p>80</p><p>100</p><p>CPU</p><p> util</p><p>izat</p><p>ion </p></li><li><p>Issues with Evaluation </p><p> The seven deadly sins of cloud computing research, HotCloud 2012. sin, n. common simplification or shortcut employed by </p><p>researchers; may present threat to scientific integrity and practical applicability or research </p><p> Systems Benchmarking Crimes, Gernot Heiser, NICTA/UNSW. When reviewing systems papers (and sometimes even when </p><p>reading published papers) I frequently come across highly misleading use of benchmarks. </p><p> I call such cases benchmarking crimes. Not because you can go to jail for them (but probably should?) but because they undermine the integrity of the scientific process. </p><p></p></li><li><p>Issues with Evaluation: The seven deadly sins of cloud computing research </p><p> Sin 1: Unnecessary distributed parallelism Sin 2: Assuming performance homogeneity Sin 3: Picking the low-hanging fruit Sin 4: Forcing the abstraction Sin 5: Unrepresentative workloads Sin 6: Assuming perfect elasticity Sin 7: Ignoring fault tolerance </p></li><li><p>Issues with Evaluation: Systems Benchmarking Crimes </p><p> Pretending micro-benchmarks represent overall performance Benchmark sub-setting without strong justification Selective data set hiding deficiencies Throughput degraded by x% overhead is x% 6% 13% overhead is a 7% increase Same dataset for calibration and validation No indication of significance of data Benchmarking of simplified simulated system Inappropriate and misleading benchmarks Relative numbers only No proper baseline Only evaluate against yourself Unfair benchmarking of competitors Arithmetic mean for averaging across benchmark scores </p><p>How to Do/Evaluate Cloud Computing ResearchCloud ComputingCloud ComputingCloud ComputingCloudsExample applicationProblemsResearch ProcessResearch Process: ideasResearch Process: DesignResearch Process: ImplementationResearch Process: EvaluationResearch Process: AnalysisResearch Process: AnalysisIssues with EvaluationIssues with Evaluation:The seven deadly sins of cloud computing researchIssues with Evaluation:Systems Benchmarking Crimes</p></li></ul>


View more >