18
appscale: open-source platform-level cloud computing I2 Joint Techs February 2 nd , 2010 Chandra Krintz Computer Science Dept. Univ. of California, Santa Barbara

appscale: open-source platform-level cloud computing · appscale cloud computing • Remote access to distributed and shared cluster resources Potentially owned by someone else (e.g

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

appscale: open-source platform-level cloud computing

I2 Joint Techs February 2nd, 2010

Chandra Krintz Computer Science Dept.

Univ. of California, Santa Barbara

appscale

cloud computing •  Remote access to distributed and shared cluster resources

  Potentially owned by someone else (e.g. Amazon, Google, …)  Users rent access to vast resources

  Advertised service-level-agreements (SLAs)   Resources are opaque and isolated

 Highly scalable, fault tolerant

  Service-oriented, utility computing   Relies on OS, network, and storage virtualization

SLAs

Web Services

Virtualization

appscale

cloud computing •  3 types: as-a-Service (aaS)

  Infrastructure: Amazon Web Services (EC2, S3, EBS)  Virtualized, isolated (CPU, Network, Storage) systems on which

users execute entire runtime stacks   Fully customer self-service

 Open APIs (IaaS standard), scalable services

  Platform: Google App Engine, Microsoft Azure  Scalable program-level abstractions via well-defined interfaces  Enable construction of network-accessible applications  Process-level (sandbox) isolation, complete software stack

  Software: Salesforce.com  Applications provided to thin clients over a network  Customizable

appscale

an opening in the clouds •  Open-source cloud computing systems from the

UCSB Computer Science Department   Goal: Bring popular cloud fabrics to “on-premise” clusters that

are easy to use and are transparent

  To facilitate investigation of  Energy-efficient cloud computing

  Services, underlying device technology, support technologies   Customization (availability, performance, application behavior)

 Hybrid cloud solutions (public and on-premise)

appscale

an opening in the clouds •  Open-source cloud computing systems from the

UCSB Computer Science Department   Goal: Bring popular cloud fabrics to “on-premise” clusters that

are easy to use and are transparent

  To facilitate investigation of  Energy-efficient cloud computing

  Services, underlying device technology, support technologies   Customization (availability, performance, application behavior)

 Hybrid cloud solutions (public and on-premise)

  By emulating key cloud layers from the commercial sector  Engender user community, access to real applications/users  Leverage extant software technologies

  Not a replacement technology for any Public Cloud service

appscale

cloud computing from UCSB •  IaaS:

  Open-source implementation of all AWS APIs   Robust, highly-available, scalable emulation   Cluster/data center support over Xen, KVM, VMWare

•  PaaS:   Open-source implementation of Google App Engine APIs   Pluggable (services), scalable, fault tolerant   Runs over virtualization or IaaS layer: AWS, Eucalyptus

appscale

google app engine

GAE Application (Python, Java)

private, enterprise data

Images

IM

Memcache Mail

Users URL Fetch

Adm

inis

trat

or

Cons

ole

Data Store

Protobuf Data APIs

SDC

Google App Engine (GAE)

Services Cron

Tasks

MyApp.appspot.com

Blob store

appscale

google app engine: the sdk

GAE Application (Python, Java)

Google App Engine (GAE) python2.5 dev_appserver.py –port=8181 MyApp

Open-source Google App Engine Software Development Kit (SDK)

Images IM Mem Cache Mail Users URL

Fetch Cron Tasks Data store

Blob Store

appscale

google app engine: run/test locally

GAE Application (Python, Java)

Google App Engine (GAE)

Open-source Google App Engine Software Development Kit (SDK)

python2.5 dev_appserver.py –port=8181 MyApp

send- mail

= simulation of actual API functionality using localhost (flat file, in-memory hash (Memcache))

curl /wget

frame- work lib

no auth

on console

on console

Images IM Mem Cache Mail Users URL

Fetch Cron Tasks Data store

Blob Store

on console

appscale

google app engine: upload to google

GAE Application (Python, Java)

Google App Engine (GAE) appcfg.py update MyApp/

private, enterprise data

SDC

Administrator Console

Free w/ quotas Pay for additional scale: CPU, BW, emails, data BigTable Automatic scaling High availability

GAE app users

via the Internet

Images IM Mem Cache Mail Users URL

Fetch Cron Tasks Data store

Blob Store

MyApp.appspot.com

appscale

sandbox restrictions

GAE Application (Python, Java)

Google App Engine (GAE) MyApp.appspot.com

•  Pure Python or Java, white list of library calls to framework •  No thread/subprocess spawning, system calls •  No writes to file system, reads only to static files uploaded w/app •  Storage using key-value, schema-free datastore (Bigtable-based) •  HTTP/S communication only, CGI to handle page requests •  Limit on number of datastore elements accessed per request •  Limit on response duration, task frequency, request rate •  Enforced quotas (BW, CPU, requests/s, files, app size, …)

•  Other things to consider •  Your code and data on Google resources •  APIs customized for MVC applications

•  Other application domains not supported

appscale

from gae to appscale •  GAE SDK extensions

  Pluggable using open-source distributed database technologies HBase, Hypertable, Cassandra, Voldemort, MongoDB, MemcacheDB, MySQL

  MemcacheD library (Python and Java)

  From console or as background thread (automatically)  Interface to Hadoop (MapReduce)

  Multi-language support: Python, Java, Ruby, Perl, soon: X10

  Translator to Linux Cron job, similar to Tasks

  Pluggable: built-in cloud-wide authentication via Rails, support for Eucalyptus and EC2 credentials

Mem Cache

Users

Cron

Tasks

Data store

appscale

appscale

GAE App Developer

(AppScale Admin)

GAE App Users

AppScaletools

HTTPS

AppController

ALB

DB S/P

AS GAE App Users GAE App

Users

AppScale Cloud

•  Distributed system with four key components AppLoadBalancer (ALB) Database Master/Peer (DB M/P)

AppServer (AS) Database Slave/Peer (DB S/P)

•  Services   Automatic deployment, database replication, node & front-end scaling

 Over Eucalyptus, EC2, and virtualization (Xen, KVM)

  System-wide performance/availability monitoring, user/admin console

Tasks(e.g.MapReduce)

DB M/P

appscale

appscale

GAE App Developer

(AppScale Admin)

GAE App Users

AppScaletools

HTTPS

AppController

ALB

DB S/P

AS GAE App Users GAE App

Users

AppScale Cloud Tasks(e.g.MapReduce)

DB M/P

•  Implements every AppScale component   Can instantiate as a particular role (ALB, AS, DB)   Can change functionality and instantiate itself as another

•  AppScale tools deploy/control cloud

appscale

appscale performance •  2 VCPUs 2.83GHz, 4GB RAM, 16GB disk

0

1

2

3

4

5

6

7

1 2 3 4

Que

ry T

ime

[s]

Number of Nodes

Average Time to Query a Database of Size 1000

HBase (1 accessor)MongoDB (1 accessor)

MemcacheDB (1 accessor)Google (1 accessor)HBase (3 accessors)

MongoDB (3 accessors)MemcacheDB (3 accessors)

Google (3 accessors)

appscale

appscale projects: http://appscale.cs.ucsb.edu •  Open-source community management

  Bug fixes, feature additions, releases, user support

•  Research (currently only internally available)   Automatic scaling of load, demand, other metrics

 Scheduling and load balancing of apps, tasks, components

  Hybrid cloud solutions (public/private, multi-zone)  Tunable fault-tolerance and availability

  Efficient communication across isolation boundaries   Alternative application domains (streaming, HPC)   Distributed profiling/sampling, feedback-driven optimization   Paas/IaaS integration and co-operation

 Customized, dynamic/adaptive SLAs  Platform-aware resource scheduling, isolation, provisioning

appscale

appscale projects: http://appscale.cs.ucsb.edu •  Open-source community management

  Bug fixes, feature additions, releases, user support

•  Research (currently only internally available)   Automatic scaling of load, demand, other metrics

 Scheduling and load balancing of apps, tasks, components

  Hybrid cloud solutions (public/private, multi-zone)  Tunable fault-tolerance and availability

  Efficient communication across isolation boundaries   Alternative application domains (streaming, HPC)   Distributed profiling/sampling, feedback-driven optimization   Paas/IaaS integration and co-operation

 Customized, dynamic/adaptive SLAs  Platform-aware resource scheduling, isolation, provisioning

appscale

appscale http://appscale.cs.ucsb.edu •  Thanks!

  Leads: Chris Bunch, Navraj Chohan   Development and research team: Jovan Chohan, Nupur Garg,

Matt Hubert, Jonathan Kupferman, Puneet Lakhina, Yiming Li, Nagy Mostafa, Yoshihide Nomura (Fujitsu), Michal Weigel

•  Support   NSF, Google, IBM Research