Cloud.la PoolParty Presentation

Embed Size (px)

Citation preview

  • 8/14/2019 Cloud.la PoolParty Presentation

    1/55

    Part 1: HadoopA demo about setting up ahadoop cluster with PoolParty

  • 8/14/2019 Cloud.la PoolParty Presentation

    2/55

    Service oriented

    deploymentI want to deploy a service

    (http/mail/etc.)

    I dont care about the platformI want it to work

    I want to do it a million timesI want it now

  • 8/14/2019 Cloud.la PoolParty Presentation

    3/55

    Self managing

  • 8/14/2019 Cloud.la PoolParty Presentation

    4/55

    GruntworkWhy?

  • 8/14/2019 Cloud.la PoolParty Presentation

    5/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    6/55

    Lets be

    Cutting-edge

    IntuitiveTool-driven

    Lazy

  • 8/14/2019 Cloud.la PoolParty Presentation

    7/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    8/55

    ToolsBecause we are human

  • 8/14/2019 Cloud.la PoolParty Presentation

    9/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    10/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    11/55

    Big frontier

    Shell scripts

    CapistranoPackage managers (apt, yum,tpkg)

    ChefPuppet

    lots of settlements

  • 8/14/2019 Cloud.la PoolParty Presentation

    12/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    13/55

    PoolPartyEnjoyable cloud infrastructure

  • 8/14/2019 Cloud.la PoolParty Presentation

    14/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    15/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    16/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    17/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    18/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    19/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    20/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    21/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    22/55

    Demo

  • 8/14/2019 Cloud.la PoolParty Presentation

    23/55

    Part 2

    Discussion of distributed algorithms, theHermes project and nosql

    Distributed Algorithms

  • 8/14/2019 Cloud.la PoolParty Presentation

    24/55

    What?

    A distributed algorithm is analgorithm designed to run oncomputer hardware constructedfrom interconnected processors.

    (Wikipedia)

  • 8/14/2019 Cloud.la PoolParty Presentation

    25/55

    Why?

    Because scale is becomingincreasingly important

    Datacenters are becomingaccessible

    Commodity hardware is cheapNetwork is cheaper

  • 8/14/2019 Cloud.la PoolParty Presentation

    26/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    27/55

    When?

    Now

  • 8/14/2019 Cloud.la PoolParty Presentation

    28/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    29/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    30/55

    Assorted types

    MapReduce

    Atomic CommitConsensusMutual exclusion (distributed

    mutex)

    Distributed search

  • 8/14/2019 Cloud.la PoolParty Presentation

    31/55

    Why its easy

    Math is fun

  • 8/14/2019 Cloud.la PoolParty Presentation

    32/55

    Why its hard

    Account for failure

    Unsafe networksData sharding

    Job distribution

  • 8/14/2019 Cloud.la PoolParty Presentation

    33/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    34/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    35/55

    Decentralization

    assumptionsNodes are prone to failure

    Nodes are homogenous*The dataset is largeNodes are cheap (easy to

    add/remove)

    Network is unowned

  • 8/14/2019 Cloud.la PoolParty Presentation

    36/55

    Scale

    Greater utilization of hardware

    InexpensiveCooperative application space

    And its green

  • 8/14/2019 Cloud.la PoolParty Presentation

    37/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    38/55

    NoSql

    Scaling relational databases is noteasy and seriously no fun at allKey/Value stores are easier to

    scale

  • 8/14/2019 Cloud.la PoolParty Presentation

    39/55

    NoSql

    BigTable (column orienteddatabase)

    CassandraVoldemortScalaris

  • 8/14/2019 Cloud.la PoolParty Presentation

    40/55

    Paxos

    An algorithm for decidingconsensus within a network of

    unreliable nodes.

    Transaction layer

    Atomic commitsStrong data consistency

  • 8/14/2019 Cloud.la PoolParty Presentation

    41/55

    Paxos

    Devised by Leslie Lamport in 1990Published in 1998Based on viewsource replication

    (published 2 years earlier)

  • 8/14/2019 Cloud.la PoolParty Presentation

    42/55

    Big names

    Googles Chubby (and BigTable)IBM San Volume Controller

    Microsoft

  • 8/14/2019 Cloud.la PoolParty Presentation

    43/55

    What?

  • 8/14/2019 Cloud.la PoolParty Presentation

    44/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    45/55

    What? (contd)

  • 8/14/2019 Cloud.la PoolParty Presentation

    46/55

    What? (contd)

  • 8/14/2019 Cloud.la PoolParty Presentation

    47/55

    What? (contd)

  • 8/14/2019 Cloud.la PoolParty Presentation

    48/55

    What? (contd)

  • 8/14/2019 Cloud.la PoolParty Presentation

    49/55

    What? (contd)

  • 8/14/2019 Cloud.la PoolParty Presentation

    50/55

  • 8/14/2019 Cloud.la PoolParty Presentation

    51/55

    HermesOpen-source internode communicationproject

  • 8/14/2019 Cloud.la PoolParty Presentation

    52/55

    What

    Erlang-y

    Consensus algorithmsDistributed mutex

    Mapping/Reduction

  • 8/14/2019 Cloud.la PoolParty Presentation

    53/55

    Where? (almost)http://github.com/auser/hermes/tree/master

    https://github.com/auser/hermes/tree/masterhttps://github.com/auser/hermes/tree/master
  • 8/14/2019 Cloud.la PoolParty Presentation

    54/55

    [email protected]

    mailto:[email protected]:[email protected]
  • 8/14/2019 Cloud.la PoolParty Presentation

    55/55

    Thanks

    Ari Lerner

    AT&T CloudTeam

    And all the various funny imagesources

    irc.freenode.net/#poolpartyrbYou