60
A BRIEF INTRODUCTION TO HIGH ASSURANCE CLOUD COMPUTING WITH ISIS2 Ken Birman 1 Cornell University

A Brief Introduction To High Assurance Cloud Computing With Isis2

  • Upload
    quynh

  • View
    33

  • Download
    1

Embed Size (px)

DESCRIPTION

A Brief Introduction To High Assurance Cloud Computing With Isis2. Cornell University. Ken Birman. Isis 2 System. A prebuilt technology that automates many of the hard tasks involved in replicating services and the data on which they depend Targets cloud computing settings - PowerPoint PPT Presentation

Citation preview

Page 1: A Brief Introduction To High Assurance Cloud Computing With Isis2

A BRIEF INTRODUCTION TO HIGH ASSURANCE CLOUD COMPUTING WITH ISIS2Ken Birman

1

Cornell University

Page 2: A Brief Introduction To High Assurance Cloud Computing With Isis2

2

Isis2 System A prebuilt technology that automates

many of the hard tasks involved in replicating services and the data on which they depend

Targets cloud computing settings Available in open-source from

isis2.codeplex.com Intended to be easy to use… … but lacks commercial support, and also

lacks some of the polish of a commercial product

Page 3: A Brief Introduction To High Assurance Cloud Computing With Isis2

3

Isis2 System

Elasticity (sudden scale changes) Potentially heavily loads High node failure rates Concurrent (multithreaded) apps

Long scheduling delays, resource contention Bursts of message loss Need for very rapid response times Community skeptical of “assurance

properties”

C# library (but callable from any .NET language) offering replication techniques for cloud computing developers

Based on a model that fuses virtual synchrony and state machine replication models

Goal is to make it easy for users to deal with some very hard problems seen in distributed settings

Page 4: A Brief Introduction To High Assurance Cloud Computing With Isis2

4

Supported languages… The main library is written in .NET / C#

…and so it is best used from C# Evolved from Java

IronPython also works, and you can use a version of C++ called C++/CLI

.NET runs on Windows Our developers use Visual Studio

… but also can be used on Linux You work with Mono and use the Mono developer

environment or compile using “mcs” at the command line

Page 5: A Brief Introduction To High Assurance Cloud Computing With Isis2

5

Other ways to access Isis2… New “outboard” option

A recent addition…. It allows native C++ and C users to access a subset of the Isis2 functionality

Initial focus is on file replication for memory-mapped files and other big memory-mapped objects

This outboard feature can be accessed from a Isis2 command-line tool To do so you run the Isis2 system as a server Commands to it can then be issued from any

language

Page 6: A Brief Introduction To High Assurance Cloud Computing With Isis2

6

Why Isis2? The new version of Isis is a completely new

system, but builds on a long history First version was the Isis Toolkit (~1985-1996) That system was used to create the French Air

Traffic Control system, the NYSE trading platform, the US Navy AEGIS core communications library and many other mission-critical applications

After the Isis Toolkit, we created Horus and Ensemble (focus was on ties to formal methods)

Isis2 is our latest and best technology. Pronounced “Isis two”, not “Isis squared”

Page 7: A Brief Introduction To High Assurance Cloud Computing With Isis2

7

What does Isis2 do? A library to help you create highly resilient,

secure, scalable applications. The methods automate important tasks

You can run your program on a few machines of your own, or on a big cluster, or in an enterprise network, or even on a cloud platform like Amazon EC2

Isis2 is a “specialist” in solving problems involving data replication within “groups” of programs

Page 8: A Brief Introduction To High Assurance Cloud Computing With Isis2

8

Imagine that some collection of programs are running on some set of machines.Data is said to be replicated if each of them has a copy, and sees the updates to that copy

Key Concept: Replication

Page 9: A Brief Introduction To High Assurance Cloud Computing With Isis2

9

Time

Replication in pictures This is an example of a non-replicated

server with a client who issues some form of request

Update the monitoring and alarmscriteria for Mrs. Marsh as follows…

Confirmed

Response delay seen by end-user would

include Internet latencies

Service response

delay

Service instance

Page 10: A Brief Introduction To High Assurance Cloud Computing With Isis2

10

Replication in pictures Here we replicate data on multiple

programs. The group of programs offers better reliability than a program, and could perform better tooUpdate the monitoring and

alarmscriteria for Mrs. Marsh as follows…

Confirmed

Response delay seen by end-user would

include Internet latencies

Service response

delay

Service group

Page 11: A Brief Introduction To High Assurance Cloud Computing With Isis2

11

Things to notice Although there is a group of programs, the client still

talks to some representative. Standard “web services” requests work in the usual way

The group members are just normal programs and could be running on the same machines, or different machines. They could be running the same code, or different code. They cooperate to replicate data No variables or memory is physically shared by the programs.

Instead, each has a private replica. When an update

happens, a message is sent to tell all the group members update their copies

Page 12: A Brief Introduction To High Assurance Cloud Computing With Isis2

12

The big picture…

Steps in using Isis2

Page 13: A Brief Introduction To High Assurance Cloud Computing With Isis2

13

First questions Step back and draw a sketch of your

application Client systems, if it has remote access The service you want to build that will use

Isis2 Data files the service may need, or

databases Other subsystems or services it will interact

with

Page 14: A Brief Introduction To High Assurance Cloud Computing With Isis2

14

Our example from earlier

Response

Request

Computes

Service group

Updates that modify

replicated data

Page 15: A Brief Introduction To High Assurance Cloud Computing With Isis2

15

Where will the service run? Will you run your replicated service…

Next to its clients, on the same machines? On EC2, RackSpace, GooglePlex, MSFT

Azure, ….? In a cluster of computers up the hallway?

How will clients connect to it? Web Services remote method invocation? CORBA RPC? TCP connections?

Page 16: A Brief Introduction To High Assurance Cloud Computing With Isis2

16

Now build a non-replicated instance

Before adding Isis2 mechanisms, build a single instance of your server and test access to it from your client, or whatever else might access it

Debug this code. Adding Isis2 functionality after the fact isn’t

hard and may really be much easier than using it from the outset!

But planning ahead can help: using C#, C++/CLI or IronPython will be far easier than other options

Page 17: A Brief Introduction To High Assurance Cloud Computing With Isis2

17

… some people get stuck at step 1! Building a cloud-hosted service involves

steps you may never have tried before For example, with Visual Studio, you create

a new kind of service instance, and need to tell it which methods are remotely available, and then need to install and run it on your machine, which may involve firewall changes to allow clients to access it…

… Isis2 can’t really help with that. But it isn’t terribly hard. Just do a small step at a time following the instructions for the package you decide to use

Page 18: A Brief Introduction To High Assurance Cloud Computing With Isis2

18

Why not native C or native C++? We do support a stripped-down way to use Isis2

from these and other languages, but you only can access part of the functionality

We’ll discuss these features later, but not today

If you know Java, you will find it easy to migrate to C#, and we would recommend that route C# works on both Windows and Linux (via Mono)

Page 19: A Brief Introduction To High Assurance Cloud Computing With Isis2

19

Designing the replicated functions Now you are ready to design the replication

features of your service Ask what data needs to be replicated? How will it be loaded initially? How will it be

updated? Will you do load-balanced queries (if so they can just

access any service member), or parallel queries? What synchronization or coordination is needed?

Start with a good image of how you want it to work

Page 20: A Brief Introduction To High Assurance Cloud Computing With Isis2

20

State machine model Think about a state machine

Some sort of object or class, with state in it (data) The state is updated deterministically with events

occur (updates…), and it can be checkpointed and later loaded back in from the checkpoint

With Isis2 we simply replicate a state machine! You design an object or subsystem that has the

replicated data in it The updates become multicasts, totally ordered,

and also group membership changes

Page 21: A Brief Introduction To High Assurance Cloud Computing With Isis2

21

Replication in pictures By the time you are ready to code, you’ll

have a picture similar to this one, specific to your setting

Update the monitoring and alarmscriteria for Mrs. Marsh as follows…

Confirmed

Response delay seen by end-user would

include Internet latencies

Service response

delay

Service group

Hosted on EC2

Ordered multicast

to replicate update

Page 22: A Brief Introduction To High Assurance Cloud Computing With Isis2

22

From this picture… Now your needs are more concrete

How will I represent the replicated state in my group? How can I copy the current data to a joining member? I don’t want to reply to the client until all the replicas

have been updated. What’s the best way to do that? I’m worried about speed. What will limit response

delays? How fast can a service like this run? This set of MOOC modules will help you answer

such questions, but experiments and optimization of your solution are certainly needed too…

Page 23: A Brief Introduction To High Assurance Cloud Computing With Isis2

23

(some data…)

Let’s replicate!

Page 24: A Brief Introduction To High Assurance Cloud Computing With Isis2

24

How would you solve this problem? Assume you have access to a client-

server technology, like the ones used to build web services

You would need a list of group members Then when an update happens, the

program receiving the request could update the others

Page 25: A Brief Introduction To High Assurance Cloud Computing With Isis2

25

Replication version 0

X=5 X=5 X=5 X=5

Set x=x+150 X=155 X=155

X=155

X=155

Page 26: A Brief Introduction To High Assurance Cloud Computing With Isis2

26

Things you’ll need to think about

Keep track of group membership. Can it change? If new members join, how do they learn the initial value of

x, or other “replicated state”? When someone joins, how do we update the group

membership list? If a member fails, how are they dropped from the group?

Implement the 1-to-(N-1) “multicast” Handle reliability issues: what if a request times out? Handle security: should we use SSL? Something else? What if two conflicting updates happen concurrently?

Page 27: A Brief Introduction To High Assurance Cloud Computing With Isis2

27

This sounds hard! … in fact it is very hard

Any form of replication is equivalent to a famous problem called distributed consensus

There are many published papers on this topic; you’ll want to use a good solution

Once you solve the basic issues you’ll still face many challenges of performance, scale, portability, respecting rules for the particular runtime setting…

Page 28: A Brief Introduction To High Assurance Cloud Computing With Isis2

28

We say that replicated data is consistent if multiple users accessing it can’t detect whether or not it was replicated.

Key Concept: Consistency

Page 29: A Brief Introduction To High Assurance Cloud Computing With Isis2

29

Consistent replicated data

X=5 X=5 X=5 X=5

Set x=x+150 X=155 X=155

X=155

X=155

Set x=22

X=22X=22

X=22

X=22

Page 30: A Brief Introduction To High Assurance Cloud Computing With Isis2

30

Inconsistent replicated data

X=5 X=5 X=5 X=5

Set x=x+150 X=155 X=155

X=155

X=155

Set x=22

X=22X=22

X=22

X=22

Thinks x=155Think

s x=22

Thinks

x=22

Thinks

x=22

Isis2 never lets this happen. It automatically

enforces ordering for you

Page 31: A Brief Introduction To High Assurance Cloud Computing With Isis2

31

Possible sources of inconsistency? Conflicting updates could reach members in different

orders

Maybe someone dropped an update (network message loss), or applied one twice.

Update sender could have failed while sending the updates, so that some copies were sent, but others weren’t sent

Confusion about membership: perhaps some member was joining and the update initiator didn’t realize it, hence didn’t send the update to it, or the initial state was “wrong”

Page 32: A Brief Introduction To High Assurance Cloud Computing With Isis2

32

A subtle risk Failure handling poses hard problems!

Suppose that process A is supposed to send an update reliably to processes B, C and D

A might use TCP, or some other reliable protocol. But if process C fails, A needs to give up.

What if the network temporarily fails, and A can’t reach C? This is indistinguishable from C failing, except that later, C will be accessible again!

Page 33: A Brief Introduction To High Assurance Cloud Computing With Isis2

33

Split brain syndrome We say that the network “partitioning”

issue causes a “split brain” problem

Some members of our group think that A and B the healthy members. Others think that C and D are healthy and that A and B have failed. Which “side” is right? Who should be in charge if this group is doing

something critical, like deciding which plane can land on a particular runway?

Page 34: A Brief Introduction To High Assurance Cloud Computing With Isis2

34

Split brain syndrome

Inconsistency can be very dangerous!

Flight AA 27 clear to land on

runway 3-B

Flight US 1827 clear to take off on runway 3-B

Page 35: A Brief Introduction To High Assurance Cloud Computing With Isis2

35

Split brain syndromeFlight US 1827 clear to take off on runway 3-B

A B C D

Consider this air traffic control “server”. It helps make sure runways are used safely.

Page 36: A Brief Introduction To High Assurance Cloud Computing With Isis2

36

Split brain syndrome

Flight AA 27 clear to land on

runway 3-B

Flight US 1827 clear to take off on runway 3-B

A B C D

Temporary network failure: A and B can talk to each other but can’t reach

C or D, and vice-versa

Isis2 never lets this happen. It automatically

prevents split-brain behavior

Page 37: A Brief Introduction To High Assurance Cloud Computing With Isis2

37

Other settings that need consistency Medical systems that give information about

patient status and current drug regimen Smart power grid system that controls power

infrastructure components such as transformers Self-driving vehicles that coordinate while driving

at high speed on highways Control systems for chemical plants and refineries Corporate accounting systems, and payroll

… you can make quite a long list.

Page 38: A Brief Introduction To High Assurance Cloud Computing With Isis2

38

Cloud computing: CAP theorem The most common tools for building cloud computing

solutions don’t help with consistency They treat the property as a special need and assume

you’ll find your own ways to do this This reflects the so-called CAP theorem

“You can have at most two out of these three:Consistency, Availability and Partition Tolerance.” Eric Brewer, Berkeley

Cloud platforms assume you are not sophisticated enoughto make tradeoffs. They weaken consistency to get the fastest possible response times under the widest possible range of operating conditions.

Page 39: A Brief Introduction To High Assurance Cloud Computing With Isis2

39

By using a preexisting library like Isis2 you don’t need to solve these problems yourselfThe system solves them for you

Isis2 to the rescue!

In Egyptian mythology, Isis rescued Osiris after he was torn to pieces in an epic battle with Horus. She restored him to life and he went on to rule the underworld. Their child,

Set, later defeated Horus and banished him. The story illustrates the value of fault-

tolerance

Page 40: A Brief Introduction To High Assurance Cloud Computing With Isis2

40

Isis2 makes developer’s life easier

Formal model permits us to achieve correctness

Isis2 is based on protocols that can be expressed mathematically and proved correct

Think of Isis2 as a collection of modules, each with rigorously stated properties

Isis2 implementation needs to be fast, lean, easy to use

Developer must see it as easier to use Isis2 than to build from scratch

Seek great performance under “cloudy conditions”

Forced to anticipate many styles of use

Benefits of Using Formal model

Importance of Sound Engineering

Page 41: A Brief Introduction To High Assurance Cloud Computing With Isis2

Isis2 makes developer’s life easier

Group g = new Group(“myGroup”);Dictionary<string,double> Values = new

Dictionary<string,double>();g.ViewHandlers += delegate(View v) {

Console.Title = “myGroup members: “+v.members;};g.Handlers[UPDATE] += delegate(string s, double v) { Values[s] = v;};g.Handlers[LOOKUP] += delegate(string s) { g.Reply(Values[s]);};g.Join();

g.OrderedSend(UPDATE, “Harry”, 20.75);

List<double> resultlist = new List<double>();nr = g.Query(ALL, LOOKUP, “Harry”, EOL, resultlist);

First sets up group

Join makes this entity a member. State transfer isn’t shown

Then can multicast, query. Runtime callbacks to the “delegates” as events arrive

Easy to request security (g.SetSecure), persistence

“Consistency” model dictates the ordering aseen for event upcalls and the assumptions user can make

41

Page 42: A Brief Introduction To High Assurance Cloud Computing With Isis2

Isis2 makes developer’s life easier

Group g = new Group(“myGroup”);Dictionary<string,double> Values = new

Dictionary<string,double>();g.ViewHandlers += delegate(View v) {

Console.Title = “myGroup members: “+v.members;};g.Handlers[UPDATE] += delegate(string s, double v) { Values[s] = v;};g.Handlers[LOOKUP] += delegate(string s) { g.Reply(Values[s]);};g.Join();

g.OrderedSend(UPDATE, “Harry”, 20.75);

List<double> resultlist = new List<double>();nr = g.Query(ALL, LOOKUP, “Harry”, EOL, resultlist);

First sets up group

Join makes this entity a member. State transfer isn’t shown

Then can multicast, query. Runtime callbacks to the “delegates” as events arrive

Easy to request security (g.SetSecure), persistence

“Consistency” model dictates the ordering seen for event upcalls and the assumptions user can make

42

Page 43: A Brief Introduction To High Assurance Cloud Computing With Isis2

Isis2 makes developer’s life easier

Group g = new Group(“myGroup”);Dictionary<string,double> Values = new

Dictionary<string,double>();g.ViewHandlers += delegate(View v) {

Console.Title = “myGroup members: “+v.members;};g.Handlers[UPDATE] += delegate(string s, double v) { Values[s] = v;};g.Handlers[LOOKUP] += delegate(string s) { g.Reply(Values[s]);};g.Join();

g.OrderedSend(UPDATE, “Harry”, 20.75);

List<double> resultlist = new List<double>();nr = g.Query(ALL, LOOKUP, “Harry”, EOL, resultlist);

First sets up group

Join makes this entity a member. State transfer isn’t shown

Then can multicast, query. Runtime callbacks to the “delegates” as events arrive

Easy to request security (g.SetSecure), persistence

“Consistency” model dictates the ordering seen for event upcalls and the assumptions user can make

43

Page 44: A Brief Introduction To High Assurance Cloud Computing With Isis2

Isis2 makes developer’s life easier

Group g = new Group(“myGroup”);Dictionary<string,double> Values = new

Dictionary<string,double>();g.ViewHandlers += delegate(View v) {

Console.Title = “myGroup members: “+v.members;};g.Handlers[UPDATE] += delegate(string s, double v) { Values[s] = v;};g.Handlers[LOOKUP] += delegate(string s) { g.Reply(Values[s]);};g.Join();

g.OrderedSend(UPDATE, “Harry”, 20.75);

List<double> resultlist = new List<double>();nr = g.Query(ALL, LOOKUP, “Harry”, EOL, resultlist);

First sets up group

Join makes this entity a member. State transfer isn’t shown

Then can multicast, query. Runtime callbacks to the “delegates” as events arrive

Easy to request security (g.SetSecure), persistence

“Consistency” model dictates the ordering seen for event upcalls and the assumptions user can make

44

Page 45: A Brief Introduction To High Assurance Cloud Computing With Isis2

Isis2 makes developer’s life easier

Group g = new Group(“myGroup”);Dictionary<string,double> Values = new

Dictionary<string,double>();g.ViewHandlers += delegate(View v) {

Console.Title = “myGroup members: “+v.members;};g.Handlers[UPDATE] += delegate(string s, double v) { Values[s] = v;};g.Handlers[LOOKUP] += delegate(string s) { g.Reply(Values[s]);};g.Join();

g.OrderedSend(UPDATE, “Harry”, 20.75);

List<double> resultlist = new List<double>();nr = g.Query(ALL, LOOKUP, “Harry”, EOL, resultlist);

First sets up group

Join makes this entity a member. State transfer isn’t shown

Then can multicast, query. Runtime callbacks to the “delegates” as events arrive

Easy to request security (g.SetSecure), persistence

“Consistency” model dictates the ordering seen for event upcalls and the assumptions user can make

45

Page 46: A Brief Introduction To High Assurance Cloud Computing With Isis2

46

Each replica is a completely standard program with its own private, local variables. Nothing is “physically shared”!So in particular, the variable “Values” is local: each program has its own copy of this variableThe different copies get the same messages, in the same order. This allows them to apply the same updates.

Key Concept: Local Data

Page 47: A Brief Introduction To High Assurance Cloud Computing With Isis2

47

C# Dictionary Generic Type C# supports generic types

Objects in which some other type is a parameter:

Dictionary<string,double> Values = new Dictionary<string,double>();

A C# Dictionary maps a key to a value. This Dictionary maps strings to doubles.

The variable name is Values. Each program has its own private copy.

Page 48: A Brief Introduction To High Assurance Cloud Computing With Isis2

Isis2 makes developer’s life easier

Group g = new Group(“myGroup”);Dictionary<string,double> Values = new

Dictionary<string,double>();g.ViewHandlers += delegate(View v) {

Console.Title = “myGroup members: “+v.members;};g.Handlers[UPDATE] += delegate(string s, double v) { Values[s] = v;};g.Handlers[LOOKUP] += delegate(string s) { g.Reply(Values[s]);};g.Join();

g.OrderedSend(UPDATE, “Harry”, 20.75);

List<double> resultlist = new List<double>();nr = g.Query(ALL, LOOKUP, “Harry”, EOL, resultlist);

First sets up group

Join makes this entity a member. State transfer isn’t shown

Then can multicast, query. Runtime callbacks to the “delegates” as events arrive

Easy to request security (g.SetSecure), persistence

“Consistency” model dictates the ordering seen for event upcalls and the assumptions user can make

48

Page 49: A Brief Introduction To High Assurance Cloud Computing With Isis2

Isis2 makes developer’s life easier

Group g = new Group(“myGroup”);Dictionary<string,double> Values = new

Dictionary<string,double>();g.ViewHandlers += delegate(View v) {

Console.Title = “myGroup members: “+v.members;};g.Handlers[UPDATE] += delegate(string s, double v) { Values[s] = v;};g.Handlers[LOOKUP] += delegate(string s) { g.Reply(Values[s]);};g.Join();

g.Send(UPDATE, “Harry”, 20.75);

List<double> resultlist = new List<double>();nr = g.Query(ALL, LOOKUP, “Harry”, EOL, resultlist);

First sets up group

Join makes this entity a member. State transfer isn’t shown

Then can multicast, query. Runtime callbacks to the “delegates” as events arrive

Easy to request security (g.SetSecure), persistence

“Consistency” model dictates the ordering seen for event upcalls and the assumptions user can make

49

Page 50: A Brief Introduction To High Assurance Cloud Computing With Isis2

50

Concept: A “multi-query” Our lookup is

Multicast to the group All members respond

A chance for parallelism Each can do part of the

job: e.g. search 1/nth of a database

Reduces response delays

Lookup “Harry” in the Ithaca phone directory

Front end

With n replicas... ... we get an n times speedup!

Names with Harry in them: ....

Page 51: A Brief Introduction To High Assurance Cloud Computing With Isis2

Isis2 makes developer’s life easier

Group g = new Group(“myGroup”);Dictionary<string,double> Values = new

Dictionary<string,double>();g.ViewHandlers += delegate(View v) {

Console.Title = “myGroup members: “+v.members;}; g.Handlers[UPDATE] += delegate(string s, double v) { Values[s] = v;};g.Handlers[LOOKUP] += delegate(string s) { g.Reply(Values[s]);};g.Join();

g.OrderedSend(UPDATE, “Harry”, 20.75);

List<double> resultlist = new List<double>();nr = g.OrderedQuery(ALL, LOOKUP, “Harry”, EOL,

resultlist);

First sets up group

Join makes this entity a member. State transfer isn’t shown

Then can multicast, query. Runtime callbacks to the “delegates” as events arrive

Easy to request security (g.SetSecure), persistence

“Consistency” model dictates the ordering seen for event upcalls and the assumptions user can make

51

Tells Isis2 which kind of multicast order and durability you need

Page 52: A Brief Introduction To High Assurance Cloud Computing With Isis2

52

RPC versus multi-query In a remote

procedure call, the client asks the server to do something A message is sent Server replies in a

second message Result returned

much as from a local method

With a multi-query A multicast is sent Servers respond A list of results is

returned

If the group had N members when the query was done, we get N responses

Page 53: A Brief Introduction To High Assurance Cloud Computing With Isis2

53

But all the replies will be the same! In our example, Values is identical in all replicas

… so what would be the point in asking for ALL replies?

In fact Isis2 has many options You could just ask one group member to do the query.

With N members, the group can handle N times the load You could ask the k’th member to search just 1/k’th of

the Values array. This would be N times faster You could combine these ideas and treat your group as

if it had N members in N/K subsets of size K each…

Page 54: A Brief Introduction To High Assurance Cloud Computing With Isis2

Adding security: Just one line!

Group g = new Group(“myGroup”);Dictionary<string,double> Values = new

Dictionary<string,double>();g.ViewHandlers += delegate(View v) {

Console.Title = “myGroup members: “+v.members;};g.Handlers[UPDATE] += delegate(string s, double v) { Values[s] = v;};g.Handlers[LOOKUP] += delegate(string s) { g.Reply(Values[s]);};g.SetSecure(myKey);g.Join();

g.OrderedSend(UPDATE, “Harry”, 20.75);

List<double> resultlist = new List<double>();nr = g.OrderedQuery(ALL, LOOKUP, “Harry”, EOL, resultlist);

First sets up group

Join makes this entity a member. State transfer isn’t shown

Then can multicast, query. Runtime callbacks to the “delegates” as events arrive

Easy to request security, persistence, UNICAST_ONLY, etc

“Consistency” model dictates the ordering seen for event upcalls and the assumptions user can make

54

Page 55: A Brief Introduction To High Assurance Cloud Computing With Isis2

55

Picking a multicast primitive Optimistic delivery

RawSend, Send, CausalSend, OrderedSend

Use Flush(k) prior to talking to an external entity like a client

Very fast, ideal for services with “soft state”

Very pessimistic SafeSend: a version of

Paxos, like OrderedSend but with stronger durability

Active “disk logger” for strongest assurance

Slow, but suitable for updating a replicated database or persistent external filesWe’ll discuss these issues in detail later

Page 56: A Brief Introduction To High Assurance Cloud Computing With Isis2

56

Our example was overly simple it didn’t show the “state transfer” code

Corresponds to the “white arrows” in time-line figure In Isis2 we have a way

to make checkpoints State transfer: Some

active member makes acheckpoint, and the joinerloads the state from it.

The code looks like other operations in our example Checkpoints can also be used to save group state

during periods when all members are inactive

p

q

r

s

t

Time: 0 10 20 30 40 50 60 70

Time

Page 57: A Brief Introduction To High Assurance Cloud Computing With Isis2

57

Some uses for process groups To replicate data

maintained by the members in memory

To replicate actions taken on an external service such as a replicated database

To ensure that all replicas are configured the same way

To coordinate the processing of requests and load-balance

To offer a way to parallelize processing by having each group member do part of the work

Fault-tolerance via a backup scheme

Page 58: A Brief Introduction To High Assurance Cloud Computing With Isis2

58

Isis2 Summary A library that you can invoke from a normal

program written in a normal way

It does the work of creating groups and sending multicasts and ensuring that the consistency model will be enforced

The developer just tells it what to do. She thinks about a parallel distributed application. Virtual synchrony eliminates many hard problems

Page 59: A Brief Introduction To High Assurance Cloud Computing With Isis2

59

Why not build it yourself from scratch?

Isis2 user

object

Isis2 user

object

Isis2 user

object

Isis2 library

Group instances and multicast protocolsFlow Control

Membership Oracle

Large Group Layer UDP overlayDr. Multicast Platform Security

Reliable Sending Fragmentation Group Security

Sense Runtime EnvironmentSelf-stabilizing

Bootstrap ProtocolSocket Mgt/Send/Rcv

SendCausalSend

OrderedSendSafeSendQuery....

Message Library “Wrapped” locks Bounded Buffers

Oracle Membership

Group membership

Report suspected failures

Other groupmembers

SafeSend and Send are two of the protocol components hosted over what we call the large-

scale properties sandbox. The sandbox addresses issues like flow control, security, etc.

All protocols share and benefit from those properties

These systems are complex, especially if you want to run on platforms like EC2

By using Isis2 you “inherit” 30 years of research on how to make it work

OOB File Replication Key-Value Store

Views

The SandBox itself is mostly composed of “convergent” protocols that use

probabilistic methods

Page 60: A Brief Introduction To High Assurance Cloud Computing With Isis2

60

Isis2: Cornell’s cloud solution

Isis2 makes it easy to write programs in a standard way, using the methods in its library, that replicate data and can coordinate their actions The system automates tasks needed to

ensure consistency Without the system, consistency, security

and other such issues are very hard to solve But you still need to decide where to run

your programs, make sure they can find one-another, etc. Isis2 focuses mostly on keeping them consistent.