Towards Elastic Operating Systems

Preview:

DESCRIPTION

Towards Elastic Operating Systems. Amit Gupta Ehab Ababneh Richard Han Eric Keller. University of Colorado, Boulder. OS + Cloud Today. OS/Process. ELB/ Cloud Mgr. Resources Limited Thrashing CPUs limited I/O bottlenecks Network Storage. P resent Workarounds - PowerPoint PPT Presentation

Citation preview

Towards Elastic Operating SystemsAmit GuptaEhab AbabnehRichard HanEric Keller

University of Colorado,Boulder

2

OS/Process

Resources Limited• Thrashing• CPUs limited• I/O bottlenecks

• Network• Storage

Present Workarounds• Additional Scripting/Code changes• Extra Modules/Frameworks

• Coordination• Synch/Aggregating State

OS + Cloud Today

ELB/CloudMgr

3

Advantages• Expands available Memory • Extends the scope of Multithreaded

Parallelism (More CPUs available)• Mitigates I/O bottlenecks• Network• Storage

Stretch ProcessOS/Process

4

ElasticOS : Our Vision

5

ElasticOS: Our Goals “Elasticity” as an OS Service

Elasticize all resources – Memory,CPU, Network, …

Single machine abstraction Apps unaware whether they’re running on

1 machine or 1000 machines Simpler Parallelism

Compatible with an existing OS (e.g Linux, …)

6“Stretched” Process Unified Address Space

OS/Process

V R

Elastic Page TableLocation

7

Movable Execution ContextOS/Process

• OS handles elasticity – Apps don’t change• Partition locality across multiple nodes• Useful for single (and multiple) threads

• For multiple threads, seamlessly exploit network I/O and CPU parallelism

8

Replicate Code, Partition Data

CODE

Data 1

Data 2

CODE CODE

• Unique copy of data (unlike DSM)• Execution context follows data

(unlike Process Migration, SSI )

9

Exploiting Elastic Locality• We need an adaptive page clustering

algorithm• LRU, NSWAP i.e “always pull”• Execution follows data i.e “always

jump”• Hybrid (Initial): Pull pages, then Jump

10

Status and Future Work Complete our initial prototype Improve our page placement

algorithm Improve context jump efficiency Investigate Fault Tolerance issues

Thank YouQuestions

?

Contact:amit.gupta@colorado.edu

12

Algorithm Performance(1)

13

Algorithm Performance(2)

14

Page PlacementMultinode Adaptive LRU

CPUs

Mem

Swap CPUs Swap

Mem

Pulls Threshold Reached !Pull First

JumpExecution

Context

15

Locality in a Single Thread

CPUs

Mem

Swap CPUs Swap

Mem

Temporal Locality

16

Locality across Multiple Threads

CPUs

Mem

Swap CPUs Swap

Mem

CPUs Swap

17

Unlike DSM…

18

Exploiting Elastic Locality• Assumptions • Replicate Code Pages, Place Data Pages

(vs DSM)• We need an adaptive page clustering

algorithm• LRU, NSWAP• Us (Initial): Pull pages, then Jump

19

Replicate Code, Distribute Data

CODE

Data 1

Data 2

CODE CODE

• Unique copy of data (vs DSM)• Execution context follows data

(vs Process Migration)

AccessingData 1 Accessing

Data 2Accessing

Data 1

20

Benefits OS handles elasticity – Apps don’t

change Partition locality across multiple nodes

Useful for single (and multiple) threads For multiple threads, seamlessly

exploit network I/O and CPU parallelism

21

Benefits (delete) OS handles elasticity

Application ideally runs unmodified Application is naturally partitioned …

By Page Access locality By seamlessly exploiting multithreaded

parallelism By intelligent page placement

22

How should we place pages ?

23

Execution Context JumpingA single thread example

Address Space

Node 1

Address Space

Node 2

Process

TIME

24

Address Space

Node 1

Address Space

Node 2

Process

V RPage Table

IP Addr

“Stretch” a Process Unified Address Space

25

Operating Systems Today Resource Limit = 1 Node

OS

CPUs

Mem

Disks Process

26

Cloud Applications at Scale

Cloud Manager

LoadBalancer

Process

More Resources ?

ProcessProcess

Framework (eg. Map Reduce)

Partitioned Data

Partitioned Data

Partitioned Data

More Queries ?

27

Our findings Important Tradeoff

Data Page Pulls Vs Execution Context Jumps

Latency cost is realistic Our Algorithm: Worst case scenario

“always pull” == NSWAP marginal improvements

28

Advantages Natural Groupings: Threads &

Pages Align resources with inherent

parallelism Leverage existing mechanisms

for synchronization

29

“Stretch” a Process : Unified Address Space

V R

CPUs

Mem

Swap

CPUs

Mem

Swap

Page Table

A “Stretched” Process =

Collection of Pages + Other Resources { Across Several Machines }

IP Addr

30

delete Exec. context follows Data Replicate Code Pages

Read-Only => No Consistency burden Smartly distribute Data Pages Execution context can jump

Moves towards data *Converse also allowed*

31

Elasticity in Cloud Apps Today

D1

~~~~

~~~~

~~~~

Input Data

….~~~

~~~~

~~~~

~

CPUs

Mem

Disk

Output Data

D2 Dx

32

D1

Load Balancer

….~~~

~~~~

~~~~

~

CPUs

Mem

Disk

Output Data

D2 Dx

Input Queries

Dy

33

(delete)Goals : Elasticity dimensions Extend Elasticity to

Memory CPU I/O

Network Storage

34

Thank You

35

Bang Head Here !

36

Stretching a Thread

37

Overlapping Elastic Processes

38

*Code Follows Data*

39

Application Locality

40

Possible Animation?

41

Multinode Adaptive LRU

42

Possible Animation?

43

Open Topics Fault tolerance

Stack handling

Dynamic Linked Libraries Locking

44

Elastic Page TableVirtual Addr

Phy. Addr Valid Node (IP addr)

A B 1 LocalhostC D 0 LocalhostE F 1 128.138.60.

1G H 0 128.138.60.

1

Local MemSwap spaceRemote Mem

RemoteSwap

45

“Stretch” a Process Move beyond resource boundaries of

ONE machine CPU Memory Network, I/O

46

D1 D2

~~~~

~~~~

~~~~

Input Data

….~~~

~~~~

~~~~

~

CPUs

Mem

Disk

Output Data

CPUs

Mem

Disk

47

D1

CPUs

Mem

Disk

D2

CPUs

Mem

Disk

~~~~

~~~~

~~~~

Data

48

Reinventing Elasticity Wheel

Recommended