Ideas for Cooperative Disk Management with ECOSystem

Ideas for Cooperative Disk Management with ECOSystem

Emily Tennant

Mentor: Carla Ellis

Duke University

Outline

• Introduction

• Building blocks for cooperative file operations in ECOSystem

• Read Requests

• Write Requests

• Unresolved Issues

• Experimental Plan

Vision and Goals

• The next step for ECOSystem– Energy efficient policy space– Energy aware applications

• Vision– Create a set of API extensions permitting applications to pass

application-specific information to the OS.– Use this information to manage disk accesses more efficiently.

• Specific Goals– Simple interface– Minimal changes – backward compatibility– Energy savings in terms of currentcy

Related Work

• Recent submission to OSDI conference:– Andreas Weissel, Björn Beutel, Frank Bellosa. Cooperative I/O—

A Novel I/O Semantics for Energy-Aware Applications.

• Goal: “demonstrate benefits of application involvement in operating system power management.”

• Coop-I/O – an approach to reduce power consumption of devices (disk)

• Encompasses all levels of the computer system– Hardware

– Operating system

– I/O interface for energy-efficient applications

Main Elements – Coop-I/O

• New cooperative file operations– read_coop(), write_coop(), open_coop()– New parameters: time-out and cancel flag– Delay activating disk for length of time-out parameter– Possible to abort accesses after time-out period

• Energy-efficient update mechanism– Update of cached disk blocks is batched to maximize the time hard

disk can spend in standby mode.

• OS controls hard disk modes– Disk drive is switched into low-power mode according to an

adaptive algorithm.– “device-dependent time-out with early shutdown (DDT/ES)”

Cooperative I/O in ECOSystem

• New cooperative file operations– Extend system calls (read, write) with extra parameters.– Deferrable disk accesses– Options for abortable file operations?

• Energy-efficient update mechanism– Updates deferred to create bursty disk access– Energy-efficient update strategies integrated into write process

• OS control of disk drive– Motivation of adaptive algorithm for powering down disk drive?

Outline

• Introduction


• Read Requests

• Write Requests



Bidding• How do we couch the cooperative, deferrable file

operations of Coop-I/O in terms of ECOSystem’s currentcy?

• “Bidding” process– Inflated entry price delays disk

spinup to ensure that processes have enough currentcy to execute and generate more disk requests.

– Each process “bids” the amount of currentcy it is willing to contribute towards the entry price.

– Is bidded currentcy considered available for use?

Time

To

tal B

id

Entry Price

P1

P2

P3

Disk spins up

Priorities• Motivating Question: How can

we implement the bidding process in a useful and intuitive interface (similar to Coop-I/O)?– Currentcy is dynamic – difficult

for application to assign directly.

– Create static priorities.

– Map priorities to a currentcy amount for bid.

• What type of priorities could be created?– Integer sets (1-10, 1-100)

– Real-numbered intervals

– Dynamic prioritiesPriority (and currentcy bid)

Tim

e sp

e nt

wa

it in

g

Time-out Priority Currentcy

Dynamic Priorities• Allow priority to change over time.

– As a process waits for disk access, its priority can increase, causing an increase in the amount of currentcy allocated to its bid.

• A Simple Case:– No feedback required from application– Integer priority levels 1-10– Include time-out parameter in system call – If disk access not scheduled within time-out, jump up one priority

level.

• Other possibilities– Priority function (priority vs. time) given by application– Upcalls to application allow on-the-fly changes in priority

• Issues– Default priorities, time-out, etc.

Mapping Priority to Bid

• Takes place within OS

• Involves resource container

• May require resource container alteration– Bid, priority, time-out, etc.

Resource Container

Available_currentcy

Ticket

Available_currentcy

TicketBid

priority

Percentage of entry price

Percentage of available currentcy Priority?

Time-out?

• Assume numeric priority.

• Priority corresponds directly to percentage of available currentcy that is allocated to bid.

• Example: priority = 5, available currentcy = 1000mJ

• BID = 500mJ

• Remainder of available currentcy can be used (CPU, NIC) while process waits to access disk.

• Overhead issues

BID = available_currentcy * (priority/10)

A Simple Mapping Model

Tackling the Exception of Abortability

• The ability to save energy/currentcy by aborting a disk access is a desirable concept.

• Can abortability be considered in terms of currentcy?

ECOSystem Possibilities:• Time-out must be a specified, non-

infinite value for all possibilities.

• Boolean system call parameter– Abortable accesses can have

decreasing priority over time.

• Unique priority level (i.e. 0)– Does not require extra parameter

– Can automatically allocate zero currentcy (no bid)

Coop-I/O:• Cancel flag as system call

parameter• Abort disk access after waiting

through time-out period.

Outline

• Introduction


• Read Requests

• Write Requests



Sample Read Disk Accesses

1. System call is generated.

read(……...., [priority],[time-out]);

2. Verify that data is uncached and disk is inactive.

3. Transform priority bid.

4. “Issue bid” – store bid amount in a data structure.

5. For dynamic priorities, enter timed waiting function/loop.

6. Update daemon waits until bids entry price (reads & writes).

7. If waiting function returns at time-out:1. If abortable, cancel access.

2. Else, increase priority and “re-issue” bid.

8. When bids entry price, disk spins up.

9. Disk access is scheduled.

10. Data is read into buffer cache.

11. Resource container is debited for cost of access.

Write Disk Access Issues

• More complex!

• What does “deferrable write” mean? Do we defer writing to the buffer cache or flushing the buffers to the disk?

• How do we guarantee consistency of file system when write requests are aborted?

• Coop-I/O: early commit/abort strategy – Delays writing data to buffer cache until it can be committed.

• Drive is active at time of write request

• Drive will be activated by another committed write (dirty buffers exist).

• Time-out has been reached and write request is non-abortable.

– Forces processes to wait before writing to buffers.

– Defers both writing to buffer cache and flushing buffers to disk.

Write Disk Access, cont.

• ECOSystem: Unified write/update policy defers disk access

• Simplify non-abortable writes– Data can be written to buffer immediately – no need to defer!

– Process bids to flush buffer to disk drive (only requires resource container).

– What happens to the bidding process when buffers are overwritten?

• Implement abortable writes with early commit/abort strategy.– Must immediately identify abortable writes.

– Do not have to issue a bid.

– Bidding process must take place before writing to buffer.

• All dirty buffers are always flushed to disk.

Update Mechanism

Coop-I/O• Separate update policy.

• Four Drive specific-cooperative update strategies:– Write back all buffers.

– Update cooperatively.

– Update each drive separately.

– Update on shutdown.

ECOSystem• Update strategy

– Bidding process creates bursty disk access.

– If disk is active, updates are scheduled immediately.

– Disk spinup/spindown controlled by existence of scheduled disk accesses.

– Decreasing entry price guarantees updates.

– FlushStart vs. entry price

Sample Write Disk AccessesGenerate system call

Abortable?NoYes

Is buffer already cached?

Is buffer already cached?

Is disk active?

Is disk active? Is disk active?Is disk active? Write to buffer.

Yes YesNoNo

Bid/Wait (for read).

No No No

No

Bid.

Flush buffer.

Read buffer.

Write to buffer.

Flush buffer.

Bid (for read).Bid/Wait.

Abort.

Write to buffer.

Flush buffer.

Abort. Read buffer.

Write to buffer.

Flush buffer.

Outline

• Introduction


• Read Requests

• Write Requests



Unresolved IssuesSame Bytes Different Bytes

Same Buffer BlockDifferent

Buffer Blocks

Reads

Same Task ? Add bids?

Add bids. One total bid per task for disk access.

Multiple Tasks

One accessReplace bid?

Add bids?Charge all tasks?

HOW?

Normal bidding process.

Writes

Same Task

Overwriting = one disk access

Replace bid?Add bids? Add bids.

Multiple Tasks

OverwritingReplace bid?

Who is charged?

Unlikely!Add bids?

Charge all tasks? How?

Normal bidding process.

1. How do we accumulate bids?

2. How do we charge for access? Do we weight charge by number of bytes read?

Experimental Plan

• Design – 6/14/02

• Implementation of deferrable reads – 6/26/02

• Implementation of deferrable writes – 7/5/02

• Synthetic benchmarks/testing – 7/12/02

• Rewrite cooperative application – 7/19/02

• “Real” application testing – 7/24/02

• Implementation of abortable reads/writes

• Testing of abortable reads/writes – as time allows (7/29/02)

Testing

• Synthetic Benchmarks– Simple read/write programs.

– Preliminary results.

– Create scenarios where energy-savings are most obvious.

– Simulate different workloads:• Multiple simultaneous

cooperative tasks

• Mix cooperative and non-cooperative tasks.

• Test same buffer access issues.

• “Real” Application– Audio/video player, image

viewer.– Read file from hard disk.– Test performance in non-

optimized, “real-life” situations.

• Compare results on:– ECOSystem unthrottled

(Coop-I/O)– Current ECOSystem

implementation– ECOSystem with cooperative

implementation

What do we want to test?

Summary

• Extended system calls for file operations.

• Priority can be specified by application.

• Priority determines amount of currentcy in bid.

• Total bids for disk access must exceed entry price.

• Abortable reads/writes may be cancelled after specific period of time.

• Interaction between decreasing entry price and bids for disk access works toward efficiently batching disk accesses while guaranteeing that non-abortable accesses occur.

Documents

Ideas for Cooperative Disk Management with ECOSystem