34
Improving Server Applications with System Transactions Sangman Kim, Michael Z. Lee, Alan M. Dunn, Owen S. Hofmann, Xuan Wang, Emmett Witchel, Donald E. Porter 1

Improving Server Applications with System Transactions

  • Upload
    collin

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

Sangman Kim , Michael Z. Lee, Alan M. Dunn, Owen S. Hofmann, Xuan Wang, Emmett Witchel , Donald E. Porter. Improving Server Applications with System Transactions. Poor OS API Support for Concurrency. Fine-grained locking - Bug-prone, hard to maintain - OS provides poor support. - PowerPoint PPT Presentation

Citation preview

Page 1: Improving Server Applications with System Transactions

1

Improving Server Applications with System Transactions

Sangman Kim, Michael Z. Lee, Alan M. Dunn, Owen S. Hofmann, Xuan Wang, Emmett Witchel, Donald E. Porter

Page 2: Improving Server Applications with System Transactions

Poor OS API Support for Concurrency

2

Para

llelis

m

Maintainability

Fine-grained locking - Bug-prone, hard to maintain - OS provides poor support

Coarse-grained locking - Reduced resource utilization

Page 3: Improving Server Applications with System Transactions

System Transaction Improves OS API Concurrency

3

Para

llelis

m

Server Applications

working with OS API

System Transactio

n

Server Applications

working with OS API

Maintainability

Page 4: Improving Server Applications with System Transactions

4

LinuxTxOS

Improving System Transactions

TxOS provides operating system transaction [Porter et al., SOSP 2009] Transaction for OS objects (e.g., files,

pipes)System transaction in TxOS

TxOS

ApplicationJVM

Middleware state sharing with multithreading

TxOS system callsMiddleware state sharing

Page 5: Improving Server Applications with System Transactions

5

TxOS

Improving System Transactions

TxOS provides operating system transaction [Porter et al., SOSP 2009] Transaction for OS objects (e.g., files ,

pipes)Synchronization in legacy code

ApplicationJVM

TxOS system calls

Synchronization primitivesMiddleware state sharing

Page 6: Improving Server Applications with System Transactions

TxOS

Improving System Transactions

TxOS provides operating system transaction [Porter et al., SOSP 2009] Transaction for OS objects (e.g., files,

pipes) TxOS+: Improved system

transactions

6

ApplicationJVM

TxOS system calls

TxOS+TxOS+: pause/resume,commit ordering, and more

Up to 88% throughput improvement

At most 40 application line changes

Synchronization primitivesMiddleware state sharing

Page 7: Improving Server Applications with System Transactions

7

Background: system transaction

System transactions in action

Challenges for rewriting applications

Implementation and evaluation

Outline

Page 8: Improving Server Applications with System Transactions

Background: System Transaction Transaction Interface and semantics

System calls: xbegin(), xend(), xabort()

ACID semantics ▪ Atomic – all or nothing▪ Consistent – one consistent state to another▪ Isolated – updates as if only one concurrent transaction▪ Durable – committed transactions on disk

Optimistic concurrency control

Fix synchronization issues with OS APIs8

Page 9: Improving Server Applications with System Transactions

Background: System Transaction

Lazy versioning: speculative copy for data

TxOS requires no special hardware9

xbegin();write(f, buf);xend(); CommitAbort

Conflict!inode

header

inode iinumlock

inode datasize

mode…

Copy of inode data

Page 10: Improving Server Applications with System Transactions

10

Background: system transaction

System transactions in action

Challenges for rewriting applications

Implementation and evaluation

Outline

Page 11: Improving Server Applications with System Transactions

11

Applications Parallelized with OS Transactions

Parallelizing applications that synchronize on OS state

Example 1: State-machine replication Constraint: Deterministic state update

Example 2: IMAP Email Server Constraint: Consistent file system

operations

Page 12: Improving Server Applications with System Transactions

12

Example 1: Parallelizing State-machine Replication

Core component of fault tolerant services e.g., Chubby, Zookeeper, Autopilot

Replicas execute the same sequence of operations Often single-threaded to avoid non-determinism

Ordered transaction Makes parallel OS state updates deterministic Applications determine commit order of

transactions

Page 13: Improving Server Applications with System Transactions

13

Example 2: Parallelizing IMAP Email ServersEveryone has concurrent email

clients Desktop, laptop, tablets, phones, .... Need concurrent access to stored emails

Brief history of email storage formats mbox: single file, file locking Lockless Maildir Dovecot Maildir: return of file locking

Page 14: Improving Server Applications with System Transactions

14

mbox Single file mailbox of email messages

Synchronization with file-locking▪ One of fcntl(), flock(), lock file (.mbox.lock)▪ Very coarse-grained locking

mbox: Database Without Parallelism

~/.mboxFrom MAILER-DAEMON Wed Apr 11 09:32:28 2012 From: Sangman Kim <[email protected]> To: EuroSys 2012 audienceSubject: mbox needs file lock. Maildir hides message.…..From MAILER-DAEMON Wed Apr 11 09:34:51 2012From: Sangman Kim <[email protected]> To: EuroSys 2012 audienceSubject: System transactions good, file locks bad!….

Page 15: Improving Server Applications with System Transactions

15

Maildir: Parallelism Through Lockless Design Maildir: Lockless alternative to mbox

Directories of message files Each file contains a message Directory access with no synchronization

(originally)

Message filenames contain flagsMaildir/cur 00000000.00201.host:2,T00001000.00305.host:2,R

00002000.02619.host:2,T00010000.08919.host:2,S00015000.10019.host:2,S

TrashedRepliedTrashedSeenSeen

Page 16: Improving Server Applications with System Transactions

16

Messages Hidden with Lockless Maildir

PROCESS 2 (MARKING)

if (access(“043:2,S”)):

rename(“043:2,S”, “043:2,R”)

PROCESS 1 (LISTING)

while (f = readdir(“Maildir/cur”)):

print f.name

018:2,S 021:2,S 052:2,S 061:2,SSeen

“Maildir/cur” directory

Seen043:2,SSeen Seen Seen

Page 17: Improving Server Applications with System Transactions

17

043:2,RReplied

Messages Hidden with Lockless Maildir

PROCESS 2 (MARKING)

if (access(“043:2,S”)):

rename(“043:2,S”, “043:2,R”)

PROCESS 1 (LISTING)

while (f = readdir(“Maildir/cur”)):

print f.name

018:2,S 021:2,S 052:2,S 061:2,SSeen

“Maildir/cur” directory

Seen043:2,SSeen Seen Seen

Process 1 Result018:2,S021:2,S052:2,S061:2,S

Message missing!

Page 18: Improving Server Applications with System Transactions

18

Return of The Coarse-grained File Locking Maildir synchronization

Lockless

File locks▪ Per-directory coarse-grained locking▪ Complexity of Maildir, performance of mbox

System transactions

“certain anomalous situations may result” – Courier IMAP manpage

Page 19: Improving Server Applications with System Transactions

Maildir Parallelized with System Transaction

PROCESS 1 (MARKING)

xbegin()

if (access(“XXX:2,S”)):

rename(“XXX:2,S”,

“XXX:2,R”)xend()

PROCESS 2 (MESSAGE LISTING)

xbegin()

while (f = readdir(“Maildir/cur”)):

print f.name

xend()

xbegin()

xend()

xbegin()

xend()

Consistent directory accesses with better parallelism

19

Page 20: Improving Server Applications with System Transactions

20

Background: system transaction

System transactions in action

Challenges for rewriting applications

Implementation and evaluation

Outline

Page 21: Improving Server Applications with System Transactions

21

Challenges of Rewriting Applications

1. Middleware state sharing

2. Deterministic parallel update for system state

3. Composing with other synchronization primitives

Page 22: Improving Server Applications with System Transactions

22

Middleware and System Transaction Problem with memory management

Multiple threads share the same heap

Thread 1 Thread 2In Transaction

xbegin();p1 = malloc();

xabort();p2 = malloc();

*p2 = 1;

Middleware (libc)

Kernel mmap()

Heap

Transactional object for heap

p1

Page 23: Improving Server Applications with System Transactions

23

Middleware and System Transaction Problem with memory management

Multiple threads share the same heap

Thread 1 Thread 2In Transaction

xbegin();p1 = malloc();

xabort();p2 = malloc();

*p2 = 1;

Middleware (libc)

Kernel

Transactional object for heap

Heap

FAULT!Certain middleware actions should not roll back

p1 p2unmapped

Page 24: Improving Server Applications with System Transactions

24

Two Types of Actions on Middleware State

USER-INITIATED ACTION User changes system

state Most file accesses Most synchronization

MIDDLEWARE-INITIATED System state changed

as side effect of user action malloc() memory mapping Java garbage collection Dynamic linking

Middleware state shared among user threads Can’t just roll back!

Page 25: Improving Server Applications with System Transactions

25

Handling Middleware-Initiated Actions Transaction pause/resume

Expose state changes by middleware-initiated actions to other threads

Additional system calls ▪ xpause(), xresume()

Limited complexity increase▪ We used pause/resume 8 times in glibc, 4 times in

JVM▪ Only used in application for debugging

Page 26: Improving Server Applications with System Transactions

26

Pause/Resume In JVM Execution

SysTransaction.begin();

files = dir.list();

SysTransaction.end();

Java code

xpause()

xresume()

xbegin();

files = dir.list();

VM operations(garbage collection)

xend();

JVM Execution

Page 27: Improving Server Applications with System Transactions

27

Other Challenges for Maturing TxOS 17,000 lines of kernel changes

Transactionalizing file descriptor table Handling page lock for disk I/O Memory protection Optimization with directory caching Reorganizing data structure and more

Details in the paper

Page 28: Improving Server Applications with System Transactions

28

Background: system transaction

System transactions in action

Challenges for rewriting applications

Implementation and evaluation

Outline

Page 29: Improving Server Applications with System Transactions

29

Application 1: Parallelized BFT Application

Implemented in UpRight BFT library

Fault tolerant routing backend Graph stored in a file Compute shortest path Edge add/remove

Ordered transactions for deterministic update

Page 30: Improving Server Applications with System Transactions

30

Minimal Application Code Change

Component Total LOC Changed LOC

Routing application

1,006 18 (1.8%)

Upright Library 22,767 174 (0.7%)

JVM 496,305 384 (0.0008%)

glibc 1,027,399 826 (0.0008%)

Page 31: Improving Server Applications with System Transactions

31

0 10 20 30 40 50 60 70 80 90 1000

5001000150020002500300035004000

TxOS, denseLinux,dense

Deterministic State Update with Better Throughput

Thro

ughp

ut (

req/

s)

Dense graph: 88%

tput

Sparse graph:

11% tput

Work to add/delete edges small compared to scheduling

overhead

Write ratio (%)BFT graph server

Page 32: Improving Server Applications with System Transactions

32

Application 2: Dovecot Maildir access Dovecot mail server

Uses directory lock files for maildir accesses

Locking is replaced with system transactions Changed LoC: 40 out of 138,723

Benchmark: Parallel IMAP clients Each client executes operations on a random

message▪ Read: message read▪ Write: message creation/deletion▪ 1500 messages total

Page 33: Improving Server Applications with System Transactions

33

Mailbox Consistency with Better Throughput

Dovecot benchmark with 4 clients

0102030405060708090

0 10 25 50 100Write ratio (%)

Tput

Impr

ovem

ent

(%)

Better block scheduling

enhances write performance

Page 34: Improving Server Applications with System Transactions

34

Conclusion: OS Transactions Improve Server Performance

System transactions parallelize tricky server applications Parallel Dovecot maildir operations Parallel BFT state update

System transaction improves throughput with few application changes Up to 88% throughput improvement At most 40 changed lines of application

code