82
Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica Middleware Laboratory MIDLAB Registers

Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Registers

Page 2: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Register: definitionA register is a shared variable accessed by processes through read and write operations

Page 3: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Distributed Systems: register abstraction

• Multiprocessor machine:

� Processes typically communicate through registers at hardware level

• The set of these registers constitute the physical memory

• Distributed message passing system:

� no physical shared memory

� Processes communicate exchangingmsg over a network

• Register abstraction support the design of distributed solution, by hiding the complexity of the underlying message passing system and the distribution of the data

......

Hwregister

p1 p2 pnpi

Bus

Register(abstraction)

NETWORK

p1 p2 pi pn

Page 4: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Register operations

A process accesses a register through:

� Read operation, read()→v: it returns the “current”value v of the register; this operation does not modify the content of the register;

� Write operation, write(v): it writes the value v in the register and returns true at the end of the operation

Each operation starts with an invocation and terminates when the corresponding response is received

Page 5: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Register: Assumption

•• A register stores only A register stores only positive integers and positive integers and it is initialize to 0it is initialize to 0

•• Each value written is univocally identifiedEach value written is univocally identified

•• Processes are sequentialProcesses are sequential: a process cannot invoke a new operation before the one it previously invoked (if any) returned

Page 6: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Register: Notation

(X,Y) denotes a register where X processes can write and Y processes can read

� (1,1) denotes a register where only a process can write and only a process can read. It is a priori known which process can write and which can read

� (1,N) denotes a register where a single process, a priori known, can write, and N processes can read

Page 7: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Assumptions:

• serial access: a process does not invoke an operation on the register if there is another process that previously invoked an operation on it and this latter does not yet complete

•no failures

Sequential Sequential SSpecificationpecification

��LivenessLiveness.. Each operation eventually terminates

��Safety.Safety. Each read operation returns the last value written

Register Semantics: Serial System, No failures

p1

write(5) read()→8

p2

write(8)read()→5

Page 8: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Assumptions:

• several processes can access the register

• concurrent access

• no failures

• Which value does the read operation has to return?

Register semantics: Concurrency

p1

write(5)

p2

read()→?

write(8)

p1

write(5)

p2

write(8)

read()→?

Page 9: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Assumptions:

• several processes can access the register

• serial access

• processes can fail by crashing, i.e. after some point in time they stop to run their algorithm forever

Failed operation: a process fails at some time in between the invocation and the response of the operation

Which value does the read operation has to return?

Register semantics: failures

p1

write(5)

p2

write(8)

read()→?

Page 10: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Register Semantics: Concurrency & Failures

A process can invokes a write operation and then crash before the corresponding response event is generated. The write operation could have taken place or not

Register semantics: a read may return both:

� The value written by the last write operation which completes

� The value given as input to the last write operation, even thoughthis operation will fail

p1

write(5)

p2

write(8)

read()→?

Page 11: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Operations• Every operation is characterized by two events:

� Invocation

� Return (Confirmation for the write operation and a value for the read)

• Each of these events occur at a single indivisible point of time

• An operation is complete if both the invocation and the return events are occurred

• A Failed operation is an operation invoked by some process pi that crashes before obtaining a return

Opt

Invocation Return

Op’

Invocation

Crash

t

Page 12: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Precedence between Operations

• The execution of an operation invoked by a process p, is the time interval defined by the invocation event and the return event

• Given two operations o e o’, o precedeo precedes s oo’’ if the response event of o precedes the invocation event of o’

• An operation o invoked by a process p may precedes an operation o’ invoked by p’ only if o completes

• If it is not possible to define a precedence relation between two operations, they are said to be concurrent

Page 13: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Example

p1

Op1

p2

Op2

Op1 preceeds Op2

p1

Op1

p2

Op2

Op1 is concurrent Op2

Page 14: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Register Specification:

Regular-Atomic

Page 15: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

(1,N) Regular Register: Specification

Termination.Termination. If a correct process invokes an operation, then the operation eventually receives the corresponding confirmation

ValidityValidity.. A read operation returns the last value written or

the value concurrently written

p1

write(5)

p2read()→0 read()→5

No regular

p1

write(5)

p2read()→0 read()→5

Regular

Page 16: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

(1,N) Regular Register: Scenario

NNOTEOTE: In a regular register, a process can read a value v and then a value v’, even if the writer has written v’ and then v, as long as the write and the read operations are concurrent

This behavior is not allowed in an ATOMIC register

p1

write(5)

p2read()→5 read()→6

write(6)

read()→5

Page 17: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

(1,N) Atomic Register: Specification

IDEA:IDEA: regular register+ ordering.regular register+ ordering.

Properties:Properties:

TerminationTermination. . If a correct process invokes an operation, then the operation eventually receives the corresponding confirmation.

ValidityValidity.. A read operation returns the last value written or

the value concurrently being written.

OrderingOrdering.. If a read returns v2 after a read that it precedes it has returned v1, then v1 cannot be written after v2

Page 18: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

(1,N) Atomic Register: scenarioes.1

es.2

1. Regular but notatomic register: Write(5) precedeswrite(6). But processp2 read first the value 6 and then the value 5

2. The register isatomic

p1

write(5)

p2read()→6

write(6)

read()→5

p1

write(5)

p2read()→5 read()→6

write(6)

read()→6

Page 19: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

(1,N) Atomic register: scenario

Not atomic register: the precedence relation also refers toread operations issued by different processes

p1

write(5)

p2read()→6

write(6)

p3read()→5

Page 20: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Scenario 1

p1

write(5)

p2

write(6)

read()→5

ATOMICATOMIC and REGULARREGULAR.

Page 21: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Scenario 2

NOT ATOMICNOT ATOMIC and NOT REGULARNOT REGULAR

p1

write(5)

p2

write(6)

read()→6

Page 22: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Scenario 3

p1

write(5)

p2read()→0

ATOMICATOMIC and so REGULARREGULAR. write(5) executed by p1

fails. So, it does not complete and it is concurrentwith the read by p2. Validity is respected.

Page 23: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Scenario 4

REGULAR REGULAR butbut nonnon ATOMICATOMIC. The ordering propertyis violated.

p1

write(5)

p2read()→5

p3read()→0

Page 24: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Scenario 5

ATOMICATOMIC and REGULARREGULAR

p1

write(5)

p2

write(6)

read()→6

p3read()→5

Page 25: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Regular Register:Implementation

Page 26: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

onon--rregrreg denotes the implementation of a regular register where1 process can write (writer) and N processes can read (readers).

Events:

• RequestRequest:<:<onon--rregReadrregRead,, regreg>>

Used to invoke a read operation on register reg

• ConfirmationConfirmation:<:<onon--rregReadReturnrregReadReturn, , regreg, , vv>>

Used to return v as a response to the read invocation on

register reg and indicates that the operation completed

• RequestRequest:<:<onon--rregWriterregWrite ,,regreg, v, v>>

Used to invoke a write operation of value v on register reg.

• Confirmation:Confirmation:<<onon--rregWriteReturn,rregWriteReturn,regreg>>

Confirms that the write operation has taken place at register

reg and is complete.

Page 27: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Termination.Termination. If a correct process invokes an operation, then the operation eventually receives the corresponding confirmation

ValidityValidity.. A read operation returns the last value written or

the value concurrently being written

NOTATION: RR sometimes used to denote a Regular Register

Page 28: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

FailFail--Stop AlgorStop Algorithmithm:processes can crash but the crashes canbe reliably detected by all the other processes

• failure model: crash

• perfect failure detector:

�� Strong completenessStrong completeness. The crash of a process is eventually detected by every correct process

�� Strong accuracyStrong accuracy. No process is detected to have crashed until it has really crashed

Page 29: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Perfect pointPerfect point--point links:point links:

1.1. Reliable deliveryReliable delivery - Let pi be any process that sends a message m to aprocess pj. If neither pi nor pj crashes, then pj eventually delivers m.

2.2. No duplicationNo duplication – No message is delivered by a process more thanonce.

3.3. No creationNo creation – If a message m is delivered by some process pj, then mwas previously sent to pj by some process pi.

BestBest--Effort Broadcast (Effort Broadcast (bebBroadcastbebBroadcast):):

1.1. BestBest--effort validityeffort validity –For any two processes pi and pj . If pi and pj arecorrect, then every message broadcast by pi is eventually delivered bypj.

2.2. No duplicationNo duplication-- No message is delivered more than once

3.3. No creationNo creation- If a message m is delivered by some process pj, then mwas previously broadcast by some process pi

Page 30: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Algorithm Idea:

� Each process stores a local copy of the register

�� ReadRead--OneOne: each read operation returns the value stored in its local copy of the register

��WriteWrite--AllAll: each write operation updates the value locally stored at each process the writer consider to have not crashed

� A write completes when the writer receives an ack from each process that has not crashed

Page 31: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

(1,N) regular register

NOTE.NOTE. The algorithmimplements an array of RR. We consider a single entry, i.e. a single Regular Register

• ValueValue[r][r]: current value of the register

• writeSetwriteSet[r][r]: used by the writer to track when a writehas been propagated to allcorrect processes

• correctcorrect: set of correctprocesses.

Page 32: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

(1,N) regular register: write

upon event <on-rregWrite | reg, val> dotrigger < bebBroadcast | [Write, reg, val] >;

upon event < bebDeliver |pj , [Write, reg, val] > dovalue[reg] := val;trigger < pp2pSend | pj , [Ack, reg] >;

upon event < pp2pDeliver | pj , [Ack, reg] > dowriteSet[reg] := writeSet[reg] ∪ {pj};

upon exists r such that correct ⊆ writeSet[r] do

writeSet[r] := ∅;trigger <on-rregWriteReturn | r>;

Page 33: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

upon event < on-rregRead | reg > do

trigger < on-rregReadReturn | reg, value[reg] >;

Page 34: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

(1,N) regular registerCorreCorrectnessctness::

Termination –

� read: trivial, it is local.

� write: from the properties of the communication primitives and from the completeness property of the perfect failure detector.

� Validity – Because of the strong accuracy property of the perfect failure detector, each write operation can complete only after all processes that do not crash have updated their local copy of the register. So, the two following cases can hold:

� The read operation is not concurrent with the last write that has been invoked,the process will read the last value written

� The read operation is concurrent with the last write. For the no creation property of the channels, the value returned is either the last value written or the one being written. This latter is concurrent with the read operation

Performance:� Write – At most 2N messages.

� Read – 0 msg, it is local

Page 35: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

• The algorithm does not ensure validity if the failure detector isnot perfect. The following scenario could happen:

• P1 invokes write()6 and then falsely suspects p2. Thus, p1

completes the write operation without waiting for the ack of p2, i.e. without being sure that the value 6 has been written in the local copy of the register at p2

p1

write(5)

p2

write(6)

read()→5

Page 36: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Fail-silent algorithm:”process crashes can never be reliably detected”� Failure model: crash� No perfect failure detector

Assumptions:

� N processes whose 1 writer and N readers

� A majority of correct processes

Communication Primitives:

� Perfect point-to-point link

� Best-effort broadcast

Page 37: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

IDEA:

� Each process locally stores a copy of the current value of the register

� Each written value is univocally associated to a timestamp

� The writer and the reader processes use a set of witness processes, to track the last value written

� Quorum: the intersection of any two sets of witness processes is not empty

�� ““Majority VotingMajority Voting””: each set is constituted by a majority of processes

Page 38: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

1.1. snsn[r]:[r]: timestamp for register r

2.2. v[r]:v[r]: value of register r

3.3. acksacks[r]:[r]: data structure used by the writer to track howmany processes have updated the copy of the register

4.4. reqreq[r][r]=i:denotes that a given process has invoked its i-thread operation

Page 39: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

(1,N) regular register

When the writer writes:

� it increments the timestamp by 1

� it locally stores the value written

� it tracks 1 ack

� it broadcasts a message to propagate the write

Page 40: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

When a process receives a write message m, it verifies if the value to be written is more recent than the last value locallywritten. In that case:

� it locally stores the new value and the correspondingtimestamp

� it sends an ack to the writer. The ack piggybacks the register id and the timestamp associated to the valuewritten

Page 41: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

(1,N) regular register

When the writer receives an ack:

� It compares the timestamp in the message and its currentone. If they are equal, it increments by 1 the control structure that tracks the number of received acks.

�When the writer has received the majority of acks, i.e. the majority of processes have stored the value, the writecompletes.

Page 42: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

When a process invokes a read on a register r

� it increments reqid[r] by 1

� it makes the readSet[r] empty

� it broadcasts the read request

Page 43: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

(1,N) regular register

When a process receives a read request, it sends back the value in its local copy of the register and the correspondingtimestamp

When a process receives a response for a read request

� It checks if the response is for the current request

� If so, it inserts the pair (timestamp, value) in the readSet

Page 44: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

When a process has received a majority of responses to its readrequest:

� It choices the value with the highest timestamp,

� It updates its local copy of the register and timestamp

� It returns the value

Page 45: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Functioning Scenario(1,3) regular

register

• Π={p1,p2,p3}

• p1 is the writer

• I=invokation

• R=response

v=5sn=1

v=5sn=1

v=5sn=1

v=5sn=1

Read Quorumread()�5

Write Quorumwrite(5)

v=6sn=2

p1

p2

p3

I R

I R

I Rv=6sn=2

Write Quorumwrite(6)

Page 46: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

(1,N) regular register

CorreCorrectnessctness::

� Termination – from the properties of the communication primitives and the assumption of a majority of correct processes

� Validity – from the intersection property of the quorums

Performance:

�Write –at most 2N messages

� Read - at most 2N messages

Page 47: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Atomic Register Implementation

Page 48: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Events:

�Request: <<onon--aregReadaregRead | | regreg>>

Page 49: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

(1,N) Atomic Register: Specification

TerminationTermination. . If a correct process invokes an operation, then the operation eventually receives the corresponding confirmation

ValidityValidity.. A read operation returns the last value written or the value concurrently being written

OrderingOrdering.. If a read returns v2 after a read that precedes it has returned v1, then v1 cannot be written after v2

Page 50: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Page 51: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

The algorithm consists of two phases

PHASE 1.PHASE 1. We use a (1,N) regular register to build a (1,1) atomic

register

PHASEPHASE 2.2. We use a set of (1,1) atomic registers to build a (1,N)

atomic register

NOTANOTATIONTION::

Hereafter, rr and ra, will be sometimes used to respectively

denote regular register and atomic register

Page 52: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

IDEA:IDEA:

� p1 is the writer and p2 is the reader of the (1,1) atomic register, we aim to implement

� We use a (1,N) regular register where p1 is the writer and p2 is the reader

� Each write operation on the atomic register writes the pair (value, timestamp) into the underlying regular register

� The reader tracks the timestamp of previously read values to avoid to read something old

Page 53: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Page 54: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

To read a value stored into the atomic register, the reader:

� reads the value and the corresponding timestamp from the rr

� checks if the timestamp of the value read from rr is greater than the timestamp of the last value read and in case it locally stores the new value and its timestamp

�returns the value locally stored

Page 55: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

CorreCorrectnessctness::� Termination – from the termination property of the regular register

� Validity – from the validity property of the regular register

� Ordering – from the validity property and from the fact that the read tracks the last value read and its timestamp. A read operation always returns a value with a timestamp greater or equal to the one of the previously read value

Performance:

� Write – Each write operation requests a write on a (1,N) regular register

� Read - Each read operation requests a read on a (1,N) regular register

�� NOTE:NOTE: no more msg w.r.t. (1,1) regular register implementation

Page 56: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

IDEA:IDEA:

(1,N) Atomic Register implies a writer p1 and N readers

The writer p1 communicates with every other reader by using N (1,1) atomic registers

� p1 is the writer of the (1,N) atomic register and is the writer of the (1,1) atomic registers used to communicate with the readers (N registers)

� the value of variable writer[r,i] is the identifier of the (1,1) atomic register whose reader is process pi

� Each time the writer wants to write a value, it writes in all the N atomic registers

Page 57: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

A set of N2 (1,1) atomic registers are used for communication among readers

readers[r,i,j] stores the identifier of the (1,1) atomic register used process pj to inform process pi about the last value pj read

Each time a reader pi wants to read:

� for every j, it reads the values written in register readers[r,i,j];

� it reads the value written by the writer in the atomic register shared with pi, writer[r,i]

� it decides which is the last value written v

� it writes v in all registers readers[r,j,i] for every j

� it returns v

Page 58: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Page 59: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

When the writer invokes a write operation on the (1,N) atomic register:

• It increments the timestamp

• It sets reading to false

• It invokes the write of the pair (timestamp,value) on all the N (1,1)

atomic registers shared with the readers, i.e. writer[r,j] for all j

Page 60: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

When all the write operations on the (1,1) atomic registerscomplete, the write on the (1,N) atomic register completes

Each time a write on a (1,1) atomic register writer[r,j] completes, the writer tracks one more ack

Page 61: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

When a reader wants to read the (1,N) atomic register:

� It makes the readSet empty

� For each j, it invokes the read operation on the (1,1) atomic register, readers[r,i,j]

When the reader receives a response by a (1,1) atomic register, itinserts the corresponding pair (timestamp, value) in the readSet

Page 62: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

When the reader pi has read all the registers shared with the other readers, itinvokes the read on the (1,1) atomic register shared with the writer

When reader pi has also read the (1,1) atomic register shared with the writer:

� pi chooses the value with the highest timestamp

�� NOTICE: NOTICE: pi informs all the readers about the value it read by writing into

readersreaders[r,j,i][r,j,i] for each j.

Page 63: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Each time the reader pi obtains an ack corresponding to the previous writeon readersreaders[r,j,i][r,j,i] for each j, pi increments by 1 the number of received ack

When pi has received all the acks, it returns the chosen value as the response of the (1,N) atomic register

Page 64: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Correctness:

Termination – from the termination of the (1,1) atomic register

Validity – from the validity of the (1,1) atomic register.

Ordering - Consider a write operation w1 which writes value v1 with timestamp s1. Let w2 be a write which precedes w1. Let v2 and s2 (s1 < s2) be the value and the timestamp corresponding to w2.

Let assume that a read returns v2: by the algorithm, for each j in [1;N], pi has written (s2,v2) in readers[r; i; j].

For the ordering property of the underlying (1,1) atomic registers, each successive read will return a value with timestamp greater or equal to s2. Then s1 cannot be returned.

Performance:

� Write – each write operation on a (1,N) atomic register requests N write operationsi on the (1,1) atomic registers.

� Read – Each read operation on a (1,N) atomic register requests to read N (1,1) atomic registersand to write N (1,1) atomic registers.

Page 65: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Page 66: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

The algorithm is a modified version of the Read-One Write-All (1,N) Regular Register

IDEA: “the read operation writes”

The algorithm is called “Read-Impose Write-All” because a read operation imposes to all correct processes to update their local copy of the register with the value read, unless they store a more recent value

Page 67: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

A process can usethe writeSet bothwhen reading and writing.

Variable reading isused to distinguishthe currentoperation

Page 68: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

When a reader wants to read the value of the (1,N) atomic register

� It increments by 1 the number of read requests

� reading=true “I am reading”

� It stores the value of the local copy of the register in the variabile readval

� It broadcasts a message to write such a value

When the writer invokes the write of a value into the atomic register,

� It increments by 1 the number of write requests

� It broadcasts the write message

Page 69: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

When a process delivers a write request:

� If the value in the message is more recent that the one already stored (i.e. the first hasa bigger timestamp), the process locally applies the value

� In any cases, it sends back the ack

When an ack is delivered, if it corresponds to the current operation, the process is inserted in the writeSet

Page 70: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

When the process has received an ack by all correct processes

� If it was reading (reading =true), it terminates the read on the (1,N) atomic register by returning the value of readval

� If it was writing, it returns the ack to state the completion of the write

Page 71: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Correctness:

� Termination – as for the Read-One Write-All (1,N) Regular Register.

� Validity - as for Read-One Write-All (1,N) Regular Register.

� Ordering – to complete a read operation, the reader process has to be sure that every other process has in its local copy of the register a value with timestamp bigger or equal of the timestamp of the value read. In this way, any successive read could not return an older value.

Performance:

� Write - a write requests at most 2N messages

� Read - a read requests at most 2N messages

Page 72: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Page 73: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Failure model: crash

A majority of correct processes is assumed.

The algorithm is a variation of the Majority Voting (1,N) Regular

Register

IDEA:IDEA: A read imposes to a majority of processes to have the

value read

Page 74: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Page 75: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

When a process delivers a write message, if it contains a more recent value, this latter is locally stored

In any case it sends the ack

A process track the number of acks delivered

Page 76: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

When a majority of ack is delivered, the current operationcompletes: if it was a read, the value to be read is returned

When a process invokes a read on the (1,N) atomic register, a read requestis broadcast

Page 77: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Each process answer to a read request with its local values

Page 78: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

When a process has received a majority of read response, itchooses the one with the highest timestamp

The process also broadcasts a request to write this latter value

Page 79: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

CorreCorrectnessctness::� Termination – as Majority Voting (1,N) Regular Register

� Validity – as Majority Voting (1,N) Regular Register.

� Ordering – due to the fact that the read imposes the write of the value read to a majority of processes and to the property ofintersection of quorums.

Performance:�Write – at most 2N messages

� Read – at most 4N messages

Page 80: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

(N,N) Atomic Register: Specification

TerminationTermination:: If a correct process invokes an operation, then the operation eventually receives the corresponding confirmation

Atomicity: Every failed operation appears to be complete ordoes not appear to have been invoked at all, and everycomplete operation appears to have been executed at someinstant between its invocation and the correspondingconfirmation event.

Page 81: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Scenario 1

NO ATOMIC

p1

write(5)

p2

write(8)

read()→8

read()→5

Page 82: Registers - dis.uniroma1.itbonomi/Teaching/aa0809/index_assets/SD09Register.pdf · Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica MID LAB Middleware

Università di Roma “La Sapienza”Dipartimento di Informatica e SistemisticaMiddleware LaboratoryMIDLAB

Scenario 2

p1

write(5)

p2

write(8)

read()→8

read()→5

p3

ATOMIC