Retrieval Multimedia Data from Disks Presented by Yuni Xia

Preview:

DESCRIPTION

Retrieval Multimedia Data from Disks Presented by Yuni Xia. Fundamental characteristics : Real-time storage and retrieval Large data transfer rate and storage space requirement. Why choosing magnetic disk Storage capacity Speed Moderate cost / Random access / Writing. Side View:. - PowerPoint PPT Presentation

Citation preview

Retrieval Multimedia Data from DisksPresented by

Yuni Xia

Fundamental characteristics:

• Real-time storage and retrieval• Large data transfer rate and

storage space requirement

Why choosing magnetic disk• Storage capacity• Speed• Moderate cost / Random access /

Writing

Read/write head

platters

Spindle

Side View:

Top View:

Tracks

Sector

Suppose: Wish to read data sector i on track ti,

read head is currently over sector j in track tj

Readtime = seek(ti, tj) + rotation(si,sj)+data/dtr

seek(ti, tj) = abs(ti-tj) / rv

rotation(si,sj) = (abs(si-sj) / snum ) / ss

Symbol Meaningtnumsnumitdssrvdtr

total # of trackstotal # of sectorsintertrack distancespin speed radial velocitydata transfer rate

Raid arrays and Placement methods

• By spreading data across several hard disks faster performance greater storage capacityhigher data security

• Six standards: 0-5

(cross-type variations, such as 0/1, 3/5)

• Implemented by software and hardware

RAID 0: Striped Disk Array without Fault Tolerance

RAID Level 0 requires a minimum of 2 drives to implement

AEI

M

BFJN

CGKO

DHL

etc..

RAID 1: Mirroring and Duplexing

RAID Level 1 requires a minimum of 2 drives to implement

ABCD

EFGH

IJKL

MNOP

ABCD

=

EFGH

=

MNOP

IJKL

= =

RAID 5: Independent Data Disks with Distributed Parity Blocks

RAID Level 5 requires a minimum of 3 drives to implement

A0A1A2A3

B0B1B2

3 parity

4 parity B4

C0C1

2 parity

C3C4

D01 parity

D2D3D4

0 parity

E1E2E3E4

Router

Server1

...

. . .

d1 d2 d3 dm

Server n

...

d1 d2 d3 dn

A model of heterogeneous disk servers

What needs to be modeled?

• The intrinsic characteristics of each disk server• The intrinsic characteristics/capabilities of each

client• The relationship between the disk servers and

clients

• The distribution of data across the disk server

Disk Server Characteristics

1. Dtr(i):

Total disk bandwidth of disk server i

2. Buf(i):

Total buffer space associated with server i

3. Switchtime(i, t):

Time required for si to switch between clients at time t

4. Cyctime(i, t):

One cycle of read operation to be executed by si at time t

Client Characteristics

1. Cons(i,t):

The consumption rate of client Ci at time t

2. Data(i, t): (M, b)

Play: data(i, t) = {(m,b), (m, b+1), …}

FF: data(i, t) = {(m,b), (m, b+ffs), (m, b+2ffs), …}

RW: data(i, t) = {(m,b), (m, b-rws), (m, b-2rws), …}

Pause: data(i, t) = {(m,b)}

Client Characteristics

Data(i, t): (M, b, len, step)

b, (b+step), (b+2*step), …. , (b+(len-1)*step)

1. Play: step =1

2. FF: step = ffs

3. RW: step = -rws

4. Pause: step = 0

Client-Server Characteristics

1. Timealloc(i,j,t):• In any given cycle of disk server i, each client

cj has a time-slice, timealloc(i, j, t)• cyctime(i, t) >= sum( timealloc(i,j,t))

+ (ni,t * switchtime(i,t))

2. active(t):The set of all clients that are active at time t.

3. d_active(i, t)active(t)= Union(d_active(i, t))

Client-Server Characteristics

4. Ut (i):

The set of servers which are handling the requests of client Ci.

Ut (i) = { S | Ci d_active(s, t)}

5. Bufreq(j, i, t):The amount of buffer that is required at server

Si so that data that client Cj needs to read doesn’t get overwritten.

Buf(i) >= sum(bufreq(j, i, t)

Distribution of Data

M (mi , b) : placement mapping

• The set of all servers that contain block b of mi

• M ( “Sound of Music”, 20 ) = {2, 4, 5}

Placement constraint

data (C, t) = (m, b, len, step) i (0< i <len)

( j Ut (i) ) ( j M (mi , b + ( i * step ) )

Suppose: Data (C, t) = (M, 5, 5, 3),

Ut (C) = {1, 3, 4}

{ 5, 8, 11, 14, 17} must be in {S1, S3, S4}

Definition: State of an MOD System S(t) 1. Active ( t ) 2. Cyctime (i, t) 3. Cons ( i, t ) 4. Timealloc ( i, j, t) 5. Data ( i, t) 6. Ut

Disk availability constraint

1. Consumption Rate Constraint:Sum(cons(j,t)) + switchtime(i,t) * dtr(i)/ cyctime(i,t) <=dtr(i)

2. Buffer requirement constraint:

sum(buf(j, i, t)) < = buf (i)

timealloc (i, j, t) = cyctime (i, t)* cons (j, t) / dtr(i)

bufreq(j,i,t) = (dtr(i)-cons(j,t))* timealloc(i, j, t)

(mi, 140, 2, 5)

(mi, 199, 2, 1)

(140, 145)

(199, 200)

Router

Server1B: 1-150

...

Server 2B: 151-250

...

Server 3B: 200-300

...

(150, 155)

(201, 202)

Trans Transaction type Priority

tr1

tr2

tr3

tr4

tr5

tr6

exiting client

continuing client-normal

continuing client-needs switching

continuing client- needs splitting

new client

new client -needs splitting

5

4

3

3

2

1

An event-based algorithm QuickSOL• FindSOL • OptimizeSOL

FindSOL Phase:FindSOL Phase: 1. Split EV(t) into 6 sets:

new(t), exit(t), cont(t), pause(t), ff(t), rew(t)

2. (handle exiting clients)

For each clients Ci in exit(t) do

1) free the resources

2) delete Ci from state table

3. (Handle Continuing Clients) For each clients Ci in cont(t) or ff(t) or rew(t) do

If servers currently assigned to C satisfy ..then modify the state tableelse 1) re-set C’s priority to 3 2) Move it into new(t) 3) update the resource table

4. (Handle New Clients) For each clients Ci in new(t) do

1) Identify the servers that have the data required by C

2) Determine which server have enough bandwidth .. • IF no such server is available,

split the event into 2 sub-events: data(C, t) = (m, s, l/2, 2*step) and

data(C, t) = (m, s+step, l/2, 2*step) • Keep splitting till for both sub-events ... • Update state table

3) Do the same as 2) in terms of buffer requirement

OptimizeSOL phase:OptimizeSOL phase: 1. Switching 2. Splitting Balancing the load, Maximizing the # of clients ...