Approximation Algorithms for Task Allocation with QoS and Energy Considerations Bader N. Alahmad

Preview:

Citation preview

Approximation Algorithms for Task Allocation with QoS and Energy Considerations

Bader N. Alahmad

2

Problems we consider

Task allocation such that QoS score achieved & number of processors minimized

a) Variation 1: number of service classes (QoS levels) fixed, built into platform

b) Variation 2 : Minimum service class should be maintained, arbitrary number of service classes

Task Allocation such that QoS score achieved & Overall Energy minimized

3

Problems are NP-Hard

Reduction from Bin-Packing and PARTITION Bin-Packing (decision): STRONGLY NP-

Complete PARTITION (decision): NP-Complete in the

ordinary (weak) sense

4

Approximability Everywhere !

(Problem Instance)

Algorithm

(error Factor)πœ–πΌ

|π‘£π‘Žπ‘™π‘’π‘’ (𝑠 )βˆ’π‘‚π‘ƒπ‘‡ ( 𝐼 )|max(π‘£π‘Žπ‘™π‘’π‘’ (𝑠) ,𝑂𝑃𝑇 ( 𝐼 ))

β‰€πœ–

𝑠(solution)

𝑨

PTAS: running time poly in

FPTAS: running time poly in

5

Quality of Service (QoS)

tolerable error minimum precision required

Service providerClient

Service Level Agreement

Service Level Agreement

The service will provide a response of within 300ms for 99.9% of its request for a peak

client load of 500 requests per second

6

QoS example: Search Engines

β€œWoongki Baek and Trishul M. Chilimbi. 2010. Green: a framework for supporting energy-conscious programming using controlled approximation. In Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation (PLDI '10).”

Accept user querySearch index for matching docsRank docs, return the top N docs in ranked order

QoS Loss metric: % of queries that return

Different set of top N docs

Same set of top N docs in different

RANK order

Tradeoff (Approximation)

Execute less cycles Ranking for faster response

9

Quality of Service

A task comes equipped with service class specifications (utilization, reward)

per processor

πœ– β‰€π‘’π‘‘π‘–π‘™π‘–π‘§π‘Žπ‘‘π‘–π‘œπ‘›β‰€1πœ–β‰€π‘Ÿπ‘’π‘€π‘Žπ‘Ÿπ‘‘ ≀1

1/4

1

10

Task Allocation with Service Classes (TASC)

Tasks associated with resource function.

Task1

Task3

Task2

1Identical processors of unit capacity

General Solution Method: Memoization

11

Fixed Service Classes

Variation 1: Service classes are fixed

(𝑒2 ,π‘ž2)

(𝑒8 ,π‘ž8) (𝑒5 ,π‘ž5)(π‘’π‘˜βˆ’ 1 ,π‘žπ‘˜βˆ’ 1)

(π’–π’Œ ,π’’π’Œ)(𝑒6 ,π‘ž6)(𝑒3 ,π‘ž3)(𝑒4 ,π‘ž4)

(𝑒1 ,π‘ž1)

(𝑒7 ,π‘ž7)Service classes (pool) predetermined by provider (platform)

Task1Task3

Task2

12

FSC – Assign Tasks to Service Classes

For each assignment of the form , verify that there is a set of tasks such that

There are such assignments (bins and balls)Since is fixed is poly in

13

Verify Assignment – Max Flow in a bipartite graph for each assignment

Example assignment

14

Minimum Makespan & Bin-Packing are DUAL problems

There exists a packing of items with execution times to bins with capacity t each iff there exists a schedule with makespan at

most t.

π’†πŸ‘

π’†πŸ

π’†πŸπ’†πŸ’

π’†πŸ“

π’†πŸ”

t

Scheduling Makespan = t

π’†πŸ‘π’†πŸπ’†πŸ

π’†πŸ’π’†πŸ“

π’†πŸ”t

Bin Packing: bin capacity = t

π’†πŸ‘

𝒕

π’†πŸ

𝐭/t

π’†πŸ’

𝒕

1

Bin Packing: bin capacity = 1

Normalize

𝑒6/𝑑

𝑒5/𝑑

15

Find minimum number of bins for each assignment

Apply bin-packing PTAS for each assignment

n times. Why?

[bin packing: de la Vega and Lueker 81] [minimum makespan: Hochbaum and Shmoys 88]

Result

PTAS for TASC !

16

Minimum Service Class (MSC)

Variation 2: Relax fixed service classes

But: require minimum service class to be maintained. Why?

What is the maximum number of service classes ? Are they bounded?

Solution: Quantize !

Why??

17

Quantize Service Classes - qMSC

(𝑒 ,π‘ž)β†’(πœ–+π›Όπœ–2 ,πœ–+π›½πœ–2)

Setting will do the trick !

π‘Žπ‘‘π‘šπ‘œπ‘ π‘‘ ⌈1

πœ–4βŒ‰ π‘ π‘’π‘Ÿπ‘£π‘–π‘π‘’π‘π‘™π‘Žπ‘ π‘ π‘’π‘ !

Catch ! Processors may need to have capacity (1+πœ– )

ResultsPTAS for qMSC

18

Incorporate Energy Expenditure and platform heterogeneity !

19

Platform

K heterogeneous processor types: AMD, INTEL , …

Processor1

Processor2

K heterogeneous physical processor types

At most S speeds per processorLogical Processors

𝑠1,1 𝑠1,2

𝑠2,1 𝑠1,2 𝑠2,3

Distinct physical processors have different speed levels

Processor speeds dynamically adjustable

20

Platform - Continued

Platform consists of M processor β€œslots” to be filled Need to build platform with the available processors.

M Logical processors (physical processor/speed) pairs

possible platform configurations

21

Tasks look like this

Task1

π‘π‘Ÿπ‘œπ‘π‘’π‘ π‘ π‘œπ‘Ÿ 𝑑𝑦𝑝𝑒1

Task2

π‘π‘Ÿπ‘œπ‘π‘’π‘ π‘ π‘œπ‘Ÿ 𝑑𝑦𝑝𝑒2

π‘π‘Ÿπ‘œπ‘π‘’π‘ π‘ π‘œπ‘Ÿ 𝑑𝑦𝑝𝑒3

π‘π‘Ÿπ‘œπ‘π‘’π‘ π‘ π‘œπ‘Ÿ 𝑑𝑦𝑝𝑒1

22

Prior attempts related to our work

Closest to our efforts:

C.-Y. Yang, J.-J. Chen, T.-W. Kuo, and L. Thiele. An approximation scheme for energy-efficient scheduling of real-time tasks in heterogeneous multiprocessor systems. In DATE, pages 694–699, 2009

Energy Model is weak utilization scales linearly with speed Cannot capture holistic energy expenditure (devices) Interpolates speed ! Might get very inaccurate

23

Linear Scaling of Utilization: I/O bound Tasks

Our work

Relax all previous assumptions

Completely discrete, arbitrarily structured setting

f

t (cpu)

2f

t/2 (I/O)t/2 (cpu)

t/2

t/2t/4

Task1

Task2

26

Energy Model

Dynamic power consumption per (processor/speed) Static power consumption per physical processor

(independent of speed) Idle power: when processor in dormant mode

Service Level AgreementThe service will provide a response of within 300ms for 99.9% of its

request for a peak client load of 500 requests per second

28

Objectives

Overall Energy Expenditure minimized

Quality of Service Score achieved

29

Offline Dynamic Program

Enumerate platform configurations Iterate through tasks

Assign tasks to processors Build state space per task from states of previous task The solution is the state of last task

30

State space

Each task maintains a set of states State is a partial feasible schedule up to current

task across all processors.

T1T4T2

T3 T5T8

T6

T7

T1T4T2

T3 T5

T8

T6

T7

T8

πœ‘ πœ“1 1

31

Trimming the state space

How to bring down the size ofthe state space to polynomial, while controlling

the error propagation ?

32

Measure of nearness: Delta-close states

1Δ𝐸 𝑗❑

πœ“π‘–β‰€πΈπ‘—

πœ‘ 𝑖

≀ Ξ” 𝐸 π‘—β‘πœ“π‘–

π‘π‘’π‘’π‘‘π‘‘π‘œ h𝑐 π‘œπ‘œπ‘ π‘’ Ξ”π‘π‘Ÿπ‘œπ‘π‘’π‘Ÿπ‘™π‘¦β†’(1+ πœ–2𝑁 )

33

State Space partitioned into -BoxesΞ”

πœ™π‘– ,16

πœ™π‘– ,11πœ™π‘– ,10

πœ™π‘– ,5

πœ™π‘– ,6πœ™π‘– ,15

πœ™π‘– ,9πœ™π‘– ,4πœ™π‘– ,8πœ™π‘– ,13

πœ™π‘– ,17

πœ™π‘– ,18

πœ™π‘– ,1πœ™π‘– ,22πœ™π‘– ,20

πœ™π‘– ,19

πœ™π‘– ,21

πœ™π‘– ,12πœ™π‘– ,2

πœ™π‘– ,7πœ™π‘– ,3πœ™π‘– ,13πœ™π‘– ,14

34

Choose the dominating state in each -boxΞ”

Dominating: has maximum Reward value

35

Results

Quality of SolutionπΈπ‘›π‘’π‘Ÿπ‘”π‘¦ ≀ (1+πœ– )𝑂𝑃𝑇

Running Time

36

Approximation Schemes vs. Heuristics

Approximation Algorithms

Guaranteed worst case bounds on quality of solution

Running time might be too large to be used in practice

Heuristics Stochastic Local Search,

greedy, … Hard to design β€œgood”

ones Hard to obtain

guarantees on quality of returned solution

Might converge quickly, depending on input

38

Heuristics were hard to analyze

Compute Energy Per Processor for each task

Pack the task with the smallest

(ENERGY Per Processor / Reward per same processor ) first

if it fits that processor

Go Greedy !

39

Was it Fun?

Thank you

40

Mathematical Program