Upload
kaden-hixon
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Multi-core Real-Time Scheduling for Generalized Parallel Task Models
Abusayeed Saifullah, Kunal Agrawal, Chenyang Lu, Christopher Gill
Multi-core processors provide an opportunity to schedule computation-intensive tasks in real-time Most of the tasks exhibit intra-task parallelism Real-time systems need to be developed to
exploit intra-task parallelism
2
Real-Time Systems on Multi-core Traditional multiprocessor scheduling
Focuses on inter-task parallelism Mostly restricted to sequential task models
Computation-intensive complex real-time tasks are growing Video surveillance Radar tracking Hybrid real-time structural testing
3
Parallel Task Model
Lakshmanan et al. (RTSS ’10) have addressed a restricted synchronous model where
Each horizontal bar indicates a thread of execution (sequence of instructions)
Parallel threads form a segment
Threads of each segment synchronize at the end of the segment
A task is an alternate sequence of parallel and sequential segments
The total number of threads in each segment ≤ number of cores
All parallel segments have an equal number of threads
Synchronous task model
Segment 1 Seg 2 Seg 3 Segment 4 Segment 5
Threads of Segment 1 synchronize here
Our Contributions
4
We address a general synchronous parallel task model Different segments may have different numbers of threads Each segment can have an arbitrary number of threads
Example: such tasks are generated by Parallel for loops in OpenMP, CilkPlus Barrier primitives in thread libraries
This model is more portable The same program can execute on machines with different
numbers of cores
A Task Example
start
end
5
void parallel_task(float *a,float *b,float *c,float * d){7
int n=7; int i=0;
parallel_for(; i< n; i++)c[i] = a[i] + b[i];
n=4; i=0;
parallel_for(; i< n; i++)d[i] = a[i] - b[i];
}
Our Contributions (contd..)
6
We propose a task decomposition for general synchronous parallel task model Decomposes each parallel task into a set of sequential subtasks Subtasks are scheduled like traditional tasks
Why decomposition? We can exploit the rich literature of multiprocessor scheduling The proposed decomposition ensures that if the decomposed tasks
are schedulable, the original task set is also schedulable
Our Contributions (contd..) We analyze schedulability in terms of processor speed
augmentation bound Speed augmentation bound ν for an Algorithm A: if an optimal
algorithm can schedule a synchronous parallel task set on unit-speed processor cores, then A can schedule the decomposed tasks on ν-speed processor cores.
We prove that the proposed decomposition requires a speed augmentation of at most 4 for Global Earliest Deadline First (G-EDF) scheduling 5 for Partitioned Deadline Monotonic (P-DM) scheduling
7
Overview of a Task Decomposition
8
Each thread of the task becomes an individual task with An intermediate subdeadline A release offset to retain precedence relations in the original task
Deadlines are assigned by distributing slack among segments
Deadline of a thread= execution requirement+ assigned slack
How much slack a segment demands depends on Available slack of the task Execution requirement of the segment
Execution requirement of a segment is the product of Total number of parallel threads in the segment and Execution requirement of each thread in the segment
Larger execution requirement implies more demand for slack In the figure, Segment 1 requires more slack than Segment 2
Slack Distribution
9
Slack Distribution (contd..)
10
We use the following principle to distribute slack All segments that receive slack will achieve an equal density
Reasons to equalize the density among segments Fairness: deadline of each segment becomes proportional to its
execution requirement We can bound the density of the decomposed tasks We can exploit existing density-based analyses for multiprocessor
€
Density of a task =execution requirement
deadline
Density of a Segment S =(total threads in S) * (exec. req.of a thread)
Assigned deadline
Slack Distribution (contd..)
11
…
Slack of each segment is determined by solving the equalities Sum of subdeadlines=task deadline (total assigned slack = task slack) Density of Segment 1= density of Segment 2 = so on
All threads in a segment have the same deadline and offset Deadline= execution requirement of the thread + segment slack Release offset=sum of deadlines of preceding segment
An Example of Task Decomposition
12
Segment 1:
deadline=20
density=(5*4)/20=1
Segment 2:
deadline=4
density=(2*2)/4=1
Segment 3:
deadline=9
density=(3*3)/9=1
Segment 4:
deadline=16
density=(4*4)/16=1
Segment 5:
deadline=3
density=(1*3)/3=1
All segments have an equal density!
Global EDF (G-EDF) Schedulability A sufficient condition for G-EDF scheduling on m unit- speed cores [Baruah RTSS ’07]
A necessary condition for any task set for any scheduler
total density
max density
If the original task set is schedulable anyway on m unit-speed cores, the decomposed tasks are schedulable under G-EDF on 4-speed cores
Using the density bounds for decomposed tasks
13
Partitioned DM (P-DM) Schedulability
A sufficient condition for FBB-FFD scheduling on m unit-speed cores
FBB-FFD (Fisher Baruah Baker – First-Fit Decreasing) is a well-known P-DM scheduler [ECRTS ’06]
A necessary condition for any scheduler
max cumulative exe. req. of tasks divided by time length
If the original task set is schedulable anyway on m unit-speed cores, the decomposed tasks are FBB-FFD schedulable on 5-speed cores
Using load and density bounds for decomposed tasks
14
Conclusion Multi-core processors provide opportunities to schedule
computation-intensive tasks in real-time Real-time systems need to exploit intra-task parallelism
We have addressed real-time scheduling for generalized synchronous parallel task model Different segments may have different number of threads Each segment can have an arbitrary number of threads
We have proposed a task decomposition that achieves A processor-speed augmentation bound of 4 for Global EDF A processor-speed augmentation bound of 5 for Partitioned DM
15