Upload
ajal-jose
View
144
Download
2
Tags:
Embed Size (px)
Citation preview
Multitasking and Process Management in
embedded systemsReal-Time Scheduling Algorithms
Differences Between Multithreading and Multitasking
4
Processing Elements Architecture
Flynn’s Classification Of Computer Architectures
• In 1966, Michael Flynn proposed a classification for computer architectures based on the number of instruction steams and data streams (Flynn’s Taxonomy).
• Flynn uses the stream concept for describing a machine's structure
• A stream simply means a sequence of items (data or instructions).
• The classification of computer architectures based on the number of instruction steams and data streams (Flynn’s Taxonomy).
6
Simple classification by Flynn: (No. of instruction and data streams)
> SISD - conventional> SIMD - data parallel, vector computing> MISD - systolic arrays> MIMD - very general, multiple
approaches.
Current focus is on MIMD model, using general purpose processors.
(No shared memory)
Processing Elements
7
Processor OrganizationsProcessor Organizations
Computer Architecture Classifications
Single Instruction,Single Instruction, Single Instruction,Single Instruction, Multiple InstructionMultiple InstructionMultiple InstructionMultiple Instruction
Single Data StreamSingle Data Stream Multiple Data StreamMultiple Data Stream Single Data StreamSingle Data StreamMultiple Data StreamMultiple Data Stream
(SISD)(SISD) (SIMD) (SIMD) (MISD) (MISD) (MIMD) (MIMD)
Uniprocessor Vector Array Shared MemoryUniprocessor Vector Array Shared MemoryMulticomputerMulticomputer
Processor Processor (tightly coupled) Processor Processor (tightly coupled) (loosely coupled)(loosely coupled)
8
9
SISD : A Conventional Computer
Speed is limited by the rate at which computer can transfer information internally.
ProcessorProcessorData Input Data Output
Instru
ctio
ns
Ex:PC, Macintosh, Workstations
10
SISD
SISD (Singe-Instruction stream, Singe-Data stream)
SISD corresponds to the traditional mono-processor ( von Neumann computer). A single data stream is being processed by one
instruction stream OR A single-processor computer (uni-processor)
in which a single stream of instructions is generated from the program.
11
SISD
where CU= Control Unit, PE= Processing Element, M= Memory
12
SIMD SIMD (Single-Instruction stream, Multiple-
Data streams) Each instruction is executed on a different
set of data by different processors i.e multiple processing units of the same type process on multiple-data streams.
This group is dedicated to array processing machines.
Sometimes, vector processors can also be seen as a part of this group.
13
SIMD
where CU= Control Unit, PE= Processing Element, M= Memory
14
MISD MISD (Multiple-Instruction streams, Singe-
Data stream) Each processor executes a different
sequence of instructions. In case of MISD computers, multiple
processing units operate on one single-data stream .
In practice, this kind of organization has never been used
15
MISD
where CU= Control Unit, PE= Processing Element, M= Memory
16
MIMD MIMD (Multiple-Instruction streams,
Multiple-Data streams) Each processor has a separate program. An instruction stream is generated from
each program. Each instruction operates on different data. This last machine type builds the group for
the traditional multi-processors. Several processing units operate on multiple-data streams.
17
MIMD Diagram
18
The MISDArchitecture
More of an intellectual exercise than a practical configuration. Few built, but commercially not available
Data InputStream
Data OutputStream
ProcessorA
ProcessorB
ProcessorC
InstructionStream A
InstructionStream B
Instruction Stream C
19
20
SIMD Architecture
Ex: CRAY machine vector processing, Thinking machine cm*
Ci<= Ai * Bi
InstructionStream
ProcessorA
ProcessorB
ProcessorC
Data Inputstream A
Data Inputstream B
Data Inputstream C
Data Outputstream A
Data Outputstream B
Data Outputstream C
21
Unlike SISD, MISD, MIMD computer works asynchronously.
Shared memory (tightly coupled) MIMD
Distributed memory (loosely coupled) MIMD
MIMD Architecture
ProcessorA
ProcessorB
ProcessorC
Data Inputstream A
Data Inputstream B
Data Inputstream C
Data Outputstream A
Data Outputstream B
Data Outputstream C
InstructionStream A
InstructionStream B
InstructionStream C
22
MEMORY
BUS
Shared Memory MIMD machine
Comm: Source PE writes data to GM & destination retrieves it Easy to build, conventional OSes of SISD can be easily be ported Limitation : reliability & expandability. A memory component or
any processor failure affects the whole system. Increase of processors leads to memory contention.
Ex. : Silicon graphics supercomputers....
MEMORY
BUS
Global Memory SystemGlobal Memory System
ProcessorA
ProcessorA
ProcessorB
ProcessorB
ProcessorC
ProcessorC
MEMORY
BUS
23
MEMORY
BUS
Distributed Memory MIMD
Communication : IPC on High Speed Network. Network can be configured to ... Tree, Mesh, Cube, etc. Unlike Shared MIMD
easily/ readily expandable Highly reliable (any CPU failure does not affect the whole
system)
ProcessorA
ProcessorA
ProcessorB
ProcessorB
ProcessorC
ProcessorC
MEMORY
BUS
MEMORY
BUS
MemorySystem A
MemorySystem A
MemorySystem B
MemorySystem B
MemorySystem C
MemorySystem C
IPC
channel
IPC
channel
interprocess communication
24
Single Core
Figure 1. Single-core systems schedule tasks on 1 CPU to multitask
Single-core systems schedule tasks on 1 CPU to multitask
• Applications that take advantage of multithreading have numerous benefits, including the following:
1.More efficient CPU use2.Better system reliability3.Improved performance on multiprocessor
computers
Multicore
Figure 2. Dual-core systems enable multitasking operating systems to execute two tasks simultaneously
• The OS executes multiple applications more efficiently by splitting the different applications, or processes, between the separate CPU cores. The computer can spread the work - each core is managing and switching through half as many applications as before - and deliver better overall throughput and performance. In effect, the applications are running in parallel.
3. Multithreading
Threads
• Threads are lightweight processes as the overhead of switching between threads is less
• The can be easily spawned• The Java Virtual Machine spawns a thread
when your program is run called the Main Thread
P PP P P PMicrokernelMicrokernel
Multi-Processor Computing System
Threads InterfaceThreads Interface
Hardware
Operating System
ProcessProcessor ThreadPP
Applications
Computing Elements
Programming paradigms
Why do we need threads?
• To enhance parallel processing• To increase response to the user• To utilize the idle time of the CPU• Prioritize your work depending on priority
Example
• Consider a simple web server • The web server listens for request and serves it• If the web server was not multithreaded, the
requests processing would be in a queue, thus increasing the response time and also might hang the server if there was a bad request.
• By implementing in a multithreaded environment, the web server can serve multiple request simultaneously thus improving response time
Synchronization
• Synchronization is prevent data corruption• Synchronization allows only one thread to
perform an operation on a object at a time.• If multiple threads require an access to an
object, synchronization helps in maintaining consistency.
Threads Concept
40
Multiple threads on multiple CPUs
Multiple threads sharing a single CPU
Creating Tasks and Threads
41
// Custom task class public class TaskClass implements Runnable { ... public TaskClass(...) { ... } // Implement the run method in Runnable public void run() { // Tell system how to run custom thread ... } ... }
// Client class public class Client { ... public void someMethod() { ... // Create an instance of TaskClass TaskClass task = new TaskClass(...); // Create a thread Thread thread = new Thread(task); // Start a thread thread.start(); ... } ... }
java.lang.Runnable
TaskClass
Figure 3. Dual-core system enables multithreading
43
Multithreading - Uniprocessors
Multithreading - Uniprocessors
Concurrency Vs Parallelism Concurrency Vs Parallelism
K ConcurrencyConcurrencyK ConcurrencyConcurrency
Number of Simulatneous execution units > no of CPUs
Number of Simulatneous execution units > no of CPUs
P1P1
P2P2
P3P3
time
time
CPU
44
Multithreading - Multithreading - MultiprocessorsMultiprocessorsMultithreading - Multithreading - MultiprocessorsMultiprocessors
Concurrency Vs Parallelism Concurrency Vs Parallelism Concurrency Vs Parallelism Concurrency Vs Parallelism
P1P1
P2P2
P3P3
time
time
No of execution process = no of CPUsNo of execution process = no of CPUs
CPU
CPU
CPU
45
Compiler Thread
Preprocessor Thread
Multithreaded Compiler
SourceCode
ObjectCode
46
Thread Programming models
1. The boss/worker model
2. The peer model
3. A thread pipeline
The scheduling algorithm is one of the most important portions of embedded operating system. The performance of scheduling algorithm influences the performance of the whole system.
47
taskXtaskX
taskYtaskY
taskZtaskZ
main ( )main ( )
WorkersProgram
Files
Resources
Databases
Disks
SpecialDevices
Boss
Input (Stream)
The boss/worker model
48
The peer model
taskXtaskX
taskYtaskY
WorkersProgram
Files
Resources
Databases
Disks
SpecialDevices
taskZtaskZ
Input(static)
Input(static)
49
A thread pipeline
Resources Files
Databases
Disks
Special Devices
Files
Databases
Disks
Special Devices
Files
Databases
Disks
Special Devices
Stage 1Stage 1 Stage 2Stage 2 Stage 3Stage 3
Program Filter Threads
Input (Stream)
Applications that take advantage of multithreading have numerous
benefits, including the following:
• More efficient CPU use• Better system reliability• Improved performance on multiprocessor
computers
Thread State Diagram
51
A thread can be in one of five states: New, Ready, Running, Blocked, or Finished.
Cooperation Among Threads
52
The conditions can be used to facilitate communications among threads. A thread can specify what to do under a certain condition. Conditions are objects created by invoking the newCondition() method on a Lock object. Once a condition is created, you can use its await(), signal(), and signalAll() methods for thread communications. The await() method causes the current thread to wait until the condition is signaled. The signal() method wakes up one waiting thread, and the signalAll() method wakes all waiting threads.
«interface»
java.util.concurrent.Condition
+await(): void
+signal(): void
+signalAll(): Condition
Causes the current thread to wait until the condition is signaled.
Wakes up one waiting thread.
Wakes up all waiting threads.
• In many applications, you make synchronous calls to resources, such as instruments. These instrument calls often take a long time to complete. In a single-threaded application, a synchronous call effectively blocks, or prevents, any other task within the application from executing until the operation completes. Multithreading prevents this blocking.
• While the synchronous call runs on one thread, other parts of the program that do not depend on this call run on different threads. Execution of the application progresses instead of stalling until the synchronous call completes. In this way,
• a multithreaded application maximizes the efficiency of the CPU ? • because it does not idle if any thread of the application is ready to
run.
4. Multithreading with LabVIEW
Semaphores & Deadlock
Semaphores (Optional)
56
Semaphores can be used to restrict the number of threads that access a shared resource. Before accessing the resource, a thread must acquire a permit from the semaphore. After finishing with the resource, the thread must return the permit back to the semaphore, as shown:
Thread Priority
57
• Each thread is assigned a default priority of the thread that spawned it. The priority of the main thread is Thread.NORM_PRIORITY. You can reset the priority using setPriority(int priority).
• Some constants for priorities include Thread.MIN_PRIORITY : 1
Thread.MAX_PRIORITY : 10
Thread.NORM_PRIORITY : 5
Deadlock
58
Sometimes two or more threads need to acquire the locks on several shared objects. This could cause deadlock, in which each thread has the lock on one of the objects and is waiting for the lock on the other object. Consider the scenario with two threads and two objects, as shown. Thread 1 acquired a lock on object1 and Thread 2 acquired a lock on object2. Now Thread 1 is waiting for the lock on object2 and Thread 2 for the lock on object1. The two threads wait for each other to release the in order to get the lock, and neither can continue to run.
synchronized (object1) { // do something here synchronized (object2) { // do something here } }
Thread 1
synchronized (object2) { // do something here synchronized (object1) { // do something here } }
Thread 2
Step
1 2 3 4 5 6
Wait for Thread 2 to release the lock on object2
Wait for Thread 1 to release the lock on object1
59
Deadlocks
lock( M1 );
lock( M2 );
lock( M2 );
lock( M1 );
Thread 1 Thread 2
Thread1 is waiting for the resource(M2) locked by Thread2 andThread2 is waiting for the resource (M1) locked by Thread1
Preventing Deadlock
60
Deadlock can be easily avoided by using a simple technique known as resource ordering. With this technique, you assign an order on all the objects whose locks must be acquired and ensure that each thread acquires the locks in that order. For the example, suppose the objects are ordered as object1 and object2. Using the resource ordering technique, Thread 2 must acquire a lock on object1 first, then on object2. Once Thread 1 acquired a lock on object1, Thread 2 has to wait for a lock on object1. So Thread 1 will be able to acquire a lock on object2 and no deadlock would occur.
Process scheduling in embedded systems
$
Process management - Three key requirements:
1. The timing behavior of the OS must be predictable - services of the OS: Upper bound on the execution time!
2. OS must manage the timing and schedulingOS possibly has to be aware of task deadlines; (unless scheduling is done off-line).3. The OS must be fast
Scheduling States: Simplified View of Thread State Transitions
RUNNABLE
SLEEPINGSTOPPED
ACTIVE
Stop
ContinuePreempt Stop
Stop Sleep
Wakeup
Hard and Soft Real Time Systems(Operational Definition)
• Hard Real Time System– Validation by provably correct procedures or extensive
simulation that the system always meets the timings constraints
• Soft Real Time System– Demonstration of jobs meeting some statistical constraints
suffices.
• Example – Multimedia System – 25 frames per second on an average
Real-Time System continue
• Soft RTS: meet timing constraints most of the time, it is not necessary that every time constraint be met. Some deadline miss is tolerated. ( tolerence is there )
• Hard RTS: meet all time constraints exactly, Every resource management system must work in the correct order to meet time constraints. No deadline miss is allowed.
( hod when a print documentation was not available on time , teared the paper at mets )
Role of an OS in Real Time Systems
1. Standalone Applications– Often no OS involved– Micro controller based Embedded Systems
• Some Real Time Applications are huge & complex– Multiple threads– Complicated Synchronization Requirements– Filesystem / Network / Windowing support– OS primitives reduce the software design time
Tasks Categories• Invocation = triggered operations
– Periodic (time-triggered)– Aperiodic (event-triggered)
• Creation– Static– Dynamic
• Multi-Tasking System– Preemptive: higher-priority process taking control
of the processor from a lower-priority– Non-Preemptive : Each task can control the CPU
for as long as it needs it.
Features of RTOS’s
1. Scheduling.
2. Resource Allocation.
3. Interrupt Handling.
4. Other issues like kernel size.
1. Scheduling in RTOS
• More information about the tasks are known– No of tasks– Resource Requirements– Release Time– Execution time– Deadlines
• Being a more deterministic system better scheduling algorithms can be devised.
Why we need scheduling ?!• each computation (task) we want to execute needs resources • resources: processor, memory segments, communication, I/O
devices etc.)• the computation must be executed in particular order
(relative to each other and/or relative to time)• the possible ordering is either completely or statistically a
priori known (described) • scheduling: assignment of processor to computations;• allocation: assignment of other resources to computations;
Real-Time Scheduling Algorithms
Fixed Priority Algorithms Dynamic Priority Algorithms Hybrid algorithms
Rate Monotonic scheduling
Deadline Monotonic scheduling
Earliest Deadline First
Least Laxity First
Maximum Urgency First
1. Static 2. Dynamic 3.hybrid
Scheduling AlgorithmStatic vs. DynamicStatic Scheduling:
All scheduling decisions at compile time. Temporal task structure fixed.Precedence and mutual exclusion satisfied by the
schedule (implicit synchronization).One solution is sufficient.Any solution is a sufficient schedulability test.
BenefitsSimplicity
Scheduling AlgorithmStatic vs. Dynamic
• Dynamic Scheduling:– All scheduling decisions at run time.
• Based upon set of ready tasks.• Mutual exclusion and synchronization enforced by
explicit synchronization constructs.– Benefits
• Flexibility.• Only actually used resources are claimed.
– Disadvantages• Guarantees difficult to support• Computational resources required for scheduling
Scheduling AlgorithmPreemptive vs. Nonpreemptive
• Preemptive Scheduling:– Event driven.
• Each event causes interruption of running tasks.• Choice of running tasks reconsidered after each
interruption.– Benefits:
• Can minimize response time to events.– Disadvantages:
• Requires considerable computational resources for scheduling
• Nonpreemptive Scheduling:– Tasks remain active till completion
• Scheduling decisions only made after task completion. – Benefits:
• Reasonable when task execution times ~= task switching times.• Less computational resources needed for scheduling
– Disadvantages:• Can leads to starvation (not met the deadline)
especially for those real time tasks ( or high priority tasks).
Scheduling AlgorithmPreemptive vs. Nonpreemptive
1.A Rate Monotonic scheduling• Priority assignment based on rates of tasks • Higher rate task assigned higher priority• Schedulable utilization = 0.693 (Liu and Leyland)
Where Ci is the computation time, and Ti is the release period
• If U < 0.693, schedulability is guaranteed• Tasks may be schedulable even if U > 0.693
RM example
Process Execution Time Period
P1 1 8
P2 2 5
P3 2 10
The utilization will be:
The theoretical limit for processes, under which we can conclude that the system is schedulable is:
Since the system is schedulable!
1.B Deadline Monotonic scheduling• Priority assignment based on relative deadlines of tasks • Shorter the relative deadline, higher the priority
2.A Earliest Deadline First (EDF)• Dynamic Priority Scheduling• Priorities are assigned according to deadlines:
– Earlier deadline, higher priority– Later deadline, lower priority
• The first and the most effectively widely used dynamic priority-driven scheduling algorithm.
• Effective for both preemptive and non-preemptive scheduling.
Two Periodic Tasks• Execution profile of two periodic tasks
– Process A• Arrives 0 20 40 …• Execution Time 10 10 10 …• End by 20 40 60 …
– Process B• Arrives 0 50 100 …• Execution Time 25 25 25 …• End by 50 100 150 …
• Question: Is there enough time for the execution of two periodic tasks?
Five Periodic Tasks• Execution profile of five periodic tasks
Process Arrival TimeExecution
TimeStarting Deadline
A 10 20 110
B 20 20 20
C 40 20 50
D 50 20 90
E 60 20 70
2.B Least Laxity First (LLF)• Dynamic preemptive scheduling with dynamic priorities• Laxity : The difference between the time until a tasks
completion deadline and its remaining processing time requirement.
• a laxity is assigned to each task in the system and minimum laxity tasks are executed first.
• Larger overhead than EDF due to higher number of context switches caused by laxity changes at run time
– Less studies than EDF due to this reason
Least Laxity First Cont• LLF considers the execution time of a task, which EDF does
not.• LLF assigns higher priority to a task with the least laxity.• A task with zero laxity must be scheduled right away and
executed without preemption or it will fail to meet its deadline.
• The negative laxity indicates that the task will miss the deadline, no matter when it is picked up for execution.
The definition of laxity is the quality or condition of being loose. Being lose and relaxed in the enforcement of a procedure is an example of laxity.
Least Laxity First Example
Maximum Urgency First Algorithm• This algorithm is a combination of fixed and dynamic priority
scheduling, also called mixed priority scheduling.• With this algorithm, each task is given an urgency which is
defined as a combination of two fixed priorities (criticality and user priority) and a dynamic priority that is inversely proportional to the laxity.
• The MUF algorithm assigns priorities in two phases• Phase One concerns the assignment of static priorities to
tasks• Phase Two deals with the run-time behavior of the MUF
scheduler
The first phase consists of these steps :1) It sorts the tasks from the shortest period to the longest
period. Then it defines the critical set as the first N tasks such that the total CPU load factor does not exceed 100%. These tasks are guaranteed not to fail even during a transient overload.
2) All tasks in the critical set are assigned high criticality. The remaining tasks are considered to have low criticality.
3) Every task in the system is assigned an optional unique user priority
3.A Maximum Urgency First Algorithm phase 1
In the second phase, the MUF scheduler follows an algorithm to select a task for execution.
This algorithm is executed whenever a new task is arrived to the ready queue.
The algorithm is as follows:
1) If there is only one highly critical task, pick it up and execute it.
2) If there are more than one highly critical task, select the one with the highest dynamic priority. Here, the task with the least laxity is considered to be the one with the highest priority.
3) If there is more than one task with the same laxity, select the one with the highest user priority.
Maximum Urgency First Algorithm phase 2
Scheduling Algorithms in RTOS
1. Clock Driven Scheduling
2. Weighted Round Robin Scheduling
3. Priority Scheduling (Greedy / List / Event Driven)
Scheduling Algorithms in RTOS (contd)
• Clock Driven– All parameters about jobs (release time/
execution time/deadline) known in advance.– Schedule can be computed offline or at some
regular time instances.– Minimal runtime overhead.– Not suitable for many applications.
Scheduling Algorithms in RTOS (contd)
• Weighted Round Robin– Jobs scheduled in FIFO manner– Time quantum given to jobs is proportional to it’s weight– Example use : High speed switching network
• QOS guarantee.
– Not suitable for precedence constrained jobs.• Job A can run only after Job B. No point in giving time quantum to
Job B before Job A.
Scheduling Algorithms in RTOS (contd)
• Priority Scheduling (Greedy/List/Event Driven)
– Processor never left idle when there are ready tasks
– Processor allocated to processes according to priorities
– Priorities • static - at design time• Dynamic - at runtime
Priority Scheduling
• Earliest Deadline First (EDF)– Process with earliest deadline given highest priority
• Least Slack Time First (LSF)– slack = relative deadline – execution left
• Rate Monotonic Scheduling (RMS)– For periodic tasks– Tasks priority inversely proportional to it’s period