A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
1
Guide to multithreaded code
“All I need to know about writing multithreaded code is that I am not smart enough to do it right.” - Terry Way
Intended audience:• C/C++ programmers not already expert in threading.• Visual Basic and Java programmers - it’s not all done for you!• Managers - see the issues that must be addressed.
•Part I Overview of the issues - Managers and developers
•Part II Coding issues - Developers
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
2
Part I
• An overview for managers and developers
• Performance issues are marked with a red arrow.• Most of the other issues relate to reliability.
• Best practice advice is marked with a blue tick.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
3
Part II
• Coding issues, mostly for developers– Support within COM (limited amount of COM+)
– Support within Visual Basic– Support within Java– How to use C++ effectively
• Avoiding reliability problems• Tackling performance problems
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
4
Common myths• Threads will make the system run faster.• We don’t need to test on SMP machines.• The chances of two threads doing X are so small.• Only C/C++ programmers need concern themselves.• We can bother about that later.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
5
The right tools• SMP machines• Appropriate libraries:
Dinkumware STL www.dinkumware.com
Smartheap www.microquill.comHoard
www.cs.utexas.edu/users/emery/hoard/Threads.h++ www.roguewave.com
• Configure msvcrt optimallywww.wdj.com/archive/1105/feature.html
MSVCRT_HEAP_SELECT
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
6
Why use SMP?
• You may not be targeting SMP machines, but you still need to test with SMP machines.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
7
char * global = new char[1000];global[0] = '\0';
_beginthread(thread_B);_beginthread(thread_C);
WaitForMultipleObjects(...);
for (int i = 0; i != x; ++i){ Waitxxxx(); . do_stuff(); . . strcpy(global, src);
~25m s . .n = strlen(global); . .
25m s is the standard "quantum " s ize onN T /2000 (it can be changed)
thread B thread C
thread A
tim e
// Bang!
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
8
Why test on SMP?
• There are categories of bugs that almost never occur on a uniprocessor machine, but are much more likely to occur on an SMP machine. Your choose is:– Test on a uniprocessor machine with a very varied data set,
and on all CPU speeds that you will ever run on (or vary the quantum size from 1 to 100ms).
– Let your most important customer find the bugs, because he has the most unusual data set.
– Test on an SMP machine.
• Which costs most now?
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
9
You’re exaggerating right?
• Yes and No.• In 25ms a huge amount of instructions can be
performed. So the chance of a thread swap occurring in the critical place is small.
• This is a demonstration of a bug that is almost impossible to test for on a uniprocessor machine. Other more common classes of bug are easier to test for without SMP, and but much easier to test for on an SMP machine. You reduce the number of test case iterations to find a bug with SMP.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
10
SMP benefits
• Reduced testing and/or more confidence.• Scalability (for servers).• More horse power. You can buy a 2 x 1GHz machine
for less than a 1.4GHz one way machine.• Motivation - put a smile on a developer’s face and
upgrade his machine.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
11
SMP downsides
• Not everybody’s software runs on SMP.• Hey, this just proves my point, SMP finds bugs for you!
• Badly written code can actually run slower on SMP.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
12
Why use threads?
• Wait on a blocking call without hanging the UI thread.• Smooth scheduling.• Apparent performance.• Division of truly independent and parallel tasks (e.g.
background tasks, or multiple clients) • Avoid poling.
Poling is almost never right. Pole too often and you burn CPU power, too slow and you appear unresponsive. Also you can completely miss events.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
13
Threads have an overhead• Every thread that you create has an overhead. It needs
– kernel resources to manage it– a stack (memory, with a typical reserved size of 1MB)– it needs scheduling– it needs creating– it needs to be cleaned up after it quits– OS handles must be duplicated– it can add complexity to the program– it may compete for resources– you may need to swap between threads
• Only create a thread if there is a real need.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
14
KISS
• Keep it simple stupid!
• If you don’t have a justified requirement for multiple threads, then you are causing unnecessary complication.
• This talk is non-trivial and so is threading.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
15
Advice
• Illustrate the object relationships, clearly showing the tiers .
• Document the threading model of your design.
• Illustrate what objects reside in what threads or apartments.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
16
Terminology
• SMP– Symmetrical
multiprocessing. More than one CPU in a machine, each with equal access to system resources.
CPU 0 CPU 1
Main memory
external cache
on board cache on board cachecache sync
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
17
Terminology
• Serialize– To “order sequentially”. Often we need to serialize
requests (function calls). This means that if two threads call a function foo() then we must ensure that the thread A has fully executed foo() before the thread B is permitted to start executing foo().
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
18
Terminology
• Thread safe lock (abbreviated to lock)
– A resource that only one thread can hold at anyone time.thread_safe_lock_t my_lock; // Protects data below
std::string my_string;
std::vector<int> my_numbers;
void foo()
{
my_lock.lock();
my_numbers.push_back(my_string.size());
my_lock.unlock();
}
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
19
How locks work
thread Bthread A
void foo(){ my_lock.lock();
my_numbers.push_back(my_string.size());
my_lock.unlock();}
1
2
4
3
5
6
thread B waits
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
20
How locks work
1) Thread A enters the function and acquires the lock.
2) Thread A gets on with executing the function.
3) Thread B enters the function but cannot acquire the lock, so it waits until the lock is available.
4) Thread A unlocks the lock and leaves the function.
5) Thread B is now able to acquire the lock, and execute the function.
6) Thread B unlocks the lock and leaves the function.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
21
Overview of threading problems
1) Dead locks– This is where a thread is stops in its tracks waiting
for some event to be signaled, or some resource to become free. But that will never happen. So the thread waits for ever.
2) Lack of atomic calls– It is possible for the object to be accessed whilst in
an inconsistent state, because what is logically a single operation is split across more than one function call.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
22
Overview of threading problems
3) Race conditions– e.g. A sent event is generated by one thread and
an acknowledgement by another thread. It may be possible to receive the acknowledgement before receiving the initial sent event.
4) Data corruption– e.g. String is copied before it is completely written.
Similar to the SMP example.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
23
Overview of threading problems
5) Indirect freezes– The UI freezes because it is waiting on a resource
held by a background thread that is performing blocking operation.
6) Lost locks– This is a special case of deadlock, where a lock
has been acquired but the code forget to release the lock. The next thread to attempt the acquire the lock will wait for ever.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
24
Overview of threading problems
7) Resource contention– This is where there is a very frequently used
resource that only one thread can use at a time. The CPU can spend more time swapping threads than it does doing the real work.
8) Programmed thread context swapping– Typically caused by a local/remote procedure call
(lrpc) being made from thread A to thread B. This is almost as expensive as interprocess communication.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
25
Overview of threading problems
9) Mixed component capabilities– Some are thread safe and some are not
10) Mixed environment expectations– Visual Basic, C++, Java, COM, COM+ and third
party solutions may all have differing threading expectations and solutions, creating confusion.
11) Every body has a solution– If I had $1 for every CThread class I’ve seen
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
26
Tools
• The programming environment– VB/Java/C++ COM COM+
• Debugger
• Perfmon
• QSlice (quick slice)
• Profiler
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
27
Questions ?
End of part I
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
28
Environment support
• Windows NT/2000 - far too much to cover, and mostly too low level
• COM
• Visual Basic
• Java
• C++
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
29
COM support
• The problem:– Component A was written in VB and is not thread
safe.– Component B was written in C++ or Java and is
thread safe.– These components cannot safely interact.
• Either the whole system must be written using one scheme or we must devise a way to interconnect them safely.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
30
COM supportHow COM solves the problem:
K ey
P rocess boundaryprimary ST A
STA
STA
STAS ing le Th readedA partm en t
STA
Thread
0 - n
1
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
31
COM supportHow COM solves the problem:
K ey
P rocess boundaryMTA
primary ST A
STA
STA
STAM ulti-th readedA partm en t
MTA
S ing le Th readedA partm en t
STA
Thread
0 - n 0 - 1
1 - n
1
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
32
Cross apartment calls
Cross apartment calls are very expensive. A large amount of RPC code gets executed, and two thread swaps have to occur. One thread swap to execute the code in object B, and another to get back to A’s thread.
MTAprimary ST A
STA
STA
STA
0 - n 0 - 1
AB
lrpc
1 - n
1
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
33
COM - example 1
A dialog in the primary STA creates a line object. The line object creates two points.
The line object is threading model “both”.
The points are threading model “apartment”.
primary ST A
dialogline
p2p1
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
34
COM example 2
The points cannot be instantiated in the MTA so a proxy is used. This is very inefficient.
M T A
line
ST A
p2
p1
p2
p1
lrpc
lrpc
proxy
filesaver
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
35
Management implications• In an ideal world:
– all developers would be expert in all languages.
– each component would be written in the most suitable language for the domain.
• In the real world– most developers are proficient in just one or two languages.
– resource and scheduling issues forces managers to choose the language that the component is written in based on the available skills.
• Result– major inefficiencies can be built into a product.
– if you are lucky and you find the performance bottle neck you may need to completely rewrite the component in another language.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
36
More inefficiencies M T AST A
lineline
triangletriangle
line
lrpc
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
37
Free threaded marshaler M T AST A
lineline
triangletriangle
line
line agregratesfree threadedm arshaler
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
38
Free threaded marshaler M T AST A
lineline
triangletriangle
line
p p
ST Ap p p p
line agregratesfree threadedm arshaler
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
39
Free threaded marshaler M T AST A
lineline
triangletriangle
line
p p
ST Ap p p p
line agregratesfree threadedm arshaler
bang!
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
40
One Solution
• NT 4.0 SP3 introduced the GIT (global interface table). You store a “handle to an object” rather than a direct pointer. Before accessing an object you can give the GIT the handle and ask it for a valid pointer (to a proxy).
• GITPtr - see code handout.– If anyone wants to use this, I can talk about it separately.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
41
COM support summary
• For all this work COM has only solved a couple of problems:
* It has directly solved problem (9) mixed component capabilities.
* Because it is a standard on Microsoft platforms it indirectly solves problem (10) mixed environment expectations.
• However, it solves these problems at the cost of problem (8) thread context swapping. But not with thread neutral model.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
42
Enter COM+• Windows 2000 only.• Much of COM+ is about transaction processing.• In addition to a “threading model” components can
now have a “synchronization” attribute.• There is a finer grain of isolation of incompatible
code, each apartment can have one or more contexts.
• Thread neutral apartment.• I haven’t used it.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
43
Thread neutral apartment
• COM+ only.
• Now Microsoft’s recommended threading model for non-UI code. Seriously consider it when targeting Windows 2000.
• Like FTM but without the problems. No need for GITPtr.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
44
COM+ synchronization• Disabled
– Just like COM
• Not Supported
– Objects make no promises, it is not synchronized. (These ought to be apartment threaded.)
• Supported
– Objects provide their own (custom) synchronization.
• Required
– Requires COM+ to provide the synchronization. Even when it’s not needed.
• Requires New
– ?
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
45
Questions?
• Next we’ll see how Visual Basic fits into the COM framework.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
46
Theme• A recurring theme in writing robust software is
to restrict things as much as you can, please watch out for it.– Prefer private over public– Prefer variables in this order:
• automatics, parameters, members, globals
– Prefer const over non-const– The more you expose the more there is to go
wrong.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
47
VB variables
Sub foo()
dim x as Integer
x = Something
End Sub
• This variable does exactly what you expect. It has the life time of the function call. It is only visible within the function.
• See handout “vb_rules”.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
48
VB variables
Private m_credits_earned as Integer
Sub Grade()
If m_credits_earned > 50 Then
call TopGrade
End If
End Sub
• The code excerpt comes from a VB ‘student’ class.• This variable does exactly what you expect. The life time of
m_credits_earned is the same as the lifetime of the student. Each separate student has a different instance of m_credits_earned.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
49
VB variablesSub Bar()
static is_recursive as boolean
If is_recursive Then
Exit Sub
End If
is_recursive = true
call DoStuff ‘ beware DoStuff can call Bar
is_recursive = false
End Sub
• This use of static does exactly what you expect and want. Although it is rather different to the C++ meaning with the same syntax. It declares a member variable only visible to the function. Member variables have the lifetime of the form or class, or module they are part of.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
50
VB variablesPrivate g_total_something as Integer
Sub Foo()
If g_total_something < 10 Then
call DoStuff
g_total_something = g_total_something + 1
End If
End Sub
• The code excerpt comes from a VB module. The behavior is different for a form or class, and anyway you would use m_total_something in a form or class.
• This variable does exactly what you expect and want until the code is run in a different apartment. The code in the new apartment uses a new instance of m_total_something and initializes it to zero.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
51
Visual Basic Support
• All Visual Basic objects use the apartment model. All objects exist in an STA.
• To avoid the problem of concurrent access global variables are really “per apartment” variables. Since most Visual Basic objects exist in the primary STA, in effect global are usually per process. However, you should not rely upon this.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
52
Visual Basic Support• To serve as a simple demonstration we have two simple
Visual Basic programs. Code listings are provided.
• Depending on how the server (counter) is compiled with regard to threading dramatically affects how the client code works.
• The moral is, beware of module level variables in VB. Otherwise called globals. They are thread safe, but make the code very hard to test in a multithreaded environment, and are open to miss use.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
53
Visual Basic globals
STAC lass1
g_num ber
C lass1
Typically both objects will be in the same apartment, the primary STA.
In this case the client is accessing two objects that are both in the same apartment.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
54
Visual Basic globals• However depending on the build options and where
the objects are obtained from the following situation can arise:
STA
C lass1
g_num ber
STA
C lass1
g_num ber
In this case the client is accessing two objects in different apartments.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
55
Visual Basic and reentrancePrivate m_divisor as Integer
Private Sub Form_Load
m_divisor = 1
End Sub
Public Sub foo()
m_divisor = 0
End Sub
Private Sub Bar()
If m_divisor <> 0 Then
RaiseEvent some_event ‘ or otherwise yield
MsgBox 100 / m_divisor
End If
End Sub
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
56
Visual Basic and reentrance
• When the event is raised there is nothing stopping a client that handles that event from calling foo().
• Even if none of the clients call foo(), we cannot be sure that one of the clients is not in another apartment. If there is a client in another apartment it will be called through lrpc, and will yield. Yielding permits COM to unblock a previously blocked request either foo() or bar().
• The ideal solution is to queue the event and raise it later, but this is a lot of trouble in VB.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
57
Reentrance solutionsPrivate Sub Bar()
If m_divisor <> 0 Then dim divisor as Integer
divisor = m_divisor ‘ copy the member variable
RaiseEvent some_event ‘ or otherwise yield
MsgBox 100 / divisor
End If
End Sub
orPrivate Sub Bar()
If m_divisor <> 0 Then
MsgBox 100 / m_divisor
RaiseEvent some_event ‘ or otherwise yield
End If
End Sub
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
58
Accessing multithreaded objects
Sub Bar()
If x.number <> 0 Then
call ExpensiveOperation()
MsgBox 100 / x.number
End If
End Sub
Just because only one thread at a time can access your object do not assume that other objects cannot be accessed by more than one thread!
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
59
VB - what have we learnt ?• Be sure that you understand the meaning of a global
variable. VB does not have real globals.• Global constants are okay.• Per object variables (member variables) have a more
clear meaning, but beware of reentrance.• When possible raise events at the end of functions.• Avoid calling across apartments, if you must, prefer
to do it at the end of functions.• Never yield (never call DoEvents)• The state of other objects can change underneath
you.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
60
Questions ?
• How many people want to cover Java support for threads ?Or shall we move on to C++ ?
My knowledge of Java is not deep
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
61
Java support
• Java has more powerful and fine grain support for threads than does VB, it is much more like C++.
• Threads can be created, waited upon, suspended, synchronized.
• The keyword synchronized can be applied to blocks of code, member functions, or static class functions.
• All variables can be waited on or signaled, or locked.• Some standard Java classes are thread safe and
others are not. Read the documentation!
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
62
Java support
• Per object synchronization– Typically this is used to protect member variables of an
object being simultaneously accessed by two threads.
public class MyClass{
public synchronized void foo() { }
Conceptually there is an extra member variable in each object which is locked when the any synchronized method is entered and unlocked when it is quit.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
63
Java support
• Per class synchronization– Typically this is used to protect static members of a class
from being simultaneously accessed by two threads.
public class MyClass{
public synchronized static void bar() { }
In this case the Java virtual machine grabs a lock on the “class object”. The “class object”, is a special Java object that describes the class. There is one class object for each class.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
64
Java support
• Block synchronization– Typically this is used provide finer gain protection to
variables.
public void foo(){
synchronized(this);
This is equivalent to synchronizing the whole function, however, we could just synchronize the else part of an if statement, or any block of code.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
65
C++ support
• In the standard C++ has no built in support for threads.
• However compiler vendors usually provide some support for threads.
• Since C++ is extensible (via classes) it potentially has the most powerful support.
• Many of the issues here also apply to Java.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
66
C++ process memorySee code handout with same title
static storage (fixed size)
stacksgrow ondem and
1000
? ar? n mod
1024 e lements(4096 bytes)
heap
0xDE ADBE A F
0x17E D5423
heapgrows ondem and
23
22
21
0xDE ADBE A F
0x17E D5423
8720
all CO Mobjects
everyth ingcreated with
'new'
thread local storage(TLS)
stacks(one per thread)
accessib le by all threads
accessib le by all threads
t
a
b
c
int_ptr
in t_array
g
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
67
• C++ has– automatic data (allocated from stack)– has static data
(static data can be scoped per process, file, class, struct or function).
– heap / free store data– per machine data (vendor specific)– per thread data (vendor specific)– potentially we could extend C++ via a class (most
likely a smart pointer) to support per apartment data
C++ process memory
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
68
C++ memory
• Clearly C++ has the most flexible scheme.• However since there is no built in support for threads
we must be careful to protect the data that can be accessed by multiple threads. This means protecting all static and heap based data.
• Actually the lack of built in support, as we will see latter can be an advantage.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
69
The correct way to use lockswith non-member data
static critical_section_t my_critsec; // protects data below
static my_data_t my_data1;
static my_data_t my_data2;
void foo()
{
cs_lock_t auto_lock(my_critsec);
my_data1.do_something();
my_data2.do_something();
}
• Here we are protecting some per file static data my_data1
and my_data2. The cost is very small, and is very similar to use to Java’s built in support.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
70
The correct way to use lockswith member data
class my_class_t
{
.
.
mutable critical_section_t m_critsec; // protects data below
my_data_t m_data;
};
// to be able to use const you must declare the critical section member as mutable
void my_class_t::foo() const
{
cs_lock_t auto_lock(m_critsec);
m_data.do_something();
}
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
71
Deadlocks - the problem
crit_sect_t cs_b;object_t obj_b;int b_val;
void foo(){ cs_lock_t lock1(cs_a); obj_a.process(a_val);
do_something();
cs_lock_t lock2(cs_b); obj_b.process(b_val);}
void bar(){ cs_lock_t lock1(cs_b); obj_b.process(b_val);
do_something_else();
cs_lock_t lock2(cs_a); obj_a.process(a_val);}
crit_sect_t cs_a;object_t obj_a;int a_val;
dead lock dead lock
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
72
Deadlocks a solution
void foo(){ { cs_lock_t lock1(cs_a); obj_a.process(a_val); }
do_something();
{ cs_lock_t lock2(cs_b); obj_b.process(b_val); }}
void bar(){ { cs_lock_t lock1(cs_b); obj_b.process(b_val); }
do_something_else();
{ cs_lock_t lock2(cs_a); obj_a.process(a_val); }}
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
73
Deadlocks - another solution
void foo(){ cs_lock_t lock1(cs_a); cs_lock_t lock2(cs_b); obj_a.process(a_val);
do_something(); obj_b.process(b_val);}
void bar(){ cs_lock_t lock1(cs_a); cs_lock_t lock2(cs_b); obj_b.process(b_val);
do_something_else(); obj_a.process(a_val);}
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
74
Deadlocks - subtle risks
void my_class_t::foo()
{
cs_lock_t auto_lock(m_crit_sec);
m_client->update();
m_val = m_val + 1;
}
• The problem is that we have no knowledge and no control over what client does in the call to update(). If the client attempts to lock a resource we could hit a variation on our old deadlock situation.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
75
Dead locks - more subtle risks
void my_class_t::foo()
{
cs_lock_t auto_lock(m_crit_sec);
FireUpdateEvent();
m_val = m_val + 1;
}
• The problem is exacerbated by Microsoft. Microsoft code almost always fires events synchronously.
• To be fair C# has support for asynchronous events.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
76
A possible work around
void my_class_t::foo()
{
{
cs_lock_t auto_lock(m_crit_sec);
m_val = m_val + 1;
}
FireUpdateEvent();
}
• This avoids the risk of deadlock but is not always practical. Also we do not know whether the caller of foo() has locked a resource. The only reliable solution is to use asynchronous events.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
77
Support for asynchronous events
• A good implementation is non-trivial.
• Neither ATL, MFC, Visual Basic, or Java support asynchronous events (C# does).
• Do not implement your own per case solutions. Use library code.
• I know of only one good source for reusable source code to implement asynchronous events in C++:www.cuj.com/archive/1610/1610list.html
If you would like to use this, please see me because I’ve got some improvements and fixes for it, also there are issues with scripting clients.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
78
Deadlocks - very subtle risksstatic file_scoped_critsec;
void my_class_t::foo()
{
cs_lock_t auto_lock(file_scoped_critsec);
p_low_level_obj->foo();
m_val = m_val + 1;
} DOWNWARD CALLS ARE OKAY - HAVE A DIAGRAM!!!!!
• Avoid having global locks if you can. Minimize the visibility of lockable resources.
• If you must have one, beware of what you do whilst holding it. Do not do anything that could result in a cross-apartment call.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
79
Global lock example
ST A M T A
Object A
O bject B
Object A locks the global (non-member) resource, and calls into object B. Object B also requires the global resource, and waits.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
80
Causalitycausality n : the relation between causes and effects
• The operating system applies locks based on the thread id, not the causality. As long as you stay with the its framework COM is able to track the causality.
• In the previous slide there were two threads but only one causality.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
81
CausalitySTA MTA
Object A
Object B
Object A is blocked in an RPC call to object B. Object B calls back object A, but the system (COM+) knows that this is the same causality and allows the RPC to unblock to allow the call to continue (otherwise it would deadlock).
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
82
Breaking causality
• If object B was to create another thread and that thread was to attempt to call back object A then we would have fooled COM’s causality checking, and dead lock will ensue.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
83
Read/Write locks
• So far we have locked our resource regardless of whether the client is just reading or modifying the data. Only one client at a time can have any access.
• This is inefficient. Because we could have 1000 clients all wanting to read the data, and they will not interfere with each other, so they could all have access in parallel.
• What we need is a different type of lock. The lock will only deny access to readers if a writer holds the lock. The lock will deny access to writers if anyone holds the lock.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
84
Usageint my_class_t::get_value() const
{
reader_lock auto_lock(m_critsec);
return m_value;
}
void my_class_t::modify_value(int new_value)
{
writer_lock auto_lock(m_critsec);
m_value = new_value;
}
• The get_value function locks the object for reading, and the modify_value function locks it for writing. However, it is up to the developer to correctly choose whether to perform a read or write lock. It is possible to have the code automatically select a read or write lock - advanced topic.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
85
Temporary freezesvoid my_class_t::foo()
{
cs_lock_t auto_lock(m_crit_sec); potentially_expensive_off_machine_function_call();
}
int my_class_t::get_trivial_value()
{
cs_lock_t auto_lock(m_crit_sec); return m_trivial;
}
• Never hold a lock for an extended period of time.
The function foo is called from a background thread. So m_crit_sec remains locked for several seconds. The UI thread attempts to call get_trivial_value and it is also blocked.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
86
When not to use a lock• Locks add complexity, cost CPU cycles to acquire and release,
and risk deadlock, so it’s best to avoid them when you can.• When the data is only accessed by one thread:
– Automatics (stack based)– Thread local storage
• Data that does not change:– const
• Stateless objects• Naturally self consistent data without race conditions:
– e.g. a single bool set by the user, and read by a back ground thread.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
87
Good use of constconst double pi = 3.14159265359; // point 0
class person_t
{
public:
my_class_t(const std::string & fore_name, // point 1
const std::string & family_name) :
m_fore_name(fore_name)
m_family_name(family_name)
{}
std::string get_fullname() const // point 2
{
return m_fore_name + m_family_name;
}
private:
mutable crit_sec_t m_critsec; // point 3
const std::string m_fore_name; // point 4
const std::string m_family_name;
};
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
88
Good use of const
0> const type indentifier = value;Most people’s simple view of “const”, an item of data (often global) with a fixed value.
1> foo(const T & arg)Pass by “const reference”. This means that the function promises not to modify the arguments that have been passed by reference.
2> class_t::func() constA “const member function” . This means that the function promises not to modify the “logical state” of the object. It may change the bit pattern of “mutable” members but it will preserve the state of the object.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
89
Good use of const
3> mutable type identifier;A “mutable data member”. This declares a member variable that does not contribute to the logical state of the object. It can be modified by “const member functions”.
4> const type identifier;A “const data member”. This declares a data member that never changes. If it is to have a value it must be given that value immediately when it is created.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
90
Good use of const
• The code for person_t is almost completely thread safe. Once constructed the data clearly never changes. So no locks are required.
• “almost”Well there is the issue of lifetime. Sometimes this is simple, sometimes not.
• In a Microsoft environment we can use COM to do reference counting. We can inherit a very light weight implementation of IUnknown. Or we can use boost::shard_ptr (but make it thread safe).
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
91
Guidance for locks and const
• In general, acquire a lock only on externally visible functions member functions. That means public or protected or any virtual functions, plus functions called internally by asynchronous means. The remaining internal functions will not need a lock, it will have already been acquired.
• Acquiring a lock within internal functions is harmless, but there is a performance penalty.
• If you wish to distinguish read and write locks:– Non-const functions require a write lock.
– const functions require at least a read lock.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
92
Guidance for locks and const
• Non-private member functions that only access const members need not acquire a lock. Such functions may not call other member functions.
public: void get_foo() const // no lock
required{
return m_foo;}
private:const int m_foo;
mutable crit_sect_t m_critsec;
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
93
Guidance for locks and const• Non-private or virtual const member functions need only acquire
a read lock.
public: // or protected void foo1() const { read_lock_t auto_lock(m_critsec);
. } virtual void foo2() const { read_lock_t auto_lock(m_critsec);
. }
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
94
Guidance for locks and const• Other non-private or virtual member functions must acquire a
write lock.
public: // or protectedvoid foo3()
{ write_lock_t auto_lock(m_critsec);
.}
virtual void foo4() { write_lock_t auto_lock(m_critsec);
.}
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
95
Guidance for locks and const
• Private non virtual functions need not acquire locks.
private: void foo5()
{ . }
• “Non-private” == public or protected• Restrict the visibility of functions as much as possible.• This assumes that mutable data is thread safe, and that threads
are not internally created on private member functions.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
96
Granularity
void my_class_t::append(const std::string & new_data)
{
if (new_data.empty())
{
return;
}
else
{
cs_lock_t auto_lock(m_crit_sec); m_existing_data = m_existing_data + ‘,’ + new_data;
}
}
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
97
Granularityvoid my_class_t::append(const std::string & new_data)
{
if (new_data.empty())
return;
cs_lock_t auto_lock(m_crit_sec); m_existing_data = m_existing_data + “,” + new_data;
}
• Either put the lock at the beginning of the function, or soon after the beginning. If it is hidden in the depths of the function it is too easy for a maintenance programmer to break the code. On a uniprocessor machine it is only rarely a performance optimization, unless of course there are expensive off machine calls involved.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
98
Thread local storage
DWORD TlsAlloc();
BOOL TlsSetValue(DWORD, LPVOID);
LPVOID TlsGetValue(DWORD);
BOOL TlsFree(DWORD);
__declspec(thread) int my_per_thread_var = 0;
• Be warned, __declspec(thread) is unsafe to use in a DLL loaded using LoadLibrary, i.e. any COM DLLs.
• Can be used to detect recursion in a multithreaded environment.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
99
Per machine storage-------- Shared.h --------
namespace shared
{
extern bool __declspec(dllexport) per_machine_val;
}
------- Shared.cpp -------
#pragma data_seg(".ANAME")
#pragma comment(linker, "/SECTION:.ANAME,RWS")
namespace shared
{
bool per_machine_val = false;
}
#pragma data_seg()
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
100
Per machine storage • You cannot store pointers in this storage.• If you have consistency issues, you must use a
Mutex rather than a critical section for serialization.• The DLL containing shared::per_machine_val must
be loaded into the each process sharing the data.• If you have multiple copies of the DLL on the
machine there is potential for multiple out-of-sync instances of the global value.
• The value is initialized when the DLL is first loaded.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
101
Atomic calls• Consider a rectangle with the following properties:
– set/get: center, width, height
– set/get: top_left, top_right, bottom_left, bottom_right – set(center, width, height) , get(center, width, height)
• The first interface gives an inherently self consistent view of an object. However, it is still possible for the details of a valid but non-existent rectangle to be read by another thread.
• The second interface is just plain dangerous. It is possible at any given instant for the object to be any quadrangle, not a rectangle at all (a invalid state).
• The third interface is safe but unfriendly.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
102
Atomic calls• The main alternatives are:
– To provide a lock and unlock method– Provide a copy/clone method– Provide a snapshot method
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
103
Atomic calls
• lock/unlock methods can be both risky and inefficient. Risky because the client could lock but forget to unlock an object. The alternative is to provide a helper class that does the lock in its constructor and unlock in its destructor, and only expose the lock/unlock methods to the helper class. It can also be inefficient with highly used global objects because it can lead to contention.
• copy or clone methods may be inappropriate especially if the object has an identity e.g. an employee object.
• snapshot methods add the complexity of a new interface with just read-only properties. Copying the internal state of the original object may be expensive.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
104
Summary of rules for locks• When possible hold at most one lock at a time.
• Acquire and release locks in “nested” order.
• If you must acquire multiple locks always do it in the same order.
• Read/write locks can be more efficient.
• Hold locks for the minimum time period (no off machine calls).
• Do not call external (non-downward) code whilst holding a lock.
• Use asynchronous events and callbacks.
• Use smart or very smart lock classes.
• Never expose a public lock/unlock method - it invites abuse.
• Each object should protect its own data.
• Make maximum use of const.
• Make use of mutable.
• Only use a lock when there is a need.
• Follow the “Guidance for locks and const”
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
105
Library support - locks
• In large scale multithreaded development we want the following facilities in locks:– Exception safety– Correct copy/assignment semantics– Automatic read/write locking– Fixed efficient DLL interface on lockable object– Warning when lock held too long (freeze)– Minimum overhead in a single threaded environment– Adapts to operating system (selects fastest method available)
– Optional deadlock detection
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
106
Library support - events
• Facilities we want for events and callbacks– The ability to fire events and callbacks
asynchronously– Support for multiple arguments– Type safety– Scripting support
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
107
Transactions• Transactions may occur in parallel threads, and typically lock
database records. Database locks like any other lock should not be held for any length of time, but database lock are relatively expensive.
• Keeping transactions short is essential in a multiuser transactional system. Transactions must be isolated from each other. This means that objects currently working on Transaction A cannot see any of the data being used by other objects working on Transaction B. Neither transaction knows whether the other is going to commit or abort, so they don't know whether the data in the other ongoing transaction is good or not. You can not be allowed to see data that is potentially bogus for fear you would misuse it, so you get “locked out”.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
108
Transactions• Keep the granularity fine, lock the minimum of data.
• When possible use a “contra-transaction”. Immediately commit data without waiting for a user response / confirmation, and achieve the effect of a role back by using a “contra-transaction”.
– For example, do not lock the credit for a bank whilst waiting for a trade to be confirmed. Instead draw down the credit immediately, and if the deal is not confirmed, generate a contra transaction to restore the credit.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
109
Brief summary
• Clearly model threads and apartments in your design, and the illustrate the context in which objects and code run.
• Use const to the maximum.• Beware locks and expensive calls.• Avoid global or shared data or objects.• Uses asynchronous events and callbacks.• Use very smart locks.• Use fine granularity with transactions and consider
“contra-transactions”.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
110
Brief summary
• COM– VB is okay for the UI and strictly apartment model code.– C++ provides the most flexibility for infrastructure. It can
exist in an MTA.
• COM+– Consider using threading model ‘Neutral’.– Consider letting COM+ do the synchronization (‘Required’).
• Read the code handouts– critical_section_t, cs_lock_t, asynchronous event firing,
GITPtr
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
111
Commonly misused functions• CreateThread
– Use _beginthread, _beginthreadex or AfxBeginThread.
• _beginthread, _beginthreadex– Read what MSDN says about the handles returned.
• Sleep(0), SleepEx(0, TRUE)– The former will yield your thread, the later enters an alertable wait
state.
• SendMessage– This will cause two thread swaps if the window is in a different thread.
• DllMain– NT holds the process critical section whilst calling DllMain, the
same one acquired by GetModuleHandle among other functions. MSJ 96 sidebar.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
112
Areas not covered
• Win32 thread primitives
• Priorities and scheduling
• Out of process schemes using COM
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
113
Recommended reading• Books
– Most books about threads are like vegetables, they are sold by weight, and past their sell by date.
– There is no single source.– Java Threads - Scott Oaks & Henry Wong
about 300 pages, well presented, covers general issues in a Java context, and Java specifics.
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
114
Recommended reading• Internet
– http://www.lsc.nd.edu/~jsiek/tlc/btl/libs/threads/Synchronization.htmlwww.wdj.com/archive/1105/feature.html
– comp.programming.threads (news group)– www.cuj.com/archive/1610/1610list.html– www.boost.org (will probably support threads in the future)
• Products– www.cs.utexas.edu/users/emery/hoard/ – www.microquill.com smart heap– www.roguewave.com threads++– www.objectspace.com The Foundations Thread<ToolKit>– www.ooc.com/jtc/ JThreads/C++
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
115
Recommended reading
• MSDN – Apartment-Model Threading in Visual Basic– Neutral Apartments
– Windows 2000 Brings Significant Refinements to the COM(+) Programming Model
• item: Contexts and Apartments
– Services provided by COM+
• item: Concurrency
A practical guide to writing multithreaded code - Mark Bartosik Febuary 2001
116
More help?
• If you would like more detail in a particular area or some help or guidance. For example, writing the really smart locking class, extending GITPtr, or event firing….– I am very happy to help
extension 6830, cube 3412email:[email protected]
– More than an hour? This will probably require you to bribe Lauren Lenoble and/or Angelo Bartolotta.