Upload
shuo-chen
View
6.470
Download
4
Embed Size (px)
DESCRIPTION
Citation preview
ESSENTIALS OF MULTITHREADEDSYSTEM PROGRAMMING IN C++
Shuo Chen2011/02
blog.csdn.net/[email protected]
@bnu_chenshuo
Shuo Chen (blog.csdn.net/Solstice)
2
Contents
Challenges in multithreaded system programming Thread safety of C and C++ libraries RAII and fork() fork() and signal handling in multithreaded programs
2011/02
Shuo Chen (blog.csdn.net/Solstice)
3
Audience: C++ programmers
Familiar with Pthreads and Sockets API Knows thread safety, deadlock, race condition, etc.
In a word: read through APUE2e and UNP3e (vol. 1) by W. Richard Stevens et al.
All discussions are based on Linux 2.6.x, x >= 28 There are new syscalls, eg signalfd, eventfd, and timerfd x86 and x64 platforms
2011/02
Shuo Chen (blog.csdn.net/Solstice)
4
Multi-threaded system programming
Multithreading is inevitable in this multi-core era The difficulties are not learning synchronization
primitives (mutexes, condition variables) ~10 functions are sufficient to do it right
But understanding interactions between existing system calls and library functions
Understands how threads affect system design Use it wisely and effectively Avoid common pitfalls and fallacies
2011/02
5
11 essential Pthreads functions
11 out of 110+ pthreads functions 2 -> create and join threads 4 -> init/destroy, lock/unlock mutexes 5 -> init/destroy, wait/signal/broadcast cond vars Think twice if you need more
Some are okay, eg. once and key, maybe rwlock Some are bad, eg. cancel and kill, semaphores
Check muduo/base for encapsulation in C++ http://code.google.com/p/muduo http://github.com/chenshuo/recipes click thread
2011/02 Shuo Chen (blog.csdn.net/Solstice)
Shuo Chen (blog.csdn.net/Solstice)
6
An asynchronous world
Never assume the sequence of events without proper synchronization. Knows happens-before relation, memory visibility, etc.
The effect of an interaction between two [thread]s must be independent of the speed at which it is carried out. --- Brinch Hansen 1973
2011/02
7
Standards and practices
Although the latest official standards of C and C++ languages (C99 and C++03) do not say a word about process or thread
We write multi-process and/or multi-threaded C/C++ programs in real life, as a real-world need
We can’t wait it to be standardized, as standards usually fall behind practices for years btw, if there are not real life multi-threads programs ,
how do people what/how to standardize? We adhere to some de facto standards
A lot simpler if we focus on one hardware and one OS2011/02 Shuo Chen (blog.csdn.net/Solstice)
Shuo Chen (blog.csdn.net/Solstice)
8
Thread identifier on Linux
Use pid_t as thread id, instead of pthread_t, on Linux pthread_t thid = pthread_self(), thid is opaque (uintptr_t) pid_t tid = ::gettid(), tid is task id, usually a small integer /proc/tid/, /proc/pid/task/tid/, ps, top all work
fine How to implement gettid() efficiently? Thread local?
gettid(2) is a syscall, but the output should never change getpid(2) caches the result, should gettid() do the same? What if fork(), will it caches the old value in child proc? How about pthread_atfork() to clear it up? Check muduo/base/Thread.cc for details
2011/02
9
Creation of threads
A library should not create its own ‘background’ thread without prior informed consent Makes a program non-forkable
Never create thread before main() Avoid creating thread in ctor of static or global object Breaks static objects constructing, eg. protobuf registering
The number of threads created should be independentof system load, eg. # of connections, # of requests otherwise non-scalable
Reuse threads, by assign multiple roles to it Doing IO and timer with muduo EventLoop class For simple task, do it within IO callbacks in IO threads
2011/02 Shuo Chen (blog.csdn.net/Solstice)
Shuo Chen (blog.csdn.net/Solstice)
10
Three ways of termination
http://blog.csdn.net/program_think/archive/2009/03/14/3991107.aspx
Natural death – return from thread function, good Suicide – call pthread_exit() Mudered – killed by pthread_cancel()
Rule: let it die, never suicide or murder a thread Why? inherently deadlock-prone: no chance to unlock Design your program so that a thread can be waken up
and safely exits For reference
Java Thread.{stop, suspend and destroy} are deprecated Boost Threads doesn’t provide thread::cancel()
2011/02
Shuo Chen (blog.csdn.net/Solstice)
11
pthread_cancel() and C++
In C, we have concept of ‘cancellation point’ In C++, pthread_cancel() throws an exception in that
thread, helps unwinding objects on stack The exception must reach the outmost function,
otherwise core dump:FATAL: exception not rethrown
Aborted (core dumped)
Always rethrow in catch(…) cause Ulrich Drepper “Cancellation and C++ Exceptions”
Better: never cancel or kill a thread
2011/02
12
exit() is not thread safe in C++
exit() destructs static or global objects, (_exit() doesn’t) The destructor may try to hold a lock The caller function may have held the same lock already End up in a dead lock
Check following code for an example of dead lock github.com/chenshuo/recipes/blob/master/thread/test/ExitDeadLock.cc How to quit a multi-threaded program safely?
An irregular but simple solution: make a process killable, eg. p.29 blog.csdn.net/Solstice/archive/2010/10/19/5950190.aspx
It’s not fault of exit(), but static or global objects Try to avoid static or global objects in C++, except for PODs
2011/02 Shuo Chen (blog.csdn.net/Solstice)
Shuo Chen (blog.csdn.net/Solstice)
13
Thread local __thread in g++
Thread safe by natural, unless escaped to other thread More efficient implementation, than pthread_key_t
See “ELF Handling For Thread-Local Storage” In C++, must be initialized with constant-expression
No __thread string t_obj("Chen Shuo"); No __thread string* t_obj = new string; Only __thread string* t_obj = NULL; More rules:
http://gcc.gnu.org/onlinedocs/gcc/Thread_002dLocal.html
Use pthread_key_t if you want auto destruction2011/02
Shuo Chen (blog.csdn.net/Solstice)
14
Use non-recursive mutex only
A basic assumption of holding a mutex Once I lock it, I can modify the guarded object safely Which is not true for recursive mutex, eg.http://blog.csdn.net/Solstice/archive/2010/02/12/5307710.aspx#_Toc11928
Recursive mutexes by David Butenhof http://zaval.org/resources/library/butenhof1.html
Recursive locks - a blessing or a curse? http://www.thinkingparallel.com/2006/09/27/
recursive-locks-a-blessing-or-a-curse/
2011/02
Shuo Chen (blog.csdn.net/Solstice)
15
Impacts of introducing threads
Threading is a late patch to OS kernel Unix kernel and API formed in early 1970s First implementation of threads emerged in early 1990s Breaks lots of assumptions made during the 20 years
Library functions with side effects must be revisited malloc/free, fread/fseek can be made thread-safe with locks
Functions that return or use static allocated space are not thread safe but may have thread-safe variants asctime_r, ctime_r, gmtime_r, rand_r, stderror_r, strtok_r
errno is not an ‘extern int’, but a per-thread value extern int *__errno_location(void); #define errno (*__errno_location())2011/02
Shuo Chen (blog.csdn.net/Solstice)
16
Thread safety of C library
Individual system calls must be thread safe Be caution of interfering of same file descriptor from
multiple threads Most of glibc library functions are thread safe nowadays
Counterintuitively, Posix standards lists functions thatare not required to be thread safe, it's a black list.
http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09
2.9.1 Thread-Safety :
All functions defined by this volume of POSIX.1-2008 shall be thread-safe, except that the following functions need not be thread-safe. Notably, getenv/putenv/setenv/system() are not safe
2011/02
Shuo Chen (blog.csdn.net/Solstice)
17
FILE* functions are thread safe
Read ‘man flockfile’, but they are not composable, eg. fseek(), followed by fread()
The file position may change during the course by a different thread Wrap with flockfile(FILE*) and funlockfile(FILE*)
Same applies to lseek(2) and read(2), but how to lock? Use pread(2) instead, which doesn’t change the file offset
In general, a function that calls two thread-safe functions is not guaranteed to be thread-safe Just like exception-safety, thread-safety is not composable
2011/02
Shuo Chen (blog.csdn.net/Solstice)
18
Thread safety is not composable
A solution works in single-threaded program may not apply to multi-threaded program. Any solution calls two or more thread safe function are not
necessarily correct in multi-threaded program What’s the time in London now? Program runs in New Yorkstring oldTz = getenv("TZ"); // save TZ
putenv("TZ=Europe/London"); tzset(); // set TZ to London
struct tm localTimeInLN = *localtime(time(NULL));
setenv("TZ", oldTz.c_str(), 1); tzset(); // restore old TZ
This code impacts localtime() in other threads Thread safe functions are not composable unless you
carefully design the interface and interactions2011/02
Shuo Chen (blog.csdn.net/Solstice)
19
Thread safety of C++ std library
Although not required by the standard, the de facto says Unshared objects are independent: Two threads can freely
use different objects without any special action on the caller's part. We call it "same level as built-in types."
This applies to STL containers like map, vector, string Pure functions are safe, eg. Most of STL algorithms.
The global cin/cout objects are shared by threads, and are not thread safe. Moreover, they can't be made safe cout << a << b; cout.operator<<(a).operator<<(b); Two function calls can be interrupted by another thread Use printf(3) instead, it's thread safe and atomic.
Allocators must be thread safe, as they are shared2011/02
Shuo Chen (blog.csdn.net/Solstice)
20
Thread-Safe vs. Thread-Efficient
printf(3) and malloc(3) are thread safe, but not necessarily efficient enough, esp. on multi-cores
printf(3) locks FILE* stdout, synchronizes threads not good for multi-threaded logging, we need a better lib
your default malloc(3) may not optimized for multi-threads and multi-cores it may lock global heap for each allocation try tcmalloc, Google's thread-cache malloc see Intel. Is your memory management multi-core ready?
http://software.intel.com/en-us/blogs/2009/08/21/is-your-memory-management-multi-core-ready/
2011/02
21
Operate one fd in one thread
Although system calls of file descriptors are safe What if a thread close a fd when other thread is block
reading it? What happens if a thread add a fd to epoll watch list
while other thread is epoll_wait()ing it? What happens if two threads poll same fd, and find it
readable simultaneously? What if two threads read the same TCP socket but each
get partial data? How do you tell which part comes first? Rule: all operations on one file descriptor should
happen in one thread, make your life a lot easier2011/02 Shuo Chen (blog.csdn.net/Solstice)
Shuo Chen (blog.csdn.net/Solstice)
22
File descriptors in threads
File descriptors are small integers, unlike HANDLE When create a new fd, kernel picks the lowest unused one
Higher possibility of cross-talk, if careless, eg. A fd shared by two threads The first thread have just close()d it The second is about to read() it But a third thread happened to create a new fd with same
id (the lowest available int reused) during the period What does the second thread read from? Any other impact?
Solution: manage resource with RAII idiom And use the usual technique to manage object life cycles
2011/02
Shuo Chen (blog.csdn.net/Solstice)
23
C++ and fork()
A object could construct once but destruct twiceint main()
{
Foo foo; // call 'Foo::Foo'
fork(); // fork to two process
// call 'Foo::~Foo' in parent *and* child processes
}
It might be a problem, if Foo owns some resource that is not inherited by child process Again, avoid static or global objects in C++ In child process, the object may not be properly initialized
A global muduo::Timestamp startTime(now()) is wrong2011/02
24
RAII and fork()
fork() doesn't copy all state Open file descriptors are inherited by child process
But the offset of file are independent The child does not inherit
its parent's memory locks (mlock(2), mlockall(2)) record locks from its parent (fcntl(2)) timers from its parent (setitimer(2), alarm(2),
timer_create(2)), and others So the RAII idiom may not work well in fork()ed process
A RAII class that wraps timer_create/timer_delete in ctor/dtor may fail in child process after fork()
Use pthread_atfork() as the last resort2011/02 Shuo Chen (blog.csdn.net/Solstice)
Shuo Chen (blog.csdn.net/Solstice)
25
C++ and threads
Use scoped lock guard only, check muduo/base/Mutex.h Don't allow exceptions to propagate across module
boundaries don't let exception propagate out of the thread main
function, catch all exceptions in the outer-most function But, rethrow the one of pthread_cancel(), as we said before
Don't allow exceptions to propagate out of your callback, esp. callbacks from C library, eg. the init_routine registered to pthread_once()
Better: don't use exception in C++
2011/02
Shuo Chen (blog.csdn.net/Solstice)
26
Threads and fork()
The fork() model doesn’t fit well in threads A fundamental flaw of Posix OSes, as other threads
disappear in child, the state is not consistent in child proc After fork a multi-threaded program you may only call
async-signal-safe functions in child, as if in signal handler malloc() is not safe, other thread may hold the lock when
fork()ing, and no chance to unlock in the new process So does printf(), pthread_* and others.
The only safe way to use fork() in a multi-threaded program is calling exec() immediately in child process And make sure set close-on-exec flag on every file
descriptors in parent process for security reasons.2011/02
Shuo Chen (blog.csdn.net/Solstice)
27
Signals and threads
The whole Posix signal mechanism is a shit Only async-signal-safe functions can be called in
signal handler, also called 'reentrant functions' Most of the functions are not async-signal-safe, except
those listed in Posix standards, so it's a white listhttp://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03 'man 7 signal' to get the list on Linux None of pthread_* are not async-signal-safe, you can't
notify a cond var or lock a mutex in signal handler Surprisely, gettimeofday(2) is not async-signal-safe
2011/02
Shuo Chen (blog.csdn.net/Solstice)
28
Deal with signals in MT programs
Rule 1: do not use signal don't use it as IPC, eg. SIGUSR1, SIGUSR2, SIGINT, SIGHUP don't use library functions built upon signals, eg. alarm,
sleep, usleep, timer_create, etc. Rule 2: when you absolutely need, convert an async signal
to synchronous file descriptor readable event use signalfd in high Linux kernel version
Normally, the set of signals to be received via the file descriptor should be blocked using pthread_sigmask(3), to prevent the signals being handled according to their default dispositions.
or open a pipe(2), write(2) one byte in signal handler, and read(2) or poll(2) it in main thread2011/02
Shuo Chen (blog.csdn.net/Solstice)
29
Other resources
http://pubs.opengroup.org/onlinepubs/9699919799/
http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them
http://www.linuxprogrammingblog.com/all-about-linux-signals
http://www.cppblog.com/lymons/archive/2008/06/01/51838.html Seven posts in http://www.cppblog.com/lymons/category/9446.html
2011/02
Shuo Chen (blog.csdn.net/Solstice)
30
To be continued
Essential of non-blocking network programming in C++ Birth of a reactor
– design and implementation of Muduo
2011/02
Shuo Chen (blog.csdn.net/Solstice)
31
Avoid static or global objects
Except for PODs
2011/02