Upload
jignesh-shah
View
2.647
Download
14
Embed Size (px)
Citation preview
Understanding PostgreSQL LWLocks
Jignesh Shah
Staff Engineer, VMware Inc
PgCon 2011 - Ottawa
© 2011 VMware Inc 2
About MySelf
§ Joined VMware in 2010 q PostgreSQL performance on vSphere
§ Previously at Sun Microsystems from 2000-‐2010 q Database Performance on Solaris/Sun Systems
§ Work with PostgreSQL Performance Community q Scaling, BoIlenecks using various workloads
§ My Blog: hIp://jkshah.blogspot.com
© 2011 VMware Inc 3
Content
v What are PostgreSQL LWLocks ? v Understanding modes of LWLock v LWLocks API v Architecture of the LWLocks Framework v Internals of LWLock Wait List v Ways to monitors the events related to LWLocks v Top LWLocks.
© 2011 VMware Inc 4
What are PostgreSQL LWLocks?
§ LWLocks – Light Weight Locks of PostgreSQL § Primary Intent
q Mutually Exclusive access to shared memory structures § Offers Shared and Exclusive mode § Not to be confused with the PostgreSQL Locks § PostgreSQL Locks depends on LWLocks to protect its shared
state § Uses SpinLockAcquire/SpinLockRelease and
PGSemaphoreLock/PGSemaphoreUnlock
© 2011 VMware Inc 5
Understanding Modes of LWLocks
ExclusiveLock § For a par\cular lockid there can be only 1 exclusivelock held
(no shared locks and no other exclusive lock for the lockid) SharedLock • For a par\cular lockid there can be 1 or more SharedLock held
(no exclusive lock for the lockid)
© 2011 VMware Inc 6
LWLocks API
§ LWLockAssign(void) § Get a dynamic LWLock number (mostly at module load)
§ LWLockAcquire(LWLockId lockid, LWLockMode mode) § Waits \ll Lock is acquired in EXCLUSIVE or SHARED mode
§ LWLockCondi\onalAcquire(LWLockId lockid, LWLockMode mode) § Returns False if it not available (Non-‐blocking)
§ LWLockRelease(LWLockId lockid) § Releases the lock
§ LWLockReleaseAll (void) § Used aaer an ERROR to cleanup
§ LWLockHeldByMe (LWLockId lockid) § Debug support test
© 2011 VMware Inc 7
Architecture Flow of LWLocks
LWLockCondi\onalAcquire • Easy lock it if available else return false (Non-‐blocking) LWLockAcquire • Lock if available or else put me in wait queue (FIFO) and got
to sleep on a semaphore (and loop) • Only return when lock is available (Blocking) LWLockRelease • Release the Lock and if there are no more shared locks
remaining then wake up the next process wai\ng in the wait queue
© 2011 VMware Inc 8
Wait Queue List
• Wait Queue is protected by a mutex which is accessed using SpinLocks
• When the lock is released and there are no more shared locks pending then the process releasing the lock will wake the next waiter as follows
• If the next process in wait queue is wai\ng for an EXCLUSIVE mode , only that process is removed from the queue and wakes them up
• Or since the next process is wai\ng for SHARED mode, it will try to remove as many consecu\ve SHARED mode processes and wake them up
© 2011 VMware Inc 9
Lock Acquisition Internals
Spins on ‘lock’ mutex
Lock Available
• If Not Add to Wait Queue List • Release ‘lock’ mutex
Release ‘lock’ mutex
Sleep on its process wait semaphore
© 2011 VMware Inc 10
Lock Release Internals
Release Lock
Shared Locks ?
• Wake Up process next in FIFO • If it is Shared wake up all
sequen\al shared waiters
Release ‘lock’ mutex
Spins on ‘lock’ mutex
© 2011 VMware Inc 11
LWLocks Wait Queue List FIFO Example
• Consider something 80-‐20 Shared /Exclusive ra\o It will wake up first 5 shared locks then wait \ll all of them
releases the lock before waking up the process asking for Exclusive
It will do only exclusive one then and one it releases the lock, wakes up the next 1 shared one
S S S S S S S S S E E
© 2011 VMware Inc 12
Observations about LWLocks Wait Queue List
• All Opera\ons on Wait Queue List are serialized • Not scalable on SMP architecture
• Currently only FIFO supported • No restric\on on shared wakeups
© 2011 VMware Inc 13
LWLocks – Defined in lwlock.h
typedef enum LWLockId {
BufFreelistLock, ShmemIndexLock, OidGenLock, XidGenLock, ProcArrayLock, SInvalReadLock, SInvalWriteLock, WALInsertLock, WALWriteLock, ControlFileLock, CheckpointLock, CLogControlLock, SubtransControlLock, MultiXactGenLock, MultiXactOffsetControlLock, MultiXactMemberControlLock, RelCacheInitLock, BgWriterCommLock, TwoPhaseStateLock, TablespaceCreateLock, BtreeVacuumLock, AddinShmemInitLock, AutovacuumLock, AutovacuumScheduleLock, SyncScanLock, RelationMappingLock, AsyncCtlLock, AsyncQueueLock, SerializableXactHashLock, SerializableFinishedListLock, SerializablePredicateLockListLock, OldSerXidLock, SyncRepLock, /* Individual lock IDs end here */ FirstBufMappingLock, FirstLockMgrLock = FirstBufMappingLock + NUM_BUFFER_PARTITIONS, /*16 */ FirstPredicateLockMgrLock = FirstLockMgrLock + NUM_LOCK_PARTITIONS, /* must be last except for MaxDynamicLWLock: */ NumFixedLWLocks = FirstPredicateLockMgrLock + NUM_PREDICATELOCK_PARTITIONS, MaxDynamicLWLock = 1000000000
} LWLockId;
© 2011 VMware Inc 14
Monitoring
LWLOCK_STAT • Special build with LWLOCK_STAT defined LOCK_DEBUG • Puts debug messages in system alert log DYNAMIC TRACING (DTrace, SystemTap) • Recommended • Useful to find hot locks • Not for produc\on use, because it can make the server
unstable
© 2011 VMware Inc 15
Dynamic Tracing for LWLocks
• postgresql-‐lwlock-‐wait-‐start(lockid,mode) • Postgresql-‐lwlock-‐wait-‐done(lockid,mode) • Postgresql-‐lwlock-‐acquire(lockid,mode) • Postgresql-‐lwlock-‐condacquire(lockid,mode) • Postgresql-‐lwlock-‐condacquire-‐fail(lockid,mode) • Postgresql-‐lwlock-‐release(lockid)
© 2011 VMware Inc 16
Example Monitoring using Systemtap (DBT2)
LOCKNAME LWID M W/A COUNT SUM-TIME(us) MAX-TIME(us) AVG-TIME(us)
WALInsertLock 7 Ex W 14013 2746505 3955 195
WALWriteLock 8 Ex W 10006 25508653 286749 2549
LockMgrLock 55 Ex W 2035 203429 3323 99
LockMgrLock 45 Ex W 932 54297 2860 58
LockMgrLock 54 Ex W 673 24362 1062 36
ProcArrayLock 4 Ex W 515 15907 666 30
ProcArrayLock 4 Sh W 176 6064 97 34
LockMgrLock 56 Ex W 171 5826 376 34
CLogControlLock 11 Sh W 111 22490 6127 202
LockMgrLock 57 Ex W 101 5524 1326 54
LockMgrLock 59 Ex W 79 2883 347 36
CLogControlLock 11 Ex W 58 8543 4439 147
LockMgrLock 49 Ex W 57 1848 76 32
LockMgrLock 47 Ex W 57 3166 1468 55
© 2011 VMware Inc 17
Top Locks - WALWriteLock
• Protects writes on WAL • Acquired when WAL records flushed to disk • Acquired when WAL Log switch occurs • Improve the underlying storage of pg_xlog • Synchronous_commit=off helps indirectly ( Do not wait for
flush to the disk) • Full_page_writes=off also helps reduce the stress (but not
recommended since resiliency goes down)
© 2011 VMware Inc 18
Top Locks - WALInsertLock
• Protects WAL Buffers • Increasing wal buffers may help though to only certain extent • Snychnrous_commit off will lead to increased pressure on this
lock ( Not a bad thing ) • However eventually not much can be done once it gets to a
big problem without new commits • Full_page_writes=off certainly helps (again not recommended
since it reduces the resiliency of the database from write errors)
© 2011 VMware Inc 19
Top Locks - ProcArrayLock
• Protects ProcArray structure • It used to be that every transac\on actually acquired this lock
in exclusive mode before commit causing it to be a top lock • Fixed in 9.0
© 2011 VMware Inc 20
Top Locks - SInvalidReadLock
• Protects sinval array • Readers take “Shared” SInvalReadLock • SICleanupQueue and other array-‐wide updates take
“Excluslive” SInvalReadLock to lock out all readers • Long wait \mes to acquire SInvalidReadLock generally results
when the Shared Buffer pool is being stressed • Increase shared_buffers in postgresql.conf corresponding to
ac\ve data size
© 2011 VMware Inc 21
Top Locks - CLogControlLock
• Protects CLogControl structure • Generally not a problem • If it shows on the top lists check
• $PGDATA/pg_clog should be on buffered file system
© 2011 VMware Inc 22
Example Monitoring using Systemtap (Sysbench simple read)
LOCKNAME LWID M W/A COUNT SUM-TIME(us) MAX-TIME(us) AVG-TIME(us)
LockMgrLock 45 Ex W 85343 469682510 13152 5503
LockMgrLock 57 Ex W 57547 30903727 8313 537
LockMgrLock 44 Ex W 390 34061 1670 87
LockMgrLock 59 Ex W 375 41570 2032 110
LockMgrLock 56 Ex W 361 39685 1889 109
LockMgrLock 47 Ex W 344 24548 1564 71
LockMgrLock 54 Ex W 335 67770 2319 202
LockMgrLock 50 Ex W 325 44213 1690 136
LockMgrLock 49 Ex W 325 39280 1475 120
LockMgrLock 55 Ex W 323 39448 1584 122
LockMgrLock 48 Ex W 323 26982 1669 83
© 2011 VMware Inc 23
Top Locks - LockMgrLocks
• Protects rela\ons • Sets of about 16 Lock Par\\ons by default to handle all
rela\ons • Each rela\on is part of only one par\\on (irrespec\ve of size)
© 2011 VMware Inc 24
Other Locks - BufMappingLocks
• Protects regions of Buffers • Sets of about 16 Regions of Buffers by default to handle the
whole Bufferpool • Only taken shared access
© 2011 VMware Inc 25
Problems with LWLock
• Overall system gravitates to certain top locks • Single mutex lock protects adding and deple\ng the wait
queue • Not SMP Scalable • Chance for op\miza\on out there • Performance limited to serialized rate the locks are
processed
© 2011 VMware Inc 26
Questions / More Information
v Email: [email protected] v Learn more about PostgreSQL
q hIp://www.postgresql.org v Blog: hIp://jkshah.blogspot.com