500 likes | 639 Vues
OPS-12: A New Spin on Some Old Latches. Richard Banville. Fellow. Agenda. Monitoring Latch usage Performance. Some Definitions First. Single threading access. What is a “latch” Synchronization mechanism Latches vs “locks” What is a critical section Code access to shared resource
 
                
                E N D
OPS-12: A New Spin on Some Old Latches Richard Banville Fellow
Agenda • Monitoring • Latch usage • Performance
Some Definitions First Single threading access • What is a “latch” • Synchronization mechanism • Latches vs “locks” • What is a critical section • Code access to shared resource • Types of resources • How is a latch obtained • test, test-and-set, spin loop • Performed on “shared” value
Latch Info VST access to latch information for each _latch: _Latch-Type Latch type _Latch-Name Name _Latch-Lock # times locked _Latch-Wait Nap count _Latch-HoldUser # of last holder _Latch-QholdUser # of last holder of queue latch
_Latch-Type: MT_LT_QUEUE Test, test-and-set spin loop • Multi level latch • Wakeup order guaranteed Queue Latch governor • -spin 0 forces all latches Semaphore Latch
_Latch-Type: MT_LT_SPIN Test, test-and-set spin loop • Multi level latch • Wakeup order guaranteed Queue Latch governor • -spin 0 forces all latches Semaphore Latch Test, test-and-set spin loop • Single level latch • Most performant Spin Latch Latch Short Nap “backoff” -nap to -napmax
Latch Implementations “Mux” latch Spin or queue latch acts as governor of “mux” latch table Resources protected Mux latch table (128) request governor • Associates critical section w/resource • Multi level latch with one governor • Underlying one to many relationship Turns 1 latch into 128 latches
Latch Implementations Object latch - type 1 Spin or queue latch acts as governor of object latch Resources protected request governor • Enhanced mux latch • Associates critical section w/resource • Multi level latch with one governor • Underlying one to one relationship Turns 1 latch into many latches
Latch Implementations Object latch - type 2 Spin or queue latch maintained within the resource itself Resources protected request • Associates critical section w/resource • Single level latch. No governor • One to one relationship • Greatly improves concurrency No single access funneling • Statistics maintained as if 1 latch exists
Latch Implementations Latch families Table of spin or queue latchs Resources protected Latch table request • Modified “mux” latch mechanism • Associates critical section w/resource • Single level latch with no governor • Underlying one to many relationship • Statistics typically make this look like one latch
Latch Holder Who am I waiting on? • _Latch-QHolder • Last owner of queue type latch • Not last queued user • _Latch-Holder • Last owner of spin latch • Not honored for “true” object latches • Type 2 Not necessarily current owner Why not?
Latch Counts Activity statistics (or lack there of) • _Latch-Lock • # times latch locked/acquired • High numbers are OK • _Latch-Wait • # times user had to wait (after spin) • High numbers may not be good • …but is not wrong • Performance issue potential • Balance with –spin value (see tuning)
Latch Names _Latch-Name • MTX • USR • OM • AIB • BIB • SCC • GST • TXT • TXQ • BFP • BHT • BF1 • BF2 • BF3 • BF4 • BF5 • BF6 • BF7 • CPQ • LRU • LR2 • LR3 • LR4 • L27 • L28 • L28 • L30 • L31 • LKP • LKF • LHT • LHT2 • LHT3 • LHT4 • SEQ • PWQ • AIW • BIW
Agenda • Monitoring • Latch usage • Performance
Miscellaneous Latches • USR* • Protects login (user) control structures • SCC* • Protects schema locking operation • GST • Protects shared memory allocation *Indicates queue latch
Obsolete Latches Latches removed or reserved • Removed • AIW, LR2, LR3, LR4 • Not in use • BIW • Renamed & reserved for future use • BF5 L27 • BF6 L28 • BF7 L29 • BF8 L30
Latches Protecting Database Update Actions
Update Latches - MTX Latches protecting recovery logging procedure Top of the food chain – busiest/longest update latch MTX DB Update Action Record BI note Record AI note Perform Action User 1 Not allowed DB Update Action Record BI note Record AI note Perform Action User 2 DB Update Action Record BI note Record AI note Perform Action User 3 • Protects txn allocation & bi/ai note order • Quiet point maintenance • High activity • Online backup startup • High nap rate
Very little contention OLTP latch order: until BIW active • MTX, BIB, AIB Page writer latch order: • BIB or AIB BIB and AIB latch usage Bi (ai) buffer life cycle Forward Processing -bibufs 10 Current Output Buffer Modified Queue Free List Free(a) 32 31 BIB BIB Free(b) 30 New Notes (Actions) Free(c) 29 User B I W Free(d) BIB Free(e) BI
Latches – LRU & PWQ -B least recently used (LRU) chain maintenance MRU LRU 32 2048 1024 8192 1056 . . . LRU High activity High nap rate
Latches – LRU & PWQ -B least recently used (LRU) chain maintenance MRU LRU 32 2048 1024 8192 1056 . . . LRU Dirty buffer User Page writer queue (PWQ) PWQ myDB 1088 2080 Very little contention PWQ Page Writer 3200 unless too many APWs LRU
Check point queue (CPQ) 1024 7104 Check- point 6080 Latches – LRU & CPQ -B least recently used (LRU) chain maintenance MRU LRU 32 2048 1024 8192 1056 . . . CPQ Page writer queue (PWQ) myDB Very little contention 1088 2080 CPQ Page Writer 3200
Backup queue • Very little contention • Only used w/online backup Online backup queue - BFP Online “point in time” backup … Area 7 Area 8 Area 6 BI Modify request Modify request Online Backup User 1 BFP MTX TXE .bak BFP Back it up and mark it
Update Latches Latches for write operation structures • TXT • Protects transaction table modifications • TXQ • Protects acquire/release of transaction end lock (TXE) • SEQ • Protects sequence control structures • Moderate activity • Little contention
Latches Protecting Data Access Actions and Housekeeping
Storage Areas Logical Index A-1 Cust# Idx SRep Idx Index D-1 Index D-2 Index B-1 Name Idx Table B Cust Tbl Table D Table A Data Area Physical Extent Extent Extent Extent Extent Extent Extent Extent Extent Disk Storage Files
Storage Areas Logical Index A-1 Cust# Idx SRep Idx Index D-1 Index D-2 Index B-1 Name Idx Table B Cust Tbl Table D Table A Languages Schema Area Area 7 Area 8 Physical Extent Extent Extent Extent Extent Extent Extent Extent Extent Storage Engine Disk Storage Files
Logical Index A-1 Index B-1 Name Idx Cust# Idx SRep Idx Index D-1 Index D-2 Table B Table D Table A Cust Tbl Storage Areas and Object Mapping Languages Mapping Schema Area Area 7 Area 8 Physical Extent Extent Extent Extent Extent Extent Extent Extent Extent Storage Engine Disk Storage Files
Logical Index A-1 Index B-1 Name Idx Cust# Idx SRep Idx Index D-1 Index D-2 Table B Table D Table A Cust Tbl Very high activity High nap rate Object Cache – OM Latch Languages Mapping OM -omsize OM cache loaded on demand Get _StorageObject record OM latch needed for paging maintenance Schema
Logical Index A-1 Index B-1 Name Idx Cust# Idx SRep Idx Index D-1 Index D-2 Table B Table D Table A Cust Tbl 2nd cache for over flow & Secondary Cache new objects Get _StorageObject record OM latch needed for paging maintenance Schema Schema Object Cache - OM Latch Languages Mapping No latching, no paging -omsize Zero contention! Primary Cache OM cache loaded at startup OM Online Schema changes? Little contention
Ptr to buffer Block ID Block ID Block ID Block ID Buffer Pool Hash Table - BHT Shared memory block lookup Hash Table (-hash) List of (–B) buffer pool entries (unordered) User 1 User 2 BHT User 3 User 4 Buffer pool location lookup single threaded High activity, few naps
Ptr to buffer BHT BHT BHT BHT BHT . . . Buffer Pool Hash Table - BHT BHT now latch family of 256 Hash Table (-hash) List of (–B) buffer pool entries (unordered) User 1 Block ID User 2 Block ID User 3 Block ID User 4 Block ID Buffer pool location lookup multi-threaded High activity, few naps
Ptr to buffer Hash Table (-hash) BHT BHT BHT BHT BHT . . . -B buffer pool entry info – BF[1-4] Database access latches List of (–B) buffer pool entries (unordered)
Ptr to buffer BF1 BHT BF2 BHT BHT BF3 BHT BHT . . . BF4 -B buffer pool entry info – BF[1-4] Database access latches Latch family of 4 Hash Table (-hash) List of (–B) buffer pool entries (unordered) Buffer pool info structure supports max of 4 concurrent threads.
Ptr to buffer Hash Table (-hash) BHT User 1 Block ID BHT User 2 Block ID BHT User 3 Block ID BHT BHT User 4 Block ID . . . -B buffer pool entry info – BF[1-4] BF latch family of 4 Object latch type 2 List of (–B) buffer pool entries (unordered)
Ptr to buffer BHT User 1 Block ID BHT User 2 Block ID BHT User 3 Block ID BHT BHT User 4 Block ID . . . -B buffer pool entry info – BF[1-4] BF latch family of 4 Object latch type 2 Hash Table (-hash) List of (–B) buffer pool entries (unordered) • -B info structure supports –B threads • 4 latch slots used for statistics only
Plain latch Single threaded access High activity Low nap rate Lock Table Latches (LKF, LKP & LHT) Free Chain (-L entries) LKF Acquire a lock Hash Chain Anchor Table -lkhash entries
Lock Table Latches (LKF, LKP & LHT) Free Chain (-L entries) Type 1 object latch LHT Single threaded latch access Multi threaded chain access • Record locks LHT LKP • Record “get” locks • High activity/waits • Table locks Hash Chain Anchor Table LKP -lkhash entries • Purged record locks Protected by LHT (LKT) • Low activity/waits and LKP latches
LHT1 LHT2 LHT3 LHT4 Lock Table Latches (LKF, LKP & LHT) Enhanced concurrency Free Chain (-L entries) Type 1 object latch family - Added 4 governors Multi threaded latch access Multi threaded chain access LKP • Record locks • High activity, few waits Hash Chain Anchor Table • Record “get” locks -lkhash entries Protected by LHT[1-4] • Low contention/waits
Agenda • Monitoring • Latch usage • Performance
Tuning parameters Nothing new here • -spin • # retries before nap • 0 or 1 or 10,000 or 6,000 * # cpus or … • Can be changed online • -mux • Use multiplexing/governor (1) • -nap • Initial amount to nap (10 ms) • -napmax* • Max amount to nap (5 sec) • -napinc and -napstep • Both obsolete and ignored
Foresight and insight What to monitor • CPU usage • -spin ~6,000 * # CPUs • CPU thrashes, decrease –spin • > 16 CPUs, avoid cache line ping pong • Activity matters • Activity • # latches acquired • # latch waits • Application throughput
Performance – 10.1C vs 10.1B House Keeping Data Access
Performance – 10.1C vs 10.1B Latches/sec Housekeeping Data Access LRU 31% Buf 43%
Performance – wait for it 4,294 (~9%) fewer latch waits/sec in 10.1c
Performance – you got it! ~40% ~56% # Users 0 4 8 12 16 24 32 256 512 768 992
In Summary • Better insight • We’ve done some things • You can too • Tuning • Move to 10.1C
? Questions