s390/spinlock: introduce spinlock wait queueing

The queued spinlock code for s390 follows the principles of the common code qspinlock implementation but with a few notable differences. The format of the spinlock_t locking word differs, s390 needs to store the logical CPU number of the lock holder in the spinlock_t to be able to use the diagnose 9c directed yield hypervisor call. The inline code sequences for spin_lock and spin_unlock are nice and short. The inline portion of a spin_lock now typically looks like this: lhi %r0,0 # 0 indicates an empty lock l %r1,0x3a0 # CPU number + 1 from lowcore cs %r0,%r1,<some_lock> # lock operation jnz call_wait # on failure call wait function locked: ... call_wait: la %r2,<some_lock> brasl %r14,arch_spin_lock_wait j locked A spin_unlock is as simple as before: lhi %r0,0 sth %r0,2(%r2) # unlock operation After a CPU has queued itself it may not enable interrupts again for the arch_spin_lock_flags() variant. The arch_spin_lock_wait_flags wait function is removed. To improve performance the code implements opportunistic lock stealing. If the wait function finds a spinlock_t that indicates that the lock is free but there are queued waiters, the CPU may steal the lock up to three times without queueing itself. The lock stealing update the steal counter in the lock word to prevent more than 3 steals. The counter is reset at the time the CPU next in the queue successfully takes the lock. While the queued spinlocks improve performance in a system with dedicated CPUs, in a virtualized environment with continuously overcommitted CPUs the queued spinlocks can have a negative effect on performance. This is due to the fact that a queued CPU that is preempted by the hypervisor will block the queue at some point even without holding the lock. With the classic spinlock it does not matter if a CPU is preempted that waits for the lock. Therefore use the queued spinlock code only if the system runs with dedicated CPUs and fall back to classic spinlocks when running with shared CPUs. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
author: Martin Schwidefsky <schwidefsky@de.ibm.com> 2017-03-24 17:25:02 +0100
committer: Martin Schwidefsky <schwidefsky@de.ibm.com> 2017-09-28 07:29:44 +0200
commit: b96f7d881ad94203e997cd2aa7112d4a06d121ef (patch)
tree: 7c942d7f08610684a3cbce30a3052bde6549b92e /arch/s390/include/asm/lowcore.h
parent: s390/spinlock: use the cpu number +1 as spinlock value (diff)
download: linux-dev-b96f7d881ad94203e997cd2aa7112d4a06d121ef.tar.xz
linux-dev-b96f7d881ad94203e997cd2aa7112d4a06d121ef.zip
1 files changed, 3 insertions, 2 deletions
diff --git a/arch/s390/include/asm/lowcore.h b/arch/s390/include/asm/lowcore.h
index a6870ea6ea8b..62943af36ac6 100644
--- a/arch/s390/include/asm/lowcore.h
+++ b/arch/s390/include/asm/lowcore.h
@@ -133,8 +133,9 @@ struct lowcore {
 	__u8	pad_0x03b4[0x03b8-0x03b4];	/* 0x03b4 */
 	__u64	gmap;				/* 0x03b8 */
 	__u32	spinlock_lockval;		/* 0x03c0 */
-	__u32	fpu_flags;			/* 0x03c4 */
-	__u8	pad_0x03c8[0x0400-0x03c8];	/* 0x03c8 */
+	__u32	spinlock_index;			/* 0x03c4 */
+	__u32	fpu_flags;			/* 0x03c8 */
+	__u8	pad_0x03cc[0x0400-0x03cc];	/* 0x03cc */
 
 	/* Per cpu primary space access list */
 	__u32	paste[16];			/* 0x0400 */
author	Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-03-24 17:25:02 +0100
committer	Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-09-28 07:29:44 +0200
commit	b96f7d881ad94203e997cd2aa7112d4a06d121ef (patch)
tree	7c942d7f08610684a3cbce30a3052bde6549b92e /arch/s390/include/asm/lowcore.h
parent	s390/spinlock: use the cpu number +1 as spinlock value (diff)
download	linux-dev-b96f7d881ad94203e997cd2aa7112d4a06d121ef.tar.xz linux-dev-b96f7d881ad94203e997cd2aa7112d4a06d121ef.zip