From a2db55dda9cf9c24427ccd5da8792b028ebff46c Mon Sep 17 00:00:00 2001
From: Bharata B Rao <bharata@linux.ibm.com>
Date: Thu, 9 Jan 2020 14:50:47 +0530
Subject: powerpc: Ultravisor: Fix the dependencies for CONFIG_PPC_UV

Let PPC_UV depend only on DEVICE_PRIVATE which in turn
will satisfy all the other required dependencies

Fixes: 013a53f2d25a ("powerpc: Ultravisor: Add PPC_UV config option")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200109092047.24043-1-bharata@linux.ibm.com
---
 arch/powerpc/Kconfig | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

(limited to 'arch')

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 1ec34e16ed65..e2a412113359 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -455,11 +455,7 @@ config PPC_TRANSACTIONAL_MEM
 config PPC_UV
 	bool "Ultravisor support"
 	depends on KVM_BOOK3S_HV_POSSIBLE
-	select ZONE_DEVICE
-	select DEV_PAGEMAP_OPS
-	select DEVICE_PRIVATE
-	select MEMORY_HOTPLUG
-	select MEMORY_HOTREMOVE
+	depends on DEVICE_PRIVATE
 	default n
 	help
 	  This option paravirtualizes the kernel to run in POWER platforms that
-- 
cgit v1.2.3-59-g8ed1b


From 17328f218fb760c9c6accc5b52494889243a6b98 Mon Sep 17 00:00:00 2001
From: Frederic Barrat <fbarrat@linux.ibm.com>
Date: Mon, 13 Jan 2020 14:01:18 +0100
Subject: powerpc/xive: Discard ESB load value when interrupt is invalid
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

A load on an ESB page returning all 1's means that the underlying
device has invalidated the access to the PQ state of the interrupt
through mmio. It may happen, for example when querying a PHB interrupt
while the PHB is in an error state.

In that case, we should consider the interrupt to be invalid when
checking its state in the irq_get_irqchip_state() handler.

Fixes: da15c03b047d ("powerpc/xive: Implement get_irqchip_state method for XIVE to fix shutdown race")
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
[clg: wrote a commit log, introduced XIVE_ESB_INVALID ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200113130118.27969-1-clg@kaod.org
---
 arch/powerpc/include/asm/xive-regs.h |  1 +
 arch/powerpc/sysdev/xive/common.c    | 15 ++++++++++++---
 2 files changed, 13 insertions(+), 3 deletions(-)

(limited to 'arch')

diff --git a/arch/powerpc/include/asm/xive-regs.h b/arch/powerpc/include/asm/xive-regs.h
index f2dfcd50a2d3..33aee7490cbb 100644
--- a/arch/powerpc/include/asm/xive-regs.h
+++ b/arch/powerpc/include/asm/xive-regs.h
@@ -39,6 +39,7 @@
 
 #define XIVE_ESB_VAL_P		0x2
 #define XIVE_ESB_VAL_Q		0x1
+#define XIVE_ESB_INVALID	0xFF
 
 /*
  * Thread Management (aka "TM") registers
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index f5fadbd2533a..9651ca061828 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -972,12 +972,21 @@ static int xive_get_irqchip_state(struct irq_data *data,
 				  enum irqchip_irq_state which, bool *state)
 {
 	struct xive_irq_data *xd = irq_data_get_irq_handler_data(data);
+	u8 pq;
 
 	switch (which) {
 	case IRQCHIP_STATE_ACTIVE:
-		*state = !xd->stale_p &&
-			 (xd->saved_p ||
-			  !!(xive_esb_read(xd, XIVE_ESB_GET) & XIVE_ESB_VAL_P));
+		pq = xive_esb_read(xd, XIVE_ESB_GET);
+
+		/*
+		 * The esb value being all 1's means we couldn't get
+		 * the PQ state of the interrupt through mmio. It may
+		 * happen, for example when querying a PHB interrupt
+		 * while the PHB is in an error state. We consider the
+		 * interrupt to be inactive in that case.
+		 */
+		*state = (pq != XIVE_ESB_INVALID) && !xd->stale_p &&
+			(xd->saved_p || !!(pq & XIVE_ESB_VAL_P));
 		return 0;
 	default:
 		return -EINVAL;
-- 
cgit v1.2.3-59-g8ed1b


From 5d2e5dd5849b4ef5e8ec35e812cdb732c13cd27e Mon Sep 17 00:00:00 2001
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Date: Wed, 8 Jan 2020 11:14:22 +0530
Subject: powerpc/mm/hash: Fix sharing context ids between kernel & userspace

Commit 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in
the same 0xc range") has a bug in the definition of MIN_USER_CONTEXT.

The result is that the context id used for the vmemmap and the lowest
context id handed out to userspace are the same. The context id is
essentially the process identifier as far as the first stage of the
MMU translation is concerned.

This can result in multiple SLB entries with the same VSID (Virtual
Segment ID), accessible to the kernel and some random userspace
process that happens to get the overlapping id, which is not expected
eg:

  07 c00c000008000000 40066bdea7000500  1T  ESID=   c00c00  VSID=      66bdea7 LLP:100
  12 0002000008000000 40066bdea7000d80  1T  ESID=      200  VSID=      66bdea7 LLP:100

Even though the user process and the kernel use the same VSID, the
permissions in the hash page table prevent the user process from
reading or writing to any kernel mappings.

It can also lead to SLB entries with different base page size
encodings (LLP), eg:

  05 c00c000008000000 00006bde0053b500 256M ESID=c00c00000  VSID=    6bde0053b LLP:100
  09 0000000008000000 00006bde0053bc80 256M ESID=        0  VSID=    6bde0053b LLP:  0

Such SLB entries can result in machine checks, eg. as seen on a G5:

  Oops: Machine check, sig: 7 [#1]
  BE PAGE SIZE=64K MU-Hash SMP NR_CPUS=4 NUMA Power Mac
  NIP: c00000000026f248 LR: c000000000295e58 CTR: 0000000000000000
  REGS: c0000000erfd3d70 TRAP: 0200 Tainted: G M (5.5.0-rcl-gcc-8.2.0-00010-g228b667d8ea1)
  MSR: 9000000000109032 <SF,HV,EE,ME,IR,DR,RI> CR: 24282048 XER: 00000000
  DAR: c00c000000612c80 DSISR: 00000400 IRQMASK: 0
  ...
  NIP [c00000000026f248] .kmem_cache_free+0x58/0x140
  LR  [c088000008295e58] .putname 8x88/0xa
  Call Trace:
    .putname+0xB8/0xa
    .filename_lookup.part.76+0xbe/0x160
    .do_faccessat+0xe0/0x380
    system_call+0x5c/ex68

This happens with 256MB segments and 64K pages, as the duplicate VSID
is hit with the first vmemmap segment and the first user segment, and
older 32-bit userspace maps things in the first user segment.

On other CPUs a machine check is not seen. Instead the userspace
process can get stuck continuously faulting, with the fault never
properly serviced, due to the kernel not understanding that there is
already a HPTE for the address but with inaccessible permissions.

On machines with 1T segments we've not seen the bug hit other than by
deliberately exercising it. That seems to be just a matter of luck
though, due to the typical layout of the user virtual address space
and the ranges of vmemmap that are typically populated.

To fix it we add 2 to MIN_USER_CONTEXT. This ensures the lowest
context given to userspace doesn't overlap with the VMEMMAP context,
or with the context for INVALID_REGION_ID.

Fixes: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range")
Cc: stable@vger.kernel.org # v5.2+
Reported-by: Christian Marillat <marillat@debian.org>
Reported-by: Romain Dolbeau <romain@dolbeau.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
[mpe: Account for INVALID_REGION_ID, mostly rewrite change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200123102547.11623-1-mpe@ellerman.id.au
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

(limited to 'arch')

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 15b75005bc34..3fa1b962dc27 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -600,8 +600,11 @@ extern void slb_set_size(u16 size);
  *
  */
 #define MAX_USER_CONTEXT	((ASM_CONST(1) << CONTEXT_BITS) - 2)
+
+// The + 2 accounts for INVALID_REGION and 1 more to avoid overlap with kernel
 #define MIN_USER_CONTEXT	(MAX_KERNEL_CTX_CNT + MAX_VMALLOC_CTX_CNT + \
-				 MAX_IO_CTX_CNT + MAX_VMEMMAP_CTX_CNT)
+				 MAX_IO_CTX_CNT + MAX_VMEMMAP_CTX_CNT + 2)
+
 /*
  * For platforms that support on 65bit VA we limit the context bits
  */
-- 
cgit v1.2.3-59-g8ed1b