aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc/include/asm/book3s/32
AgeCommit message (Collapse)AuthorFilesLines
2022-09-26powerpc/mm: Make PAGE_KERNEL_xxx macros grep-friendlyChristophe Leroy1-2/+1
Avoid multi-lines to help getting a complete view when using grep. They still remain under the 100 chars limit. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3bc3f5a51949ee7f52dba36677db23d4337c7995.1662544980.git.christophe.leroy@csgroup.eu
2022-09-26powerpc/mm: Reduce redundancy in pgtable.hChristophe Leroy1-19/+0
PAGE_KERNEL_TEXT, PAGE_KERNEL_EXEC and PAGE_AGP are the same for all powerpcs. Remove duplicated definitions. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/92254499430d13d99e4a4d7e9ad8e8284cb35380.1662544974.git.christophe.leroy@csgroup.eu
2022-03-25Merge tag 'powerpc-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds1-15/+22
Pull powerpc updates from Michael Ellerman: "Livepatch support for 32-bit is probably the standout new feature, otherwise mostly just lots of bits and pieces all over the board. There's a series of commits cleaning up function descriptor handling, which touches a few other arches as well as LKDTM. It has acks from Arnd, Kees and Helge. Summary: - Enforce kernel RO, and implement STRICT_MODULE_RWX for 603. - Add support for livepatch to 32-bit. - Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS. - Merge vdso64 and vdso32 into a single directory. - Fix build errors with newer binutils. - Add support for UADDR64 relocations, which are emitted by some toolchains. This allows powerpc to build with the latest lld. - Fix (another) potential userspace r13 corruption in transactional memory handling. - Cleanups of function descriptor handling & related fixes to LKDTM. Thanks to Abdul Haleem, Alexey Kardashevskiy, Anders Roxell, Aneesh Kumar K.V, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Bhaskar Chowdhury, Cédric Le Goater, Chen Jingwen, Christophe JAILLET, Christophe Leroy, Corentin Labbe, Daniel Axtens, Daniel Henrique Barboza, David Dai, Fabiano Rosas, Ganesh Goudar, Guo Zhengkui, Hangyu Hua, Haren Myneni, Hari Bathini, Igor Zhbanov, Jakob Koschel, Jason Wang, Jeremy Kerr, Joachim Wiberg, Jordan Niethe, Julia Lawall, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mamatha Inamdar, Maxime Bizon, Maxim Kiselev, Maxim Kochetkov, Michal Suchanek, Nageswara R Sastry, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nour-eddine Taleb, Paul Menzel, Ping Fang, Pratik R. Sampat, Randy Dunlap, Ritesh Harjani, Rohan McLure, Russell Currey, Sachin Sant, Segher Boessenkool, Shivaprasad G Bhat, Sourabh Jain, Thierry Reding, Tobias Waldekranz, Tyrel Datwyler, Vaibhav Jain, Vladimir Oltean, Wedson Almeida Filho, and YueHaibing" * tag 'powerpc-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (179 commits) powerpc/pseries: Fix use after free in remove_phb_dynamic() powerpc/time: improve decrementer clockevent processing powerpc/time: Fix KVM host re-arming a timer beyond decrementer range powerpc/tm: Fix more userspace r13 corruption powerpc/xive: fix return value of __setup handler powerpc/64: Add UADDR64 relocation support powerpc: 8xx: fix a return value error in mpc8xx_pic_init powerpc/ps3: remove unneeded semicolons powerpc/64: Force inlining of prevent_user_access() and set_kuap() powerpc/bitops: Force inlining of fls() powerpc: declare unmodified attribute_group usages const powerpc/spufs: Fix build warning when CONFIG_PROC_FS=n powerpc/secvar: fix refcount leak in format_show() powerpc/64e: Tie PPC_BOOK3E_64 to PPC_FSL_BOOK3E powerpc: Move C prototypes out of asm-prototypes.h powerpc/kexec: Declare kexec_paca static powerpc/smp: Declare current_set static powerpc: Cleanup asm-prototypes.c powerpc/ftrace: Use STK_GOT in ftrace_mprofile.S powerpc/ftrace: Regroup PPC64 specific operations in ftrace_mprofile.S ...
2022-03-21powerpc: Add pmd_pfn()Matthew Wilcox (Oracle)1-2/+2
This is straightforward for everything except nohash64 where we indirect through pmd_page(). There must be a better way to do this. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-02-03powerpc/32s: Make pte_update() non atomic on 603 coreChristophe Leroy1-15/+22
On 603 core, TLB miss handler don't do any change to the page tables so pte_update() doesn't need to be atomic. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/cc89d3c11fc9c742d0df3454a657a3a00be24046.1643538554.git.christophe.leroy@csgroup.eu
2022-01-24powerpc/fixmap: Fix VM debug warning on unmapChristophe Leroy1-0/+1
Unmapping a fixmap entry is done by calling __set_fixmap() with FIXMAP_PAGE_CLEAR as flags. Today, powerpc __set_fixmap() calls map_kernel_page(). map_kernel_page() is not happy when called a second time for the same page. WARNING: CPU: 0 PID: 1 at arch/powerpc/mm/pgtable.c:194 set_pte_at+0xc/0x1e8 CPU: 0 PID: 1 Comm: swapper Not tainted 5.16.0-rc3-s3k-dev-01993-g350ff07feb7d-dirty #682 NIP: c0017cd4 LR: c00187f0 CTR: 00000010 REGS: e1011d50 TRAP: 0700 Not tainted (5.16.0-rc3-s3k-dev-01993-g350ff07feb7d-dirty) MSR: 00029032 <EE,ME,IR,DR,RI> CR: 42000208 XER: 00000000 GPR00: c0165fec e1011e10 c14c0000 c0ee2550 ff800000 c0f3d000 00000000 c001686c GPR08: 00001000 b00045a9 00000001 c0f58460 c0f50000 00000000 c0007e10 00000000 GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 GPR24: 00000000 00000000 c0ee2550 00000000 c0f57000 00000ff8 00000000 ff800000 NIP [c0017cd4] set_pte_at+0xc/0x1e8 LR [c00187f0] map_kernel_page+0x9c/0x100 Call Trace: [e1011e10] [c0736c68] vsnprintf+0x358/0x6c8 (unreliable) [e1011e30] [c0165fec] __set_fixmap+0x30/0x44 [e1011e40] [c0c13bdc] early_iounmap+0x11c/0x170 [e1011e70] [c0c06cb0] ioremap_legacy_serial_console+0x88/0xc0 [e1011e90] [c0c03634] do_one_initcall+0x80/0x178 [e1011ef0] [c0c0385c] kernel_init_freeable+0xb4/0x250 [e1011f20] [c0007e34] kernel_init+0x24/0x140 [e1011f30] [c0016268] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 7fe3fb78 48019689 80010014 7c630034 83e1000c 5463d97e 7c0803a6 38210010 4e800020 81250000 712a0001 41820008 <0fe00000> 9421ffe0 93e1001c 48000030 Implement unmap_kernel_page() which clears an existing pte. Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b0b752f6f6ecc60653e873f385c6f0dce4e9ab6a.1638789098.git.christophe.leroy@csgroup.eu
2022-01-16powerpc/32s: Fix kasan_init_region() for KASANChristophe Leroy1-0/+2
It has been reported some configuration where the kernel doesn't boot with KASAN enabled. This is due to wrong BAT allocation for the KASAN area: ---[ Data Block Address Translation ]--- 0: 0xc0000000-0xcfffffff 0x00000000 256M Kernel rw m 1: 0xd0000000-0xdfffffff 0x10000000 256M Kernel rw m 2: 0xe0000000-0xefffffff 0x20000000 256M Kernel rw m 3: 0xf8000000-0xf9ffffff 0x2a000000 32M Kernel rw m 4: 0xfa000000-0xfdffffff 0x2c000000 64M Kernel rw m A BAT must have both virtual and physical addresses alignment matching the size of the BAT. This is not the case for BAT 4 above. Fix kasan_init_region() by using block_size() function that is in book3s32/mmu.c. To be able to reuse it here, make it non static and change its name to bat_block_size() in order to avoid name conflict with block_size() defined in <linux/blkdev.h> Also reuse find_free_bat() to avoid an error message from setbat() when no BAT is available. And allocate memory outside of linear memory mapping to avoid wasting that precious space. With this change we get correct alignment for BATs and KASAN shadow memory is allocated outside the linear memory space. ---[ Data Block Address Translation ]--- 0: 0xc0000000-0xcfffffff 0x00000000 256M Kernel rw 1: 0xd0000000-0xdfffffff 0x10000000 256M Kernel rw 2: 0xe0000000-0xefffffff 0x20000000 256M Kernel rw 3: 0xf8000000-0xfbffffff 0x7c000000 64M Kernel rw 4: 0xfc000000-0xfdffffff 0x7a000000 32M Kernel rw Fixes: 7974c4732642 ("powerpc/32s: Implement dedicated kasan_init_region()") Cc: stable@vger.kernel.org Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7a50ef902494d1325227d47d33dada01e52e5518.1641818726.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/kuap: Add kuap_lock()Christophe Leroy1-5/+9
Add kuap_lock() and call it when entering interrupts from user. It is called kuap_lock() as it is similar to kuap_save_and_lock() without the save. However book3s/32 already have a kuap_lock(). Rename it kuap_lock_addr(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4437e2deb9f6f549f7089d45e9c6f96a7e77905a.1634627931.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/kuap: Remove __kuap_assert_locked()Christophe Leroy1-5/+0
__kuap_assert_locked() is redundant with __kuap_get_and_assert_locked(). Move the verification of CONFIG_PPC_KUAP_DEBUG in kuap_assert_locked() and make it call __kuap_get_and_assert_locked() directly. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1a60198a25d2ba38a37f1b92bc7d096435df4224.1634627931.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/kuap: Check KUAP activation in generic functionsChristophe Leroy1-29/+5
Today, every platform checks that KUAP is not de-activated before doing the real job. Move the verification out of platform specific functions. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/894f110397fcd248e125fb855d1e863e4e633a0d.1634627931.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/kuap: Add a generic intermediate layerChristophe Leroy1-11/+11
Make the following functions generic to all platforms. - bad_kuap_fault() - kuap_assert_locked() - kuap_save_and_lock() (PPC32 only) - kuap_kernel_restore() - kuap_get_and_assert_locked() And for all platforms except book3s/64 - allow_user_access() - prevent_user_access() - prevent_user_access_return() - restore_user_access() Prepend __ in front of the name of platform specific ones. For now the generic just calls the platform specific, but next patch will move redundant parts of specific functions into the generic one. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/eaef143a8dae7288cd34565ffa7b49c16aee1ec3.1634627931.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/32s: Save content of sr0 to avoid 'mfsr'Christophe Leroy1-0/+5
Calling 'mfsr' to get the content of segment registers is heavy, in addition it requires clearing of the 'reserved' bits. In order to avoid this operation, save it in mm context and in thread struct. The saved sr0 is the one used by kernel, this means that on locking entry it can be used as is. For unlocking, the only thing to do is to clear SR_NX. This improves null_syscall selftest by 12 cycles, ie 4%. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b02baf2ed8f09bad910dfaeeb7353b2ae6830525.1634627931.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/32s: Do kuep_lock() and kuep_unlock() in assemblyChristophe Leroy2-35/+76
When interrupt and syscall entries where converted to C, KUEP locking and unlocking was also converted. It improved performance by unrolling the loop, and allowed easily implementing boot time deactivation of KUEP. However, null_syscall selftest shows that KUEP is still heavy (361 cycles with KUEP, 212 cycles without). A way to improve more is to group 'mtsr's together, instead of repeating 'addi' + 'mtsr' several times. In order to do that, more registers need to be available. In C, GCC will always be able to provide the requested number of registers, but at the cost of saving some data on the stack, which is counter performant here. So let's do it in assembly, when we have full control of which register can be used. It also has the advantage of locking earlier and unlocking later and it helps GCC generating less tricky code. The only drawback is to make boot time deactivation less straight forward and require 'hand' instruction patching. Group 'mtsr's by 4. With this change, null_syscall selftest reports 336 cycles. Without the change it was 361 cycles, that's a 7% reduction. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/115cb279e9b9948dfd93a065e047081c59e3a2a6.1634627931.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/32s: Remove capability to disable KUEP at boottimeChristophe Leroy1-2/+1
Disabling KUEP at boottime makes things unnecessarily complex. Still allow disabling KUEP at build time, but when it's built-in it is always there. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/96f583f82423a29a4205c60b9721079111b35567.1634627931.git.christophe.leroy@csgroup.eu
2021-10-07powerpc/32s: Fix kuap_kernel_restore()Christophe Leroy1-0/+8
At interrupt exit, kuap_kernel_restore() calls kuap_unlock() with the value contained in regs->kuap. However, when regs->kuap contains 0xffffffff it means that KUAP was not unlocked so calling kuap_unlock() is unrelevant and results in jeopardising the contents of kernel space segment registers. So check that regs->kuap doesn't contain KUAP_NONE before calling kuap_unlock(). In the meantime it also means that if KUAP has not been correcly locked back at interrupt exit, it must be locked before continuing. This is done by checking the content of current->thread.kuap which was returned by kuap_get_and_assert_locked() Fixes: 16132529cee5 ("powerpc/32s: Rework Kernel Userspace Access Protection") Reported-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0d0c4d0f050a637052287c09ba521bad960a2790.1631715131.git.christophe.leroy@csgroup.eu
2021-08-19powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEPChristophe Leroy1-0/+20
Commit b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C") removed the 'isync' instruction after adding/removing NX bit in user segments. The reasoning behind this change was that when setting the NX bit we don't mind it taking effect with delay as the kernel never executes text from userspace, and when clearing the NX bit this is to return to userspace and then the 'rfi' should synchronise the context. However, it looks like on book3s/32 having a hash page table, at least on the G3 processor, we get an unexpected fault from userspace, then this is followed by something wrong in the verification of MSR_PR at end of another interrupt. This is fixed by adding back the removed isync() following update of NX bit in user segment registers. Only do it for cores with an hash table, as 603 cores don't exhibit that problem and the two isync increase ./null_syscall selftest by 6 cycles on an MPC 832x. First problem: unexpected WARN_ON() for mysterious PROTFAULT WARNING: CPU: 0 PID: 1660 at arch/powerpc/mm/fault.c:354 do_page_fault+0x6c/0x5b0 Modules linked in: CPU: 0 PID: 1660 Comm: Xorg Not tainted 5.13.0-pmac-00028-gb3c15b60339a #40 NIP: c001b5c8 LR: c001b6f8 CTR: 00000000 REGS: e2d09e40 TRAP: 0700 Not tainted (5.13.0-pmac-00028-gb3c15b60339a) MSR: 00021032 <ME,IR,DR,RI> CR: 42d04f30 XER: 20000000 GPR00: c000424c e2d09f00 c301b680 e2d09f40 0000001e 42000000 00cba028 00000000 GPR08: 08000000 48000010 c301b680 e2d09f30 22d09f30 00c1fff0 00cba000 a7b7ba4c GPR16: 00000031 00000000 00000000 00000000 00000000 00000000 a7b7b0d0 00c5c010 GPR24: a7b7b64c a7b7d2f0 00000004 00000000 c1efa6c0 00cba02c 00000300 e2d09f40 NIP [c001b5c8] do_page_fault+0x6c/0x5b0 LR [c001b6f8] do_page_fault+0x19c/0x5b0 Call Trace: [e2d09f00] [e2d09f04] 0xe2d09f04 (unreliable) [e2d09f30] [c000424c] DataAccess_virt+0xd4/0xe4 --- interrupt: 300 at 0xa7a261dc NIP: a7a261dc LR: a7a253bc CTR: 00000000 REGS: e2d09f40 TRAP: 0300 Not tainted (5.13.0-pmac-00028-gb3c15b60339a) MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 228428e2 XER: 20000000 DAR: 00cba02c DSISR: 42000000 GPR00: a7a27448 afa6b0e0 a74c35c0 a7b7b614 0000001e a7b7b614 00cba028 00000000 GPR08: 00020fd9 00000031 00cb9ff8 a7a273b0 220028e2 00c1fff0 00cba000 a7b7ba4c GPR16: 00000031 00000000 00000000 00000000 00000000 00000000 a7b7b0d0 00c5c010 GPR24: a7b7b64c a7b7d2f0 00000004 00000002 0000001e a7b7b614 a7b7aff4 00000030 NIP [a7a261dc] 0xa7a261dc LR [a7a253bc] 0xa7a253bc --- interrupt: 300 Instruction dump: 7c4a1378 810300a0 75278410 83820298 83a300a4 553b018c 551e0036 4082038c 2e1b0000 40920228 75280800 41820220 <0fe00000> 3b600000 41920214 81420594 Second problem: MSR PR is seen unset allthough the interrupt frame shows it set kernel BUG at arch/powerpc/kernel/interrupt.c:458! Oops: Exception in kernel mode, sig: 5 [#1] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac Modules linked in: CPU: 0 PID: 1660 Comm: Xorg Tainted: G W 5.13.0-pmac-00028-gb3c15b60339a #40 NIP: c0011434 LR: c001629c CTR: 00000000 REGS: e2d09e70 TRAP: 0700 Tainted: G W (5.13.0-pmac-00028-gb3c15b60339a) MSR: 00029032 <EE,ME,IR,DR,RI> CR: 42d09f30 XER: 00000000 GPR00: 00000000 e2d09f30 c301b680 e2d09f40 83440000 c44d0e68 e2d09e8c 00000000 GPR08: 00000002 00dc228a 00004000 e2d09f30 22d09f30 00c1fff0 afa6ceb4 00c26144 GPR16: 00c25fb8 00c26140 afa6ceb8 90000000 00c944d8 0000001c 00000000 00200000 GPR24: 00000000 000001fb afa6d1b4 00000001 00000000 a539a2a0 a530fd80 00000089 NIP [c0011434] interrupt_exit_kernel_prepare+0x10/0x70 LR [c001629c] interrupt_return+0x9c/0x144 Call Trace: [e2d09f30] [c000424c] DataAccess_virt+0xd4/0xe4 (unreliable) --- interrupt: 300 at 0xa09be008 NIP: a09be008 LR: a09bdfe8 CTR: a09bdfc0 REGS: e2d09f40 TRAP: 0300 Tainted: G W (5.13.0-pmac-00028-gb3c15b60339a) MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 420028e2 XER: 20000000 DAR: a539a308 DSISR: 0a000000 GPR00: a7b90d50 afa6b2d0 a74c35c0 a0a8b690 a0a8b698 a5365d70 a4fa82a8 00000004 GPR08: 00000000 a09bdfc0 00000000 a5360000 a09bde7c 00c1fff0 afa6ceb4 00c26144 GPR16: 00c25fb8 00c26140 afa6ceb8 90000000 00c944d8 0000001c 00000000 00200000 GPR24: 00000000 000001fb afa6d1b4 00000001 00000000 a539a2a0 a530fd80 00000089 NIP [a09be008] 0xa09be008 LR [a09bdfe8] 0xa09bdfe8 --- interrupt: 300 Instruction dump: 80010024 83e1001c 7c0803a6 4bffff80 3bc00800 4bffffd0 486b42fd 4bffffcc 81430084 71480002 41820038 554a0462 <0f0a0000> 80620060 74630001 40820034 Fixes: b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C") Cc: stable@vger.kernel.org # v5.13+ Reported-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4856f5574906e2aec0522be17bf3848a22b2cd0b.1629269345.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/kuap: Remove to/from/size parameters of prevent_user_access()Christophe Leroy1-2/+1
prevent_user_access() doesn't use anymore to/from/size parameters. Remove them. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b7113662fd2c26e4c33e9d705de324bd3860822e.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/kuap: Remove KUAP_CURRENT_XXXChristophe Leroy1-1/+0
book3s/32 was the only user of KUAP_CURRENT_XXX. After rework of book3s/32 KUAP, it is not used anymore. Remove them. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/549214ecf6887d965645e664520d4886663c5ffb.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32s: Rework Kernel Userspace Access ProtectionChristophe Leroy1-70/+78
On book3s/32, KUAP is provided by toggling Ks bit in segment registers. One segment register addresses 256M of virtual memory. At the time being, KUAP implements a complex logic to apply the unlock/lock on the exact number of segments covering the user range to access, with saving the boundaries of the range of segments in a member of thread struct. But most if not all user accesses are within a single segment. Rework KUAP with a different approach: - Open only one segment, the one corresponding to the starting address of the range to be accessed. - If a second segment is involved, it will generate a page fault. The segment will then be open by the page fault handler. The kuap member of thread struct will now contain: - The start address of the current on going user access, that will be used to know which segment to lock at the end of the user access. - ~0 when no user access is open - ~1 when additionnal segments are opened by a page fault. Then, at lock time - When only one segment is open, close it. - When several segments are open, close all user segments. Almost 100% of the time, only one segment will be involved. In interrupts, inline the function that unlock/lock all segments, because not inlining them implies a lot of register save/restore. With the patch, writing value 128 in userspace in perf_copy_attr() is done with 16 instructions: 3890: 93 82 04 dc stw r28,1244(r2) 3894: 7d 20 e5 26 mfsrin r9,r28 3898: 55 29 00 80 rlwinm r9,r9,0,2,0 389c: 7d 20 e1 e4 mtsrin r9,r28 38a0: 4c 00 01 2c isync 38a4: 39 20 00 80 li r9,128 38a8: 91 3c 00 00 stw r9,0(r28) 38ac: 81 42 04 dc lwz r10,1244(r2) 38b0: 39 00 ff ff li r8,-1 38b4: 91 02 04 dc stw r8,1244(r2) 38b8: 2c 0a ff fe cmpwi r10,-2 38bc: 41 82 00 88 beq 3944 <perf_copy_attr+0x36c> 38c0: 7d 20 55 26 mfsrin r9,r10 38c4: 65 29 40 00 oris r9,r9,16384 38c8: 7d 20 51 e4 mtsrin r9,r10 38cc: 4c 00 01 2c isync ... 3944: 48 00 00 01 bl 3944 <perf_copy_attr+0x36c> 3944: R_PPC_REL24 kuap_lock_all_ool Before the patch it was 118 instructions. In reality only 42 are executed in most cases, but GCC is not able to see that a properly aligned user access cannot involve more than one segment. 5060: 39 1d 00 04 addi r8,r29,4 5064: 3d 20 b0 00 lis r9,-20480 5068: 7c 08 48 40 cmplw r8,r9 506c: 40 81 00 08 ble 5074 <perf_copy_attr+0x2cc> 5070: 3d 00 b0 00 lis r8,-20480 5074: 39 28 ff ff addi r9,r8,-1 5078: 57 aa 00 06 rlwinm r10,r29,0,0,3 507c: 55 29 27 3e rlwinm r9,r9,4,28,31 5080: 39 29 00 01 addi r9,r9,1 5084: 7d 29 53 78 or r9,r9,r10 5088: 91 22 04 dc stw r9,1244(r2) 508c: 7d 20 ed 26 mfsrin r9,r29 5090: 55 29 00 80 rlwinm r9,r9,0,2,0 5094: 7c 08 50 40 cmplw r8,r10 5098: 40 81 00 c0 ble 5158 <perf_copy_attr+0x3b0> 509c: 7d 46 50 f8 not r6,r10 50a0: 7c c6 42 14 add r6,r6,r8 50a4: 54 c6 27 be rlwinm r6,r6,4,30,31 50a8: 7d 20 51 e4 mtsrin r9,r10 50ac: 3c ea 10 00 addis r7,r10,4096 50b0: 39 29 01 11 addi r9,r9,273 50b4: 7f 88 38 40 cmplw cr7,r8,r7 50b8: 55 29 02 06 rlwinm r9,r9,0,8,3 50bc: 40 9d 00 9c ble cr7,5158 <perf_copy_attr+0x3b0> 50c0: 2f 86 00 00 cmpwi cr7,r6,0 50c4: 41 9e 00 4c beq cr7,5110 <perf_copy_attr+0x368> 50c8: 2f 86 00 01 cmpwi cr7,r6,1 50cc: 41 9e 00 2c beq cr7,50f8 <perf_copy_attr+0x350> 50d0: 2f 86 00 02 cmpwi cr7,r6,2 50d4: 41 9e 00 14 beq cr7,50e8 <perf_copy_attr+0x340> 50d8: 7d 20 39 e4 mtsrin r9,r7 50dc: 39 29 01 11 addi r9,r9,273 50e0: 3c e7 10 00 addis r7,r7,4096 50e4: 55 29 02 06 rlwinm r9,r9,0,8,3 50e8: 7d 20 39 e4 mtsrin r9,r7 50ec: 39 29 01 11 addi r9,r9,273 50f0: 3c e7 10 00 addis r7,r7,4096 50f4: 55 29 02 06 rlwinm r9,r9,0,8,3 50f8: 7d 20 39 e4 mtsrin r9,r7 50fc: 3c e7 10 00 addis r7,r7,4096 5100: 39 29 01 11 addi r9,r9,273 5104: 7f 88 38 40 cmplw cr7,r8,r7 5108: 55 29 02 06 rlwinm r9,r9,0,8,3 510c: 40 9d 00 4c ble cr7,5158 <perf_copy_attr+0x3b0> 5110: 7d 20 39 e4 mtsrin r9,r7 5114: 39 29 01 11 addi r9,r9,273 5118: 3c c7 10 00 addis r6,r7,4096 511c: 55 29 02 06 rlwinm r9,r9,0,8,3 5120: 7d 20 31 e4 mtsrin r9,r6 5124: 39 29 01 11 addi r9,r9,273 5128: 3c c6 10 00 addis r6,r6,4096 512c: 55 29 02 06 rlwinm r9,r9,0,8,3 5130: 7d 20 31 e4 mtsrin r9,r6 5134: 39 29 01 11 addi r9,r9,273 5138: 3c c7 30 00 addis r6,r7,12288 513c: 55 29 02 06 rlwinm r9,r9,0,8,3 5140: 7d 20 31 e4 mtsrin r9,r6 5144: 3c e7 40 00 addis r7,r7,16384 5148: 39 29 01 11 addi r9,r9,273 514c: 7f 88 38 40 cmplw cr7,r8,r7 5150: 55 29 02 06 rlwinm r9,r9,0,8,3 5154: 41 9d ff bc bgt cr7,5110 <perf_copy_attr+0x368> 5158: 4c 00 01 2c isync 515c: 39 20 00 80 li r9,128 5160: 91 3d 00 00 stw r9,0(r29) 5164: 38 e0 00 00 li r7,0 5168: 90 e2 04 dc stw r7,1244(r2) 516c: 7d 20 ed 26 mfsrin r9,r29 5170: 65 29 40 00 oris r9,r9,16384 5174: 40 81 00 c0 ble 5234 <perf_copy_attr+0x48c> 5178: 7d 47 50 f8 not r7,r10 517c: 7c e7 42 14 add r7,r7,r8 5180: 54 e7 27 be rlwinm r7,r7,4,30,31 5184: 7d 20 51 e4 mtsrin r9,r10 5188: 3d 4a 10 00 addis r10,r10,4096 518c: 39 29 01 11 addi r9,r9,273 5190: 7c 08 50 40 cmplw r8,r10 5194: 55 29 02 06 rlwinm r9,r9,0,8,3 5198: 40 81 00 9c ble 5234 <perf_copy_attr+0x48c> 519c: 2c 07 00 00 cmpwi r7,0 51a0: 41 82 00 4c beq 51ec <perf_copy_attr+0x444> 51a4: 2c 07 00 01 cmpwi r7,1 51a8: 41 82 00 2c beq 51d4 <perf_copy_attr+0x42c> 51ac: 2c 07 00 02 cmpwi r7,2 51b0: 41 82 00 14 beq 51c4 <perf_copy_attr+0x41c> 51b4: 7d 20 51 e4 mtsrin r9,r10 51b8: 39 29 01 11 addi r9,r9,273 51bc: 3d 4a 10 00 addis r10,r10,4096 51c0: 55 29 02 06 rlwinm r9,r9,0,8,3 51c4: 7d 20 51 e4 mtsrin r9,r10 51c8: 39 29 01 11 addi r9,r9,273 51cc: 3d 4a 10 00 addis r10,r10,4096 51d0: 55 29 02 06 rlwinm r9,r9,0,8,3 51d4: 7d 20 51 e4 mtsrin r9,r10 51d8: 3d 4a 10 00 addis r10,r10,4096 51dc: 39 29 01 11 addi r9,r9,273 51e0: 7c 08 50 40 cmplw r8,r10 51e4: 55 29 02 06 rlwinm r9,r9,0,8,3 51e8: 40 81 00 4c ble 5234 <perf_copy_attr+0x48c> 51ec: 7d 20 51 e4 mtsrin r9,r10 51f0: 39 29 01 11 addi r9,r9,273 51f4: 3c ea 10 00 addis r7,r10,4096 51f8: 55 29 02 06 rlwinm r9,r9,0,8,3 51fc: 7d 20 39 e4 mtsrin r9,r7 5200: 39 29 01 11 addi r9,r9,273 5204: 3c e7 10 00 addis r7,r7,4096 5208: 55 29 02 06 rlwinm r9,r9,0,8,3 520c: 7d 20 39 e4 mtsrin r9,r7 5210: 39 29 01 11 addi r9,r9,273 5214: 3c ea 30 00 addis r7,r10,12288 5218: 55 29 02 06 rlwinm r9,r9,0,8,3 521c: 7d 20 39 e4 mtsrin r9,r7 5220: 3d 4a 40 00 addis r10,r10,16384 5224: 39 29 01 11 addi r9,r9,273 5228: 7c 08 50 40 cmplw r8,r10 522c: 55 29 02 06 rlwinm r9,r9,0,8,3 5230: 41 81 ff bc bgt 51ec <perf_copy_attr+0x444> 5234: 4c 00 01 2c isync Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Export the ool handlers to fix build errors] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d9121f96a7c4302946839a0771f5d1daeeb6968c.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32s: Allow disabling KUAP at boot timeChristophe Leroy1-1/+26
PPC64 uses MMU features to enable/disable KUAP at boot time. But feature fixups are applied way too early on PPC32. Now that all KUAP related actions are in C following the conversion of KUAP initial setup and context switch in C, static branches can be used to enable/disable KUAP. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Export disable_kuap_key to fix build errors] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/cd79e8008455fba5395d099f9bb1305c039b931c.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32s: Allow disabling KUEP at boot timeChristophe Leroy1-1/+5
PPC64 uses MMU features to enable/disable KUEP at boot time. But feature fixups are applied way too early on PPC32. Now that all KUEP related actions are in C following the conversion of KUEP initial setup and context switch in C, static branches can be used to enable/disable KUEP. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7745a2c3a08ec46302920a3f48d1cb9b5469dbbb.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32s: Simplify calculation of segment register contentChristophe Leroy1-18/+22
segment register has VSID on bits 8-31. Bits 4-7 are reserved, there is no requirement to set them to 0. VSIDs are calculated from VSID of SR0 by adding 0x111. Even with highest possible VSID which would be 0xFFFFF0, adding 16 times 0x111 results in 0x1001100. So, the reserved bits are never overflowed, no need to clear the reserved bits after each calculation. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ddc1cfd2ec8f3b2395c6a4d7f2b0c1aa1b1e64fb.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32s: Convert switch_mmu_context() to CChristophe Leroy1-0/+5
switch_mmu_context() does things that can easily be done in C. For updating user segments, we have update_user_segments(). As mentionned in commit b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C"), update_user_segments() has the loop unrolled which is a significant performance gain. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/05c0875ad8220c03452c3a334946e207c6ca04d6.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32s: move CTX_TO_VSID() into mmu-hash.hChristophe Leroy1-0/+10
In order to reuse it in switch_mmu_context(), this patch moves CTX_TO_VSID() macro into asm/book3s/32/mmu-hash.h Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/26b36ef2939234a04b37baf6ffe50cba81f5d1b7.1622708530.git.christophe.leroy@csgroup.eu
2021-06-17powerpc/32s: Refactor update of user segment registersChristophe Leroy2-0/+48
KUEP implements the update of user segment registers. Move it into mmu-hash.h in order to use it from other places. And inline kuep_lock() and kuep_unlock(). Inlining kuep_lock() is important for system_call_exception(), otherwise system_call_exception() has to save into stack the system call parameters that are used just after, and doing that takes more instructions than kuep_lock() itself. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/24591ca480d14a62ef910e38a5273d551262c4a2.1622708530.git.christophe.leroy@csgroup.eu
2021-05-17powerpc/32s: Remove asm/book3s/32/hash.hChristophe Leroy2-46/+37
Move the PAGE bits into pgtable.h to be more similar to book3s/64. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7f4aaa479569328a1e5b07c96c08fbca0cd7dd88.1620307890.git.christophe.leroy@csgroup.eu
2021-05-17powerpc/32s: Speed up likely path of kuap_update_sr()Christophe Leroy1-2/+4
In most cases, kuap_update_sr() will update a single segment register. We know that first update will always be done, if there is no segment register to update at all, kuap_update_sr() is not called. Avoid recurring calculations and tests in that case. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/848f18d213b8341939add7302dc4ef80cc7a12e3.1620307636.git.christophe.leroy@csgroup.eu
2021-04-14powerpc/32s: Define a MODULE area below kernel text all the timeChristophe Leroy1-2/+0
On book3s/32, the segment below kernel text is used for module allocation when CONFIG_STRICT_KERNEL_RWX is defined. In order to benefit from the powerpc specific module_alloc() function which allocate modules with 32 Mbytes from end of kernel text, use that segment below PAGE_OFFSET at all time. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a46dcdd39a9e80b012d86c294c4e5cd8d31665f3.1617283827.git.christophe.leroy@csgroup.eu
2021-03-29powerpc/32: Manage KUAP in CChristophe Leroy1-49/+1
Move all KUAP management in C. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/199365ddb58d579daf724815f2d0acb91cc49d19.1615552867.git.christophe.leroy@csgroup.eu
2021-03-29powerpc/32s: Create C version of kuap save/restore/check helpersChristophe Leroy1-0/+45
In preparation of porting PPC32 to C syscall entry/exit, create C version of kuap_save_and_lock() and kuap_user_restore() and kuap_kernel_restore() and kuap_assert_locked() and kuap_get_and_assert_locked() on book3s/32. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2be8fb729da4a0f9863b25e1b9d547174fcd5056.1615552867.git.christophe.leroy@csgroup.eu
2021-03-29powerpc/32s: Move KUEP locking/unlocking in CChristophe Leroy1-31/+0
This can be done in C, do it. Unrolling the loop gains approx. 15% performance. From now on, prepare_transfer_to_handler() is only for interrupts from kernel. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4eadd873927e9a73c3d1dfe2f9497353465514cf.1615552867.git.christophe.leroy@csgroup.eu
2021-03-24powerpc: Fix misspellings in tlbflush.hZhang Yunkai1-1/+1
The comment marking the end of the include guard is wrong, fix it up. Signed-off-by: Zhang Yunkai <zhang.yunkai@zte.com.cn> [mpe: Rewrite commit message] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210304031318.188447-1-zhang.yunkai@zte.com.cn
2021-02-09powerpc/32s: mfsrin()/mtsrin() become mfsr()/mtsr()Christophe Leroy1-4/+4
Function names should tell what the function does, not how. mfsrin() and mtsrin() are read/writing segment registers. They are called that way because they are using mfsrin and mtsrin instructions, but it doesn't matter for the caller. In preparation of following patch, change their name to mfsr() and mtsr() in order to make it obvious they manipulate segment registers without messing up with how they do it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f92d99f4349391b77766745900231aa880a0efb5.1612612022.git.christophe.leroy@csgroup.eu
2021-02-09powerpc: remove unneeded semicolonsChengyang Fan1-1/+1
Remove superfluous semicolons after function definitions. Signed-off-by: Chengyang Fan <cy.fan@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210125095338.1719405-1-cy.fan@huawei.com
2020-12-17Merge tag 'powerpc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds4-20/+79
Pull powerpc updates from Michael Ellerman: - Switch to the generic C VDSO, as well as some cleanups of our VDSO setup/handling code. - Support for KUAP (Kernel User Access Prevention) on systems using the hashed page table MMU, using memory protection keys. - Better handling of PowerVM SMT8 systems where all threads of a core do not share an L2, allowing the scheduler to make better scheduling decisions. - Further improvements to our machine check handling. - Show registers when unwinding interrupt frames during stack traces. - Improvements to our pseries (PowerVM) partition migration code. - Several series from Christophe refactoring and cleaning up various parts of the 32-bit code. - Other smaller features, fixes & cleanups. Thanks to: Alan Modra, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Ard Biesheuvel, Athira Rajeev, Balamuruhan S, Bill Wendling, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Colin Ian King, Daniel Axtens, David Hildenbrand, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geert Uytterhoeven, Giuseppe Sacco, Greg Kurz, Harish, Jan Kratochvil, Jordan Niethe, Kaixu Xia, Laurent Dufour, Leonardo Bras, Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu Desnoyers, Nathan Lynch, Nicholas Piggin, Oleg Nesterov, Oliver O'Halloran, Oscar Salvador, Po-Hsu Lin, Qian Cai, Qinglang Miao, Randy Dunlap, Ravi Bangoria, Sachin Sant, Sandipan Das, Sebastian Andrzej Siewior , Segher Boessenkool, Srikar Dronamraju, Tyrel Datwyler, Uwe Kleine-König, Vincent Stehlé, Youling Tang, and Zhang Xiaoxu. * tag 'powerpc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (304 commits) powerpc/32s: Fix cleanup_cpu_mmu_context() compile bug powerpc: Add config fragment for disabling -Werror powerpc/configs: Add ppc64le_allnoconfig target powerpc/powernv: Rate limit opal-elog read failure message powerpc/pseries/memhotplug: Quieten some DLPAR operations powerpc/ps3: use dma_mapping_error() powerpc: force inlining of csum_partial() to avoid multiple csum_partial() with GCC10 powerpc/perf: Fix Threshold Event Counter Multiplier width for P10 powerpc/mm: Fix hugetlb_free_pmd_range() and hugetlb_free_pud_range() KVM: PPC: Book3S HV: Fix mask size for emulated msgsndp KVM: PPC: fix comparison to bool warning KVM: PPC: Book3S: Assign boolean values to a bool variable powerpc: Inline setup_kup() powerpc/64s: Mark the kuap/kuep functions non __init KVM: PPC: Book3S HV: XIVE: Add a comment regarding VP numbering powerpc/xive: Improve error reporting of OPAL calls powerpc/xive: Simplify xive_do_source_eoi() powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG ...
2020-12-17powerpc/32s: Fix cleanup_cpu_mmu_context() compile bugMichael Ellerman1-0/+1
Currently pmac32_defconfig with SMP=y doesn't build: arch/powerpc/platforms/powermac/smp.c: error: implicit declaration of function 'cleanup_cpu_mmu_context' It would be nice for consistency if all platforms clear mm_cpumask and flush TLBs on unplug, but the TLB invalidation bug described in commit 01b0f0eae081 ("powerpc/64s: Trim offlined CPUs from mm_cpumasks") only applies to 64s and for now we only have the TLB flush code for that platform. So just add an empty version for 32-bit Book3S. Fixes: 01b0f0eae081 ("powerpc/64s: Trim offlined CPUs from mm_cpumasks") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Change log based on comments from Nick] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2020-12-09powerpc/mm: Move the WARN() out of bad_kuap_fault()Christophe Leroy1-5/+1
In order to prepare the removal of calls to search_exception_tables() on the fast path, move the WARN() out of bad_kuap_fault(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9501311014bd6507e04b27a0c3035186ccf65cd5.1607491748.git.christophe.leroy@csgroup.eu
2020-12-09powerpc/32s: Inline flush_hash_entry()Christophe Leroy1-6/+11
flush_hash_entry() is a simple function calling flush_hash_pages() if it's a hash MMU or doing nothing otherwise. Inline it. And use it also in __ptep_test_and_clear_young(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9af895be7d4b404d40e749a2659552fd138e62c4.1603348103.git.christophe.leroy@csgroup.eu
2020-12-09powerpc/32s: Inline tlb_flush()Christophe Leroy1-0/+11
On book3s/32, tlb_flush() does nothing when the CPU has a hash table, it calls _tlbia() otherwise. Inline it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ebc933d1c530a19ef3cf7983f6ae94814f6e92ac.1603348103.git.christophe.leroy@csgroup.eu
2020-12-09powerpc/32s: Split and inline flush_range()Christophe Leroy1-1/+12
flush_range() handle both the MMU_FTR_HPTE_TABLE case and the other case. The non MMU_FTR_HPTE_TABLE case is trivial as it is only a call to _tlbie()/_tlbia() which is not worth a dedicated function. Make flush_range() a hash specific and call it from tlbflush.h based on mmu_has_feature(MMU_FTR_HPTE_TABLE). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/132ab19aae52abc8e06ab524ec86d4229b5b9c3d.1603348103.git.christophe.leroy@csgroup.eu
2020-12-09powerpc/32s: Inline flush_tlb_range() and flush_tlb_kernel_range()Christophe Leroy1-3/+12
flush_tlb_range() and flush_tlb_kernel_range() are trivial calls to flush_range(). Make flush_range() global and inline flush_tlb_range() and flush_tlb_kernel_range(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c7029a78e78709ad9272d7a44260e06b649169b2.1603348103.git.christophe.leroy@csgroup.eu
2020-12-09powerpc/32s: Split and inline flush_tlb_mm() and flush_tlb_page()Christophe Leroy1-2/+18
flush_tlb_mm() and flush_tlb_page() handle both the MMU_FTR_HPTE_TABLE case and the other case. The non MMU_FTR_HPTE_TABLE case is trivial as it is only a call to _tlbie()/_tlbia() which is not worth a dedicated function. Make flush_tlb_mm() and flush_tlb_page() hash specific and call them from tlbflush.h based on mmu_has_feature(MMU_FTR_HPTE_TABLE). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/11e932ded41ba6d9b251d89b7afa33cc060d3aa4.1603348103.git.christophe.leroy@csgroup.eu
2020-12-09powerpc/32s: Inline _tlbie() on non SMPChristophe Leroy1-0/+7
On non SMP, _tlbie() is just a tlbie plus a sync instruction. Make it static inline. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/475136425541db5c7c8a0395d19d400525b251bc.1603348103.git.christophe.leroy@csgroup.eu
2020-12-09powerpc/32s: Move _tlbie() and _tlbia() prototypes to tlbflush.hChristophe Leroy1-0/+4
In order to use _tlbie() and _tlbia() directly from asm/book3s/32/tlbflush.h, move their prototypes from mm/mm_decl.h to there. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/867587af929973ad65f8ef6972f2474a80c1737a.1603348103.git.christophe.leroy@csgroup.eu
2020-12-09powerpc/mm: Remove flush_tlb_page_nohash() prototype.Christophe Leroy1-1/+0
flush_tlb_page_nohash() was removed by commit 703b41ad1a87 ("powerpc/mm: remove flush_tlb_page_nohash") Remove stale prototype and comment. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4a58831da6d6ba4fe309b94aa1dd8f02982d46b2.1603348103.git.christophe.leroy@csgroup.eu
2020-12-08powerpc/book3s64/kuap: Improve error reporting with KUAPAneesh Kumar K.V1-2/+2
This partially reverts commit eb232b162446 ("powerpc/book3s64/kuap: Improve error reporting with KUAP") and update the fault handler to print [ 55.022514] Kernel attempted to access user page (7e6725b70000) - exploit attempt? (uid: 0) [ 55.022528] BUG: Unable to handle kernel data access on read at 0x7e6725b70000 [ 55.022533] Faulting instruction address: 0xc000000000e8b9bc [ 55.022540] Oops: Kernel access of bad area, sig: 11 [#1] .... when the kernel access userspace address without unlocking AMR. bad_kuap_fault() is added as part of commit 5e5be3aed230 ("powerpc/mm: Detect bad KUAP faults") to catch userspace access incorrectly blocked by AMR. Hence retain the full stack dump there even with hash translation. Also, add a comment explaining the difference between hash and radix. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201208031539.84878-1-aneesh.kumar@linux.ibm.com
2020-12-04powerpc: Fix incorrect stw{, ux, u, x} instructions in __set_pte_atMathieu Desnoyers1-2/+2
The placeholder for instruction selection should use the second argument's operand, which is %1, not %0. This could generate incorrect assembly code if the memory addressing of operand %0 is a different form from that of operand %1. Also remove the %Un placeholder because having %Un placeholders for two operands which are based on the same local var (ptep) doesn't make much sense. By the way, it doesn't change the current behaviour because "<>" constraint is missing for the associated "=m". [chleroy: revised commit log iaw segher's comments and removed %U0] Fixes: 9bf2b5cdc5fe ("powerpc: Fixes for CONFIG_PTE_64BIT for SMP support") Cc: <stable@vger.kernel.org> # v2.6.28+ Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/96354bd77977a6a933fe9020da57629007fdb920.1603358942.git.christophe.leroy@csgroup.eu
2020-12-04powerpc/book3s64/kuap: Improve error reporting with KUAPAneesh Kumar K.V1-2/+2
With hash translation use DSISR_KEYFAULT to identify a wrong access. With Radix we look at the AMR value and type of fault. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201127044424.40686-17-aneesh.kumar@linux.ibm.com
2020-12-04powerpc/vdso: Replace vdso_base by vdsoChristophe Leroy1-1/+1
All other architectures but s390 use a void pointer named 'vdso' to reference the VDSO mapping. In a following patch, the VDSO data page will be put in front of text, vdso_base will then not anymore point to VDSO text. To avoid confusion between vdso_base and VDSO text, rename vdso_base into vdso and make it a void __user *. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8e6cefe474aa4ceba028abb729485cd46c140990.1601197618.git.christophe.leroy@csgroup.eu
2020-11-16arch: pgtable: define MAX_POSSIBLE_PHYSMEM_BITS where neededArnd Bergmann1-0/+2
Stefan Agner reported a bug when using zsram on 32-bit Arm machines with RAM above the 4GB address boundary: Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = a27bd01c [00000000] *pgd=236a0003, *pmd=1ffa64003 Internal error: Oops: 207 [#1] SMP ARM Modules linked in: mdio_bcm_unimac(+) brcmfmac cfg80211 brcmutil raspberrypi_hwmon hci_uart crc32_arm_ce bcm2711_thermal phy_generic genet CPU: 0 PID: 123 Comm: mkfs.ext4 Not tainted 5.9.6 #1 Hardware name: BCM2711 PC is at zs_map_object+0x94/0x338 LR is at zram_bvec_rw.constprop.0+0x330/0xa64 pc : [<c0602b38>] lr : [<c0bda6a0>] psr: 60000013 sp : e376bbe0 ip : 00000000 fp : c1e2921c r10: 00000002 r9 : c1dda730 r8 : 00000000 r7 : e8ff7a00 r6 : 00000000 r5 : 02f9ffa0 r4 : e3710000 r3 : 000fdffe r2 : c1e0ce80 r1 : ebf979a0 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 30c5383d Table: 235c2a80 DAC: fffffffd Process mkfs.ext4 (pid: 123, stack limit = 0x495a22e6) Stack: (0xe376bbe0 to 0xe376c000) As it turns out, zsram needs to know the maximum memory size, which is defined in MAX_PHYSMEM_BITS when CONFIG_SPARSEMEM is set, or in MAX_POSSIBLE_PHYSMEM_BITS on the x86 architecture. The same problem will be hit on all 32-bit architectures that have a physical address space larger than 4GB and happen to not enable sparsemem and include asm/sparsemem.h from asm/pgtable.h. After the initial discussion, I suggested just always defining MAX_POSSIBLE_PHYSMEM_BITS whenever CONFIG_PHYS_ADDR_T_64BIT is set, or provoking a build error otherwise. This addresses all configurations that can currently have this runtime bug, but leaves all other configurations unchanged. I looked up the possible number of bits in source code and datasheets, here is what I found: - on ARC, CONFIG_ARC_HAS_PAE40 controls whether 32 or 40 bits are used - on ARM, CONFIG_LPAE enables 40 bit addressing, without it we never support more than 32 bits, even though supersections in theory allow up to 40 bits as well. - on MIPS, some MIPS32r1 or later chips support 36 bits, and MIPS32r5 XPA supports up to 60 bits in theory, but 40 bits are more than anyone will ever ship - On PowerPC, there are three different implementations of 36 bit addressing, but 32-bit is used without CONFIG_PTE_64BIT - On RISC-V, the normal page table format can support 34 bit addressing. There is no highmem support on RISC-V, so anything above 2GB is unused, but it might be useful to eventually support CONFIG_ZRAM for high pages. Fixes: 61989a80fb3a ("staging: zsmalloc: zsmalloc memory allocation library") Fixes: 02390b87a945 ("mm/zsmalloc: Prepare to variable MAX_PHYSMEM_BITS") Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Stefan Agner <stefan@agner.ch> Tested-by: Stefan Agner <stefan@agner.ch> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lore.kernel.org/linux-mm/bdfa44bf1c570b05d6c70898e2bbb0acf234ecdf.1604762181.git.stefan@agner.ch/ Signed-off-by: Arnd Bergmann <arnd@arndb.de>