linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2016-06-17	powerpc/mm/hash: Don't add memory coherence if cache inhibited is set	Aneesh Kumar K.V	1	-5/+9
	H_ENTER hcall handling in qemu had assumptions that a cache inhibited hpte entry won't have memory conference set. Also older kernel mentioned that some version of pHyp required this (the code removed by the below commit says: /* Make pHyp happy */ if ((rflags & _PAGE_NO_CACHE) && !(rflags & _PAGE_WRITETHRU)) hpte_r &= ~HPTE_R_M; But with older kernel we had some inconsistent memory conherence mapping. We always enabled memory conherence in the page fault path and removed memory conherence is _PAGE_NO_CACHE was set when we mapped the page via htab_bolt_mapping. The commit mentioned below tried to consolidate that by always enabling memory conherence. But as mentioned above that breaks Qemu H_ENTER handling. This patch update this such that we enable memory conherence only if cache inhibited is not set and bring fault handling, lpar and bolt mapping in sync. Fixes: commit 30bda41aba4e("powerpc/mm: Drop WIMG in favour of new constant") Reported-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-14	powerpc/mm/hash: Use the correct PPP mask when updating HPTE	Aneesh Kumar K.V	2	-4/+5
	With commit e58e87adc8bf9 "powerpc/mm: Update _PAGE_KERNEL_RO" we now use all the three PPP bits. The top bit is now used to have a PPP value of 0b110 which will be mapped to kernel read only. When updating the hpte entry use right mask such that we update the 63rd bit (top 'P' bit) too. Prior to e58e87adc8bf we didn't support KERNEL_RO at all (it was == KERNEL_RW), so this isn't a regression as such. Fixes: e58e87adc8bf ("powerpc/mm: Update _PAGE_KERNEL_RO") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-10	powerpc/mm/radix: Flush page walk cache when freeing page table	Aneesh Kumar K.V	6	-7/+73
	Even though a tlb_flush() does a flush with invalidate all cache, we can end up doing an RCU page table free before calling tlb_flush(). That means we can have page walk cache entries even after we free the page table pages. This can result in us doing wrong page table walk. Avoid this by doing pwc flush on every page table free. We can't batch the pwc flush, because the rcu call back function where we free the page table pages doesn't have information of the mmu gather. Thus we have to do a pwc on every page table page freed. Note: I also removed the dummy tlb_flush_pgtable call functions for hash 32. Fixes: 1a472c9dba6b ("powerpc/mm/radix: Add tlbflush routines") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-10	powerpc/mm/radix: Update to tlb functions ric argument	Aneesh Kumar K.V	1	-21/+22
	Radix invalidate control (RIC) is used to control which cache to flush using tlb instructions. When doing a PID flush, we currently flush everything including page walk cache. For address range flush, we flush only the TLB. In the next patch, we add support for flushing only the page walk cache. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-10	powerpc/nohash: Fix build break with 64K pages	Michael Ellerman	1	-1/+1
	Commit 74701d5947a6 "powerpc/mm: Rename function to indicate we are allocating fragments" renamed page_table_free() to pte_fragment_free(). One occurrence was mistyped as pte_fragment_fre(). This only breaks the nohash 64K page build, which is not the default or enabled in any defconfig. Fixes: 74701d5947a6 ("powerpc/mm: Rename function to indicate we are allocating fragments") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-08	powerpc/mm/hash: Compute the segment size correctly for ISA 3.0	Aneesh Kumar K.V	1	-1/+5
	PowerISA 3.0 encodes the segment size in the second half of hash page table entry. Update hpte_decode() accordingly. Fixes: 50de596de8be ("powerpc/mm/hash: Add support for Power9 Hash") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-08	powerpc/mm/radix: Fix always false comparison against MMU_NO_CONTEXT	Aneesh Kumar K.V	1	-4/+4
	In some of the radix TLB flush routines, we use a local to store the mm->context.id, AKA the PID. Currently we use an int, but the PID is unsigned long, so large values of PID will be truncated. In particular MMU_NO_CONTEXT is -1, which means all our comparisons against that value can never be true. This means we'll issue TLB flushes when we shouldn't on radix enabled machines. Fix it by using an unsigned long for the local. Discovered by Coverity. Fixes: 1a472c9dba6b ("powerpc/mm/radix: Add tlbflush routines") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com> [mpe: Write change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-08	of: fix autoloading due to broken modalias with no 'compatible'	Wolfram Sang	1	-1/+1
	Because of an improper dereference, a stray 'C' character was output to the modalias when no 'compatible' was specified. This is the case for some old PowerMac drivers which only set the 'name' property. Fix it to let them match again. Reported-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Tested-by: Mathieu Malaterre <malat@debian.org> Cc: Philipp Zabel <p.zabel@pengutronix.de> Cc: Andreas Schwab <schwab@linux-m68k.org> Fixes: 6543becf26fff6 ("mod/file2alias: make modalias generation safe for cross compiling") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-08	powerpc/pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added	Michael Ellerman	1	-1/+1
	The recent commit 7cc851039d64 ("powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call") added a new PVR mask & value to the start of the ibm_architecture_vec[] array. However it missed the fact that further down in the array, we hard code the offset of one of the fields, and then at boot use that value to patch the value in the array. This means every update to the array must also update the #define, ugh. This means that on pseries machines we will misreport to firmware the number of cores we support, by a factor of threads_per_core. Fix it for now by updating the #define. Fixes: 7cc851039d64 ("powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-06	powerpc/pseries: Fix PCI config address for DDW	Gavin Shan	1	-2/+2
	In commit 8445a87f7092 "powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism", the PE address was replaced with the PCI config address in order to remove dependency on EEH. According to PAPR spec, firmware (pHyp or QEMU) should accept "xxBBSSxx" format PCI config address, not "xxxxBBSS" provided by the patch. Note that "BB" is PCI bus number and "SS" is the combination of slot and function number. This fixes the PCI address passed to DDW RTAS calls. Fixes: 8445a87f7092 ("powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism") Cc: stable@vger.kernel.org # v3.4+ Reported-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-06	powerpc/ptrace: Fix out of bounds array access warning	Khem Raj	1	-2/+2
	gcc-6 correctly warns about a out of bounds access arch/powerpc/kernel/ptrace.c:407:24: warning: index 32 denotes an offset greater than size of 'u64[32][1] {aka long long unsigned int[32][1]}' [-Warray-bounds] offsetof(struct thread_fp_state, fpr[32][0])); ^ check the end of array instead of beginning of next element to fix this Signed-off-by: Khem Raj <raj.khem@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Segher Boessenkool <segher@kernel.crashing.org> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-01	powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call	Thomas Huth	1	-0/+1
	If we do not provide the PVR for POWER8NVL, a guest on this system currently ends up in PowerISA 2.06 compatibility mode on KVM, since QEMU does not provide a generic PowerISA 2.07 mode yet. So some new instructions from POWER8 (like "mtvsrd") get disabled for the guest, resulting in crashes when using code compiled explicitly for POWER8 (e.g. with the "-mcpu=power8" option of GCC). Fixes: ddee09c099c3 ("powerpc: Add PVR for POWER8NVL processor") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-01	powerpc/mm/radix: Add missing tlb flush	Aneesh Kumar K.V	1	-4/+1
	This should not have any impact on hash, because hash does tlb invalidate with every pte update and we don't implement flush_tlb_* functions for hash. With radix we should make an explicit call to flush tlb outside pte update. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-01	powerpc/mm/hash: Fix the reference bit update when handling hash fault	Aneesh Kumar K.V	1	-2/+20
	When we converted the asm routines to C functions, we missed updating HPTE_R_R based on _PAGE_ACCESSED. ASM code used to copy over the lower bits from pte via. andi. r3,r30,0x1fe /* Get basic set of flags */ We also update the code such that we won't update the Change bit ('C' bit) always. This was added by commit c5cf0e30bf3d8 ("powerpc: Fix buglet with MMU hash management"). With hash64, we need to make sure that hardware doesn't do a pte update directly. This is because we do end up with entries in TLB with no hash page table entry. This happens because when we find a hash bucket full, we "evict" a more/less random entry from it. When we do that we don't invalidate the TLB (hpte_remove) because we assume the old translation is still technically "valid". For more info look at commit 0608d692463("powerpc/mm: Always invalidate tlb on hpte invalidate and update"). Thus it's critical that valid hash PTEs always have reference bit set and writeable ones have change bit set. We do this by hashing a non-dirty linux PTE as read-only and always setting _PAGE_ACCESSED (and thus R) when hashing anything else in. Any attempt by Linux at clearing those bits also removes the corresponding hash entry. Commit 5cf0e30bf3d8 did that for 'C' bit by enabling 'C' bit always. We don't really need to do that because we never map a RW pte entry without setting 'C' bit. On READ fault on a RW pte entry, we still map it READ only, hence a store update in the page will still cause a hash pte fault. This patch reverts the part of commit c5cf0e30bf3d8 ("[PATCH] powerpc: Fix buglet with MMU hash management") and retain the updatepp part. - If we hit the updatepp path on native, the old code without that commit, would fail to set C bcause native_hpte_updatepp() was implemented to filter the same bits as H_PROTECT and not let C through thus we would "upgrade" a RO HPTE to RW without setting C thus causing the bug. So the real fix in that commit was the change to native_hpte_updatepp Fixes: 89ff725051d1 ("powerpc/mm: Convert __hash_page_64K to C") Cc: stable@vger.kernel.org # v4.5+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-01	powerpc/mm/radix: Update LPCR only if it is powernv	Aneesh Kumar K.V	1	-13/+10
	LPCR cannot be updated when running in guest mode. Fixes: 2bfd65e45e87 ("powerpc/mm/radix: Add radix callbacks for early init routines") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-31	powerpc: Use privileged SPR number for MMCR2	Thomas Huth	1	-1/+1
	We are already using the privileged versions of MMCR0, MMCR1 and MMCRA in the kernel, so for MMCR2, we should better use the privileged versions, too, to be consistent. Fixes: 240686c13687 ("powerpc: Initialise PMU related regs on Power8") Cc: stable@vger.kernel.org # v3.10+ Suggested-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-31	powerpc: Fix definition of SIAR and SDAR registers	Thomas Huth	1	-2/+2
	The SIAR and SDAR registers are available twice, one time as SPRs 780 / 781 (unprivileged, but read-only), and one time as the SPRs 796 / 797 (privileged, but read and write). The Linux kernel code currently uses the unprivileged SPRs - while this is OK for reading, writing to that register of course does not work. Since the KVM code tries to write to this register, too (see the mtspr in book3s_hv_rmhandlers.S), the contents of this register sometimes get lost for the guests, e.g. during migration of a VM. To fix this issue, simply switch to the privileged SPR numbers instead. Cc: stable@vger.kernel.org Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-30	powerpc/pseries/eeh: Refactor the configure_bridge RTAS tokens	Russell Currey	1	-16/+12
	The RTAS calls "ibm,configure-pe" and "ibm,configure-bridge" perform the same actions, however the former can skip configuration if unnecessary. The existing code treats them as different tokens even though only one will ever be called. Refactor this by making a single token that is assigned during init. Signed-off-by: Russell Currey <ruscur@russell.cc> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-30	powerpc/pseries/eeh: Handle RTAS delay requests in configure_bridge	Russell Currey	1	-15/+36
	In the "ibm,configure-pe" and "ibm,configure-bridge" RTAS calls, the spec states that values of 9900-9905 can be returned, indicating that software should delay for 10^x (where x is the last digit, i.e. 990x) milliseconds and attempt the call again. Currently, the kernel doesn't know about this, and respecting it fixes some PCI failures when the hypervisor is busy. The delay is capped at 0.2 seconds. Cc: <stable@vger.kernel.org> # 3.10+ Signed-off-by: Russell Currey <ruscur@russell.cc> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-29	Linux 4.7-rc1	Linus Torvalds	1	-3/+3

2016-05-29	hash_string: Fix zero-length case for !DCACHE_WORD_ACCESS	George Spelvin	1	-2/+2
	The self-test was updated to cover zero-length strings; the function needs to be updated, too. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: George Spelvin <linux@sciencehorizons.net> Fixes: fcfd2fbf22d2 ("fs/namei.c: Add hashlen_string() function") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-28	Rename other copy of hash_string to hashlen_string	George Spelvin	1	-2/+2
	The original name was simply hash_string(), but that conflicted with a function with that name in drivers/base/power/trace.c, and I decided that calling it "hashlen_" was better anyway. But you have to do it in two places. [ This caused build errors for architectures that don't define CONFIG_DCACHE_WORD_ACCESS - Linus ] Signed-off-by: George Spelvin <linux@sciencehorizons.net> Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: fcfd2fbf22d2 ("fs/namei.c: Add hashlen_string() function") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-28	hpfs: implement the show_options method	Mikulas Patocka	1	-11/+32
	The HPFS filesystem used generic_show_options to produce string that is displayed in /proc/mounts. However, there is a problem that the options may disappear after remount. If we mount the filesystem with option1 and then remount it with option2, /proc/mounts should show both option1 and option2, however it only shows option2 because the whole option string is replaced with replace_mount_options in hpfs_remount_fs. To fix this bug, implement the hpfs_show_options function that prints options that are currently selected. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-28	affs: fix remount failure when there are no options changed	Mikulas Patocka	1	-2/+3
	Commit c8f33d0bec99 ("affs: kstrdup() memory handling") checks if the kstrdup function returns NULL due to out-of-memory condition. However, if we are remounting a filesystem with no change to filesystem-specific options, the parameter data is NULL. In this case, kstrdup returns NULL (because it was passed NULL parameter), although no out of memory condition exists. The mount syscall then fails with ENOMEM. This patch fixes the bug. We fail with ENOMEM only if data is non-NULL. The patch also changes the call to replace_mount_options - if we didn't pass any filesystem-specific options, we don't call replace_mount_options (thus we don't erase existing reported options). Fixes: c8f33d0bec99 ("affs: kstrdup() memory handling") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org # v4.1+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-28	hpfs: fix remount failure when there are no options changed	Mikulas Patocka	1	-2/+3
	Commit ce657611baf9 ("hpfs: kstrdup() out of memory handling") checks if the kstrdup function returns NULL due to out-of-memory condition. However, if we are remounting a filesystem with no change to filesystem-specific options, the parameter data is NULL. In this case, kstrdup returns NULL (because it was passed NULL parameter), although no out of memory condition exists. The mount syscall then fails with ENOMEM. This patch fixes the bug. We fail with ENOMEM only if data is non-NULL. The patch also changes the call to replace_mount_options - if we didn't pass any filesystem-specific options, we don't call replace_mount_options (thus we don't erase existing reported options). Fixes: ce657611baf9 ("hpfs: kstrdup() out of memory handling") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-28	fs: fix binfmt_aout.c build error	Guenter Roeck	1	-1/+0
	Various builds (such as i386:allmodconfig) fail with fs/binfmt_aout.c:133:2: error: expected identifier or '(' before 'return' fs/binfmt_aout.c:134:1: error: expected identifier or '(' before '}' token [ Oops. My bad, I had stupidly thought that "allmodconfig" covered this on x86-64 too, but it obviously doesn't. Egg on my face. - Linus ] Fixes: 5d22fc25d4fc ("mm: remove more IS_ERR_VALUE abuses") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-28	h8300: Add <asm/hash.h>	George Spelvin	2	-0/+54
	This will improve the performance of hash_32() and hash_64(), but due to complete lack of multi-bit shift instructions on H8, performance will still be bad in surrounding code. Designing H8-specific hash algorithms to work around that is a separate project. (But if the maintainers would like to get in touch...) Signed-off-by: George Spelvin <linux@sciencehorizons.net> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp
2016-05-28	microblaze: Add <asm/hash.h>	George Spelvin	2	-0/+82
	Microblaze is an FPGA soft core that can be configured various ways. If it is configured without a multiplier, the standard __hash_32() will require a call to __mulsi3, which is a slow software loop. Instead, use a shift-and-add sequence for the constant multiply. GCC knows how to do this, but it's not as clever as some. Signed-off-by: George Spelvin <linux@sciencehorizons.net> Cc: Alistair Francis <alistair.francis@xilinx.com> Cc: Michal Simek <michal.simek@xilinx.com>
2016-05-28	m68k: Add <asm/hash.h>	George Spelvin	2	-0/+60
	This provides a multiply by constant GOLDEN_RATIO_32 = 0x61C88647 for the original mc68000, which lacks a 32x32-bit multiply instruction. Yes, the amount of optimization effort put in is excessive. :-) Shift-add chain found by Yevgen Voronenko's Hcub algorithm at http://spiral.ece.cmu.edu/mcm/gen.html Signed-off-by: George Spelvin <linux@sciencehorizons.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Cc: Philippe De Muyter <phdm@macq.eu> Cc: linux-m68k@lists.linux-m68k.org
2016-05-28	<linux/hash.h>: Add support for architecture-specific functions	George Spelvin	6	-4/+299
	This is just the infrastructure; there are no users yet. This is modelled on CONFIG_ARCH_RANDOM; a CONFIG_ symbol declares the existence of <asm/hash.h>. That file may define its own versions of various functions, and define HAVE_* symbols (no CONFIG_ prefix!) to suppress the generic ones. Included is a self-test (in lib/test_hash.c) that verifies the basics. It is NOT in general required that the arch-specific functions compute the same thing as the generic, but if a HAVE_* symbol is defined with the value 1, then equality is tested. Signed-off-by: George Spelvin <linux@sciencehorizons.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Cc: Philippe De Muyter <phdm@macq.eu> Cc: linux-m68k@lists.linux-m68k.org Cc: Alistair Francis <alistai@xilinx.com> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp
2016-05-28	fs/namei.c: Improve dcache hash function	George Spelvin	1	-40/+81
	Patch 0fed3ac866 improved the hash mixing, but the function is slower than necessary; there's a 7-instruction dependency chain (10 on x86) each loop iteration. Word-at-a-time access is a very tight loop (which is good, because link_path_walk() is one of the hottest code paths in the entire kernel), and the hash mixing function must not have a longer latency to avoid slowing it down. There do not appear to be any published fast hash functions that: 1) Operate on the input a word at a time, and 2) Don't need to know the length of the input beforehand, and 3) Have a single iterated mixing function, not needing conditional branches or unrolling to distinguish different loop iterations. One of the algorithms which comes closest is Yann Collet's xxHash, but that's two dependent multiplies per word, which is too much. The key insights in this design are: 1) Barring expensive ops like multiplies, to diffuse one input bit across 64 bits of hash state takes at least log2(64) = 6 sequentially dependent instructions. That is more cycles than we'd like. 2) An operation like "hash ^= hash << 13" requires a second temporary register anyway, and on a 2-operand machine like x86, it's three instructions. 3) A better use of a second register is to hold a two-word hash state. With careful design, no temporaries are needed at all, so it doesn't increase register pressure. And this gets rid of register copying on 2-operand machines, so the code is smaller and faster. 4) Using two words of state weakens the requirement for one-round mixing; we now have two rounds of mixing before cancellation is possible. 5) A two-word hash state also allows operations on both halves to be done in parallel, so on a superscalar processor we get more mixing in fewer cycles. I ended up using a mixing function inspired by the ChaCha and Speck round functions. It is 6 simple instructions and 3 cycles per iteration (assuming multiply by 9 can be done by an "lea" instruction): x ^= input++; y ^= x; x = ROL(x, K1); x += y; y = ROL(y, K2); y = 9; Not only is this reversible, two consecutive rounds are reversible: if you are given the initial and final states, but not the intermediate state, it is possible to compute both input words. This means that at least 3 words of input are required to create a collision. (It also has the property, used by hash_name() to avoid a branch, that it hashes all-zero to all-zero.) The rotate constants K1 and K2 were found by experiment. The search took a sample of random initial states (I used 1023) and considered the effect of flipping each of the 64 input bits on each of the 128 output bits two rounds later. Each of the 8192 pairs can be considered a biased coin, and adding up the Shannon entropy of all of them produces a score. The best-scoring shifts also did well in other tests (flipping bits in y, trying 3 or 4 rounds of mixing, flipping all 64*63/2 pairs of input bits), so the choice was made with the additional constraint that the sum of the shifts is odd and not too close to the word size. The final state is then folded into a 32-bit hash value by a less carefully optimized multiply-based scheme. This also has to be fast, as pathname components tend to be short (the most common case is one iteration!), but there's some room for latency, as there is a fair bit of intervening logic before the hash value is used for anything. (Performance verified with "bonnie++ -s 0 -n 1536:-2" on tmpfs. I need a better benchmark; the numbers seem to show a slight dip in performance between 4.6.0 and this patch, but they're too noisy to quote.) Special thanks to Bruce fields for diligent testing which uncovered a nasty fencepost error in an earlier version of this patch. [checkpatch.pl formatting complaints noted and respectfully disagreed with.] Signed-off-by: George Spelvin <linux@sciencehorizons.net> Tested-by: J. Bruce Fields <bfields@redhat.com>
2016-05-28	Eliminate bad hash multipliers from hash_32() and hash_64()	George Spelvin	2	-53/+36
	The "simplified" prime multipliers made very bad hash functions, so get rid of them. This completes the work of 689de1d6ca. To avoid the inefficiency which was the motivation for the "simplified" multipliers, hash_64() on 32-bit systems is changed to use a different algorithm. It makes two calls to hash_32() instead. drivers/media/usb/dvb-usb-v2/af9015.c uses the old GOLDEN_RATIO_PRIME_32 for some horrible reason, so it inherits a copy of the old definition. Signed-off-by: George Spelvin <linux@sciencehorizons.net> Cc: Antti Palosaari <crope@iki.fi> Cc: Mauro Carvalho Chehab <m.chehab@samsung.com>
2016-05-28	Change hash_64() return value to 32 bits	George Spelvin	1	-3/+3
	That's all that's ever asked for, and it makes the return type of hash_long() consistent. It also allows (upcoming patch) an optimized implementation of hash_64 on 32-bit machines. I tried adding a BUILD_BUG_ON to ensure the number of bits requested was never more than 32 (most callers use a compile-time constant), but adding <linux/bug.h> to <linux/hash.h> breaks the tools/perf compiler unless tools/perf/MANIFEST is updated, and understanding that code base well enough to update it is too much trouble. I did the rest of an allyesconfig build with such a check, and nothing tripped. Signed-off-by: George Spelvin <linux@sciencehorizons.net>
2016-05-28	<linux/sunrpc/svcauth.h>: Define hash_str() in terms of hashlen_string()	George Spelvin	1	-31/+9
	Finally, the first use of previous two patches: eliminate the separate ad-hoc string hash functions in the sunrpc code. Now hash_str() is a wrapper around hash_string(), and hash_mem() is likewise a wrapper around full_name_hash(). Note that sunrpc code does call hash_mem() with a zero length, which is why the previous patch needed to handle that in full_name_hash(). (Thanks, Bruce, for finding that!) This also eliminates the only caller of hash_long which asks for more than 32 bits of output. The comment about the quality of hashlen_string() and full_name_hash() is jumping the gun by a few patches; they aren't very impressive now, but will be improved greatly later in the series. Signed-off-by: George Spelvin <linux@sciencehorizons.net> Tested-by: J. Bruce Fields <bfields@redhat.com> Acked-by: J. Bruce Fields <bfields@redhat.com> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: linux-nfs@vger.kernel.org
2016-05-28	fs/namei.c: Add hashlen_string() function	George Spelvin	3	-9/+53
	We'd like to make more use of the highly-optimized dcache hash functions throughout the kernel, rather than have every subsystem create its own, and a function that hashes basic null-terminated strings is required for that. (The name is to emphasize that it returns both hash and length.) It's actually useful in the dcache itself, specifically d_alloc_name(). Other uses in the next patch. full_name_hash() is also tweaked to make it more generally useful: 1) Take a "char " rather than "unsigned char " argument, to be consistent with hash_name(). 2) Handle zero-length inputs. If we want more callers, we don't want to make them worry about corner cases. Signed-off-by: George Spelvin <linux@sciencehorizons.net>
2016-05-28	Pull out string hash to <linux/stringhash.h>	George Spelvin	2	-26/+73
	... so they can be used without the rest of <linux/dcache.h> The hashlen_* macros will make sense next patch. Signed-off-by: George Spelvin <linux@sciencehorizons.net>
2016-05-28	Revert "platform/chrome: chromeos_laptop: Add Leon Touch"	Benson Leung	1	-15/+0
	This reverts commit bff3c624dc7261a084a4d25a0b09c3fb0fec872a. Board "Leon" is otherwise known as "Toshiba CB35" and we already have the entry that supports that board as of this commit : 963cb6f platform/chrome: chromeos_laptop - Add Toshiba CB35 Touch Remove this duplicate. Signed-off-by: Benson Leung <bleung@chromium.org> Signed-off-by: Olof Johansson <olof@lixom.net>
2016-05-28	i2c: dev: use after free in detach	Dan Carpenter	1	-1/+1
	The call to put_i2c_dev() frees "i2c_dev" so there is a use after free when we call cdev_del(&i2c_dev->cdev). Fixes: d6760b14d4a1 ('i2c: dev: switch from register_chrdev to cdev API') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2016-05-28	MIPS: Add missing FROZEN hotplug notifier transitions	Anna-Maria Gleixner	1	-1/+1
	The corresponding FROZEN hotplug notifier transitions used on suspend/resume are ignored. Therefore the switch case action argument is masked with the frozen hotplug notifier transition mask. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: rt@linutronix.de Patchwork: https://patchwork.linux-mips.org/patch/13351/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28	MIPS: Build microMIPS VDSO for microMIPS kernels	James Hogan	1	-0/+1
	MicroMIPS kernels may be expected to run on microMIPS only cores which don't support the normal MIPS instruction set, so be sure to pass the -mmicromips flag through to the VDSO cflags. Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.4.x- Patchwork: https://patchwork.linux-mips.org/patch/13349/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28	MIPS: Fix sigreturn via VDSO on microMIPS kernel	James Hogan	1	-8/+0
	In microMIPS kernels, handle_signal() sets the isa16 mode bit in the vdso address so that the sigreturn trampolines (which are offset from the VDSO) get executed as microMIPS. However commit ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") changed the offsets to come from the VDSO image, which already have the isa16 mode bit set correctly since they're extracted from the VDSO shared library symbol table. Drop the isa16 mode bit handling from handle_signal() to fix sigreturn for cores which support both microMIPS and normal MIPS. This doesn't fix microMIPS only cores, since the VDSO is still built for normal MIPS, but thats a separate problem. Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.4.x- Patchwork: https://patchwork.linux-mips.org/patch/13348/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28	MIPS: devicetree: fix cpu interrupt controller node-names	Antony Pavlov	7	-7/+7
	Here is the quote from [1]: The unit-address must match the first address specified in the reg property of the node. If the node has no reg property, the @ and unit-address must be omitted and the node-name alone differentiates the node from other nodes at the same level This patch adjusts MIPS dts-files and devicetree binding documentation in accordance with [1]. [1] Power.org(tm) Standard for Embedded Power Architecture(tm) Platform Requirements (ePAPR). Version 1.1 – 08 April 2011. Chapter 2.2.1.1 Node Name Requirements Signed-off-by: Antony Pavlov <antonynpavlov@gmail.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ian Campbell <ijc+devicetree@hellion.org.uk> Cc: Kumar Gala <galak@codeaurora.org> Cc: linux-mips@linux-mips.org Cc: devicetree@vger.kernel.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/13345/ Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28	MIPS: VDSO: Build with `-fno-strict-aliasing'	Maciej W. Rozycki	1	-1/+2
	Avoid an aliasing issue causing a build error in VDSO: In file included from include/linux/srcu.h:34:0, from include/linux/notifier.h:15, from ./arch/mips/include/asm/uprobes.h:9, from include/linux/uprobes.h:61, from include/linux/mm_types.h:13, from ./arch/mips/include/asm/vdso.h:14, from arch/mips/vdso/vdso.h:27, from arch/mips/vdso/gettimeofday.c:11: include/linux/workqueue.h: In function 'work_static': include/linux/workqueue.h:186:2: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing] return work_data_bits(work) & WORK_STRUCT_STATIC; ^ cc1: all warnings being treated as errors make[2]: ** [arch/mips/vdso/gettimeofday.o] Error 1 with a CONFIG_DEBUG_OBJECTS_WORK configuration and GCC 5.2.0. Include `-fno-strict-aliasing' along with compiler options used, as required for kernel code, fixing a problem present since the introduction of VDSO with commit ebb5e78cc634 ("MIPS: Initial implementation of a VDSO"). Thanks to Tejun for diagnosing this properly! Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Reviewed-by: James Hogan <james.hogan@imgtec.com> Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") Cc: Tejun Heo <tj@kernel.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.3+ Patchwork: https://patchwork.linux-mips.org/patch/13357/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28	MIPS: Pistachio: Enable KASLR	Matt Redfearn	2	-2/+7
	Allow KASLR to be selected on Pistachio based systems. Tested on a Creator Ci40. Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com> Reviewed-by: James Hogan <james.hogan@imgtec.com> Cc: Andrew Bresticker <abrestic@chromium.org> Cc: Jonas Gorski <jogo@openwrt.org> Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13356/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28	MIPS: lib: Mark intrinsics notrace	Harvey Hunt	7	-7/+7
	On certain MIPS32 devices, the ftrace tracer "function_graph" uses __lshrdi3() during the capturing of trace data. ftrace then attempts to trace __lshrdi3() which leads to infinite recursion and a stack overflow. Fix this by marking __lshrdi3() as notrace. Mark the other compiler intrinsics as notrace in case the compiler decides to use them in the ftrace path. Signed-off-by: Harvey Hunt <harvey.hunt@imgtec.com> Cc: <linux-mips@linux-mips.org> Cc: <linux-kernel@vger.kernel.org> Cc: <stable@vger.kernel.org> # 4.2.x- Patchwork: https://patchwork.linux-mips.org/patch/13354/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28	MIPS: Fix 64-bit HTW configuration	James Hogan	1	-2/+12
	The Hardware page Table Walker (HTW) is being misconfigured on 64-bit kernels. The PWSize.PS (pointer size) bit determines whether pointers within directories are loaded as 32-bit or 64-bit addresses, but was never being set to 1 for 64-bit kernels where the unsigned long in pgd_t is 64-bits wide. This actually reduces rather than improves performance when the HTW is enabled on P6600 since the HTW is initiated lots, but walks are all aborted due I think to bad intermediate pointers. Since we were already taking the width of the PTEs into account by setting PWSize.PTEW, which is the left shift applied to the page table index in addition to the native pointer size, we also need to reduce PTEW by 1 when PS=1. This is done by calculating PTEW based on the relative size of pte_t compared to pgd_t. Finally in order for the HTW to be used when PS=1, the appropriate XK/XS/XU bits corresponding to the different 64-bit segments need to be set in PWCtl. We enable only XU for now to enable walking for XUSeg. Supporting walking for XKSeg would be a bit more involved so is left for a future patch. It would either require the use of a per-CPU top level base directory if supported by the HTW (a bit like pgd_current but with a second entry pointing at swapper_pg_dir), or the HTW would prepend bit 63 of the address to the global directory index which doesn't really match how we split user and kernel page directories. Fixes: cab25bc7537b ("MIPS: Extend hardware table walking support to MIPS64") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13364/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28	MIPS: Add 64-bit HTW fields	James Hogan	2	-2/+14
	Add field definitions for some of the 64-bit specific Hardware page Table Walker (HTW) register fields in PWSize and PWCtl, in preparation for fixing the 64-bit HTW configuration. Also print these fields out along with the others in print_htw_config(). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13363/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28	MAINTAINERS: Add file patterns for mips device tree bindings	Geert Uytterhoeven	1	-0/+1
	Submitters of device tree binding documentation may forget to CC the subsystem maintainer if this is missing. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: linux-mips@linux-mips.org Cc: devicetree@vger.kernel.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/13340/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28	MAINTAINERS: Add file patterns for mips brcm device tree bindings	Geert Uytterhoeven	1	-0/+1
	Submitters of device tree binding documentation may forget to CC the subsystem maintainer if this is missing. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Hauke Mehrtens <hauke@hauke-m.de> Cc: Rafał Miłecki <zajec5@gmail.com> Cc: linux-mips@linux-mips.org Cc: devicetree@vger.kernel.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/13339/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28	MIPS: Simplify DSP instruction encoding macros	James Hogan	1	-90/+17
	Simplify the DSP instruction wrapper macros which use explicit encodings for microMIPS and normal MIPS by using the new encoding macros and removing duplication. To me this makes it easier to read since it is much shorter, but it also ensures .insn is used, preventing objdump disassembling the microMIPS code as normal MIPS. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13314/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>