wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2017-03-09	selftests/powerpc: Replace stxvx and lxvx with stxvd2x/lxvd2x	Cyril Bur	1	-24/+24
	On POWER8 (ISA 2.07) lxvx and stxvx are defined to be extended mnemonics of lxvd2x and stxvd2x. For POWER9 (ISA 3.0) the HW architects in their infinite wisdom made lxvx and stxvx instructions in their own right. POWER9 aware GCC will use the POWER9 instruction for lxvx and stxvx causing these selftests to fail on POWER8. Further compounding the issue, because of the way -mvsx works it will cause the power9 instructions to be used regardless of -mcpu=power8 to GCC or -mpower8 to AS. The safest way to address the problem for now is to not use the extended mnemonic. We don't care how the CPU loads the values from memory since the tests only performs register comparisons, so using stdvd2x/lxvd2x does not impact the test. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Acked-by: Balbir Singh<bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-09	powerpc/perf: Handle sdar_mode for marked event in power9	Madhavan Srinivasan	2	-7/+37
	MMCRA[SDAR_MODE] specifices how the SDAR should be updated in continous sampling mode. On P9 it must be set to 0b00 when MMCRA[63] is set. Fixes: c7c3f568beff2 ('powerpc/perf: macros for power9 format encoding') Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-09	powerpc/perf: Fix perf_get_data_addr() for power9 DD1	Madhavan Srinivasan	1	-0/+2
	Power9 DD1 do not support PMU_HAS_SIER flag and sdsync in perf_get_data_addr() defaults to MMCRA_SDSYNC which is wrong. Since power9 MMCRA does not support SDSYNC bit, patch includes PPMU_NO_SIAR flag to the check and set the sdsync with MMCRA_SAMPLE_ENABLE; Fixes: 27593d72c4ad ("powerpc/perf: Use MSR to report privilege level on P9 DD1") Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-08	powerpc/boot: Fix zImage TOC alignment	Michael Ellerman	1	-0/+1
	Recent toolchains force the TOC to be 256 byte aligned. We need to enforce this alignment in the zImage linker script, otherwise pointers to our TOC variables (__toc_start) could be incorrect. If the actual start of the TOC and __toc_start don't have the same value we crash early in the zImage wrapper. Cc: stable@vger.kernel.org Suggested-by: Alan Modra <amodra@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-06	ucount: Remove the atomicity from ucount->count	Eric W. Biederman	2	-8/+12
	Always increment/decrement ucount->count under the ucounts_lock. The increments are there already and moving the decrements there means the locking logic of the code is simpler. This simplification in the locking logic fixes a race between put_ucounts and get_ucounts that could result in a use-after-free because the count could go zero then be found by get_ucounts and then be freed by put_ucounts. A bug presumably this one was found by a combination of syzkaller and KASAN. JongWhan Kim reported the syzkaller failure and Dmitry Vyukov spotted the race in the code. Cc: stable@vger.kernel.org Fixes: f6b2db1a3e8d ("userns: Make the count of user namespaces per user") Reported-by: JongHwan Kim <zzoru007@gmail.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-03-06	powerpc: Sort the selects under CONFIG_PPC	Michael Ellerman	1	-66/+72
	We have a big list of selects under CONFIG_PPC, and currently they're completely unsorted. This means people tend to add new selects at the bottom of the list, and so two commits which both add a new select will often conflict. Instead sort it alphabetically. This is nicer in and of itself, but also means two commits that add a new select will have a greater chance of not conflicting. Add a note at the top and bottom asking people to keep it sorted. And while we're here pad out the 'if' expressions to make them stand out. Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-06	powerpc/64: Fix L1D cache shape vector reporting L1I values	Michael Ellerman	1	-2/+2
	It seems we didn't pay quite enough attention when testing the new cache shape vectors, which means we didn't notice the bug where the vector for the L1D was using the L1I values. Fix it, resulting in eg: L1I cache size: 0x8000 32768B 32K L1I line size: 0x80 8-way associative L1D cache size: 0x10000 65536B 64K L1D line size: 0x80 8-way associative Fixes: 98a5f361b862 ("powerpc: Add new cache geometry aux vectors") Cut-and-paste-bug-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Badly-reviewed-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-06	powerpc/64: Avoid panic during boot due to divide by zero in init_cache_info()	Anton Blanchard	1	-1/+4
	I see a panic in early boot when building with a recent gcc toolchain. The issue is a divide by zero, which is undefined. Older toolchains let us get away with it: int foo(int a) { return a / 0; } foo: li 9,0 divw 3,3,9 extsw 3,3 blr But newer ones catch it: foo: trap Add a check to avoid the divide by zero. Fixes: e2827fe5c156 ("powerpc/64: Clean up ppc64_caches using a struct per cache") Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-06	powerpc: Update to new option-vector-5 format for CAS	Suraj Jitindar Singh	3	-14/+150
	On POWER9 the ibm,client-architecture-support (CAS) negotiation process has been updated to change how the host to guest negotiation is done for the new hash/radix mmu as well as the nest mmu, process tables and guest translation shootdown (GTSE). This is documented in the unreleased PAPR ACR "CAS option vector additions for P9". The host tells the guest which options it supports in ibm,arch-vec-5-platform-support. The guest then chooses a subset of these to request in the CAS call and these are agreed to in the ibm,architecture-vec-5 property of the chosen node. Thus we read ibm,arch-vec-5-platform-support and make our selection before calling CAS. We then parse the ibm,architecture-vec-5 property of the chosen node to check whether we should run as hash or radix. ibm,arch-vec-5-platform-support format: index value pairs: <index, val> ... <index, val> index: Option vector 5 byte number val: Some representation of supported values Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> [mpe: Don't print about unknown options, be consistent with OV5_FEAT] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-06	powerpc: Parse the command line before calling CAS	Suraj Jitindar Singh	1	-5/+5
	On POWER9 the hypervisor requires the guest to decide whether it would like to use a hash or radix mmu model at the time it calls ibm,client-architecture-support (CAS) based on what the hypervisor has said it's allowed to do. It is possible to disable radix by passing "disable_radix" on the command line. The next patch will add support for the new CAS format, thus we need to parse the command line before calling CAS so we can correctly select which mmu we would like to use. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-06	powerpc/xics: Work around limitations of OPAL XICS priority handling	Balbir Singh	2	-3/+24
	The CPPR (Current Processor Priority Register) of a XICS interrupt presentation controller contains a value N, such that only interrupts with a priority "more favoured" than N will be received by the CPU, where "more favoured" means "less than". So if the CPPR has the value 5 then only interrupts with a priority of 0-4 inclusive will be received. In theory the CPPR can support a value of 0 to 255 inclusive. In practice Linux only uses values of 0, 4, 5 and 0xff. Setting the CPPR to 0 rejects all interrupts, setting it to 0xff allows all interrupts. The values 4 and 5 are used to differentiate IPIs from external interrupts. Setting the CPPR to 5 allows IPIs to be received but not external interrupts. The CPPR emulation in the OPAL XICS implementation only directly supports priorities 0 and 0xff. All other priorities are considered equivalent, and mapped to a single priority value internally. This means when using icp-opal we can not allow IPIs but not externals. This breaks Linux's use of priority values when a CPU is hot unplugged. After migrating IRQs away from the CPU that is being offlined, we set the priority to 5, meaning we still want the offline CPU to receive IPIs. But the effect of the OPAL XICS emulation's use of a single priority value is that all interrupts are rejected by the CPU. With the CPU offline, and not receiving IPIs, we may not be able to wake it up to bring it back online. The first part of the fix is in icp_opal_set_cpu_priority(). CPPR values of 0 to 4 inclusive will correctly cause all interrupts to be rejected, so we pass those CPPR values through to OPAL. However if we are called with a CPPR of 5 or greater, the caller is expecting to be able to allow IPIs but not external interrupts. We know this doesn't work, so instead of rejecting all interrupts we choose the opposite which is to allow all interrupts. This is still not correct behaviour, but we know for the only existing caller (xics_migrate_irqs_away()), that it is the better option. The other part of the fix is in xics_migrate_irqs_away(). Instead of setting priority (CPPR) to 0, and then back to 5 before migrating IRQs, we migrate the IRQs before setting the priority back to 5. This should have no effect on an ICP backend with a working set_priority(), and on icp-opal it means we will keep all interrupts blocked until after we've finished doing the IRQ migration. Additionally we wait for 5ms after doing the migration to make sure there are no IRQs in flight. Fixes: d74361881f0d ("powerpc/xics: Add ICP OPAL backend") Cc: stable@vger.kernel.org # v4.8+ Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Reported-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Tested-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Balbir Singh <bsingharora@gmail.com> [mpe: Rewrote comments and change log, change delay to 5ms] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-05	Linux 4.11-rc1	Linus Torvalds	1	-2/+2

2017-03-04	powerpc/64: Fix checksum folding in csum_add()	Shile Zhang	1	-1/+1
	Paul's patch to fix checksum folding, commit b492f7e4e07a ("powerpc/64: Fix checksum folding in csum_tcpudp_nofold and ip_fast_csum_nofold") missed a case in csum_add(). Fix it. Signed-off-by: Shile Zhang <shile.zhang@nokia.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-04	powerpc/powernv: Fix opal tracepoints with JUMP_LABEL=n	Alexey Kardashevskiy	1	-2/+2
	The recent commit to allow calling OPAL calls in real mode, commit ab9bad0ead9a ("powerpc/powernv: Remove separate entry for OPAL real mode calls"), introduced a bug when CONFIG_JUMP_LABEL=n. The commit moved the "mfmsr r12" prior to the call to OPAL_BRANCH, but we missed that OPAL_BRANCH clobbers r12 when jump labels are disabled. This leads to us using the tracepoint refcount as the MSR value, typically zero, and saving that into PACASAVEDMSR. When we return from OPAL we use that value as the MSR value for rfid, meaning we switch to 32-bit BE real mode - hilarity ensues. Fix it by using r11 in OPAL_BRANCH, which is not live at the time the macro is used in OPAL_CALL. Fixes: ab9bad0ead9a ("powerpc/powernv: Remove separate entry for OPAL real mode calls") Suggested-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-03-03	strparser: destroy workqueue on module exit	WANG Cong	1	-0/+1
	Fixes: 43a0c6751a32 ("strparser: Stream parser for messages") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-03	Documentation/sphinx: fix primary_domain configuration	John Keeping	1	-1/+1
	With Sphinx 1.5.3 I get the warning: WARNING: primary_domain 'C' not found, ignored. It seems that domain names in Sphinx are case-sensitive and for the C domain the name must be lower case. Signed-off-by: John Keeping <john@metanate.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-03-03	docs: Fix htmldocs build failure	Martyn Welch	1	-2/+2
	Build of HTML docs failing due to conversion of deviceiobook.tmpl in 8a8a602f and regulator.tmpl in 028f2533 to RST without removing from DOCBOOKS in Makefile, resulting (in the case of deviceiobook) the following error: make[1]: * No rule to make target 'Documentation/DocBook/deviceiobook.xml', needed by 'Documentation/DocBook/deviceiobook.aux.xml'. Stop. Makefile:1452: recipe for target 'htmldocs' failed make: * [htmldocs] Error 2 Update DOCBOOKS to reflect available books. Signed-off-by: Martyn Welch <martyn.welch@collabora.co.uk> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-03-03	doc/ko_KR/memory-barriers: Update control-dependencies section	SeongJae Park	1	-31/+37
	This commit applies upstream change, commit c8241f8553e8 ("doc: Update control-dependencies section of memory-barriers.txt"), to Korean translation. Signed-off-by: SeongJae Park <sj38.park@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-03-03	pcieaer doc: update the link	Cao jin	1	-1/+1
	The original link is empty, replace it. Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-03-03	Documentation: Update path to sysrq.txt	Krzysztof Kozlowski	4	-6/+6
	Commit 9d85025b0418 ("docs-rst: create an user's manual book") moved the sysrq.txt leaving old paths in the kernel docs. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-03-03	sfc: fix IPID endianness in TSOv2	Edward Cree	1	-1/+1
	The value we read from the header is in network byte order, whereas EFX_POPULATE_QWORD_* takes values in host byte order (which it then converts to little-endian, as MCDI is little-endian). Fixes: e9117e5099ea ("sfc: Firmware-Assisted TSO version 2") Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-03	sfc: avoid max() in array size	Edward Cree	1	-5/+5
	It confuses sparse, which thinks the size isn't constant. Let's achieve the same thing with a BUILD_BUG_ON, since we know which one should be bigger and don't expect them ever to change. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-03	rds: remove unnecessary returned value check	Zhu Yanjun	4	-14/+4
	The function rds_trans_register always returns 0. As such, it is not necessary to check the returned value. Cc: Joe Jin <joe.jin@oracle.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-03	rxrpc: Fix potential NULL-pointer exception	David Howells	1	-7/+8
	Fix a potential NULL-pointer exception in rxrpc_do_sendmsg(). The call state check that I added should have gone into the else-body of the if-statement where we actually have a call to check. Found by CoverityScan CID#1414316 ("Dereference after null check"). Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg") Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-03	nfp: correct DMA direction in XDP DMA sync	Jakub Kicinski	1	-2/+2
	dma_sync_single_for_*() takes the direction in which the buffer was mapped, not the direction of the sync. We should sync XDP buffers bidirectionally. Fixes: ecd63a0217d5 ("nfp: add XDP support in the driver") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-03	nfp: don't tell FW about the reserved buffer space	Jakub Kicinski	1	-1/+2
	Since commit c0f031bc8866 ("nfp_net: use alloc_frag() and build_skb()") we are allocating buffers which have to hold both the data and skb to be created in place by build_skb(). FW should only be told about the buffer space it can DMA to, that is without the build_skb() headroom and tailroom. Note: firmware applications should validate the buffers against both MTU and free list buffer size so oversized packets would not pass through the NIC anyway. Fixes: c0f031bc8866 ("nfp: use alloc_frag() and build_skb()") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-03	net: ethernet: bgmac: mac address change bug	Hari Vyas	1	-1/+5
	ndo_set_mac_address() passes struct sockaddr * as 2nd parameter to bgmac_set_mac_address() but code assumed u8 *. This caused two bytes chopping and the wrong mac address was configured. Signed-off-by: Hari Vyas <hariv@broadcom.com> Signed-off-by: Jon Mason <jon.mason@broadcom.com> Fixes: 4e209001b86 ("bgmac: write mac address to hardware in ndo_set_mac_address") Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-03	net: ethernet: bgmac: init sequence bug	Jon Mason	2	-9/+34
	Fix a bug in the 'bgmac' driver init sequence that blind writes for init sequence where it should preserve most bits other than the ones it is deliberately manipulating. The code now checks to see if the adapter needs to be brought out of reset (where as before it was doing an IDM write to bring it out of reset regardless of whether it was in reset or not). Also, removed unnecessary usleeps (as there is already a read present to flush the IDM writes). Signed-off-by: Zac Schroff <zschroff@broadcom.com> Signed-off-by: Jon Mason <jon.mason@broadcom.com> Fixes: f6a95a24957 ("net: ethernet: bgmac: Add platform device support") Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-03	xen-netback: don't vfree() queues under spinlock	Paul Durrant	1	-1/+4
	This leads to a BUG of the following form: [ 174.512861] switch: port 2(vif3.0) entered disabled state [ 174.522735] BUG: sleeping function called from invalid context at /home/build/linux-linus/mm/vmalloc.c:1441 [ 174.523451] in_atomic(): 1, irqs_disabled(): 0, pid: 28, name: xenwatch [ 174.524131] CPU: 1 PID: 28 Comm: xenwatch Tainted: G W 4.10.0upstream-11073-g4977ab6-dirty #1 [ 174.524819] Hardware name: MSI MS-7680/H61M-P23 (MS-7680), BIOS V17.0 03/14/2011 [ 174.525517] Call Trace: [ 174.526217] show_stack+0x23/0x60 [ 174.526899] dump_stack+0x5b/0x88 [ 174.527562] ___might_sleep+0xde/0x130 [ 174.528208] __might_sleep+0x35/0xa0 [ 174.528840] ? _raw_spin_unlock_irqrestore+0x13/0x20 [ 174.529463] ? __wake_up+0x40/0x50 [ 174.530089] remove_vm_area+0x20/0x90 [ 174.530724] __vunmap+0x1d/0xc0 [ 174.531346] ? delete_object_full+0x13/0x20 [ 174.531973] vfree+0x40/0x80 [ 174.532594] set_backend_state+0x18a/0xa90 [ 174.533221] ? dwc_scan_descriptors+0x24d/0x430 [ 174.533850] ? kfree+0x5b/0xc0 [ 174.534476] ? xenbus_read+0x3d/0x50 [ 174.535101] ? xenbus_read+0x3d/0x50 [ 174.535718] ? xenbus_gather+0x31/0x90 [ 174.536332] ? ___might_sleep+0xf6/0x130 [ 174.536945] frontend_changed+0x6b/0xd0 [ 174.537565] xenbus_otherend_changed+0x7d/0x80 [ 174.538185] frontend_changed+0x12/0x20 [ 174.538803] xenwatch_thread+0x74/0x110 [ 174.539417] ? woken_wake_function+0x20/0x20 [ 174.540049] kthread+0xe5/0x120 [ 174.540663] ? xenbus_printf+0x50/0x50 [ 174.541278] ? __kthread_init_worker+0x40/0x40 [ 174.541898] ret_from_fork+0x21/0x2c [ 174.548635] switch: port 2(vif3.0) entered disabled state This patch defers the vfree() until after the spinlock is released. Reported-by: Juergen Gross <jgross@suse.com> Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-03	xen-netback: keep a local pointer for vif in backend_disconnect()	Paul Durrant	1	-14/+18
	This patch replaces use of 'be->vif' with 'vif' and hence generally makes the function look tidier. No semantic change. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-03	ftrace/graph: Add ftrace_graph_max_depth kernel parameter	Todd Brandt	2	-0/+15
	Early trace callgraphs can be extremely large on systems with several seconds of boot time. The max_depth parameter limits how deep the graph trace goes and reduces the output size. This parameter is the same as the max_graph_depth file in tracefs. Link: http://lkml.kernel.org/r/1488499935-23216-1-git-send-email-todd.e.brandt@linux.intel.com Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com> [ changed comments about debugfs to tracefs ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-03-03	tracing: Add #undef to fix compile error	Rik van Riel	1	-0/+1
	There are several trace include files that define TRACE_INCLUDE_FILE. Include several of them in the same .c file (as I currently have in some code I am working on), and the compile will blow up with a "warning: "TRACE_INCLUDE_FILE" redefined #define TRACE_INCLUDE_FILE syscalls" Every other include file in include/trace/events/ avoids that issue by having a #undef TRACE_INCLUDE_FILE before the #define; syscalls.h should have one, too. Link: http://lkml.kernel.org/r/20160928225554.13bd7ac6@annuminas.surriel.com Cc: stable@vger.kernel.org Fixes: b8007ef74222 ("tracing: Separate raw syscall from syscall tracer") Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-03-03	jump_label: Add comment about initialization order for anonymous unions	Steven Rostedt (VMware)	1	-0/+7
	Commit 3821fd35b58d ("jump_label: Reduce the size of struct static_key") broke old compilers that could not handle static initialization of anonymous unions. Boris fixed it with a patch that added brackets around the static initializer. But this creates a dependency between those initializers and the structure's order of its fields. Document this dependency in case new fields are added to struct static_key in the future. Noted-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Suggested-by: Chris Mason <clm@fb.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-03-03	jump_label: Fix anonymous union initialization	Boris Ostrovsky	1	-2/+2
	Pre-4.6 gcc do not allow direct static initialization of members of anonymous structs/unions. After commit 3821fd35b58d ("jump_label: Reduce the size of struct static_key") STATIC_KEY_INIT_{TRUE\|FALSE} definitions cannot be compiled with those older compilers. Placing initializers inside curved brackets works around this problem. Link: http://lkml.kernel.org/r/1488299542-30765-1-git-send-email-boris.ostrovsky@oracle.com Fixes: 3821fd35b58d ("jump_label: Reduce the size of struct static_key") Reviewed-by: Jason Baron <jbaron@akamai.com> Compiled-by: Chris Mason <clm@fb.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-03-03	module: set __jump_table alignment to 8	David Daney	1	-0/+2
	For powerpc the __jump_table section in modules is not aligned, this causes a WARN_ON() splat when loading a module containing a __jump_table. Strict alignment became necessary with commit 3821fd35b58d ("jump_label: Reduce the size of struct static_key"), currently in linux-next, which uses the two least significant bits of pointers to __jump_table elements. Fix by forcing __jump_table to 8, which is the same alignment used for this section in the kernel proper. Link: http://lkml.kernel.org/r/20170301220453.4756-1-david.daney@cavium.com Reviewed-by: Jason Baron <jbaron@akamai.com> Acked-by: Jessica Yu <jeyu@redhat.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-03-03	ftrace/graph: Do not modify the EMPTY_HASH for the function_graph filter	Steven Rostedt (VMware)	1	-4/+8
	On boot up, if the kernel command line sets a graph funtion with the kernel command line options "ftrace_graph_filter" or "ftrace_graph_notrace" then it updates the corresponding function graph hash, ftrace_graph_hash or ftrace_graph_notrace_hash respectively. Unfortunately, at boot up, these variables are pointers to the "EMPTY_HASH" which is a constant used as a placeholder when a hash has no entities. The problem was that the comand line version to set the hashes updated the actual EMPTY_HASH instead of creating a new hash for the function graph. This broke the EMPTY_HASH because not only did it modify a constant (not sure how that was allowed to happen, except maybe because it was done at early boot, const variables were still mutable), but it made the filters have functions listed in them when they were actually empty. The kernel command line function needs to allocate a new hash for the function graph filters and assign the necessary variables to that new hash instead. Link: http://lkml.kernel.org/r/1488420091.7212.17.camel@linux.intel.com Cc: Namhyung Kim <namhyung@kernel.org> Fixes: b9b0c831bed2 ("ftrace: Convert graph filter to use hash tables") Reported-by: Todd Brandt <todd.e.brandt@linux.intel.com> Tested-by: Todd Brandt <todd.e.brandt@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-03-03	netfilter: nf_tables: don't call nfnetlink_set_err() if nfnetlink_send() fails	Pablo Neira Ayuso	2	-81/+58
	The underlying nlmsg_multicast() already sets sk->sk_err for us to notify socket overruns, so we should not do anything with this return value. So we just call nfnetlink_set_err() if: 1) We fail to allocate the netlink message. or 2) We don't have enough space in the netlink message to place attributes, which means that we likely need to allocate a larger message. Before this patch, the internal ESRCH netlink error code was propagated to userspace, which is quite misleading. Netlink semantics mandate that listeners just hit ENOBUFS if the socket buffer overruns. Reported-by: Alexander Alemayhu <alexander@alemayhu.com> Tested-by: Alexander Alemayhu <alexander@alemayhu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-03-03	netfilter: nft_set_rbtree: incorrect assumption on lower interval lookups	Pablo Neira Ayuso	1	-5/+4
	In case of adjacent ranges, we may indeed see either the high part of the range in first place or the low part of it. Remove this incorrect assumption, let's make sure we annotate the low part of the interval in case of we have adjacent interva intervals so we hit a matching in lookups. Reported-by: Simon Hanisch <hanisch@wh2.tu-dresden.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-03-03	netfilter: nf_conntrack_sip: fix wrong memory initialisation	Christophe Leroy	1	-2/+0
	In commit 82de0be6862cd ("netfilter: Add helper array register/unregister functions"), struct nf_conntrack_helper sip[MAX_PORTS][4] was changed to sip[MAX_PORTS * 4], so the memory init should have been changed to memset(&sip[4 * i], 0, 4 * sizeof(sip[i])); But as the sip[] table is allocated in the BSS, it is already set to 0 Fixes: 82de0be6862cd ("netfilter: Add helper array register/unregister functions") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-03-03	can: flexcan: fix typo in comment	Marc Kleine-Budde	1	-1/+1
	This patch fixes the typo "Disble" -> "Disable". Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-03-03	can: usb_8dev: Fix memory leak of priv->cmd_msg_buffer	Marc Kleine-Budde	1	-6/+3
	The priv->cmd_msg_buffer is allocated in the probe function, but never kfree()ed. This patch converts the kzalloc() to resource-managed kzalloc. Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-03-03	can: gs_usb: fix coding style	Ethan Zonca	1	-6/+5
	This patch fixes five minor style issues, spaces are between bitwise OR operators. Signed-off-by: Ethan Zonca <e@ethanzonca.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-03-03	can: gs_usb: Don't use stack memory for USB transfers	Ethan Zonca	1	-11/+29
	Fixes: 05ca5270005c can: gs_usb: add ethtool set_phys_id callback to locate physical device The gs_usb driver is performing USB transfers using buffers allocated on the stack. This causes the driver to not function with vmapped stacks. Instead, allocate memory for the transfer buffers. Signed-off-by: Ethan Zonca <e@ethanzonca.com> Cc: linux-stable <stable@vger.kernel.org> # >= v4.8 Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-03-02	smb2: Enforce sec= mount option	Sachin Prabhu	8	-7/+49
	If the security type specified using a mount option is not supported, the SMB2 session setup code changes the security type to RawNTLMSSP. We should instead fail the mount and return an error. The patch changes the code for SMB2 to make it similar to the code used for SMB1. Like in SMB1, we now use the global security flags to select the security method to be used when no security method is specified and to return an error when the requested auth method is not available. For SMB2, we also use ntlmv2 as a synonym for nltmssp. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2017-03-02	CIFS: Fix sparse warnings	Steve French	2	-4/+4
	Fix two minor sparse compile check warnings Signed-off-by: Steve French <steve.french@primarydata.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>
2017-03-02	ixgbe: Limit use of 2K buffers on architectures with 256B or larger cache lines	Alexander Duyck	2	-2/+3
	On architectures that have a cache line size larger than 64 Bytes we start running into issues where the amount of headroom for the frame starts shrinking. The size of skb_shared_info on a system with a 64B L1 cache line size is 320. This increases to 384 with a 128B cache line, and 512 with a 256B cache line. In addition the NET_SKB_PAD value increases as well consistent with the cache line size. As a result when we get to a 256B cache line as seen on the s390 we end up 768 bytes used by padding and shared info leaving us with only 1280 bytes to use for data storage. On architectures such as this we should default to using 3K Rx buffers out of a 8K page instead of trying to do 1.5K buffers out of a 4K page. To take all of this into account I have added one small check so that we compare the max_frame to the amount of actual data we can store. This was already occurring for igb, but I had overlooked it for ixgbe as it doesn't have strict limits for 82599 once we enable jumbo frames. By adding this check we will automatically enable 3K Rx buffers as soon as the maximum frame size we can handle drops below the standard Ethernet MTU. I also went through and fixed one small typo that I found where I had left an IGB in a variable name due to a copy/paste error. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-02	statx: Add a system call to make enhanced file info available	David Howells	72	-214/+822
	Add a system call to make extended file information available, including file creation and some attribute flags where available through the underlying filesystem. The getattr inode operation is altered to take two additional arguments: a u32 request_mask and an unsigned int flags that indicate the synchronisation mode. This change is propagated to the vfs_getattr() function. Functions like vfs_stat() are now inline wrappers around new functions vfs_statx() and vfs_statx_fd() to reduce stack usage. ======== OVERVIEW ======== The idea was initially proposed as a set of xattrs that could be retrieved with getxattr(), but the general preference proved to be for a new syscall with an extended stat structure. A number of requests were gathered for features to be included. The following have been included: (1) Make the fields a consistent size on all arches and make them large. (2) Spare space, request flags and information flags are provided for future expansion. (3) Better support for the y2038 problem [Arnd Bergmann] (tv_sec is an __s64). (4) Creation time: The SMB protocol carries the creation time, which could be exported by Samba, which will in turn help CIFS make use of FS-Cache as that can be used for coherency data (stx_btime). This is also specified in NFSv4 as a recommended attribute and could be exported by NFSD [Steve French]. (5) Lightweight stat: Ask for just those details of interest, and allow a netfs (such as NFS) to approximate anything not of interest, possibly without going to the server [Trond Myklebust, Ulrich Drepper, Andreas Dilger] (AT_STATX_DONT_SYNC). (6) Heavyweight stat: Force a netfs to go to the server, even if it thinks its cached attributes are up to date [Trond Myklebust] (AT_STATX_FORCE_SYNC). And the following have been left out for future extension: (7) Data version number: Could be used by userspace NFS servers [Aneesh Kumar]. Can also be used to modify fill_post_wcc() in NFSD which retrieves i_version directly, but has just called vfs_getattr(). It could get it from the kstat struct if it used vfs_xgetattr() instead. (There's disagreement on the exact semantics of a single field, since not all filesystems do this the same way). (8) BSD stat compatibility: Including more fields from the BSD stat such as creation time (st_btime) and inode generation number (st_gen) [Jeremy Allison, Bernd Schubert]. (9) Inode generation number: Useful for FUSE and userspace NFS servers [Bernd Schubert]. (This was asked for but later deemed unnecessary with the open-by-handle capability available and caused disagreement as to whether it's a security hole or not). (10) Extra coherency data may be useful in making backups [Andreas Dilger]. (No particular data were offered, but things like last backup timestamp, the data version number and the DOS archive bit would come into this category). (11) Allow the filesystem to indicate what it can/cannot provide: A filesystem can now say it doesn't support a standard stat feature if that isn't available, so if, for instance, inode numbers or UIDs don't exist or are fabricated locally... (This requires a separate system call - I have an fsinfo() call idea for this). (12) Store a 16-byte volume ID in the superblock that can be returned in struct xstat [Steve French]. (Deferred to fsinfo). (13) Include granularity fields in the time data to indicate the granularity of each of the times (NFSv4 time_delta) [Steve French]. (Deferred to fsinfo). (14) FS_IOC_GETFLAGS value. These could be translated to BSD's st_flags. Note that the Linux IOC flags are a mess and filesystems such as Ext4 define flags that aren't in linux/fs.h, so translation in the kernel may be a necessity (or, possibly, we provide the filesystem type too). (Some attributes are made available in stx_attributes, but the general feeling was that the IOC flags were to ext[234]-specific and shouldn't be exposed through statx this way). (15) Mask of features available on file (eg: ACLs, seclabel) [Brad Boyer, Michael Kerrisk]. (Deferred, probably to fsinfo. Finding out if there's an ACL or seclabal might require extra filesystem operations). (16) Femtosecond-resolution timestamps [Dave Chinner]. (A __reserved field has been left in the statx_timestamp struct for this - if there proves to be a need). (17) A set multiple attributes syscall to go with this. =============== NEW SYSTEM CALL =============== The new system call is: int ret = statx(int dfd, const char filename, unsigned int flags, unsigned int mask, struct statx buffer); The dfd, filename and flags parameters indicate the file to query, in a similar way to fstatat(). There is no equivalent of lstat() as that can be emulated with statx() by passing AT_SYMLINK_NOFOLLOW in flags. There is also no equivalent of fstat() as that can be emulated by passing a NULL filename to statx() with the fd of interest in dfd. Whether or not statx() synchronises the attributes with the backing store can be controlled by OR'ing a value into the flags argument (this typically only affects network filesystems): (1) AT_STATX_SYNC_AS_STAT tells statx() to behave as stat() does in this respect. (2) AT_STATX_FORCE_SYNC will require a network filesystem to synchronise its attributes with the server - which might require data writeback to occur to get the timestamps correct. (3) AT_STATX_DONT_SYNC will suppress synchronisation with the server in a network filesystem. The resulting values should be considered approximate. mask is a bitmask indicating the fields in struct statx that are of interest to the caller. The user should set this to STATX_BASIC_STATS to get the basic set returned by stat(). It should be noted that asking for more information may entail extra I/O operations. buffer points to the destination for the data. This must be 256 bytes in size. ====================== MAIN ATTRIBUTES RECORD ====================== The following structures are defined in which to return the main attribute set: struct statx_timestamp { __s64 tv_sec; __s32 tv_nsec; __s32 __reserved; }; struct statx { __u32 stx_mask; __u32 stx_blksize; __u64 stx_attributes; __u32 stx_nlink; __u32 stx_uid; __u32 stx_gid; __u16 stx_mode; __u16 __spare0[1]; __u64 stx_ino; __u64 stx_size; __u64 stx_blocks; __u64 __spare1[1]; struct statx_timestamp stx_atime; struct statx_timestamp stx_btime; struct statx_timestamp stx_ctime; struct statx_timestamp stx_mtime; __u32 stx_rdev_major; __u32 stx_rdev_minor; __u32 stx_dev_major; __u32 stx_dev_minor; __u64 __spare2[14]; }; The defined bits in request_mask and stx_mask are: STATX_TYPE Want/got stx_mode & S_IFMT STATX_MODE Want/got stx_mode & ~S_IFMT STATX_NLINK Want/got stx_nlink STATX_UID Want/got stx_uid STATX_GID Want/got stx_gid STATX_ATIME Want/got stx_atime{,_ns} STATX_MTIME Want/got stx_mtime{,_ns} STATX_CTIME Want/got stx_ctime{,_ns} STATX_INO Want/got stx_ino STATX_SIZE Want/got stx_size STATX_BLOCKS Want/got stx_blocks STATX_BASIC_STATS [The stuff in the normal stat struct] STATX_BTIME Want/got stx_btime{,_ns} STATX_ALL [All currently available stuff] stx_btime is the file creation time, stx_mask is a bitmask indicating the data provided and __spares[] are where as-yet undefined fields can be placed. Time fields are structures with separate seconds and nanoseconds fields plus a reserved field in case we want to add even finer resolution. Note that times will be negative if before 1970; in such a case, the nanosecond fields will also be negative if not zero. The bits defined in the stx_attributes field convey information about a file, how it is accessed, where it is and what it does. The following attributes map to FS__FL flags and are the same numerical value: STATX_ATTR_COMPRESSED File is compressed by the fs STATX_ATTR_IMMUTABLE File is marked immutable STATX_ATTR_APPEND File is append-only STATX_ATTR_NODUMP File is not to be dumped STATX_ATTR_ENCRYPTED File requires key to decrypt in fs Within the kernel, the supported flags are listed by: KSTAT_ATTR_FS_IOC_FLAGS [Are any other IOC flags of sufficient general interest to be exposed through this interface?] New flags include: STATX_ATTR_AUTOMOUNT Object is an automount trigger These are for the use of GUI tools that might want to mark files specially, depending on what they are. Fields in struct statx come in a number of classes: (0) stx_dev_, stx_blksize. These are local system information and are always available. (1) stx_mode, stx_nlinks, stx_uid, stx_gid, stx_[amc]time, stx_ino, stx_size, stx_blocks. These will be returned whether the caller asks for them or not. The corresponding bits in stx_mask will be set to indicate whether they actually have valid values. If the caller didn't ask for them, then they may be approximated. For example, NFS won't waste any time updating them from the server, unless as a byproduct of updating something requested. If the values don't actually exist for the underlying object (such as UID or GID on a DOS file), then the bit won't be set in the stx_mask, even if the caller asked for the value. In such a case, the returned value will be a fabrication. Note that there are instances where the type might not be valid, for instance Windows reparse points. (2) stx_rdev_*. This will be set only if stx_mode indicates we're looking at a blockdev or a chardev, otherwise will be 0. (3) stx_btime. Similar to (1), except this will be set to 0 if it doesn't exist. ======= TESTING ======= The following test program can be used to test the statx system call: samples/statx/test-statx.c Just compile and run, passing it paths to the files you want to examine. The file is built automatically if CONFIG_SAMPLES is enabled. Here's some example output. Firstly, an NFS directory that crosses to another FSID. Note that the AUTOMOUNT attribute is set because transiting this directory will cause d_automount to be invoked by the VFS. [root@andromeda ~]# /tmp/test-statx -A /warthog/data statx(/warthog/data) = 0 results=7ff Size: 4096 Blocks: 8 IO Block: 1048576 directory Device: 00:26 Inode: 1703937 Links: 125 Access: (3777/drwxrwxrwx) Uid: 0 Gid: 4041 Access: 2016-11-24 09:02:12.219699527+0000 Modify: 2016-11-17 10:44:36.225653653+0000 Change: 2016-11-17 10:44:36.225653653+0000 Attributes: 0000000000001000 (-------- -------- -------- -------- -------- -------- ---m---- --------) Secondly, the result of automounting on that directory. [root@andromeda ~]# /tmp/test-statx /warthog/data statx(/warthog/data) = 0 results=7ff Size: 4096 Blocks: 8 IO Block: 1048576 directory Device: 00:27 Inode: 2 Links: 125 Access: (3777/drwxrwxrwx) Uid: 0 Gid: 4041 Access: 2016-11-24 09:02:12.219699527+0000 Modify: 2016-11-17 10:44:36.225653653+0000 Change: 2016-11-17 10:44:36.225653653+0000 Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-02	ixgbe: update the rss key on h/w, when ethtool ask for it	Paolo Abeni	3	-4/+20
	Currently ixgbe_set_rxfh() updates the rss_key copy in the driver memory, but does not push the new value into the h/w. This commit add a new helper for the latter operation and call it in ixgbe_set_rxfh(), so that the h/w rss key value can be really updated via ethtool. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-03	sched/headers: Clean up <linux/sched.h>	Ingo Molnar	1	-571/+651
	Now that <linux/sched.h> dependencies have been sorted out, do various trivial cleanups: - remove unnecessary structure predeclarations - fix various typos - update comments where necessary - remove pointless comments - use consistent types - tabulate consistently - use a consistent comment style - clean up the header section a bit - use a consistent style of a single field per line - remove line-breaks where they make the code look worse - etc ... No change in functionality. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-03	sched/headers: Remove #ifdefs from <linux/sched.h>	Ingo Molnar	1	-8/+4
	We can remove two pairs of #ifdefs by defining structures in a smarter way. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>