linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2018-02-09	selftests/bpf: add test program for loading BPF ELF files	Jesper Dangaard Brouer	2	-1/+151
	V2: Moved program into selftests/bpf from tools/libbpf This program can be used on its own for testing/debugging if a BPF ELF-object file can be loaded with libbpf (from tools/lib/bpf). If something is wrong with the ELF object, the program have a --debug mode that will display the ELF sections and especially the skipped sections. This allows for quickly identifying the problematic ELF section number, which can be corrolated with the readelf tool. The program signal error via return codes, and also have a --quiet mode, which is practical for use in scripts like selftests/bpf. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-09	tools/libbpf: improve the pr_debug statements to contain section numbers	Jesper Dangaard Brouer	1	-12/+13
	While debugging a bpf ELF loading issue, I needed to correlate the ELF section number with the failed relocation section reference. Thus, add section numbers/index to the pr_debug. In debug mode, also print section that were skipped. This helped me identify that a section (.eh_frame) was skipped, and this was the reason the relocation section (.rel.eh_frame) could not find that section number. The section numbers corresponds to the readelf tools Section Headers [Nr]. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-09	bpf: Sync kernel ABI header with tooling header for bpf_common.h	Jesper Dangaard Brouer	1	-3/+4
	I recently fixed up a lot of commits that forgot to keep the tooling headers in sync. And then I forgot to do the same thing in commit cb5f7334d479 ("bpf: add comments to BPF ld/ldx sizes"). Let correct that before people notice ;-). Lawrence did partly fix/sync this for bpf.h in commit d6d4f60c3a09 ("bpf: add selftest for tcpbpf"). Fixes: cb5f7334d479 ("bpf: add comments to BPF ld/ldx sizes") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08	tools: bpftool: add bash completion for cgroup commands	Quentin Monnet	2	-6/+62
	Add bash completion for "bpftool cgroup" command family. While at it, also fix the formatting of some keywords in the man page for cgroups. Fixes: 5ccda64d38cc ("bpftool: implement cgroup bpf operations") Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08	tools: bpftool: add bash completion for `bpftool prog load`	Quentin Monnet	1	-2/+6
	Add bash completion for bpftool command `prog load`. Completion for this command is easy, as it only takes existing file paths as arguments. Fixes: 49a086c201a9 ("bpftool: implement prog load command") Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08	tools: bpftool: make syntax for program map update explicit in man page	Quentin Monnet	1	-1/+2
	Specify in the documentation that when using bpftool to update a map of type BPF_MAP_TYPE_PROG_ARRAY, the syntax for the program used as a value should use the "id\|tag\|pinned" keywords convention, as used with "bpftool prog" commands. Fixes: ff69c21a85a4 ("tools: bpftool: add documentation") Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08	tools: bpftool: exit doc Makefile early if rst2man is not available	Quentin Monnet	1	-0/+5
	If rst2man is not available on the system, running `make doc` from the bpftool directory fails with an error message. However, it creates empty manual pages (.8 files in this case). A subsequent call to `make doc-install` would then succeed and install those empty man pages on the system. To prevent this, raise a Makefile error and exit immediately if rst2man is not available before generating the pages from the rst documentation. Fixes: ff69c21a85a4 ("tools: bpftool: add documentation") Reported-by: Jason van Aaardt <jason.vanaardt@netronome.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08	libbpf: complete list of strings for guessing program type	Quentin Monnet	1	-0/+5
	It seems that the type guessing feature for libbpf, based on the name of the ELF section the program is located in, was inspired from samples/bpf/prog_load.c, which was not used by any sample for loading programs of certain types such as TC actions and classifiers, or LWT-related types. As a consequence, libbpf is not able to guess the type of such programs and to load them automatically if type is not provided to the `bpf_load_prog()` function. Add ELF section names associated to those eBPF program types so that they can be loaded with e.g. bpftool as well. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08	nfp: bpf: fix immed relocation for larger offsets	Jakub Kicinski	1	-1/+1
	Immed relocation is missing a shift which means for larger offsets the lower and higher part of the address would be ORed together. Fixes: ce4ebfd859c3 ("nfp: bpf: add helpers for updating immediate instructions") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08	selftests: bpf: test_kmod.sh: check the module path before insmod	Naresh Kamboju	1	-3/+15
	test_kmod.sh reported false failure when module not present. Check test_bpf.ko is present in the path before loading it. Two cases to be addressed here, In the development process of test_bpf.c unit testing will be done by developers by using "insmod $SRC_TREE/lib/test_bpf.ko" On the other hand testers run full tests by installing modules on device under test (DUT) and followed by modprobe to insert the modules accordingly. Signed-off-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-06	bpf: sockmap, fix leaking maps with attached but not detached progs	John Fastabend	1	-5/+14
	When a program is attached to a map we increment the program refcnt to ensure that the program is not removed while it is potentially being referenced from sockmap side. However, if this same program also references the map (this is a reasonably common pattern in my programs) then the verifier will also increment the maps refcnt from the verifier. This is to ensure the map doesn't get garbage collected while the program has a reference to it. So we are left in a state where the map holds the refcnt on the program stopping it from being removed and releasing the map refcnt. And vice versa the program holds a refcnt on the map stopping it from releasing the refcnt on the prog. All this is fine as long as users detach the program while the map fd is still around. But, if the user omits this detach command we are left with a dangling map we can no longer release. To resolve this when the map fd is released decrement the program references and remove any reference from the map to the program. This fixes the issue with possibly dangling map and creates a user side API constraint. That is, the map fd must be held open for programs to be attached to a map. Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-06	bpf: sockmap, add sock close() hook to remove socks	John Fastabend	2	-67/+103
	The selftests test_maps program was leaving dangling BPF sockmap programs around because not all psock elements were removed from the map. The elements in turn hold a reference on the BPF program they are attached to causing BPF programs to stay open even after test_maps has completed. The original intent was that sk_state_change() would be called when TCP socks went through TCP_CLOSE state. However, because socks may be in SOCK_DEAD state or the sock may be a listening socket the event is not always triggered. To resolve this use the ULP infrastructure and register our own proto close() handler. This fixes the above case. Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support") Reported-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-06	net: add a UID to use for ULP socket assignment	John Fastabend	3	-5/+62
	Create a UID field and enum that can be used to assign ULPs to sockets. This saves a set of string comparisons if the ULP id is known. For sockmap, which is added in the next patches, a ULP is used to hook into TCP sockets close state. In this case the ULP being added is done at map insert time and the ULP is known and done on the kernel side. In this case the named lookup is not needed. Because we don't want to expose psock internals to user space socket options a user visible flag is also added. For TLS this is set for BPF it will be cleared. Alos remove pr_notice, user gets an error code back and should check that rather than rely on logs. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-06	tools/bpf: fix batch-mode test failure of test_xdp_redirect.sh	Yonghong Song	2	-0/+3
	The tests at tools/testing/selftests/bpf can run in patch mode, e.g., make -C tools/testing/selftests/bpf run_tests With the batch mode, I experimented intermittent test failure of test_xdp_redirect.sh. .... selftests: test_xdp_redirect [PASS] selftests: test_xdp_redirect.sh [PASS] RTNETLINK answers: File exists selftests: test_xdp_meta [FAILED] selftests: test_xdp_meta.sh [FAIL] .... The following illustrates what caused the failure: (1). test_xdp_redirect creates veth pairs (veth1,veth11) and (veth2,veth22), and assign veth11 and veth22 to namespace ns1 and ns2 respectively. (2). at the end of test_xdp_redirect test, ns1 and ns2 are deleted. During this process, the deletion of actual namespace resources, including deletion of veth1{1} and veth2{2}, is put into a workqueue to be processed asynchronously. (3). test_xdp_meta tries to create veth pair (veth1, veth2). The previous veth deletions in step (2) have not finished yet, and veth1 or veth2 may be still valid in the kernel, thus causing the failure. The fix is to explicitly delete the veth pair before test_xdp_redirect exits. Only one end of veth needs deletion as the kernel will delete the other end automatically. Also test_xdp_meta is also fixed in similar manner to avoid future potential issues. Fixes: 996139e801fd ("selftests: bpf: add a test for XDP redirect") Fixes: 22c8852624fc ("bpf: improve selftests and add tests for meta pointer") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-05	bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y	Yonghong Song	1	-5/+26
	With CONFIG_BPF_JIT_ALWAYS_ON is defined in the config file, tools/testing/selftests/bpf/test_kmod.sh failed like below: [root@localhost bpf]# ./test_kmod.sh sysctl: setting key "net.core.bpf_jit_enable": Invalid argument [ JIT enabled:0 hardened:0 ] [ 132.175681] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096 [ 132.458834] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed] [ JIT enabled:1 hardened:0 ] [ 133.456025] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096 [ 133.730935] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed] [ JIT enabled:1 hardened:1 ] [ 134.769730] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096 [ 135.050864] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed] [ JIT enabled:1 hardened:2 ] [ 136.442882] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096 [ 136.821810] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed] [root@localhost bpf]# The test_kmod.sh load/remove test_bpf.ko multiple times with different settings for sysctl net.core.bpf_jit_{enable,harden}. The failed test #297 of test_bpf.ko is designed such that JIT always fails. Commit 290af86629b2 (bpf: introduce BPF_JIT_ALWAYS_ON config) introduced the following tightening logic: ... if (!bpf_prog_is_dev_bound(fp->aux)) { fp = bpf_int_jit_compile(fp); #ifdef CONFIG_BPF_JIT_ALWAYS_ON if (!fp->jited) { *err = -ENOTSUPP; return fp; } #endif ... With this logic, Test #297 always gets return value -ENOTSUPP when CONFIG_BPF_JIT_ALWAYS_ON is defined, causing the test failure. This patch fixed the failure by marking Test #297 as expected failure when CONFIG_BPF_JIT_ALWAYS_ON is defined. Fixes: 290af86629b2 (bpf: introduce BPF_JIT_ALWAYS_ON config) Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-04	dt-bindings: mailbox: qcom: Document the APCS clock binding	Georgi Djakov	1	-0/+18
	Update the binding documentation for APCS to mention that the APCS hardware block also expose a clock controller functionality. The APCS clock controller is a mux and half-integer divider. It has the main CPU PLL as an input and provides the clock for the application CPU. Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2018-02-04	mailbox: qcom: Create APCS child device for clock controller	Georgi Djakov	1	-0/+11
	There is a clock controller functionality provided by the APCS hardware block of msm8916 devices. The device-tree would represent an APCS node with both mailbox and clock provider properties. Create a platform child device for the clock controller functionality so the driver can probe and use APCS as parent. Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org> Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2018-02-04	mailbox: qcom: Convert APCS IPC driver to use regmap	Georgi Djakov	1	-5/+19
	This hardware block provides more functionalities that just IPC. Convert it to regmap to allow other child platform devices to use the same regmap. Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org> Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2018-02-03	KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL	KarimAllah Ahmed	1	-0/+88
	[ Based on a patch from Paolo Bonzini <pbonzini@redhat.com> ] ... basically doing exactly what we do for VMX: - Passthrough SPEC_CTRL to guests (if enabled in guest CPUID) - Save and restore SPEC_CTRL around VMExit and VMEntry only if the guest actually used it. Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: kvm@vger.kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lkml.kernel.org/r/1517669783-20732-1-git-send-email-karahmed@amazon.de
2018-02-03	KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL	KarimAllah Ahmed	3	-6/+110
	[ Based on a patch from Ashok Raj <ashok.raj@intel.com> ] Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for guests that will only mitigate Spectre V2 through IBRS+IBPB and will not be using a retpoline+IBPB based approach. To avoid the overhead of saving and restoring the MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only start saving and restoring when a non-zero is written to it. No attempt is made to handle STIBP here, intentionally. Filtering STIBP may be added in a future patch, which may require trapping all writes if we don't want to pass it through directly to the guest. [dwmw2: Clean up CPUID bits, save/restore manually, handle reset] Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: kvm@vger.kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lkml.kernel.org/r/1517522386-18410-5-git-send-email-karahmed@amazon.de
2018-02-03	KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES	KarimAllah Ahmed	3	-1/+17
	Intel processors use MSR_IA32_ARCH_CAPABILITIES MSR to indicate RDCL_NO (bit 0) and IBRS_ALL (bit 1). This is a read-only MSR. By default the contents will come directly from the hardware, but user-space can still override it. [dwmw2: The bit in kvm_cpuid_7_0_edx_x86_features can be unconditional] Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: kvm@vger.kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lkml.kernel.org/r/1517522386-18410-4-git-send-email-karahmed@amazon.de
2018-02-03	KVM/x86: Add IBPB support	Ashok Raj	3	-3/+116
	The Indirect Branch Predictor Barrier (IBPB) is an indirect branch control mechanism. It keeps earlier branches from influencing later ones. Unlike IBRS and STIBP, IBPB does not define a new mode of operation. It's a command that ensures predicted branch targets aren't used after the barrier. Although IBRS and IBPB are enumerated by the same CPUID enumeration, IBPB is very different. IBPB helps mitigate against three potential attacks: * Mitigate guests from being attacked by other guests. - This is addressed by issing IBPB when we do a guest switch. * Mitigate attacks from guest/ring3->host/ring3. These would require a IBPB during context switch in host, or after VMEXIT. The host process has two ways to mitigate - Either it can be compiled with retpoline - If its going through context switch, and has set !dumpable then there is a IBPB in that path. (Tim's patch: https://patchwork.kernel.org/patch/10192871) - The case where after a VMEXIT you return back to Qemu might make Qemu attackable from guest when Qemu isn't compiled with retpoline. There are issues reported when doing IBPB on every VMEXIT that resulted in some tsc calibration woes in guest. * Mitigate guest/ring0->host/ring0 attacks. When host kernel is using retpoline it is safe against these attacks. If host kernel isn't using retpoline we might need to do a IBPB flush on every VMEXIT. Even when using retpoline for indirect calls, in certain conditions 'ret' can use the BTB on Skylake-era CPUs. There are other mitigations available like RSB stuffing/clearing. * IBPB is issued only for SVM during svm_free_vcpu(). VMX has a vmclear and SVM doesn't. Follow discussion here: https://lkml.org/lkml/2018/1/15/146 Please refer to the following spec for more details on the enumeration and control. Refer here to get documentation about mitigations. https://software.intel.com/en-us/side-channel-security-support [peterz: rebase and changelog rewrite] [karahmed: - rebase - vmx: expose PRED_CMD if guest has it in CPUID - svm: only pass through IBPB if guest has it in CPUID - vmx: support !cpu_has_vmx_msr_bitmap()] - vmx: support nested] [dwmw2: Expose CPUID bit too (AMD IBPB only for now as we lack IBRS) PRED_CMD is a write-only MSR] Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: kvm@vger.kernel.org Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Link: http://lkml.kernel.org/r/1515720739-43819-6-git-send-email-ashok.raj@intel.com Link: https://lkml.kernel.org/r/1517522386-18410-3-git-send-email-karahmed@amazon.de
2018-02-03	KVM/x86: Update the reverse_cpuid list to include CPUID_7_EDX	KarimAllah Ahmed	2	-5/+4
	[dwmw2: Stop using KF() for bits in it, too] Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Link: https://lkml.kernel.org/r/1517522386-18410-2-git-send-email-karahmed@amazon.de
2018-02-03	pinctrl: remove include file from <linux/device.h>	Linus Torvalds	2	-1/+2
	When pulling the recent pinctrl merge, I was surprised by how a pinctrl-only pull request ended up rebuilding basically the whole kernel. The reason for that ended up being that <linux/device.h> included <linux/pinctrl/devinfo.h>, so any change to that file ended up causing pretty much every driver out there to be rebuilt. The reason for that was because 'struct device' has this in it: #ifdef CONFIG_PINCTRL struct dev_pin_info pins; #endif but we already avoid header includes for these kinds of things in that header file, preferring to just use a forward-declaration of the structure instead. Exactly to avoid this kind of header dependency. Since some drivers seem to expect that <linux/pinctrl/devinfo.h> header to come in automatically, move the include to <linux/pinctrl/pinctrl.h> instead. It might be better to just make the includes more targeted, but I'm not going to review every driver. It would definitely be good to have a tool for finding and minimizing header dependencies automatically - or at least help with them. Right now we almost certainly end up having way too many of these things, and it's hard to test every single configuration. FWIW, you can get a sense of the "hotness" of a header file with something like this after doing a full build: find . -name '..o.cmd' -print0 \| xargs -0 tail --lines=+2 \| grep -v 'wildcard ' \| tr ' \\' '\n' \| sort \| uniq -c \| sort -n \| less -S which isn't exact (there are other things in those '*.o.cmd' than just the dependencies, and the "--lines=+2" only removes the header), but might a useful approximation. With this patch, <linux/pinctrl/devinfo.h> drops to "only" having 833 users in the current x86-64 allmodconfig. In contrast, <linux/device.h> has 14857 build files including it directly or indirectly. Of course, the headers that absolutely _everybody_ includes (things like <linux/types.h> etc) get a score of 23000+. Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-03	firmware: dmi: handle missing DMI data gracefully	Ard Biesheuvel	2	-5/+3
	Currently, when booting a kernel with DMI support on a platform that has no DMI tables, the following output is emitted into the kernel log: [ 0.128818] DMI not present or invalid. ... [ 1.306659] dmi: Firmware registration failed. ... [ 2.908681] dmi-sysfs: dmi entry is absent. The first one is a pr_info(), but the subsequent ones are pr_err()s that complain about a condition that is not really an error to begin with. So let's clean this up, and give up silently if dma_available is not set. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Martin Hundebøll <mnhu@prevas.dk> Signed-off-by: Jean Delvare <jdelvare@suse.de>
2018-02-03	firmware: dmi_scan: Fix handling of empty DMI strings	Jean Delvare	1	-13/+9
	The handling of empty DMI strings looks quite broken to me: * Strings from 1 to 7 spaces are not considered empty. * True empty DMI strings (string index set to 0) are not considered empty, and result in allocating a 0-char string. * Strings with invalid index also result in allocating a 0-char string. * Strings starting with 8 spaces are all considered empty, even if non-space characters follow (sounds like a weird thing to do, but I have actually seen occurrences of this in DMI tables before.) * Strings which are considered empty are reported as 8 spaces, instead of being actually empty. Some of these issues are the result of an off-by-one error in memcmp, the rest is incorrect by design. So let's get it square: missing strings and strings made of only spaces, regardless of their length, should be treated as empty and no memory should be allocated for them. All other strings are non-empty and should be allocated. Signed-off-by: Jean Delvare <jdelvare@suse.de> Fixes: 79da4721117f ("x86: fix DMI out of memory problems") Cc: Parag Warudkar <parag.warudkar@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de>
2018-02-03	firmware: dmi_scan: Drop dmi_initialized	Jean Delvare	1	-13/+8
	I don't think it makes sense to check for a possible bad initialization order at run time on every system when it is all decided at build time. A more efficient way to make sure developers do not introduce new calls to dmi_check_system() too early in the initialization sequence is to simply document the expected call order. That way, developers have a chance to get it right immediately, without having to test-boot their kernel, wonder why it does not work, and parse the kernel logs for a warning message. And we get rid of the run-time performance penalty as a nice side effect. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Ingo Molnar <mingo@kernel.org>
2018-02-03	firmware: dmi: Optimize dmi_matches	Jean Delvare	1	-8/+11
	Function dmi_matches can me made a bit faster: * The documented purpose of dmi_initialized is to catch too early calls to dmi_check_system(). I'm not fully convinced it justifies slowing down the initialization of all systems out there, but at least the check should not have been moved from dmi_check_system() to dmi_matches(). dmi_matches() is being called for every entry of the table passed to dmi_check_system(), causing the same redundant check to be performed again and again. So move it back to dmi_check_system(), reverting this specific portion of commit d7b1956fed33 ("DMI: Introduce dmi_first_match to make the interface more flexible"). * Don't check for the exact_match flag again when we already know its value. Signed-off-by: Jean Delvare <jdelvare@suse.de> Fixes: d7b1956fed33 ("DMI: Introduce dmi_first_match to make the interface more flexible") Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Jeff Garzik <jgarzik@redhat.com>
2018-02-02	samples/bpf: use bpf_set_link_xdp_fd	Eric Leblond	9	-126/+24
	Use bpf_set_link_xdp_fd instead of set_link_xdp_fd to remove some code duplication and benefit of netlink ext ack errors message. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-02-02	libbpf: add missing SPDX-License-Identifier	Eric Leblond	4	-0/+8
	Signed-off-by: Eric Leblond <eric@regit.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-02-02	libbpf: add error reporting in XDP	Eric Leblond	5	-2/+272
	Parse netlink ext attribute to get the error message returned by the card. Code is partially take from libnl. We add netlink.h to the uapi include of tools. And we need to avoid include of userspace netlink header to have a successful build of sample so nlattr.h has a define to avoid the inclusion. Using a direct define could have been an issue as NLMSGERR_ATTR_MAX can change in the future. We also define SOL_NETLINK if not defined to avoid to have to copy socket.h for a fixed value. Signed-off-by: Eric Leblond <eric@regit.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-02-02	libbpf: add function to setup XDP	Eric Leblond	3	-0/+128
	Most of the code is taken from set_link_xdp_fd() in bpf_load.c and slightly modified to be library compliant. Signed-off-by: Eric Leblond <eric@regit.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-02-02	tools: add netlink.h and if_link.h in tools uapi	Eric Leblond	3	-0/+1200
	The headers are necessary for libbpf compilation on system with older version of the headers. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-02-02	Revert "defer call to mem_cgroup_sk_alloc()"	Roman Gushchin	3	-5/+15
	This patch effectively reverts commit 9f1c2674b328 ("net: memcontrol: defer call to mem_cgroup_sk_alloc()"). Moving mem_cgroup_sk_alloc() to the inet_csk_accept() completely breaks memcg socket memory accounting, as packets received before memcg pointer initialization are not accounted and are causing refcounting underflow on socket release. Actually the free-after-use problem was fixed by commit c0576e397508 ("net: call cgroup_sk_alloc() earlier in sk_clone_lock()") for the cgroup pointer. So, let's revert it and call mem_cgroup_sk_alloc() just before cgroup_sk_alloc(). This is safe, as we hold a reference to the socket we're cloning, and it holds a reference to the memcg. Also, let's drop BUG_ON(mem_cgroup_is_root()) check from mem_cgroup_sk_alloc(). I see no reasons why bumping the root memcg counter is a good reason to panic, and there are no realistic ways to hit it. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-03	bpf: fix bpf_prog_array_copy_to_user() issues	Alexei Starovoitov	1	-8/+24
	1. move copy_to_user out of rcu section to fix the following issue: ./include/linux/rcupdate.h:302 Illegal context switch in RCU read-side critical section! stack backtrace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592 rcu_preempt_sleep_check include/linux/rcupdate.h:301 [inline] ___might_sleep+0x385/0x470 kernel/sched/core.c:6079 __might_sleep+0x95/0x190 kernel/sched/core.c:6067 __might_fault+0xab/0x1d0 mm/memory.c:4532 _copy_to_user+0x2c/0xc0 lib/usercopy.c:25 copy_to_user include/linux/uaccess.h:155 [inline] bpf_prog_array_copy_to_user+0x217/0x4d0 kernel/bpf/core.c:1587 bpf_prog_array_copy_info+0x17b/0x1c0 kernel/bpf/core.c:1685 perf_event_query_prog_array+0x196/0x280 kernel/trace/bpf_trace.c:877 _perf_ioctl kernel/events/core.c:4737 [inline] perf_ioctl+0x3e1/0x1480 kernel/events/core.c:4757 2. move *prog under rcu, since it's not ok to dereference it afterwards 3. in a rare case of prog array being swapped between bpf_prog_array_length() and bpf_prog_array_copy_to_user() calls make sure to copy zeros to user space, so the user doesn't walk over uninited prog_ids while kernel reported uattr->query.prog_cnt > 0 Reported-by: syzbot+7dbcd2d3b85f9b608b23@syzkaller.appspotmail.com Fixes: 468e2f64d220 ("bpf: introduce BPF_PROG_QUERY command") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-02	soreuseport: fix mem leak in reuseport_add_sock()	Eric Dumazet	1	-15/+20
	reuseport_add_sock() needs to deal with attaching a socket having its own sk_reuseport_cb, after a prior setsockopt(SO_ATTACH_REUSEPORT_?BPF) Without this fix, not only a WARN_ONCE() was issued, but we were also leaking memory. Thanks to sysbot and Eric Biggers for providing us nice C repros. ------------[ cut here ]------------ socket already in reuseport group WARNING: CPU: 0 PID: 3496 at net/core/sock_reuseport.c:119 reuseport_add_sock+0x742/0x9b0 net/core/sock_reuseport.c:117 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 3496 Comm: syzkaller869503 Not tainted 4.15.0-rc6+ #245 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 panic+0x1e4/0x41c kernel/panic.c:183 __warn+0x1dc/0x200 kernel/panic.c:547 report_bug+0x211/0x2d0 lib/bug.c:184 fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178 fixup_bug arch/x86/kernel/traps.c:247 [inline] do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1079 Fixes: ef456144da8e ("soreuseport: define reuseport groups") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+c0ea2226f77a42936bf7@syzkaller.appspotmail.com Acked-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-02	net: qlge: use memmove instead of skb_copy_to_linear_data	Arnd Bergmann	1	-2/+1
	gcc-8 points out that the skb_copy_to_linear_data() argument points to the skb itself, which makes it run into a problem with overlapping memcpy arguments: In file included from include/linux/ip.h:20, from drivers/net/ethernet/qlogic/qlge/qlge_main.c:26: drivers/net/ethernet/qlogic/qlge/qlge_main.c: In function 'ql_realign_skb': include/linux/skbuff.h:3378:2: error: 'memcpy' source argument is the same as destination [-Werror=restrict] memcpy(skb->data, from, len); It's unclear to me what the best solution is, maybe it ought to use a different helper that adjusts the skb data in a safe way. Simply using memmove() here seems like the easiest workaround. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-02	net: qed: use correct strncpy() size	Arnd Bergmann	1	-4/+2
	passing the strlen() of the source string as the destination length is pointless, and gcc-8 now warns about it: drivers/net/ethernet/qlogic/qed/qed_debug.c: In function 'qed_grc_dump': include/linux/string.h:253: error: 'strncpy' specified bound depends on the length of the source argument [-Werror=stringop-overflow=] This changes qed_grc_dump_big_ram() to instead uses the length of the destination buffer, and use strscpy() to guarantee nul-termination. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-02	net: cxgb4: avoid memcpy beyond end of source buffer	Arnd Bergmann	1	-1/+1
	Building with link-time-optimizations revealed that the cxgb4 driver does a fixed-size memcpy() from a variable-length constant string into the network interface name: In function 'memcpy', inlined from 'cfg_queues_uld.constprop' at drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:335:2, inlined from 'cxgb4_register_uld.constprop' at drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:719:9: include/linux/string.h:350:3: error: call to '__read_overflow2' declared with attribute error: detected read beyond size of object passed as 2nd parameter __read_overflow2(); ^ I can see two equally workable solutions: either we use a strncpy() instead of the memcpy() to stop at the end of the input, or we make the source buffer fixed length as well. This implements the latter. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-02	cls_u32: add missing RCU annotation.	Paolo Abeni	1	-5/+7
	In a couple of points of the control path, n->ht_down is currently accessed without the required RCU annotation. The accesses are safe, but sparse complaints. Since we already held the rtnl lock, let use rtnl_dereference(). Fixes: a1b7c5fd7fe9 ("net: sched: add cls_u32 offload hooks for netdevs") Fixes: de5df63228fc ("net: sched: cls_u32 changes to knode must appear atomic to readers") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-02	r8152: set rx mode early when linking on	Hayes Wang	1	-2/+3
	Set rx mode before calling netif_wake_queue() when linking on to avoid the device missing the receiving packets. The transmission may start after calling netif_wake_queue(), and the packets of resopnse may reach before calling rtl8152_set_rx_mode() which let the device could receive packets. Then, the packets of response would be missed. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-02	r8152: fix wrong checksum status for received IPv4 packets	Hayes Wang	1	-5/+3
	The device could only check the checksum of TCP and UDP packets. Therefore, for the IPv4 packets excluding TCP and UDP, the check of checksum is necessary, even though the IP checksum is correct. Take ICMP for example, The IP checksum may be correct, but the ICMP checksum may be wrong. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-02	nfp: fix TLV offset calculation	Edwin Peer	1	-1/+1
	The data pointer in the config space TLV parser already includes NFP_NET_CFG_TLV_BASE, it should not be added again. Incorrect offset values were only used in printed user output, rendering the bug merely cosmetic. Fixes: 73a0329b057e ("nfp: add TLV capabilities to the BAR") Signed-off-by: Edwin Peer <edwin.peer@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-02	x86/power: Fix swsusp_arch_resume prototype	Arnd Bergmann	4	-5/+4
	The declaration for swsusp_arch_resume marks it as 'asmlinkage', but the definition in x86-32 does not, and it fails to include the header with the declaration. This leads to a warning when building with link-time-optimizations: kernel/power/power.h:108:23: error: type of 'swsusp_arch_resume' does not match original declaration [-Werror=lto-type-mismatch] extern asmlinkage int swsusp_arch_resume(void); ^ arch/x86/power/hibernate_32.c:148:0: note: 'swsusp_arch_resume' was previously declared here int swsusp_arch_resume(void) This moves the declaration into a globally visible header file and fixes up both x86 definitions to match it. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Len Brown <len.brown@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: linux-pm@vger.kernel.org Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Pavel Machek <pavel@ucw.cz> Cc: Bart Van Assche <bart.vanassche@wdc.com> Link: https://lkml.kernel.org/r/20180202145634.200291-2-arnd@arndb.de
2018-02-02	x86/dumpstack: Avoid uninitlized variable	Arnd Bergmann	1	-1/+1
	In some configurations, 'partial' does not get initialized, as shown by this gcc-8 warning: arch/x86/kernel/dumpstack.c: In function 'show_trace_log_lvl': arch/x86/kernel/dumpstack.c:156:4: error: 'partial' may be used uninitialized in this function [-Werror=maybe-uninitialized] show_regs_if_on_stack(&stack_info, regs, partial); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This initializes it to false, to get the previous behavior in this case. Fixes: a9cdbe72c4e8 ("x86/dumpstack: Fix partial register dumps") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Link: https://lkml.kernel.org/r/20180202145634.200291-1-arnd@arndb.de
2018-02-02	x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL	Darren Kenny	1	-1/+1
	Fixes: 117cc7a908c83 ("x86/retpoline: Fill return stack buffer on vmexit") Signed-off-by: Darren Kenny <darren.kenny@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180202191220.blvgkgutojecxr3b@starbug-vm.ie.oracle.com
2018-02-02	x86/pti: Mark constant arrays as __initconst	Arnd Bergmann	1	-2/+2
	I'm seeing build failures from the two newly introduced arrays that are marked 'const' and '__initdata', which are mutually exclusive: arch/x86/kernel/cpu/common.c:882:43: error: 'cpu_no_speculation' causes a section type conflict with 'e820_table_firmware_init' arch/x86/kernel/cpu/common.c:895:43: error: 'cpu_no_meltdown' causes a section type conflict with 'e820_table_firmware_init' The correct annotation is __initconst. Fixes: fec9434a12f3 ("x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Thomas Garnier <thgarnie@google.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180202213959.611210-1-arnd@arndb.de
2018-02-02	block: skd: fix incorrect linux/slab_def.h inclusion	Arnd Bergmann	1	-3/+4
	skd includes slab_def.h to get access to the slab cache object size. However, including this header breaks when we use SLUB or SLOB instead of the SLAB allocator, since the structure layout is completely different, as shown by this warning when we build this driver in one of the invalid configurations with link-time optimizations enabled: include/linux/slab.h:715:0: error: type of 'kmem_cache_size' does not match original declaration [-Werror=lto-type-mismatch] unsigned int kmem_cache_size(struct kmem_cache s); mm/slab_common.c:77:14: note: 'kmem_cache_size' was previously declared here unsigned int kmem_cache_size(struct kmem_cache s) ^ mm/slab_common.c:77:14: note: code may be misoptimized unless -fno-strict-aliasing is used include/linux/slab.h:147:0: error: type of 'kmem_cache_destroy' does not match original declaration [-Werror=lto-type-mismatch] void kmem_cache_destroy(struct kmem_cache ); mm/slab_common.c:858:6: note: 'kmem_cache_destroy' was previously declared here void kmem_cache_destroy(struct kmem_cache s) ^ mm/slab_common.c:858:6: note: code may be misoptimized unless -fno-strict-aliasing is used include/linux/slab.h:140:0: error: type of 'kmem_cache_create' does not match original declaration [-Werror=lto-type-mismatch] struct kmem_cache kmem_cache_create(const char name, size_t size, mm/slab_common.c:534:1: note: 'kmem_cache_create' was previously declared here kmem_cache_create(const char *name, size_t size, size_t align, ^ This removes the header inclusion and instead uses the kmem_cache_size() interface to get the size in a reliable way. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-02	buffer: Avoid setting buffer bits that are already set	Kemi Wang	1	-1/+4
	It's expensive to set buffer flags that are already set, because that causes a costly cache line transition. A common case is setting the "verified" flag during ext4 writes. This patch checks for the flag being set first. With the AIM7/creat-clo benchmark testing on a 48G ramdisk based-on ext4 file system, we see 3.3%(15431->15936) improvement of aim7.jobs-per-min on a 2-sockets broadwell platform. What the benchmark does is: it forks 3000 processes, and each process do the following: a) open a new file b) close the file c) delete the file until loop=100*1000 times. The original patch is contributed by Andi Kleen. Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Kemi Wang <kemi.wang@intel.com> Signed-off-by: Kemi Wang <kemi.wang@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-02	x86/spectre: Simplify spectre_v2 command line parsing	KarimAllah Ahmed	1	-30/+56
	[dwmw2: Use ARRAY_SIZE] Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: peterz@infradead.org Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1517484441-1420-3-git-send-email-dwmw@amazon.co.uk