linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2018-01-24	selftests/bpf: make 'dubious pointer arithmetic' test useful	Alexei Starovoitov	1	-7/+23
	mostly revert the previous workaround and make 'dubious pointer arithmetic' test useful again. Use (ptr - ptr) << const instead of ptr << const to generate large scalar. The rest stays as before commit 2b36047e7889. Fixes: 2b36047e7889 ("selftests/bpf: fix test_align") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-23	bpf: test_maps: cleanup sockmaps when test ends	Prashant Bhole	1	-4/+12
	Bug: BPF programs and maps related to sockmaps test exist in memory even after test_maps ends. This patch fixes it as a short term workaround (sockmap kernel side needs real fixing) by empyting sockmaps when test ends. Fixes: 6f6d33f3b3d0f ("bpf: selftests add sockmap tests") Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> [ daniel: Note on workaround. ] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-23	selftests/bpf: fix test_dev_cgroup	Alexei Starovoitov	1	-1/+1
	The test incorrectly doing mkdir /mnt/cgroup-test-work-dirtest-bpf-based-device-cgroup instead of mkdir /mnt/cgroup-test-work-dir/test-bpf-based-device-cgroup somehow such mkdir succeeds and new directory appears: /mnt/cgroup-test-work-dir/cgroup-test-work-dirtest-bpf-based-device-cgroup Later cleanup via nftw("/mnt/cgroup-test-work-dir", ...); doesn't walk this directory. "rmdir /mnt/cgroup-test-work-dir" succeeds, but bpf program and dangling cgroup stays in memory. That's a separate issue on a cgroup side. For now fix the test. Fixes: 37f1ba0909df ("selftests/bpf: add a test for device cgroup controller") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-23	selftests/bpf: speedup test_maps	Alexei Starovoitov	1	-6/+10
	test_hashmap_walk takes very long time on debug kernel with kasan on. Reduce the number of iterations in this test without sacrificing test coverage. Also add printfs as progress indicator. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-23	tools/bpf: fix a test failure in selftests prog test_verifier	Yonghong Song	1	-0/+1
	Commit 111e6b45315c ("selftests/bpf: make test_verifier run most programs") enables tools/testing/selftests/bpf/test_verifier unit cases to run via bpf_prog_test_run command. With the latest code base, test_verifier had one test case failure: ... #473/p check deducing bounds from const, 2 FAIL retval 1 != 0 0: (b7) r0 = 1 1: (75) if r0 s>= 0x1 goto pc+1 R0=inv1 R1=ctx(id=0,off=0,imm=0) R10=fp0,call_-1 2: (95) exit from 1 to 3: R0=inv1 R1=ctx(id=0,off=0,imm=0) R10=fp0,call_-1 3: (d5) if r0 s<= 0x1 goto pc+1 R0=inv1 R1=ctx(id=0,off=0,imm=0) R10=fp0,call_-1 4: (95) exit from 3 to 5: R0=inv1 R1=ctx(id=0,off=0,imm=0) R10=fp0,call_-1 5: (1f) r1 -= r0 6: (95) exit processed 7 insns (limit 131072), stack depth 0 ... The test case does not set return value in the test structure and hence the return value from the prog run is assumed to be 0. However, the actual return value is 1. As a result, the test failed. The fix is to correctly set the return value in the test structure. Fixes: 111e6b45315c ("selftests/bpf: make test_verifier run most programs") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-20	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	David S. Miller	13	-34/+610
	Alexei Starovoitov says: ==================== pull-request: bpf-next 2018-01-19 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) bpf array map HW offload, from Jakub. 2) support for bpf_get_next_key() for LPM map, from Yonghong. 3) test_verifier now runs loaded programs, from Alexei. 4) xdp cpumap monitoring, from Jesper. 5) variety of tests, cleanups and small x64 JIT optimization, from Daniel. 6) user space can now retrieve HW JITed program, from Jiong. Note there is a minor conflict between Russell's arm32 JIT fixes and removal of bpf_jit_enable variable by Daniel which should be resolved by keeping Russell's comment and removing that variable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	2	-4/+152
	The BPF verifier conflict was some minor contextual issue. The TUN conflict was less trivial. Cong Wang fixed a memory leak of tfile->tx_array in 'net'. This is an skb_array. But meanwhile in net-next tun changed tfile->tx_arry into tfile->tx_ring which is a ptr_ring. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19	bpf: add couple of test cases for div/mod by zero	Daniel Borkmann	1	-0/+87
	Add couple of missing test cases for eBPF div/mod by zero to the new test_verifier prog runtime feature. Also one for an empty prog and only exit. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-19	tools/bpf: add a testcase for MAP_GET_NEXT_KEY command of LPM_TRIE map	Yonghong Song	1	-0/+122
	A test case is added in tools/testing/selftests/bpf/test_lpm_map.c for MAP_GET_NEXT_KEY command. A four node trie, which is described in kernel/bpf/lpm_trie.c, is built and the MAP_GET_NEXT_KEY results are checked. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-19	selftests: bpf: update .gitignore with missing generated files	Shuah Khan	1	-0/+7
	Update .gitignore with missing generated files. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-19	bpftool: recognize BPF_MAP_TYPE_CPUMAP maps	Roman Gushchin	1	-0/+1
	Add BPF_MAP_TYPE_CPUMAP map type to the list of map type recognized by bpftool and define corresponding text representation. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Quentin Monnet <quentin.monnet@netronome.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Acked-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-19	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds	1	-3/+149
	Pull networking fixes from David Miller: 1) Fix BPF divides by zero, from Eric Dumazet and Alexei Starovoitov. 2) Reject stores into bpf context via st and xadd, from Daniel Borkmann. 3) Fix a memory leak in TUN, from Cong Wang. 4) Disable RX aggregation on a specific troublesome configuration of r8152 in a Dell TB16b dock. 5) Fix sw_ctx leak in tls, from Sabrina Dubroca. 6) Fix program replacement in cls_bpf, from Daniel Borkmann. 7) Fix uninitialized station_info structures in cfg80211, from Johannes Berg. 8) Fix miscalculation of transport header offset field in flow dissector, from Eric Dumazet. 9) Fix LPM tree leak on failure in mlxsw driver, from Ido Schimmel. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits) ibmvnic: Fix IPv6 packet descriptors ibmvnic: Fix IP offload control buffer ipv6: don't let tb6_root node share routes with other node ip6_gre: init dev->mtu and dev->hard_header_len correctly mlxsw: spectrum_router: Free LPM tree upon failure flow_dissector: properly cap thoff field fm10k: mark PM functions as __maybe_unused cfg80211: fix station info handling bugs netlink: reset extack earlier in netlink_rcv_skb can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once bpf: mark dst unknown on inconsistent {s, u}bounds adjustments bpf: fix cls_bpf on filter replace Net: ethernet: ti: netcp: Fix inbound ping crash if MTU size is greater than 1500 tls: reset crypto_info when do_tls_setsockopt_tx fails tls: return -EBUSY if crypto_info is already set tls: fix sw_ctx leak net/tls: Only attach to sockets in ESTABLISHED state net: fs_enet: do not call phy_stop() in interrupts r8152: disable RX aggregation on Dell TB16 dock ...
2018-01-18	selftest/bpf: extend the offload test with map checks	Jakub Kicinski	3	-25/+218
	Check map device information is reported correctly, and perform basic map operations. Check device destruction gets rid of the maps and map allocation failure path by telling netdevsim to reject map offload via DebugFS. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18	tools: bpftool: report device information for offloaded maps	Jakub Kicinski	1	-1/+6
	Print the information about device on which map is created. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18	bpf: offload: report device information about offloaded maps	Jakub Kicinski	1	-0/+3
	Tell user space about device on which the map was created. Unfortunate reality of user ABI makes sharing this code with program offload difficult but the information is the same. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18	selftests/bpf: make test_verifier run most programs	Alexei Starovoitov	1	-1/+49
	to improve test coverage make test_verifier run all successfully loaded programs on 64-byte zero initialized data. For clsbpf and xdp it means empty 64-byte packet. For lwt and socket_filters it's 64-byte packet where skb->data points after L2. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18	bpf: Sync kernel ABI header with tooling header	Jesper Dangaard Brouer	1	-1/+11
	Update tools/include/uapi/linux/bpf.h to bring it in sync with include/uapi/linux/bpf.h. The listed commits forgot to update it. Fixes: 02dd3291b2f0 ("bpf: finally expose xdp_rxq_info to XDP bpf-programs") Fixes: f19397a5c656 ("bpf: Add access to snd_cwnd and others in sock_ops") Fixes: 06ef0ccb5a36 ("bpf/cgroup: fix a verification error for a CGROUP_DEVICE type prog") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18	tools/bpf_jit_disasm: silence a static checker warning	Dan Carpenter	1	-3/+4
	There is a static checker warning that "proglen" has an upper bound but no lower bound. The allocation will just fail harmlessly so it's not a big deal. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-18	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	David S. Miller	1	-3/+149
	Daniel Borkmann says: ==================== pull-request: bpf 2018-01-18 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Fix a divide by zero due to wrong if (src_reg == 0) check in 64-bit mode. Properly handle this in interpreter and mask it also generically in verifier to guard against similar checks in JITs, from Eric and Alexei. 2) Fix a bug in arm64 JIT when tail calls are involved and progs have different stack sizes, from Daniel. 3) Reject stores into BPF context that are not expected BPF_STX \| BPF_MEM variant, from Daniel. 4) Mark dst reg as unknown on {s,u}bounds adjustments when the src reg has derived bounds from dead branches, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18	tools: bpftool: improve architecture detection by using ifindex	Jiong Wang	4	-3/+102
	The current architecture detection method in bpftool is designed for host case. For offload case, we can't use the architecture of "bpftool" itself. Instead, we could call the existing "ifindex_to_name_ns" to get DEVNAME, then read pci id from /sys/class/dev/DEVNAME/device/vendor, finally we map vendor id to bfd arch name which will finally be used to select bfd backend for the disassembler. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-17	bpf: mark dst unknown on inconsistent {s, u}bounds adjustments	Daniel Borkmann	1	-1/+122
	syzkaller generated a BPF proglet and triggered a warning with the following: 0: (b7) r0 = 0 1: (d5) if r0 s<= 0x0 goto pc+0 R0=inv0 R1=ctx(id=0,off=0,imm=0) R10=fp0 2: (1f) r0 -= r1 R0=inv0 R1=ctx(id=0,off=0,imm=0) R10=fp0 verifier internal error: known but bad sbounds What happens is that in the first insn, r0's min/max value are both 0 due to the immediate assignment, later in the jsle test the bounds are updated for the min value in the false path, meaning, they yield smin_val = 1, smax_val = 0, and when ctx pointer is subtracted from r0, verifier bails out with the internal error and throwing a WARN since smin_val != smax_val for the known constant. For min_val > max_val scenario it means that reg_set_min_max() and reg_set_min_max_inv() (which both refine existing bounds) demonstrated that such branch cannot be taken at runtime. In above scenario for the case where it will be taken, the existing [0, 0] bounds are kept intact. Meaning, the rejection is not due to a verifier internal error, and therefore the WARN() is not necessary either. We could just reject such cases in adjust_{ptr,scalar}_min_max_vals() when either known scalars have smin_val != smax_val or umin_val != umax_val or any scalar reg with bounds smin_val > smax_val or umin_val > umax_val. However, there may be a small risk of breakage of buggy programs, so handle this more gracefully and in adjust_{ptr,scalar}_min_max_vals() just taint the dst reg as unknown scalar when we see ops with such kind of src reg. Reported-by: syzbot+6d362cadd45dc0a12ba4@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-17	Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	Linus Torvalds	1	-1/+3
	Pull x86 pti bits and fixes from Thomas Gleixner: "This last update contains: - An objtool fix to prevent a segfault with the gold linker by changing the invocation order. That's not just for gold, it's a general robustness improvement. - An improved error message for objtool which spares tearing hairs. - Make KASAN fail loudly if there is not enough memory instead of oopsing at some random place later - RSB fill on context switch to prevent RSB underflow and speculation through other units. - Make the retpoline/RSB functionality work reliably for both Intel and AMD - Add retpoline to the module version magic so mismatch can be detected - A small (non-fix) update for cpufeatures which prevents cpu feature clashing for the upcoming extra mitigation bits to ease backporting" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: module: Add retpoline tag to VERMAGIC x86/cpufeature: Move processor tracing out of scattered features objtool: Improve error message for bad file argument objtool: Fix seg fault with gold linker x86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros x86/retpoline: Fill RSB on context switch for affected CPUs x86/kasan: Panic if there is not enough memory to boot
2018-01-17	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	6	-9/+606
	Overlapping changes all over. The mini-qdisc bits were a little bit tricky, however. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17	libbpf: Makefile set specified permission mode	Jesper Dangaard Brouer	1	-1/+1
	The third parameter to do_install was not used by $(INSTALL) command. Fix this by only setting the -m option when the third parameter is supplied. The use of a third parameter was introduced in commit eb54e522a000 ("bpf: install libbpf headers on 'make install'"). Without this change, the header files are install as executables files (755). Fixes: eb54e522a000 ("bpf: install libbpf headers on 'make install'") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-17	libbpf: cleanup Makefile, remove unused elements	Jesper Dangaard Brouer	1	-13/+2
	The plugin_dir_SQ variable is not used, remove it. The function update_dir is also unused, remove it. The variable $VERSION_FILES is empty, remove it. These all originates from the introduction of the Makefile, and is likely a copy paste from tools/lib/traceevent/Makefile. Fixes: 1b76c13e4b36 ("bpf tools: Introduce 'bpf' library and add bpf feature check") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-17	libbpf: install the header file libbpf.h	Jesper Dangaard Brouer	1	-1/+2
	It seems like an oversight not to install the header file for libbpf, given the libbpf.so + libbpf.a files are installed. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-17	libbpf: fix string comparison for guessing eBPF program type	Quentin Monnet	1	-1/+1
	libbpf is able to deduce the type of a program from the name of the ELF section in which it is located. However, the comparison is made on the first n characters, n being determined with sizeof() applied to the reference string (e.g. "xdp"). When such section names are supposed to receive a suffix separated with a slash (e.g. "kprobe/"), using sizeof() takes the final NUL character of the reference string into account, which implies that both strings must be equal. Instead, the desired behaviour would consist in taking the length of the string, without accounting for the ending NUL character, and to make sure the reference string is a prefix to the ELF section name. Subtract 1 to the total size of the string for obtaining the length for the comparison. Fixes: 583c90097f72 ("libbpf: add ability to guess program type based on section name") Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-17	tools: bpftool: add -DPACKAGE when including bfd.h	Jiong Wang	2	-2/+2
	bfd.h is requiring including of config.h except when PACKAGE or PACKAGE_VERSION are defined. /* PR 14072: Ensure that config.h is included first. */ #if !defined PACKAGE && !defined PACKAGE_VERSION #error config.h must be included before this header #endif This check has been introduced since May-2012. It doesn't show up in bfd.h on some Linux distribution, probably because distributions have remove it when building the package. However, sometimes the user might just build libfd from source code then link bpftool against it. For this case, bfd.h will be original that we need to define PACKAGE or PACKAGE_VERSION. Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-16	bpf: reject stores into ctx via st and xadd	Daniel Borkmann	1	-2/+27
	Alexei found that verifier does not reject stores into context via BPF_ST instead of BPF_STX. And while looking at it, we also should not allow XADD variant of BPF_STX. The context rewriter is only assuming either BPF_LDX_MEM- or BPF_STX_MEM-type operations, thus reject anything other than that so that assumptions in the rewriter properly hold. Add test cases as well for BPF selftests. Fixes: d691f9e8d440 ("bpf: allow programs to write to certain skb fields") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-16	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds	1	-0/+40
	Pull networking fixes from David Miller: 1) Two read past end of buffer fixes in AF_KEY, from Eric Biggers. 2) Memory leak in key_notify_policy(), from Steffen Klassert. 3) Fix overflow with bpf arrays, from Daniel Borkmann. 4) Fix RDMA regression with mlx5 due to mlx5 no longer using pci_irq_get_affinity(), from Saeed Mahameed. 5) Missing RCU read locking in nl80211_send_iface() when it calls ieee80211_bss_get_ie(), from Dominik Brodowski. 6) cfg80211 should check dev_set_name()'s return value, from Johannes Berg. 7) Missing module license tag in 9p protocol, from Stephen Hemminger. 8) Fix crash due to too small MTU in udp ipv6 sendmsg, from Mike Maloney. 9) Fix endless loop in netlink extack code, from David Ahern. 10) TLS socket layer sets inverted error codes, resulting in an endless loop. From Robert Hering. 11) Revert openvswitch erspan tunnel support, it's mis-designed and we need to kill it before it goes into a real release. From William Tu. 12) Fix lan78xx failures in full speed USB mode, from Yuiko Oshino. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits) net, sched: fix panic when updating miniq {b,q}stats qed: Fix potential use-after-free in qed_spq_post() nfp: use the correct index for link speed table lan78xx: Fix failure in USB Full Speed sctp: do not allow the v4 socket to bind a v4mapped v6 address sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf sctp: reinit stream if stream outcnt has been change by sinit in sendmsg ibmvnic: Fix pending MAC address changes netlink: extack: avoid parenthesized string constant warning ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY net: Allow neigh contructor functions ability to modify the primary_key sh_eth: fix dumping ARSTR Revert "openvswitch: Add erspan tunnel support." net/tls: Fix inverted error codes to avoid endless loop ipv6: ip6_make_skb() needs to clear cork.base.dst sctp: avoid compiler warning on implicit fallthru net: ipv4: Make "ip route get" match iif lo rules again. netlink: extack needs to be reset each time through loop tipc: fix a memory leak in tipc_nl_node_get_link() ipv6: fix udpv6 sendmsg crash caused by too small MTU ...
2018-01-16	bpftool: recognize BPF_PROG_TYPE_CGROUP_DEVICE programs	Roman Gushchin	1	-0/+1
	Bpftool doesn't recognize BPF_PROG_TYPE_CGROUP_DEVICE programs, so the prog show command prints the numeric type value: $ bpftool prog show 1: type 15 name bpf_prog1 tag ac9f93dbfd6d9b74 loaded_at Jan 15/07:58 uid 0 xlated 96B jited 105B memlock 4096B This patch defines the corresponding textual representation: $ bpftool prog show 1: cgroup_device name bpf_prog1 tag ac9f93dbfd6d9b74 loaded_at Jan 15/07:58 uid 0 xlated 96B jited 105B memlock 4096B Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Quentin Monnet <quentin.monnet@netronome.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-16	objtool: Improve error message for bad file argument	Josh Poimboeuf	1	-1/+3
	If a nonexistent file is supplied to objtool, it complains with a non-helpful error: open: No such file or directory Improve it to: objtool: Can't open 'foo': No such file or directory Reported-by: Markus <M4rkusXXL@web.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/406a3d00a21225eee2819844048e17f68523ccf6.1516025651.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-14	bpf: offload: add map offload infrastructure	Jakub Kicinski	1	-0/+1
	BPF map offload follow similar path to program offload. At creation time users may specify ifindex of the device on which they want to create the map. Map will be validated by the kernel's .map_alloc_check callback and device driver will be called for the actual allocation. Map will have an empty set of operations associated with it (save for alloc and free callbacks). The real device callbacks are kept in map->offload->dev_ops because they have slightly different signatures. Map operations are called in process context so the driver may communicate with HW freely, msleep(), wait() etc. Map alloc and free callbacks are muxed via existing .ndo_bpf, and are always called with rtnl lock held. Maps and programs are guaranteed to be destroyed before .ndo_uninit (i.e. before unregister_netdev() returns). Map callbacks are invoked with bpf_devs_lock read locked, drivers must take care of exclusive locking if necessary. All offload-specific branches are marked with unlikely() (through bpf_map_is_dev_bound()), given that branch penalty will be negligible compared to IO anyway, and we don't want to penalize SW path unnecessarily. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-14	Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	Linus Torvalds	4	-8/+565
	Pull x86 pti updates from Thomas Gleixner: "This contains: - a PTI bugfix to avoid setting reserved CR3 bits when PCID is disabled. This seems to cause issues on a virtual machine at least and is incorrect according to the AMD manual. - a PTI bugfix which disables the perf BTS facility if PTI is enabled. The BTS AUX buffer is not globally visible and causes the CPU to fault when the mapping disappears on switching CR3 to user space. A full fix which restores BTS on PTI is non trivial and will be worked on. - PTI bugfixes for EFI and trusted boot which make sure that the user space visible page table entries have the NX bit cleared - removal of dead code in the PTI pagetable setup functions - add PTI documentation - add a selftest for vsyscall to verify that the kernel actually implements what it advertises. - a sysfs interface to expose vulnerability and mitigation information so there is a coherent way for users to retrieve the status. - the initial spectre_v2 mitigations, aka retpoline: + The necessary ASM thunk and compiler support + The ASM variants of retpoline and the conversion of affected ASM code + Make LFENCE serializing on AMD so it can be used as speculation trap + The RSB fill after vmexit - initial objtool support for retpoline As I said in the status mail this is the most of the set of patches which should go into 4.15 except two straight forward patches still on hold: - the retpoline add on of LFENCE which waits for ACKs - the RSB fill after context switch Both should be ready to go early next week and with that we'll have covered the major holes of spectre_v2 and go back to normality" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits) x86,perf: Disable intel_bts when PTI security/Kconfig: Correct the Documentation reference for PTI x86/pti: Fix !PCID and sanitize defines selftests/x86: Add test_vsyscall x86/retpoline: Fill return stack buffer on vmexit x86/retpoline/irq32: Convert assembler indirect jumps x86/retpoline/checksum32: Convert assembler indirect jumps x86/retpoline/xen: Convert Xen hypercall indirect jumps x86/retpoline/hyperv: Convert assembler indirect jumps x86/retpoline/ftrace: Convert ftrace assembler indirect jumps x86/retpoline/entry: Convert entry assembler indirect jumps x86/retpoline/crypto: Convert crypto assembler indirect jumps x86/spectre: Add boot time option to select Spectre v2 mitigation x86/retpoline: Add initial retpoline support objtool: Allow alternatives to be ignored objtool: Detect jumps to retpoline thunks x86/pti: Make unpoison of pgd for trusted boot work for real x86/alternatives: Fix optimize_nops() checking sysfs/cpu: Fix typos in vulnerability documentation x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC ...
2018-01-13	tools/objtool/Makefile: don't assume sync-check.sh is executable	Andrew Morton	1	-1/+1
	patch(1) loses the x bit. So if a user follows our patching instructions in Documentation/admin-guide/README.rst, their kernel will not compile. Fixes: 3bd51c5a371de ("objtool: Move kernel headers/code sync check to a script") Reported-by: Nicolas Bock <nicolasbock@gentoo.org> Reported-by Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-13	selftests/x86: Add test_vsyscall	Andy Lutomirski	2	-1/+501
	This tests that the vsyscall entries do what they're expected to do. It also confirms that attempts to read the vsyscall page behave as expected. If changes are made to the vsyscall code or its memory map handling, running this test in all three of vsyscall=none, vsyscall=emulate, and vsyscall=native are helpful. (Because it's easy, this also compares the vsyscall results to their vDSO equivalents.) Note to KAISER backporters: please test this under all three vsyscall modes. Also, in the emulate and native modes, make sure that test_vsyscall_64 agrees with the command line or config option as to which mode you're in. It's quite easy to mess up the kernel such that native mode accidentally emulates or vice versa. Greg, etc: please backport this to all your Meltdown-patched kernels. It'll help make sure the patches didn't regress vsyscalls. CSigned-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/2b9c5a174c1d60fd7774461d518aa75598b1d8fd.1515719552.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-11	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	1	-21/+1
	BPF alignment tests got a conflict because the registers are output as Rn_w instead of just Rn in net-next, and in net a fixup for a testcase prohibits logical operations on pointers before using them. Also, we should attempt to patch BPF call args if JIT always on is enabled. Instead, if we fail to JIT the subprogs we should pass an error back up and fail immediately. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-12	objtool: Allow alternatives to be ignored	Josh Poimboeuf	2	-7/+57
	Getting objtool to understand retpolines is going to be a bit of a challenge. For now, take advantage of the fact that retpolines are patched in with alternatives. Just read the original (sane) non-alternative instruction, and ignore the patched-in retpoline. This allows objtool to understand the control flow around the retpoline, even if it can't yet follow what's inside. This means the ORC unwinder will fail to unwind from inside a retpoline, but will work fine otherwise. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-3-git-send-email-dwmw@amazon.co.uk
2018-01-12	objtool: Detect jumps to retpoline thunks	Josh Poimboeuf	1	-0/+7
	A direct jump to a retpoline thunk is really an indirect jump in disguise. Change the objtool instruction type accordingly. Objtool needs to know where indirect branches are so it can detect switch statement jump tables. This fixes a bunch of warnings with CONFIG_RETPOLINE like: arch/x86/events/intel/uncore_nhmex.o: warning: objtool: nhmex_rbox_msr_enable_event()+0x44: sibling call from callable instruction with modified stack frame kernel/signal.o: warning: objtool: copy_siginfo_to_user()+0x91: sibling call from callable instruction with modified stack frame ... Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-2-git-send-email-dwmw@amazon.co.uk
2018-01-10	bpf: arsh is not supported in 32 bit alu thus reject it	Daniel Borkmann	1	-0/+40
	The following snippet was throwing an 'unknown opcode cc' warning in BPF interpreter: 0: (18) r0 = 0x0 2: (7b) (u64 )(r10 -16) = r0 3: (cc) (u32) r0 s>>= (u32) r0 4: (95) exit Although a number of JITs do support BPF_ALU \| BPF_ARSH \| BPF_{K,X} generation, not all of them do and interpreter does neither. We can leave existing ones and implement it later in bpf-next for the remaining ones, but reject this properly in verifier for the time being. Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") Reported-by: syzbot+93c4904c5c70348a6890@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-10	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	David S. Miller	1	-21/+1
	Daniel Borkmann says: ==================== pull-request: bpf 2018-01-09 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Prevent out-of-bounds speculation in BPF maps by masking the index after bounds checks in order to fix spectre v1, and add an option BPF_JIT_ALWAYS_ON into Kconfig that allows for removing the BPF interpreter from the kernel in favor of JIT-only mode to make spectre v2 harder, from Alexei. 2) Remove false sharing of map refcount with max_entries which was used in spectre v1, from Daniel. 3) Add a missing NULL psock check in sockmap in order to fix a race, from John. 4) Fix test_align BPF selftest case since a recent change in verifier rejects the bit-wise arithmetic on pointers earlier but test_align update was missing, from Alexei. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-09	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	12	-49/+197

2018-01-07	selftests: fib_tests: Add test cases for netdev carrier change	Ido Schimmel	1	-0/+142
	Check that IPv4 and IPv6 react the same when the carrier of a netdev is toggled. Local routes should not be affected by this, whereas unicast routes should. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-07	selftests: fib_tests: Add test cases for netdev down	Ido Schimmel	1	-0/+141
	Check that IPv4 and IPv6 react the same when a netdev is being put administratively down. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-07	selftests: fib_tests: Add test cases for IPv4/IPv6 FIB	Ido Schimmel	2	-0/+147
	Add test cases to check that IPv4 and IPv6 react to a netdev being unregistered as expected. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-07	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	David S. Miller	22	-60/+482
	Daniel Borkmann says: ==================== pull-request: bpf-next 2018-01-07 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Add a start of a framework for extending struct xdp_buff without having the overhead of populating every data at runtime. Idea is to have a new per-queue struct xdp_rxq_info that holds read mostly data (currently that is, queue number and a pointer to the corresponding netdev) which is set up during rxqueue config time. When a XDP program is invoked, struct xdp_buff holds a pointer to struct xdp_rxq_info that the BPF program can then walk. The user facing BPF program that uses struct xdp_md for context can use these members directly, and the verifier rewrites context access transparently by walking the xdp_rxq_info and net_device pointers to load the data, from Jesper. 2) Redo the reporting of offload device information to user space such that it works in combination with network namespaces. The latter is reported through a device/inode tuple as similarly done in other subsystems as well (e.g. perf) in order to identify the namespace. For this to work, ns_get_path() has been generalized such that the namespace can be retrieved not only from a specific task (perf case), but also from a callback where we deduce the netns (ns_common) from a netdevice. bpftool support using the new uapi info and extensive test cases for test_offload.py in BPF selftests have been added as well, from Jakub. 3) Add two bpftool improvements: i) properly report the bpftool version such that it corresponds to the version from the kernel source tree. So pick the right linux/version.h from the source tree instead of the installed one. ii) fix bpftool and also bpf_jit_disasm build with bintutils >= 2.9. The reason for the build breakage is that binutils library changed the function signature to select the disassembler. Given this is needed in multiple tools, add a proper feature detection to the tools/build/features infrastructure, from Roman. 4) Implement the BPF syscall command BPF_MAP_GET_NEXT_KEY for the stacktrace map. It is currently unimplemented, but there are use cases where user space needs to walk all stacktrace map entries e.g. for dumping or deleting map entries w/o having to close and recreate the map. Add BPF selftests along with it, from Yonghong. 5) Few follow-up cleanups for the bpftool cgroup code: i) rename the cgroup 'list' command into 'show' as we have it for other subcommands as well, ii) then alias the 'show' command such that 'list' is accepted which is also common practice in iproute2, and iii) remove couple of newlines from error messages using p_err(), from Jakub. 6) Two follow-up cleanups to sockmap code: i) remove the unused bpf_compute_data_end_sk_skb() function and ii) only build the sockmap infrastructure when CONFIG_INET is enabled since it's only aware of TCP sockets at this time, from John. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-07	selftests/bpf: fix test_align	Alexei Starovoitov	1	-21/+1
	since commit 82abbf8d2fc4 the verifier rejects the bit-wise arithmetic on pointers earlier. The test 'dubious pointer arithmetic' now has less output to match on. Adjust it. Fixes: 82abbf8d2fc4 ("bpf: do not allow root to mangle valid pointers") Reported-by: kernel test robot <xiaolong.ye@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-06	tools/bpf: add a bpf selftest for stacktrace	Yonghong Song	3	-1/+190
	Added a bpf selftest in test_progs at tools directory for stacktrace. The test will populate a hashtable map and a stacktrace map at the same time with the same key, stackid. The user space will compare both maps, using BPF_MAP_LOOKUP_ELEM command and BPF_MAP_GET_NEXT_KEY command, to ensure that both have the same set of keys. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-04	tools: bpftool: remove new lines from errors	Jakub Kicinski	2	-11/+11
	It's a little bit unusual for kernel style, but we add the new line character to error strings inside the p_err() function. We do this because new lines at the end of error strings will break JSON output. Fix a few p_err("..\n") which snuck in recently. Fixes: 5ccda64d38cc ("bpftool: implement cgroup bpf operations") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-04	tools: bpftool: alias show and list commands	Jakub Kicinski	8	-19/+22
	iproute2 seems to accept show and list as aliases. Let's do the same thing, and by allowing both bring cgroup syntax back in line with maps and progs. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>