linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2018-02-08	tuntap: add missing xdp flush	Jason Wang	1	-0/+15
	When using devmap to redirect packets between interfaces, xdp_do_flush() is usually a must to flush any batched packets. Unfortunately this is missed in current tuntap implementation. Unlike most hardware driver which did XDP inside NAPI loop and call xdp_do_flush() at then end of each round of poll. TAP did it in the context of process e.g tun_get_user(). So fix this by count the pending redirected packets and flush when it exceeds NAPI_POLL_WEIGHT or MSG_MORE was cleared by sendmsg() caller. With this fix, xdp_redirect_map works again between two TAPs. Fixes: 761876c857cb ("tap: XDP support") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-09	kconfig: send error messages to stderr	Masahiro Yamada	4	-19/+24
	These messages should be directed to stderr. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-02-09	kconfig: echo stdin to stdout if either is redirected	Masahiro Yamada	1	-3/+4
	If stdio is not tty, conf_askvalue() puts additional new line to prevent prompts from being concatenated into a single line. This care is missing in conf_choice(), so a 'choice' prompt and the next prompt are shown in the same line. Move the code into xfgets() to cater to all cases. To improve this more, let's echo stdin to stdout. This clarifies what keys were input from stdio and the stdout looks like as if it were from tty. I removed the isatty(2) check since stderr is unrelated here. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-02-09	kconfig: remove check_stdin()	Masahiro Yamada	1	-14/+0
	Except silentoldconfig, valid_stdin is 1, so check_stdin() is no-op. oldconfig and silentoldconfig work almost in the same way except that the latter generates additional files under include/. Both ask users for input for new symbols. I do not know why only silentoldconfig requires stdio be tty. $ rm -f .config; touch .config $ yes "" \| make oldconfig > stdout $ rm -f .config; touch .config $ yes "" \| make silentoldconfig > stdout make[1]: * [silentoldconfig] Error 1 make: * [silentoldconfig] Error 2 $ tail -n 4 stdout Console input/output is redirected. Run 'make oldconfig' to update configuration. scripts/kconfig/Makefile:40: recipe for target 'silentoldconfig' failed Makefile:507: recipe for target 'silentoldconfig' failed Redirection is useful, for example, for testing where we want to give particular key inputs from a test file, then check the result. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-02-09	kconfig: remove 'config*' pattern from .gitignnore	Masahiro Yamada	1	-1/+0
	I could not figure out why this pattern should be ignored. Checking commit 1e65174a3378 ("Add some basic .gitignore files") did not help. Let's remove this pattern, then see if it is really needed. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-02-09	kconfig: show '?' prompt even if no help text is available	Masahiro Yamada	1	-7/+2
	'make config', 'make oldconfig', etc. always receive '?' as a valid input and show useful information even if no help text is available. ------------------------>8------------------------ foo (FOO) [N/y] (NEW) ? There is no help available for this option. Symbol: FOO [=n] Type : bool Prompt: foo Defined at Kconfig:1 ------------------------>8------------------------ However, '?' is not shown in the prompt if its help text is missing. Let's show '?' all the time so that the prompt and the behavior match. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-02-09	kconfig: do not write choice values when their dependency becomes n	Masahiro Yamada	1	-9/+7
	"# CONFIG_... is not set" for choice values are wrongly written into the .config file if they are once visible, then become invisible later. Test case --------- ---------------------------(Kconfig)---------------------------- config A bool "A" choice prompt "Choice ?" depends on A config CHOICE_B bool "Choice B" config CHOICE_C bool "Choice C" endchoice ---------------------------------------------------------------- ---------------------------(.config)---------------------------- CONFIG_A=y ---------------------------------------------------------------- With the Kconfig and .config above, $ make config scripts/kconfig/conf --oldaskconfig Kconfig * * Linux Kernel Configuration * A (A) [Y/n] n # # configuration written to .config # $ cat .config # # Automatically generated file; DO NOT EDIT. # Linux Kernel Configuration # # CONFIG_A is not set # CONFIG_CHOICE_B is not set # CONFIG_CHOICE_C is not set Here, # CONFIG_CHOICE_B is not set # CONFIG_CHOICE_C is not set should not be written into the .config file because their dependency "depends on A" is unmet. Currently, there is no code that clears SYMBOL_WRITE of choice values. Clear SYMBOL_WRITE for all symbols in sym_calc_value(), then set it again after calculating visibility. To simplify the logic, set the flag if they have non-n visibility, regardless of types, and regardless of whether they are choice values or not. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-02-08	netlink: ensure to loop over all netns in genlmsg_multicast_allns()	Nicolas Dichtel	1	-2/+10
	Nowadays, nlmsg_multicast() returns only 0 or -ESRCH but this was not the case when commit 134e63756d5f was pushed. However, there was no reason to stop the loop if a netns does not have listeners. Returns -ESRCH only if there was no listeners in all netns. To avoid having the same problem in the future, I didn't take the assumption that nlmsg_multicast() returns only 0 or -ESRCH. Fixes: 134e63756d5f ("genetlink: make netns aware") CC: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08	rxrpc: Don't put crypto buffers on the stack	David Howells	2	-41/+52
	Don't put buffers of data to be handed to crypto on the stack as this may cause an assertion failure in the kernel (see below). Fix this by using an kmalloc'd buffer instead. kernel BUG at ./include/linux/scatterlist.h:147! ... RIP: 0010:rxkad_encrypt_response.isra.6+0x191/0x1b0 [rxrpc] RSP: 0018:ffffbe2fc06cfca8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff989277d59900 RCX: 0000000000000028 RDX: 0000259dc06cfd88 RSI: 0000000000000025 RDI: ffffbe30406cfd88 RBP: ffffbe2fc06cfd60 R08: ffffbe2fc06cfd08 R09: ffffbe2fc06cfd08 R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff7c5f80d9f95 R13: ffffbe2fc06cfd88 R14: ffff98927a3f7aa0 R15: ffffbe2fc06cfd08 FS: 0000000000000000(0000) GS:ffff98927fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055b1ff28f0f8 CR3: 000000001b412003 CR4: 00000000003606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rxkad_respond_to_challenge+0x297/0x330 [rxrpc] rxrpc_process_connection+0xd1/0x690 [rxrpc] ? process_one_work+0x1c3/0x680 ? __lock_is_held+0x59/0xa0 process_one_work+0x249/0x680 worker_thread+0x3a/0x390 ? process_one_work+0x680/0x680 kthread+0x121/0x140 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x3a/0x50 Reported-by: Jonathan Billings <jsbillings@jsbillings.org> Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Jonathan Billings <jsbillings@jsbillings.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08	svcrdma: Fix Read chunk round-up	Chuck Lever	1	-4/+8
	A single NFSv4 WRITE compound can often have three operations: PUTFH, WRITE, then GETATTR. When the WRITE payload is sent in a Read chunk, the client places the GETATTR in the inline part of the RPC/RDMA message, just after the WRITE operation (sans payload). The position value in the Read chunk enables the receiver to insert the Read chunk at the correct place in the received XDR stream; that is between the WRITE and GETATTR. According to RFC 8166, an NFS/RDMA client does not have to add XDR round-up to the Read chunk that carries the WRITE payload. The receiver adds XDR round-up padding if it is absent and the receiver's XDR decoder requires it to be present. Commit 193bcb7b3719 ("svcrdma: Populate tail iovec when receiving") attempted to add support for receiving such a compound so that just the WRITE payload appears in rq_arg's page list, and the trailing GETATTR is placed in rq_arg's tail iovec. (TCP just strings the whole compound into the head iovec and page list, without regard to the alignment of the WRITE payload). The server transport logic also had to accommodate the optional XDR round-up of the Read chunk, which it did simply by lengthening the tail iovec when round-up was needed. This approach is adequate for the NFSv2 and NFSv3 WRITE decoders. Unfortunately it is not sufficient for nfsd4_decode_write. When the Read chunk length is a couple of bytes less than PAGE_SIZE, the computation at the end of nfsd4_decode_write allows argp->pagelen to go negative, which breaks the logic in read_buf that looks for the tail iovec. The result is that a WRITE operation whose payload length is just less than a multiple of a page succeeds, but the subsequent GETATTR in the same compound fails with NFS4ERR_OP_ILLEGAL because the XDR decoder can't find it. Clients ignore the error, but they must update their attribute cache via a separate round trip. As nfsd4_decode_write appears to expect the payload itself to always have appropriate XDR round-up, have svc_rdma_build_normal_read_chunk add the Read chunk XDR round-up to the page_len rather than lengthening the tail iovec. Reported-by: Olga Kornievskaia <kolga@netapp.com> Fixes: 193bcb7b3719 ("svcrdma: Populate tail iovec when receiving") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-08	NFSD: hide unused svcxdr_dupstr()	Arnd Bergmann	1	-3/+2
	There is now only one caller left for svcxdr_dupstr() and this is inside of an #ifdef, so we can get a warning when the option is disabled: fs/nfsd/nfs4xdr.c:241:1: error: 'svcxdr_dupstr' defined but not used [-Werror=unused-function] This changes the remaining caller to use a nicer IS_ENABLED() check, which lets the compiler drop the unused code silently. Fixes: e40d99e6183e ("NFSD: Clean up symlink argument XDR decoders") Suggested-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-08	nfsd: store stat times in fill_pre_wcc() instead of inode times	Amir Goldstein	3	-24/+37
	The time values in stat and inode may differ for overlayfs and stat time values are the correct ones to use. This is also consistent with the fact that fill_post_wcc() also stores stat time values. This means introducing a stat call that could fail, where previously we were just copying values out of the inode. To be conservative about changing behavior, we fall back to copying values out of the inode in the error case. It might be better just to clear fh_pre_saved (though note the BUG_ON in set_change_info). Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-08	nfsd: encode stat->mtime for getattr instead of inode->i_mtime	Amir Goldstein	2	-4/+3
	The values of stat->mtime and inode->i_mtime may differ for overlayfs and stat->mtime is the correct value to use when encoding getattr. This is also consistent with the fact that other attr times are also encoded from stat values. Both callers of lease_get_mtime() already have the value of stat->mtime, so the only needed change is that lease_get_mtime() will not overwrite this value with inode->i_mtime in case the inode does not have an exclusive lease. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-08	nfsd: return RESOURCE not GARBAGE_ARGS on too many ops	J. Bruce Fields	2	-2/+10
	A client that sends more than a hundred ops in a single compound currently gets an rpc-level GARBAGE_ARGS error. It would be more helpful to return NFS4ERR_RESOURCE, since that gives the client a better idea how to recover (for example by splitting up the compound into smaller compounds). This is all a bit academic since we've never actually seen a reason for clients to send such long compounds, but we may as well fix it. While we're there, just use NFSD4_MAX_OPS_PER_COMPOUND == 16, the constant we already use in the 4.1 case, instead of hard-coding 100. Chances anyone actually uses even 16 ops per compound are small enough that I think there's a neglible risk or any regression. This fixes pynfs test COMP6. Reported-by: "Lu, Xinyu" <luxy.fnst@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-02-08	selftests/ftrace: Add more tests for removing of function probes	Steven Rostedt (VMware)	1	-0/+37
	Al Viro discovered a bug in the removing of function probes where if it had a '' at the beginning, it would fail to find any matches. That is, because it reset the glob search string to the the initial string with a "MATCH_END" type, instead of skipping the wildcard "" it included it, where it would not match any functions because "*" was being treated as a normal character and not a wildcard one. Link: http://lkml.kernel.org/r/20180127031706.GE13338@ZenIV.linux.org.uk Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-02-08	selftests/ftrace: Add some missing glob checks	Steven Rostedt (VMware)	1	-0/+6
	Al Viro discovered a bug in the glob ftrace filtering code where "ab" is treated the same as "ab", and functions that would be selected by "ab" but not "ab" are not selected with "ab". Add tests for patterns "ab" and "ab" to the glob selftest. Link: http://lkml.kernel.org/r/20180127170748.GF13338@ZenIV.linux.org.uk Cc: Shuah Khan <shuah@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-02-08	selftests/ftrace: Have reset_ftrace_filter handle multiple instances	Steven Rostedt (VMware)	1	-0/+3
	If a probe is attached to a static function that is in multiple files with the same name, removing it by name will remove all instances: # grep jump_label_unlock set_ftrace_filter jump_label_unlock:traceoff:unlimited jump_label_unlock:traceoff:unlimited # echo '!jump_label_unlock:traceoff' >> set_ftrace_filter # grep jump_label_unlock set_ftrace_filter # But the loop in reset_ftrace_filter will try to remove multiple instances multiple times. If this happens the second time will error and cause the test to fail. At each iteration of the loop, check to see if the probe being removed still exists. Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-02-08	selftests/ftrace: Have reset_ftrace_filter handle modules	Steven Rostedt (VMware)	1	-3/+4
	If a function probe in set_ftrace_filter belongs to a module, it will contain the module name. Like: wmi_query_block [wmi]:traceoff:unlimited But writing: '!wmi_query_block [wmi]:traceoff' > set_ftrace_filter will cause an error. We still need to write: '!wmi_query_block:traceoff' > set_ftrace_filter Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-02-08	tracing: Fix parsing of globs with a wildcard at the beginning	Steven Rostedt (VMware)	1	-5/+4
	Al Viro reported: For substring - sure, but what about something like "ab" and "ab"? AFAICS, filter_parse_regex() ends up with identical results in both cases - MATCH_GLOB and search = "ab". And no way for the caller to tell one from another. Testing this with the following: # cd /sys/kernel/tracing # echo 'rawlock' > set_ftrace_filter bash: echo: write error: Invalid argument With this patch: # echo 'rawlock' > set_ftrace_filter # cat set_ftrace_filter _raw_read_trylock _raw_write_trylock _raw_read_unlock _raw_spin_unlock _raw_write_unlock _raw_spin_trylock _raw_spin_lock _raw_write_lock _raw_read_lock Al recommended not setting the search buffer to skip the first '' unless we know we are not using MATCH_GLOB. This implements his suggested logic. Link: http://lkml.kernel.org/r/20180127170748.GF13338@ZenIV.linux.org.uk Cc: stable@vger.kernel.org Fixes: 60f1d5e3bac44 ("ftrace: Support full glob matching") Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Suggsted-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-02-08	ftrace: Remove incorrect setting of glob search field	Steven Rostedt (VMware)	1	-1/+0
	__unregister_ftrace_function_probe() will incorrectly parse the glob filter because it resets the search variable that was setup by filter_parse_regex(). Al Viro reported this: After that call of filter_parse_regex() we could have func_g.search not equal to glob only if glob started with '!' or ''. In the former case we would've buggered off with -EINVAL (not = 1). In the latter we would've set func_g.search equal to glob + 1, calculated the length of that thing in func_g.len and proceeded to reset func_g.search back to glob. Suppose the glob is e.g. foo. We end up with func_g.type = MATCH_MIDDLE_ONLY; func_g.len = 3; func_g.search = "foo"; Feeding that to ftrace_match_record() will not do anything sane - we will be looking for names containing "*foo" (->len is ignored for that one). Link: http://lkml.kernel.org/r/20180127031706.GE13338@ZenIV.linux.org.uk Cc: stable@vger.kernel.org Fixes: 3ba009297149f ("ftrace: Introduce ftrace_glob structure") Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-02-08	nfp: populate MODULE_VERSION	Jakub Kicinski	1	-0/+1
	DKMS and similar out-of-tree module replacement services use module version to make sure the out-of-tree software is not older than the module shipped with the kernel. We use the kernel version in ethtool -i output, put it into MODULE_VERSION as well. Reported-by: Jan Gutter <jan.gutter@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08	nfp: limit the number of TSO segments	Jakub Kicinski	2	-1/+6
	Most FWs limit the number of TSO segments a frame can produce to 64. This is for fairness and efficiency (of FW datapath) reasons. If a frame with larger number of segments is submitted the FW will drop it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08	nfp: forbid disabling hw-tc-offload on representors while offload active	Jakub Kicinski	7	-21/+33
	All netdevs which can accept TC offloads must implement .ndo_set_features(). nfp_reprs currently do not do that, which means hw-tc-offload can be turned on and off even when offloads are active. Whether the offloads are active is really a question to nfp_ports, so remove the per-app tc_busy callback indirection thing, and simply count the number of offloaded items in nfp_port structure. Fixes: 8a2768732a4d ("nfp: provide infrastructure for offloading flower based TC filters") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08	nfp: don't advertise hw-tc-offload on non-port netdevs	Jakub Kicinski	1	-1/+1
	nfp_port is a structure which represents an ASIC port, both PCIe vNIC (on a PF or a VF) or the external MAC port. vNIC netdev (struct nfp_net) and pure representor netdev (struct nfp_repr) both have a pointer to this structure. nfp_reprs always have a port associated. nfp_nets, however, only represent a device port in legacy mode, where they are considered the MAC port. In switchdev mode they are just the CPU's side of the PCIe link. By definition TC offloads only apply to device ports. Don't set the flag on vNICs without a port (i.e. in switchdev mode). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08	nfp: bpf: require ETH table	Jakub Kicinski	1	-0/+12
	Upcoming changes will require all netdevs supporting TC offloads to have a full struct nfp_port. Require those for BPF offload. The operation without management FW reporting information about Ethernet ports is something we only support for very old and very basic NIC firmwares anyway. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-08	Revert "ath10k: add sanity check to ie_len before parsing fw/board ie"	Ryan Hsu	1	-7/+7
	This reverts commit 9ed4f91628737c820af6a1815b65bc06bd31518f. The commit introduced a regression that over read the ie with the padding. - the expected IE information ath10k_pci 0000:03:00.0: found firmware features ie (1 B) ath10k_pci 0000:03:00.0: Enabling feature bit: 6 ath10k_pci 0000:03:00.0: Enabling feature bit: 7 ath10k_pci 0000:03:00.0: features ath10k_pci 0000:03:00.0: 00000000: c0 00 00 00 00 00 00 00 - the wrong IE with padding is read (0x77) ath10k_pci 0000:03:00.0: found firmware features ie (4 B) ath10k_pci 0000:03:00.0: Enabling feature bit: 6 ath10k_pci 0000:03:00.0: Enabling feature bit: 7 ath10k_pci 0000:03:00.0: Enabling feature bit: 8 ath10k_pci 0000:03:00.0: Enabling feature bit: 9 ath10k_pci 0000:03:00.0: Enabling feature bit: 10 ath10k_pci 0000:03:00.0: Enabling feature bit: 12 ath10k_pci 0000:03:00.0: Enabling feature bit: 13 ath10k_pci 0000:03:00.0: Enabling feature bit: 14 ath10k_pci 0000:03:00.0: Enabling feature bit: 16 ath10k_pci 0000:03:00.0: Enabling feature bit: 17 ath10k_pci 0000:03:00.0: Enabling feature bit: 18 ath10k_pci 0000:03:00.0: features ath10k_pci 0000:03:00.0: 00000000: c0 77 07 00 00 00 00 00 Tested-by: Mike Lothian <mike@fireburn.co.uk> Signed-off-by: Ryan Hsu <ryanhsu@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-02-08	tools: bpftool: add bash completion for cgroup commands	Quentin Monnet	2	-6/+62
	Add bash completion for "bpftool cgroup" command family. While at it, also fix the formatting of some keywords in the man page for cgroups. Fixes: 5ccda64d38cc ("bpftool: implement cgroup bpf operations") Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08	tools: bpftool: add bash completion for `bpftool prog load`	Quentin Monnet	1	-2/+6
	Add bash completion for bpftool command `prog load`. Completion for this command is easy, as it only takes existing file paths as arguments. Fixes: 49a086c201a9 ("bpftool: implement prog load command") Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08	tools: bpftool: make syntax for program map update explicit in man page	Quentin Monnet	1	-1/+2
	Specify in the documentation that when using bpftool to update a map of type BPF_MAP_TYPE_PROG_ARRAY, the syntax for the program used as a value should use the "id\|tag\|pinned" keywords convention, as used with "bpftool prog" commands. Fixes: ff69c21a85a4 ("tools: bpftool: add documentation") Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08	tools: bpftool: exit doc Makefile early if rst2man is not available	Quentin Monnet	1	-0/+5
	If rst2man is not available on the system, running `make doc` from the bpftool directory fails with an error message. However, it creates empty manual pages (.8 files in this case). A subsequent call to `make doc-install` would then succeed and install those empty man pages on the system. To prevent this, raise a Makefile error and exit immediately if rst2man is not available before generating the pages from the rst documentation. Fixes: ff69c21a85a4 ("tools: bpftool: add documentation") Reported-by: Jason van Aaardt <jason.vanaardt@netronome.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08	libbpf: complete list of strings for guessing program type	Quentin Monnet	1	-0/+5
	It seems that the type guessing feature for libbpf, based on the name of the ELF section the program is located in, was inspired from samples/bpf/prog_load.c, which was not used by any sample for loading programs of certain types such as TC actions and classifiers, or LWT-related types. As a consequence, libbpf is not able to guess the type of such programs and to load them automatically if type is not provided to the `bpf_load_prog()` function. Add ELF section names associated to those eBPF program types so that they can be loaded with e.g. bpftool as well. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08	nfp: bpf: fix immed relocation for larger offsets	Jakub Kicinski	1	-1/+1
	Immed relocation is missing a shift which means for larger offsets the lower and higher part of the address would be ORed together. Fixes: ce4ebfd859c3 ("nfp: bpf: add helpers for updating immediate instructions") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08	CRIS: Restore mistakenly cleared kernel Makefile	Jesper Nilsson	1	-0/+17
	Commit 0fbc0b67a89d7 ("cris: remove arch specific early DT functions") was a bit overzealous in removing the CRIS DT handling, and the complete contents of the Makefile was erased instead of just the line for the devicetree file. This lead to a complete link failure for all SoCs in the CRIS port due to missing symbols. Restore the contents except the line for the devicetree file. Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Fixes: 0fbc0b67a89d7
2018-02-08	xen: Fix {set,clear}_foreign_p2m_mapping on autotranslating guests	Simon Gaiser	1	-0/+6
	Commit 82616f9599a7 ("xen: remove tests for pvh mode in pure pv paths") removed the check for autotranslation from {set,clear}_foreign_p2m_mapping but those are called by grant-table.c also on PVH/HVM guests. Cc: <stable@vger.kernel.org> # 4.14 Fixes: 82616f9599a7 ("xen: remove tests for pvh mode in pure pv paths") Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-02-08	arm: imx: Add MODULE_ALIAS for cpufreq	Nicolas Chauvet	1	-0/+1
	Without this, the imx6q-cpufreq driver isn't loaded automatically when built as a module Tested on wandboard quad with a fedora 27 kernel rpm Signed-off-by: Nicolas Chauvet <kwizart@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-02-08	cpufreq: Add and use cpufreq_for_each_{valid_,}entry_idx()	Dominik Brodowski	8	-63/+100
	Pointer subtraction is slow and tedious. Therefore, replace all instances where cpufreq_for_each_{valid_,}entry loops contained such substractions with an iteration macro providing an index to the frequency_table entry. Suggested-by: Al Viro <viro@ZenIV.linux.org.uk> Link: http://lkml.kernel.org/r/20180120020237.GM13338@ZenIV.linux.org.uk Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-02-08	cpufreq: intel_pstate: Enable HWP during system resume on CPU0	Chen Yu	1	-0/+5
	When maxcpus=1 is in the kernel command line, the BP is responsible for re-enabling the HWP - because currently only the APs invoke intel_pstate_hwp_enable() during their online process - which might put the system into unstable state after resume. Fix this by enabling the HWP explicitly on BP during resume. Reported-by: Doug Smythies <dsmythies@telus.net> Suggested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Yu Chen <yu.c.chen@intel.com> [ rjw: Subject/changelog, minor modifications ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-02-08	cpufreq: scpi: fix error return code in scpi_cpufreq_init()	Wei Yongjun	1	-0/+1
	Fix to return a negative error code from the clk_get() error handling case instead of 0, as done elsewhere in this function. Fixes: 343a8d17fa8d (cpufreq: scpi: remove arm_big_little dependency) Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-02-08	ACPI: sbshc: remove raw pointer from printk() message	Greg Kroah-Hartman	1	-2/+2
	There's no need to be printing a raw kernel pointer to the kernel log at every boot. So just remove it, and change the whole message to use the correct dev_info() call at the same time. Reported-by: Wang Qize <wang_qize@venustech.com.cn> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-02-08	s390/kconfig: Remove ARCH_WANTS_PROT_NUMA_PROT_NONE select	Ulf Magnusson	1	-1/+0
	The ARCH_WANTS_PROT_NUMA_PROT_NONE symbol was removed by commit 6a33979d5bd7 ("mm: remove misleading ARCH_USES_NUMA_PROT_NONE"), but S390 still selects it. Remove the ARCH_WANTS_PROT_NUMA_PROT_NONE select from the S390 symbol. Discovered with the https://github.com/ulfalizer/Kconfiglib/blob/master/examples/list_undefined.py script. Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-02-08	KVM: PPC: Book3S PR: Fix broken select due to misspelling	Ulf Magnusson	1	-1/+1
	Commit 76d837a4c0f9 ("KVM: PPC: Book3S PR: Don't include SPAPR TCE code on non-pseries platforms") added a reference to the globally undefined symbol PPC_SERIES. Looking at the rest of the commit, PPC_PSERIES was probably intended. Change PPC_SERIES to PPC_PSERIES. Discovered with the https://github.com/ulfalizer/Kconfiglib/blob/master/examples/list_undefined.py script. Fixes: 76d837a4c0f9 ("KVM: PPC: Book3S PR: Don't include SPAPR TCE code on non-pseries platforms") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-02-07	tcp: tracepoint: only call trace_tcp_send_reset with full socket	Song Liu	2	-2/+4
	tracepoint tcp_send_reset requires a full socket to work. However, it may be called when in TCP_TIME_WAIT: case TCP_TW_RST: tcp_v6_send_reset(sk, skb); inet_twsk_deschedule_put(inet_twsk(sk)); goto discard_it; To avoid this problem, this patch checks the socket with sk_fullsock() before calling trace_tcp_send_reset(). Fixes: c24b14c46bb8 ("tcp: add tracepoint trace_tcp_send_reset") Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07	sch_netem: Bug fixing in calculating Netem interval	Md. Islam	1	-1/+1
	In Kernel 4.15.0+, Netem does not work properly. Netem setup: tc qdisc add dev h1-eth0 root handle 1: netem delay 10ms 2ms Result: PING 172.16.101.2 (172.16.101.2) 56(84) bytes of data. 64 bytes from 172.16.101.2: icmp_seq=1 ttl=64 time=22.8 ms 64 bytes from 172.16.101.2: icmp_seq=2 ttl=64 time=10.9 ms 64 bytes from 172.16.101.2: icmp_seq=3 ttl=64 time=10.9 ms 64 bytes from 172.16.101.2: icmp_seq=5 ttl=64 time=11.4 ms 64 bytes from 172.16.101.2: icmp_seq=6 ttl=64 time=11.8 ms 64 bytes from 172.16.101.2: icmp_seq=4 ttl=64 time=4303 ms 64 bytes from 172.16.101.2: icmp_seq=10 ttl=64 time=11.2 ms 64 bytes from 172.16.101.2: icmp_seq=11 ttl=64 time=10.3 ms 64 bytes from 172.16.101.2: icmp_seq=7 ttl=64 time=4304 ms 64 bytes from 172.16.101.2: icmp_seq=8 ttl=64 time=4303 ms Patch: (rnd % (2 * sigma)) - sigma was overflowing s32. After applying the patch, I found following output which is desirable. PING 172.16.101.2 (172.16.101.2) 56(84) bytes of data. 64 bytes from 172.16.101.2: icmp_seq=1 ttl=64 time=21.1 ms 64 bytes from 172.16.101.2: icmp_seq=2 ttl=64 time=8.46 ms 64 bytes from 172.16.101.2: icmp_seq=3 ttl=64 time=9.00 ms 64 bytes from 172.16.101.2: icmp_seq=4 ttl=64 time=11.8 ms 64 bytes from 172.16.101.2: icmp_seq=5 ttl=64 time=8.36 ms 64 bytes from 172.16.101.2: icmp_seq=6 ttl=64 time=11.8 ms 64 bytes from 172.16.101.2: icmp_seq=7 ttl=64 time=8.11 ms 64 bytes from 172.16.101.2: icmp_seq=8 ttl=64 time=10.0 ms 64 bytes from 172.16.101.2: icmp_seq=9 ttl=64 time=11.3 ms 64 bytes from 172.16.101.2: icmp_seq=10 ttl=64 time=11.5 ms 64 bytes from 172.16.101.2: icmp_seq=11 ttl=64 time=10.2 ms Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07	net: ethernet: ti: cpsw: fix net watchdog timeout	Grygorii Strashko	1	-2/+14
	It was discovered that simple program which indefinitely sends 200b UDP packets and runs on TI AM574x SoC (SMP) under RT Kernel triggers network watchdog timeout in TI CPSW driver (<6 hours run). The network watchdog timeout is triggered due to race between cpsw_ndo_start_xmit() and cpsw_tx_handler() [NAPI] cpsw_ndo_start_xmit() if (unlikely(!cpdma_check_free_tx_desc(txch))) { txq = netdev_get_tx_queue(ndev, q_idx); netif_tx_stop_queue(txq); ^^ as per [1] barier has to be used after set_bit() otherwise new value might not be visible to other cpus } cpsw_tx_handler() if (unlikely(netif_tx_queue_stopped(txq))) netif_tx_wake_queue(txq); and when it happens ndev TX queue became disabled forever while driver's HW TX queue is empty. Fix this, by adding smp_mb__after_atomic() after netif_tx_stop_queue() calls and double check for free TX descriptors after stopping ndev TX queue - if there are free TX descriptors wake up ndev TX queue. [1] https://www.kernel.org/doc/html/latest/core-api/atomic_ops.html Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07	ibmvnic: Ensure that buffers are NULL after free	Thomas Falcon	1	-0/+5
	This change will guard against a double free in the case that the buffers were previously freed at some other time, such as during a device reset. It resolves a kernel oops that occurred when changing the VNIC device's MTU. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07	ibmvnic: Fix rx queue cleanup for non-fatal resets	John Allen	1	-1/+2
	At some point, a check was added to exit the polling routine during resets. This makes sense for most reset conditions, but for a non-fatal error, we expect the polling routine to continue running to properly clean up the rx queues. This patch checks if we are performing a non-fatal reset and if we are, continues normal polling operation. Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07	i40e: Fix the number of queues available to be mapped for use	Amritha Nambiar	1	-13/+14
	Fix the number of queues per enabled TC and report available queues to the kernel without having to limit them to the max RSS limit so they are available to be mapped for XPS. This allows a queue per processing thread available for handling traffic for the given traffic class. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07	net/ipv6: onlink nexthop checks should default to main table	David Ahern	1	-2/+3
	Because of differences in how ipv4 and ipv6 handle fib lookups, verification of nexthops with onlink flag need to default to the main table rather than the local table used by IPv4. As it stands an address within a connected route on device 1 can be used with onlink on device 2. Updating the table properly rejects the route due to the egress device mismatch. Update the extack message as well to show it could be a device mismatch for the nexthop spec. Fixes: fc1e64e1092f ("net/ipv6: Add support for onlink flag") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07	net/ipv6: Handle reject routes with onlink flag	David Ahern	1	-1/+2
	Verification of nexthops with onlink flag need to handle unreachable routes. The lookup is only intended to validate the gateway address is not a local address and if the gateway resolves the egress device must match the given device. Hence, hitting any default reject route is ok. Fixes: fc1e64e1092f ("net/ipv6: Add support for onlink flag") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07	sun: Add SPDX license tags to Sun network drivers	Shannon Nelson	11	-0/+11
	Add the appropriate SPDX license tags to the Sun network drivers as outlined in Documentation/process/license-rules.rst. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>