linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2017-10-05	Merge branch 'libbpf-support-more-map-options'	David S. Miller	2	-30/+43
	Craig Gallek says: ==================== libbpf: support more map options The functional change to this series is the ability to use flags when creating maps from object files loaded by libbpf. In order to do this, the first patch updates the library to handle map definitions that differ in size from libbpf's struct bpf_map_def. For object files with a larger map definition, libbpf will continue to load if the unknown fields are all zero, otherwise the map is rejected. If the map definition in the object file is smaller than expected, libbpf will use zero as a default value in the missing fields. ==================== Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	libbpf: use map_flags when creating maps	Craig Gallek	2	-1/+2
	This is required to use BPF_MAP_TYPE_LPM_TRIE or any other map type which requires flags. Signed-off-by: Craig Gallek <kraig@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	libbpf: parse maps sections of varying size	Craig Gallek	1	-29/+41
	This library previously assumed a fixed-size map options structure. Any new options were ignored. In order to allow the options structure to grow and to support parsing older programs, this patch updates the maps section parsing to handle varying sizes. Object files with maps sections smaller than expected will have the new fields initialized to zero. Object files which have larger than expected maps sections will be rejected unless all of the unrecognized data is zero. This change still assumes that each map definition in the maps section is the same size. Signed-off-by: Craig Gallek <kraig@google.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	net: qcom/emac: make function emac_isr static	Colin Ian King	1	-1/+1
	The function emac_isr is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warnings: symbol 'emac_isr' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	Merge branch 'tcp-improving-RACK-cpu-performance'	David S. Miller	8	-49/+93
	Yuchung Cheng says: ==================== tcp: improving RACK cpu performance This patch set improves the CPU consumption of the RACK TCP loss recovery algorithm, in particular for high-speed networks. Currently, for every ACK in recovery RACK can potentially iterate over all sent packets in the write queue. On large BDP networks with non-trivial losses the RACK write queue walk CPU usage becomes unreasonably high. This patch introduces a new queue in TCP that keeps only skbs sent and not yet (s)acked or marked lost, in time order instead of sequence order. With that, RACK can examine this time-sorted list and only check packets that were sent recently, within the reordering window, per ACK. This is the fastest way without any write queue walks. The number of skbs examined per ACK is reduced by orders of magnitude. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	tcp: a small refactor of RACK loss detection	Yuchung Cheng	1	-22/+18
	Refactor the RACK loop to improve readability and speed up the checks. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	tcp: more efficient RACK loss detection	Yuchung Cheng	1	-15/+5
	Use the new time-ordered list to speed up RACK. The detection logic is identical. But since the list is chronologically ordered by skb_mstamp and contains only skbs not yet acked or sacked, RACK can abort the loop upon hitting skbs that were sent more recently. On YouTube servers this patch reduces the iterations on write queue by 40x. The improvement is even bigger with large BDP networks. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	tcp: new list for sent but unacked skbs for RACK recovery	Eric Dumazet	7	-16/+74
	This patch adds a new queue (list) that tracks the sent but not yet acked or SACKed skbs for a TCP connection. The list is chronologically ordered by skb->skb_mstamp (the head is the oldest sent skb). This list will be used to optimize TCP Rack recovery, which checks an skb's timestamp to judge if it has been lost and needs to be retransmitted. Since TCP write queue is ordered by sequence instead of sent time, RACK has to scan over the write queue to catch all eligible packets to detect lost retransmission, and iterates through SACKed skbs repeatedly. Special cares for rare events: 1. TCP repair fakes skb transmission so the send queue needs adjusted 2. SACK reneging would require re-inserting SACKed skbs into the send queue. For now I believe it's not worth the complexity to make RACK work perfectly on SACK reneging, so we do nothing here. 3. Fast Open: currently for non-TFO, send-queue correctly queues the pure SYN packet. For TFO which queues a pure SYN and then a data packet, send-queue only queues the data packet but not the pure SYN due to the structure of TFO code. This is okay because the SYN receiver would never respond with a SACK on a missing SYN (i.e. SYN is never fast-retransmitted by SACK/RACK). In order to not grow sk_buff, we use an union for the new list and _skb_refdst/destructor fields. This is a bit complicated because we need to make sure _skb_refdst and destructor are properly zeroed before skb is cloned/copied at transmit, and before being freed. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	RDS: IB: Initialize max_items based on underlying device attributes	Avinash Repaka	1	-2/+2
	Use max_1m_mrs/max_8k_mrs while setting max_items, as the former variables are set based on the underlying device attributes. Signed-off-by: Avinash Repaka <avinash.repaka@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	RDS: IB: Limit the scope of has_fr/has_fmr variables	Avinash Repaka	2	-7/+6
	This patch fixes the scope of has_fr and has_fmr variables as they are needed only in rds_ib_add_one(). Signed-off-by: Avinash Repaka <avinash.repaka@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	net/ipv4: Remove unused variable in route.c	Tim Hansen	1	-2/+1
	int rc is unmodified after initalization in net/ipv4/route.c, this patch simply cleans up that variable and returns 0. This was found with coccicheck M=net/ipv4/ on linus' tree. Signed-off-by: Tim Hansen <devtimhansen@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	tcp: clean up TFO server's initial tcp_rearm_rto() call	Wei Wang	1	-12/+9
	This commit does a cleanup and moves tcp_rearm_rto() call in the TFO server case into a previous spot in tcp_rcv_state_process() to make it more compact. This is only a cosmetic change. Suggested-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	tcp: uniform the set up of sockets after successful connection	Wei Wang	4	-24/+17
	Currently in the TCP code, the initialization sequence for cached metrics, congestion control, BPF, etc, after successful connection is very inconsistent. This introduces inconsistent bevhavior and is prone to bugs. The current call sequence is as follows: (1) for active case (tcp_finish_connect() case): tcp_mtup_init(sk); icsk->icsk_af_ops->rebuild_header(sk); tcp_init_metrics(sk); tcp_call_bpf(sk, BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB); tcp_init_congestion_control(sk); tcp_init_buffer_space(sk); (2) for passive case (tcp_rcv_state_process() TCP_SYN_RECV case): icsk->icsk_af_ops->rebuild_header(sk); tcp_call_bpf(sk, BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB); tcp_init_congestion_control(sk); tcp_mtup_init(sk); tcp_init_buffer_space(sk); tcp_init_metrics(sk); (3) for TFO passive case (tcp_fastopen_create_child()): inet_csk(child)->icsk_af_ops->rebuild_header(child); tcp_init_congestion_control(child); tcp_mtup_init(child); tcp_init_metrics(child); tcp_call_bpf(child, BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB); tcp_init_buffer_space(child); This commit uniforms the above functions to have the following sequence: tcp_mtup_init(sk); icsk->icsk_af_ops->rebuild_header(sk); tcp_init_metrics(sk); tcp_call_bpf(sk, BPF_SOCK_OPS_ACTIVE/PASSIVE_ESTABLISHED_CB); tcp_init_congestion_control(sk); tcp_init_buffer_space(sk); This sequence is the same as the (1) active case. We pick this sequence because this order correctly allows BPF to override the settings including congestion control module and initial cwnd, etc from the route, and then allows the CC module to see those settings. Suggested-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	Merge branch 'VSOCK-sock_diag'	David S. Miller	21	-73/+1360
	Stefan Hajnoczi says: ==================== VSOCK: add sock_diag interface v3: * Rebased onto net-next/master and resolved Hyper-V transport conflict v2: * Moved tests to tools/testing/vsock/. I was unable to put them in selftests/ because they require manual setup of a VMware/KVM guest. * Moved to __vsock_in_bound/connected_table() to af_vsock.h * Fixed local variable ordering in Patch 4 There is currently no way for userspace to query open AF_VSOCK sockets. This means ss(8), netstat(8), and other utilities cannot display AF_VSOCK sockets. This patch series adds the netlink sock_diag interface for AF_VSOCK. Userspace programs sent a DUMP request including an sk_state bitmap to filter sockets based on their state (connected, listening, etc). The vsock_diag.ko module replies with information about matching sockets. This userspace ABI is defined in <linux/vm_sockets_diag.h>. The final patch adds a test suite that exercises the basic cases. Jorgen and Dexuan: I have only tested the virtio transport but this should also work for VMCI and Hyper-V. Please give it a shot if you have time. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	VSOCK: add tools/testing/vsock/vsock_diag_test	Stefan Hajnoczi	9	-0/+1039
	This patch adds tests for the vsock_diag.ko module. These tests are not self-tests because they require manual set up of a KVM or VMware guest. Please see tools/testing/vsock/README for instructions. The control.h and timeout.h infrastructure can be used for additional AF_VSOCK tests in the future. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	VSOCK: add sock_diag interface	Stefan Hajnoczi	5	-0/+234
	This patch adds the sock_diag interface for querying sockets from userspace. Tools like ss(8) and netstat(8) can use this interface to list open sockets. The userspace ABI is defined in <linux/vm_sockets_diag.h> and includes netlink request and response structs. The request can query sockets based on their sk_state (e.g. listening sockets only) and the response contains socket information fields including the local/remote addresses, inode number, etc. This patch does not dump VMCI pending sockets because I have only tested the virtio transport, which does not use pending sockets. Support can be added later by extending vsock_diag_dump() if needed by VMCI users. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	VSOCK: use TCP state constants for sk_state	Stefan Hajnoczi	8	-59/+64
	There are two state fields: socket->state and sock->sk_state. The socket->state field uses SS_UNCONNECTED, SS_CONNECTED, etc while the sock->sk_state typically uses values that match TCP state constants (TCP_CLOSE, TCP_ESTABLISHED). AF_VSOCK does not follow this convention and instead uses SS_* constants for both fields. The sk_state field will be exposed to userspace through the vsock_diag interface for ss(8), netstat(8), and other programs. This patch switches sk_state to TCP state constants so that the meaning of this field is consistent with other address families. Not just AF_INET and AF_INET6 use the TCP constants, AF_UNIX and others do too. The following mapping was used to convert the code: SS_FREE -> TCP_CLOSE SS_UNCONNECTED -> TCP_CLOSE SS_CONNECTING -> TCP_SYN_SENT SS_CONNECTED -> TCP_ESTABLISHED SS_DISCONNECTING -> TCP_CLOSING VSOCK_SS_LISTEN -> TCP_LISTEN In __vsock_create() the sk_state initialization was dropped because sock_init_data() already initializes sk_state to TCP_CLOSE. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	VSOCK: move __vsock_in_bound/connected_table() to af_vsock.h	Stefan Hajnoczi	2	-10/+12
	The vsock_diag.ko module will need to check socket table membership. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	VSOCK: export socket tables for sock_diag interface	Stefan Hajnoczi	2	-4/+11
	The socket table symbols need to be exported from vsock.ko so that the vsock_diag.ko module will be able to traverse sockets. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	566	-3479/+6079
	Just simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05	Merge tag 'pm-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm	Linus Torvalds	1	-7/+11
	Pull power management fix from Rafael Wysocki: "This fixes a code ordering issue in the main suspend-to-idle loop that causes some "low power S0 idle" conditions to be incorrectly reported as unmet with suspend/resume debug messages enabled" * tag 'pm-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / s2idle: Invoke the ->wake() platform callback earlier
2017-10-06	Merge branch 'pm-sleep'	Rafael J. Wysocki	1	-7/+11
	* pm-sleep: PM / s2idle: Invoke the ->wake() platform callback earlier
2017-10-05	Merge tag 'for-4.14/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm	Linus Torvalds	7	-22/+47
	Pull device mapper fixes from Mike Snitzer: - a stable fix for the alignment of the event number reported at the end of the 'DM_LIST_DEVICES' ioctl. - a couple stable fixes for the DM crypt target. - a DM raid health status reporting fix. * tag 'for-4.14/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm raid: fix incorrect status output at the end of a "recover" process dm crypt: reject sector_size feature if device length is not aligned to it dm crypt: fix memory leak in crypt_ctr_cipher_old() dm ioctl: fix alignment of event number in the device list
2017-10-05	dm raid: fix incorrect status output at the end of a "recover" process	Jonathan Brassow	2	-5/+7
	There are three important fields that indicate the overall health and status of an array: dev_health, sync_ratio, and sync_action. They tell us the condition of the devices in the array, and the degree to which the array is synchronized. This commit fixes a condition that is reported incorrectly. When a member of the array is being rebuilt or a new device is added, the "recover" process is used to synchronize it with the rest of the array. When the process is complete, but the sync thread hasn't yet been reaped, it is possible for the state of MD to be: mddev->recovery = [ MD_RECOVERY_RUNNING MD_RECOVERY_RECOVER MD_RECOVERY_DONE ] curr_resync_completed = <max dev size> (but not MaxSector) and all rdevs to be In_sync. This causes the 'array_in_sync' output parameter that is passed to rs_get_progress() to be computed incorrectly and reported as 'false' -- or not in-sync. This in turn causes the dev_health status characters to be reported as all 'a', rather than the proper 'A'. This can cause erroneous output for several seconds at a time when tools will want to be checking the condition due to events that are raised at the end of a sync process. Fix this by properly calculating the 'array_in_sync' return parameter in rs_get_progress(). Also, remove an unnecessary intermediate 'recovery_cp' variable in rs_get_progress(). Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-10-05	Merge tag 'sound-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound	Linus Torvalds	9	-11/+61
	Pull sound fixes from Takashi Iwai: "A collection of small fixes, mostly with stable ones: - X32 ABI fix for PCM; likely not so many people suffer from it, but still better to fix - Two minor kernel warning fixes on USB audio devices spotted by syzkaller - Regression fix of echoaudio due to its inconsistent dimension - Fix for HBR support on Intel DP audio, on some recent chips - USB-audio quirk for yet another Plantronics devices - Fix for potential double-fetch in ASIHPI FIFO queue" * tag 'sound-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: usx2y: Suppress kernel warning at page allocation failures Revert "ALSA: echoaudio: purge contradictions between dimension matrix members and total number of members" ALSA: usb-audio: Check out-of-bounds access by corrupted buffer descriptor ALSA: pcm: Fix structure definition for X32 ABI ALSA: usb-audio: Add sample rate quirk for Plantronics C310/C520-M ALSA: hda - program ICT bits to support HBR audio ALSA: asihpi: fix a potential double-fetch bug when copying puhm ALSA: compress: Remove unused variable
2017-10-05	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid	Linus Torvalds	8	-27/+118
	Pull HID subsystem fixes from Jiri Kosina: - buffer management size fix for i2c-hid driver, from Adrian Salido - tool ID regression fixes for Wacom driver from Jason Gerecke - a few small assorted fixes and a few device ID additions * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: Revert "HID: multitouch: Support ALPS PTP stick with pid 0x120A" HID: hidraw: fix power sequence when closing device HID: wacom: Always increment hdev refcount within wacom_get_hdev_data HID: wacom: generic: Clear ABS_MISC when tool leaves proximity HID: wacom: generic: Send MSC_SERIAL and ABS_MISC when leaving prox HID: i2c-hid: allocate hid buffers for real worst case HID: rmi: Make sure the HID device is opened on resume HID: multitouch: Support ALPS PTP stick with pid 0x120A HID: multitouch: support buttons and trackpoint on Lenovo X1 Tab Gen2 HID: wacom: Correct coordinate system of touchring and pen twist HID: wacom: Properly report negative values from Intuos Pro 2 Bluetooth HID: multitouch: Fix system-control buttons not working HID: add multi-input quirk for IDC6680 touchscreen HID: wacom: leds: Don't try to control the EKR's read-only LEDs HID: wacom: bits shifted too much for 9th and 10th buttons
2017-10-05	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds	91	-388/+863
	Pull networking fixes from David Miller: 1) Check iwlwifi 9000 reorder buffer out-of-space condition properly, from Sara Sharon. 2) Fix RCU splat in qualcomm rmnet driver, from Subash Abhinov Kasiviswanathan. 3) Fix session and tunnel release races in l2tp, from Guillaume Nault and Sabrina Dubroca. 4) Fix endian bug in sctp_diag_dump(), from Dan Carpenter. 5) Several mlx5 driver fixes from the Mellanox folks (max flow counters cap check, invalid memory access in IPoIB support, etc.) 6) tun_get_user() should bail if skb->len is zero, from Alexander Potapenko. 7) Fix RCU lookups in inetpeer, from Eric Dumazet. 8) Fix locking in packet_do_bund(). 9) Handle cb->start() error properly in netlink dump code, from Jason A. Donenfeld. 10) Handle multicast properly in UDP socket early demux code. From Paolo Abeni. 11) Several erspan bug fixes in ip_gre, from Xin Long. 12) Fix use-after-free in socket filter code, in order to handle the fact that listener lock is no longer taken during the three-way TCP handshake. From Eric Dumazet. 13) Fix infoleak in RTM_GETSTATS, from Nikolay Aleksandrov. 14) Fix tail call generation in x86-64 BPF JIT, from Alexei Starovoitov. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (77 commits) net: 8021q: skip packets if the vlan is down bpf: fix bpf_tail_call() x64 JIT net: stmmac: dwmac-rk: Add RK3128 GMAC support rndis_host: support Novatel Verizon USB730L net: rtnetlink: fix info leak in RTM_GETSTATS call socket, bpf: fix possible use after free mlxsw: spectrum_router: Track RIF of IPIP next hops mlxsw: spectrum_router: Move VRF refcounting net: hns3: Fix an error handling path in 'hclge_rss_init_hw()' net: mvpp2: Fix clock resource by adding an optional bus clock r8152: add Linksys USB3GIGV1 id l2tp: fix l2tp_eth module loading ip_gre: erspan device should keep dst ip_gre: set tunnel hlen properly in erspan_tunnel_init ip_gre: check packet length and mtu correctly in erspan_xmit ip_gre: get key from session_id correctly in erspan_rcv tipc: use only positive error codes in messages ppp: fix __percpu annotation udp: perform source validation for mcast early demux IPv4: early demux can return an error code ...
2017-10-04	Merge branch 'bpftool'	David S. Miller	19	-12/+2180
	Jakub Kicinski says: ==================== tools: add bpftool This set adds bpftool to the tools/ directory. The first patch renames tools/net to tools/bpf, the second one adds the new code, while the third adds simple documentation. v4: - rename docs .txt -> .rst (Jesper). v3: - address Alexei's comments about output and docs. v2: - report names, map ids, load time, uid; - add docs/man pages; - general cleanups & fixes. ==================== Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	tools: bpftool: add documentation	Jakub Kicinski	5	-0/+263
	Add documentation for bpftool. Separate files for each subcommand. Use rst format. Documentation is compiled into man pages using rst2man. Signed-off-by: David Beckett <david.beckett@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	tools: bpf: add bpftool	Jakub Kicinski	8	-3/+1909
	Add a simple tool for querying and updating BPF objects on the system. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	tools: rename tools/net directory to tools/bpf	Jakub Kicinski	8	-9/+8
	We currently only have BPF tools in the tools/net directory. We are about to add more BPF tools there, not necessarily networking related, rename the directory and related Makefile targets to bpf. Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	Merge branch 'enslavement-extack'	David S. Miller	27	-97/+192
	David Ahern says: ==================== net: Plumb extack error reporting to enslavements Another round of extending extack error reporting, this time for enslavements through ndo_add_slave and notifiers. v2 - changed how the messages are added to bonding driver per Jiri's request - fixed spectrum message for LAG overflow per Ido's comment ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	mlxsw: spectrum: Add extack messages for enslave failures	David Ahern	1	-10/+37
	mlxsw fails device enslavement for a number of reasons. Use the extack facility to return an error message to the user stating why the enslave is failing. Messages are prefixed with "spectrum" so users know it is a constraint imposed by the hardware driver. For example: $ ip li add br0.11 link br0 type vlan id 11 $ ip li set swp11 master br0 Error: spectrum: Enslaving a port to a device that already has an upper device is not supported. Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	net: bridge: Pass extack to down to netdev_master_upper_dev_link	David Ahern	4	-7/+15
	Pass extack arg to br_add_if. Add messages for a couple of failures and pass arg to netdev_master_upper_dev_link. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	net: bonding: Add extack messages for some enslave failures	David Ahern	1	-0/+7
	A number of bond_enslave errors are logged using the netdev_err API. Return those messages to userspace via the extack facility. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	net: vrf: Add extack messages for enslave errors	David Ahern	1	-2/+11
	Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	net: Add extack to upper device linking	David Ahern	19	-32/+44
	Add extack arg to netdev_upper_dev_link and netdev_master_upper_dev_link Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	net: Add extack to ndo_add_slave	David Ahern	9	-13/+22
	Pass extack to do_set_master and down to ndo_add_slave Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	net: Add extack to netdev_notifier_info	David Ahern	2	-33/+56
	Add netlink_ext_ack to netdev_notifier_info to allow notifier handlers to return errors to userspace. Clean up the initialization in dev.c such that extack is easily added in subsequent patches where relevant. Specifically, remove the init call in call_netdevice_notifiers_info and have callers initalize on stack when info is declared. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	net: 8021q: skip packets if the vlan is down	Vishakha Narvekar	1	-0/+6
	If the vlan is down, free the packet instead of proceeding with other processing, or counting it as received. If vlan interfaces are used as slaves for bonding, with arp monitoring for connectivity, if the rx counter is seen to be incrementing, then the bond device will not observe that the interface is down. CC: David S. Miller <davem@davemloft.net> Signed-off-by: Vishakha Narvekar <Vishakha.Narvekar@dell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	dev: advertise the new nsid when the netns iface changes	Nicolas Dichtel	4	-13/+34
	x-netns interfaces are bound to two netns: the link netns and the upper netns. Usually, this kind of interfaces is created in the link netns and then moved to the upper netns. At the end, the interface is visible only in the upper netns. The link nsid is advertised via netlink in the upper netns, thus the user always knows where is the link part. There is no such mechanism in the link netns. When the interface is moved to another netns, the user cannot "follow" it. This patch adds a new netlink attribute which helps to follow an interface which moves to another netns. When the interface is unregistered, the new nsid is advertised. If the interface is a x-netns interface (ie rtnl_link_ops->get_link_net is defined), the nsid is allocated if needed. CC: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc	Linus Torvalds	44	-503/+810
	Pull ARM SoC fixes from Olof Johansson: "Our first batch of fixes this release cycle, unfortunately a bit noisier than usual. Two major groups stand out: - Some pinctril dts/dtsi changes for stm32 due to a new driver being merged during the merge window, and this aligns the DT contents between the old format and the new. This could arguably be moved to the next merge window but it also seemed relatively harmless to include now. - Amlogic/meson had driver changes merged that required devicetree changes to avoid functional/performance regressions. I've already asked them to be more careful about this going forward, and making sure drivers are compatible with older DTs when they make these kind of changes. The platform is actively being upstreamed so there's a few things in flight, we've seen this happen before and sometimes it's hard to catch in time. Besides that there is the usual mix of minor fixes" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (33 commits) ARM: dts: stm32: use right pinctrl compatible for stm32f469 ARM: dts: stm32: Fix STMPE1600 binding on stm32429i-eval board ARM: defconfig: update Gemini defconfig ARM: defconfig: FRAMEBUFFER_CONSOLE can no longer be =m arm64: dts: rockchip: add the grf clk for dw-mipi-dsi on rk3399 reset: Restrict RESET_HSDK to ARC_SOC_HSDK or COMPILE_TEST ARM: dts: da850-evm: add serial and ethernet aliases ARM: dts: am43xx-epos-evm: Remove extra CPSW EMAC entry ARM: dts: am33xx: Add spi alias to match SOC schematics ARM: OMAP2+: hsmmc: fix logic to call either omap_hsmmc_init or omap_hsmmc_late_init but not both ARM: dts: dra7: Set a default parent to mcasp3_ahclkx_mux ARM: OMAP2+: dra7xx: Set OPT_CLKS_IN_RESET flag for gpio1 ARM: dts: nokia n900: drop unneeded/undocumented parts of the dts arm64: dts: rockchip: Correct MIPI DPHY PLL clock on rk3399 arm64: dt marvell: Fix AP806 system controller size MAINTAINERS: add Macchiatobin maintainers entry ARC: reset: remove the misleading v1 suffix all over ARC: reset: add missing DT binding documentation for HSDKv1 reset driver ARC: reset: Only build on archs that have IOMEM ARM: at91: Replace uses of virt_to_phys with __pa_symbol ...
2017-10-04	Update James Hogan's email address	James Hogan	5	-6/+8
	Update my imgtec.com and personal email address to my kernel.org one in a few places as MIPS will soon no longer be part of Imagination Technologies, and add mappings in .mailcap so get_maintainer.pl reports the right address. Signed-off-by: James Hogan <jhogan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-04	Merge branch 'bpf-cgroup-multi-prog'	David S. Miller	15	-182/+1086
	Alexei Starovoitov says: ==================== bpf: muli prog support for cgroup-bpf v1->v2: - fixed accidentally swapped two lines which caused static_key not going to zero - addressed Martin's feedback and changed prog_query to be consistent with verifier output: return -enospc and fill supplied buffer instead of just returning -enospc when buffer is too small to fit all prog_ids v1: cgroup-bpf use cases are getting more advanced and running only one program per cgroup is no longer enough. Therefore introduce support for attaching multiple programs per cgroup and running a set of effective programs. These patches introduces BPF_F_ALLOW_MULTI flag for BPF_PROG_ATTACH cmd. The default is still NONE and behavior of BPF_F_ALLOW_OVERRIDE flag is unchanged. The difference between three possible flags for BPF_PROG_ATTACH command: - NONE(default): No further bpf programs allowed in the subtree. - BPF_F_ALLOW_OVERRIDE: If a sub-cgroup installs some bpf program, the program in this cgroup yields to sub-cgroup program. - BPF_F_ALLOW_MULTI: If a sub-cgroup installs some bpf program, that cgroup program gets run in addition to the program in this cgroup. Most of the logic is in patch 1. Even when cgroup doesn't have any programs attached its set of effective program can be non-empty. To quickly execute them and avoid penalizing cgroups without any effective programs introduce 'struct bpf_prog_array' which has an optimization for cgroups with zero effective programs. Patch 2 introduces BPF_PROG_QUERY command for introspection Patch 3 makes verifier more strict for cgroup-bpf program types. Patch 4+ are tests. More details in individual patches ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	samples/bpf: use bpf_prog_query() interface	Alexei Starovoitov	1	-0/+36
	use BPF_PROG_QUERY command to strengthen test coverage Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	libbpf: add support for BPF_PROG_QUERY	Alexei Starovoitov	2	-1/+22
	add support for BPF_PROG_QUERY command to libbpf Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	libbpf: sync bpf.h	Alexei Starovoitov	1	-3/+52
	tools/include/uapi/linux/bpf.h got out of sync with actual kernel header. Update it. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	samples/bpf: add multi-prog cgroup test case	Alexei Starovoitov	2	-7/+185
	create 5 cgroups, attach 6 progs and check that progs are executed as: cgrp1 (MULTI progs A, B) -> cgrp2 (OVERRIDE prog C) -> cgrp3 (MULTI prog D) -> cgrp4 (OVERRIDE prog E) -> cgrp5 (NONE prog F) the event in cgrp5 triggers execution of F,D,A,B in that order. if prog F is detached, the execution is E,D,A,B if prog F and D are detached, the execution is E,A,B if prog F, E and D are detached, the execution is C,A,B Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	libbpf: introduce bpf_prog_detach2()	Alexei Starovoitov	2	-0/+13
	introduce bpf_prog_detach2() that takes one more argument prog_fd vs bpf_prog_detach() that takes only attach_fd and type. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04	bpf: enforce return code for cgroup-bpf programs	Alexei Starovoitov	2	-0/+112
	with addition of tnum logic the verifier got smart enough and we can enforce return codes at program load time. For now do so for cgroup-bpf program types. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>