aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2017-02-17net: hamachi: use new api ethtool_{get|set}_link_ksettingsPhilippe Reynes1-6/+8
The ethtool api {get|set}_settings is deprecated. We move this driver to new api {get|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17bridge: don't indicate expiry on NTF_EXT_LEARNED fdb entriesRoopa Prabhu1-1/+1
added_by_external_learn fdb entries are added and expired by external entities like switchdev driver or external controllers. ageing is already disabled for such entries. Hence, don't indicate expiry for such fdb entries. CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> CC: Jiri Pirko <jiri@resnulli.us> CC: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17Merge branch 'bpf-misc'David S. Miller19-110/+453
Daniel Borkmann says: ==================== Misc BPF improvements This last series for this window adds various misc improvements to BPF, one is to mark registered map and prog types as __ro_after_init, another one for removing cBPF stubs in eBPF JITs and moving the stub to the core and last also improving JITs is to make generated images visible to the kernel and kallsyms, so they can be seen in traces. For details, please have a look at the individual patches. Thanks a lot! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17bpf: make jited programs visible in tracesDaniel Borkmann13-63/+419
Long standing issue with JITed programs is that stack traces from function tracing check whether a given address is kernel code through {__,}kernel_text_address(), which checks for code in core kernel, modules and dynamically allocated ftrace trampolines. But what is still missing is BPF JITed programs (interpreted programs are not an issue as __bpf_prog_run() will be attributed to them), thus when a stack trace is triggered, the code walking the stack won't see any of the JITed ones. The same for address correlation done from user space via reading /proc/kallsyms. This is read by tools like perf, but the latter is also useful for permanent live tracing with eBPF itself in combination with stack maps when other eBPF types are part of the callchain. See offwaketime example on dumping stack from a map. This work tries to tackle that issue by making the addresses and symbols known to the kernel. The lookup from *kernel_text_address() is implemented through a latched RB tree that can be read under RCU in fast-path that is also shared for symbol/size/offset lookup for a specific given address in kallsyms. The slow-path iteration through all symbols in the seq file done via RCU list, which holds a tiny fraction of all exported ksyms, usually below 0.1 percent. Function symbols are exported as bpf_prog_<tag>, in order to aide debugging and attribution. This facility is currently enabled for root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening is active in any mode. The rationale behind this is that still a lot of systems ship with world read permissions on kallsyms thus addresses should not get suddenly exposed for them. If that situation gets much better in future, we always have the option to change the default on this. Likewise, unprivileged programs are not allowed to add entries there either, but that is less of a concern as most such programs types relevant in this context are for root-only anyway. If enabled, call graphs and stack traces will then show a correct attribution; one example is illustrated below, where the trace is now visible in tooling such as perf script --kallsyms=/proc/kallsyms and friends. Before: 7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux) f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so) After: 7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux) [...] 7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux) 7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux) f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so) Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17bpf: remove stubs for cBPF from arch codeDaniel Borkmann6-27/+14
Remove the dummy bpf_jit_compile() stubs for eBPF JITs and make that a single __weak function in the core that can be overridden similarly to the eBPF one. Also remove stale pr_err() mentions of bpf_jit_compile. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17bpf: mark all registered map/prog types as __ro_after_initDaniel Borkmann6-23/+23
All map types and prog types are registered to the BPF core through bpf_register_map_type() and bpf_register_prog_type() during init and remain unchanged thereafter. As by design we don't (and never will) have any pluggable code that can register to that at any later point in time, lets mark all the existing bpf_{map,prog}_type_list objects in the tree as __ro_after_init, so they can be moved to read-only section from then onwards. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17bridge: vlan_tunnel: explicitly reset metadata attrs to NULL on failureRoopa Prabhu1-0/+2
Fixes: efa5356b0d97 ("bridge: per vlan dst_metadata netlink support") Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17net: bgmac: store MAC address directly in netdev->dev_addrTobias Klauser4-8/+6
After commit 34a5102c3235 ("net: bgmac: allocate struct bgmac just once & don't copy it") the mac_addr member of struct bgmac is no longer necessary to pass the MAC address to bgmac_enet_probe(). Instead it can directly be stored in netdev->dev_addr. Also use eth_hw_addr_random() instead of eth_random_addr() in case a random MAC is nedded. This will make sure netdev->addr_assign_type will be properly set. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Jon Mason <jon.mason@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17Merge tag 'wireless-drivers-next-for-davem-2017-02-16' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-nextDavid S. Miller28-431/+213
Kalle Valo says: ==================== wireless-drivers-next patches for 4.11 Mostly small fixes, not really any new features. Major changes: ath10k * when trying older firmware versions don't confuse user with error messages ath9k * fix crash in AP mode (regression) * fix relayfs crash (regression) * fix initialisation with AR9340 and AR9550 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17net: ethoc: Use eth_hw_addr_random()Tobias Klauser1-8/+2
Use eth_hw_addr_random() to set a random dev_addr and update addr_assign_type instead of open-coding it. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17Merge branch 'rhashtable-allocation-failure-during-insertion'David S. Miller5-94/+316
Herbert Xu says: ==================== rhashtable: Handle table allocation failure during insertion v2 - Added Ack to patch 2. Fixed RCU annotation in code path executed by rehasher by using rht_dereference_bucket. v1 - This series tackles the problem of table allocation failures during insertion. The issue is that we cannot vmalloc during insertion. This series deals with this by introducing nested tables. The first two patches removes manual hash table walks which cannot work on a nested table. The final patch introduces nested tables. I've tested this with test_rhashtable and it appears to work. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17rhashtable: Add nested tablesHerbert Xu2-72/+276
This patch adds code that handles GFP_ATOMIC kmalloc failure on insertion. As we cannot use vmalloc, we solve it by making our hash table nested. That is, we allocate single pages at each level and reach our desired table size by nesting them. When a nested table is created, only a single page is allocated at the top-level. Lower levels are allocated on demand during insertion. Therefore for each insertion to succeed, only two (non-consecutive) pages are needed. After a nested table is created, a rehash will be scheduled in order to switch to a vmalloced table as soon as possible. Also, the rehash code will never rehash into a nested table. If we detect a nested table during a rehash, the rehash will be aborted and a new rehash will be scheduled. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17tipc: Fix tipc_sk_reinit race conditionsHerbert Xu2-11/+23
There are two problems with the function tipc_sk_reinit. Firstly it's doing a manual walk over an rhashtable. This is broken as an rhashtable can be resized and if you manually walk over it during a resize then you may miss entries. Secondly it's missing memory barriers as previously the code used spinlocks which provide the barriers implicitly. This patch fixes both problems. Fixes: 07f6c4bc048a ("tipc: convert tipc reference table to...") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17gfs2: Use rhashtable walk interface in glock_hash_walkHerbert Xu1-11/+17
The function glock_hash_walk walks the rhashtable by hand. This is broken because if it catches the hash table in the middle of a rehash, then it will miss entries. This patch replaces the manual walk by using the rhashtable walk interface. Fixes: 88ffbf3e037e ("GFS2: Use resizable hash table for glocks") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17net: mvneta: make mvneta_eth_tool_ops staticJisheng Zhang1-1/+1
The mvneta_eth_tool_ops is only used internally in mvneta driver, so make it static. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17Merge branch 'net-sched-reflect-hw-offload-in-classifiers'David S. Miller6-7/+50
Or Gerlitz says: ==================== net/sched: Reflect HW offload status in classifiers Currently there is no way of querying whether a filter is offloaded to HW or not when using "both" policy (where none of skip_sw or skip_hw flags are set by user-space). Added two new flags, "in hw" and "not in hw" such that user space can determine if a filter is actually offloaded to hw. The "in hw" UAPI semantics was chosen so it's similar to the "skip hw" flag logic. If none of these two flags are set, this signals running over older kernel. As an example, add one vlan push + fwd rule, one matchall rule and one u32 rule without any flags, and another vlan + fwd skip_sw rule, such that the different TC classifier attempt to offload all of them -- all over mlx5 SRIOV VF rep: flower skip_sw indev eth2_0 src_mac e4:11:22:33:44:50 dst_mac e4:1d:2d:a5:f3:9d action vlan push id 52 action mirred egress redirect dev eth2 flower indev eth2_0 src_mac e4:11:22:33:44:50 dst_mac e4:11:22:33:44:51 action vlan push id 53 action mirred egress redirect dev eth2 u32 ht 800: flowid 800:1 match ip src 192.168.1.0/24 action drop Since that VF rep doesn't offload matchall/u32 and can currently offload only one vlan push rule we expect three of the rules not to be offloaded: filter protocol ip pref 99 u32 filter protocol ip pref 99 u32 fh 800: ht divisor 1 filter protocol ip pref 99 u32 fh 800::1 order 1 key ht 800 bkt 0 flowid 800:1 not in_hw match c0a80100/ffffff00 at 12 action order 1: gact action drop random type none pass val 0 index 8 ref 1 bind 1 filter protocol all pref 49150 matchall filter protocol all pref 49150 matchall handle 0x1 not in_hw action order 1: mirred (Egress Mirror to device veth1) pipe index 27 ref 1 bind 1 filter protocol ip pref 49151 flower filter protocol ip pref 49151 flower handle 0x1 indev eth2_0 dst_mac e4:11:22:33:44:51 src_mac e4:11:22:33:44:50 eth_type ipv4 not in_hw action order 1: vlan push id 53 protocol 802.1Q priority 0 pipe index 20 ref 1 bind 1 action order 2: mirred (Egress Redirect to device eth2) stolen index 26 ref 1 bind 1 filter protocol ip pref 49152 flower filter protocol ip pref 49152 flower handle 0x1 indev eth2_0 dst_mac e4:1d:2d:a5:f3:9d src_mac e4:11:22:33:44:50 eth_type ipv4 skip_sw in_hw action order 1: vlan push id 52 protocol 802.1Q priority 0 pipe index 19 ref 1 bind 1 action order 2: mirred (Egress Redirect to device eth2) stolen index 25 ref 1 bind 1 v3 --> v4 changes: - removed extra parenthesis (Dave) v2 --> v3 changes: - fixed the matchall dump flags patch to do proper checks (Jakub) - added the same proper checks to flower where they were missing - that flower patch was added as #1 and hence all the other patches are offed-by-one v1 --> v2 changes: - applied feedback from Jakub and Dave -- where none of the skip flags were set, the suggested approach didn't allow user space to distringuish between old kernel to a case when offloading to HW worked fine. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17net/sched: cls_bpf: Reflect HW offload statusOr Gerlitz1-2/+11
BPF classifier support for the "in hw" offloading flags. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Amir Vadai <amir@vadai.me> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17net/sched: cls_u32: Reflect HW offload statusOr Gerlitz1-0/+10
U32 support for the "in hw" offloading flags. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Amir Vadai <amir@vadai.me> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17net/sched: cls_matchall: Reflect HW offloading statusOr Gerlitz1-2/+10
Matchall support for the "in hw" offloading flags. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Amir Vadai <amir@vadai.me> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17net/sched: cls_flower: Reflect HW offload statusOr Gerlitz1-0/+5
Flower support for the "in hw" offloading flags. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Amir Vadai <amir@vadai.me> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17net/sched: Reflect HW offload statusOr Gerlitz2-2/+9
Currently there is no way of querying whether a filter is offloaded to HW or not when using "both" policy (where none of skip_sw or skip_hw flags are set by user-space). Add two new flags, "in hw" and "not in hw" such that user space can determine if a filter is actually offloaded to hw or not. The "in hw" UAPI semantics was chosen so it's similar to the "skip hw" flag logic. If none of these two flags are set, this signals running over older kernel. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Amir Vadai <amir@vadai.me> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17net/sched: cls_matchall: Dump the classifier flagsOr Gerlitz1-0/+3
The classifier flags are not dumped to user-space, do that. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17net/sched: cls_flower: Properly handle classifier flags dumpingOr Gerlitz1-1/+2
Dump the classifier flags only if non zero and make sure to check the return status of the handler that puts them into the netlink msg. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17net: oki-semi: pch_gbe: use new api ethtool_{get|set}_link_ksettingsPhilippe Reynes1-20/+38
The ethtool api {get|set}_settings is deprecated. We move this driver to new api {get|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17net: ethernet: ti: cpsw: correct ale dev to cpswIvan Khoronzhuk1-1/+1
The ale is a property of cpsw, so change dev to cpsw->dev, aka pdev->dev, to be consistent. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17net: nvidia: forcedeth: use new api ethtool_{get|set}_link_ksettingsPhilippe Reynes1-41/+50
The ethtool api {get|set}_settings is deprecated. We move this driver to new api {get|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17Merge branch 'ptp-attribute-cleanup'David S. Miller3-116/+80
Dmitry Torokhov says: ==================== PTP attribute handling cleanup PTP core was creating some attributes, such as "period" and "fifo", and the entire "pins" attribute group, after creating class deevice, which creates a race for userspace: uevent may arrive before all attributes are created. This series of patches switches PTP to use is_visible() to control visibility of attributes in a group, and device_create_with_groups() to ensure that attributes are created before we notify userspace of a new device. v2: - added Richard's acked-by to patch #1 - removed use of kmalloc_array in favor of kcalloc in patch #2 at Richard's request - added a cover letter v1: - initial patch set ==================== Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17ptp: create "pins" together with the rest of attributesDmitry Torokhov3-42/+24
Let's switch to using device_create_with_groups(), which will allow us to create "pins" attribute group together with the rest of ptp device attributes, and before userspace gets notified about ptp device creation. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17ptp: use is_visible method to hide unused attributesDmitry Torokhov1-70/+55
Instead of creating selected attributes after the device is created (and after userspace potentially seen uevent), lets use attribute group is_visible() method to control which attributes are shown. This will allow us to create all attributes (except "pins" group, which will be taken care of later) before userspace gets notified about new ptp class device. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17ptp: use kcalloc when allocating arraysDmitry Torokhov1-3/+2
kcalloc is more semantically correct when allocating arrays of objects, and overflow-safe. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17ptp: do not explicitly set drvdata in ptp_clock_register()Dmitry Torokhov1-2/+0
We do not need explicitly call dev_set_drvdata(), as it is done for us by device_create(). Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17mlx4: do not fire tasklet unless necessaryEric Dumazet2-9/+6
All rx and rx netdev interrupts are handled by respectively by mlx4_en_rx_irq() and mlx4_en_tx_irq() which simply schedule a NAPI. But mlx4_eq_int() also fires a tasklet to service all items that were queued via mlx4_add_cq_to_tasklet(), but this handler was not called unless user cqe was handled. This is very confusing, as "mpstat -I SCPU ..." show huge number of tasklet invocations. This patch saves this overhead, by carefully firing the tasklet directly from mlx4_add_cq_to_tasklet(), removing four atomic operations per IRQ. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-16Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-nextDavid S. Miller23-148/+457
Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2017-02-16 1) Make struct xfrm_input_afinfo const, nothing writes to it. From Florian Westphal. 2) Remove all places that write to the afinfo policy backend and make the struct const then. From Florian Westphal. 3) Prepare for packet consuming gro callbacks and add ESP GRO handlers. ESP packets can be decapsulated at the GRO layer then. It saves a round through the stack for each ESP packet. Please note that this has a merge coflict between commit 63fca65d0863 ("net: add confirm_neigh method to dst_ops") from net-next and 3d7d25a68ea5 ("xfrm: policy: remove garbage_collect callback") a2817d8b279b ("xfrm: policy: remove family field") from ipsec-next. The conflict can be solved as it is done in linux-next. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-16Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queueDavid S. Miller5-230/+481
Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2017-02-16 This series contains updates to ixgbe only. Tony updates the driver to advertise 2.5Gb and 5.0Gb if the adapter supports it. Stephen Hemminger renames our dcbnl_ops since it is global to ixgbe_dcbnl_ops to avoid namespace issues. Mark updates the driver version based on the recent changes. Alex has the remainder of the changes, starting with consolidating functions that represent logical steps in the receive process so we can later update them more easily (and align with igb). Modify the receive path to only synchronize the length of the frame versus the entire buffer. Provided performance improvements by adding support for DMA_ATTR_SKIP_CPU_SYNC and DMA_ATTR_WEAK_ORDERING. Also made additional performance gains by batching the page count updates instead of doing them one at a time. Adjusted the receive path to use 3k buffers with 8k backing them in order to support build_skb with jumbo frames. Made additional driver improvements by using the length of the packet instead of the DD status to determine if a new descriptor is ready to be processed, which cuts down on reads. To reduce code duplication, pulled apart the receive path into separate functions. Added support for providing a buffer with headroom and tailroom to allow for shared info for NET_SKB_PAD and NET_IP_ALIGN. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller66-290/+594
2017-02-16cxgb4: Remove redundant code in t4_uld_clean_up()Ganesh Goudar1-2/+0
Remove variable rxq_info and also remove redundant assignment to it. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-16cxgb4: Add new T5 and T6 pci device id'sGanesh Goudar1-0/+5
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-16cxgb4: Increase max number of tc u32 linksArjun V3-13/+7
Make max number of supported tc u32 links equal to max number of filters supported by hardware. Signed-off-by: Arjun V <arjun@chelsio.com> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-16Merge tag 'media/v4.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-mediaLinus Torvalds1-5/+13
Pull media fix from Mauro Carvalho Chehab: "A regression fix that makes the Siano driver to work again after the CONFIG_VMAP_STACK change" * tag 'media/v4.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] siano: make it work again with CONFIG_VMAP_STACK
2017-02-16vfs: fix uninitialized flags in splice_to_pipe()Miklos Szeredi1-0/+1
Flags (PIPE_BUF_FLAG_PACKET, PIPE_BUF_FLAG_GIFT) could remain on the unused part of the pipe ring buffer. Previously splice_to_pipe() left the flags value alone, which could result in incorrect behavior. Uninitialized flags appears to have been there from the introduction of the splice syscall. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org> # 2.6.17+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-16Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuseLinus Torvalds1-0/+5
Pull fuse fixes from Miklos Szeredi: "Fix a use after free bug introduced in 4.2 and using an uninitialized value introduced in 4.9" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix uninitialized flags in pipe_buffer fuse: fix use after free issue in fuse_dev_do_read()
2017-02-16Merge tag 'pci-v4.10-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pciLinus Torvalds1-0/+12
Pull PCI fix from Bjorn Helgaas: "Add back pcie_pme_remove() so we free the IRQ when removing PCIe port devices; previously the leaked IRQ caused an MSI BUG_ON" * tag 'pci-v4.10-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/PME: Restore pcie_pme_driver.remove
2017-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds33-200/+355
Pull networking fixes from David Miller: 1) In order to avoid problems in the future, make cgroup bpf overriding explicit using BPF_F_ALLOW_OVERRIDE. From Alexei Staovoitov. 2) LLC sets skb->sk without proper skb->destructor and this explodes, fix from Eric Dumazet. 3) Make sure when we have an ipv4 mapped source address, the destination is either also an ipv4 mapped address or ipv6_addr_any(). Fix from Jonathan T. Leighton. 4) Avoid packet loss in fec driver by programming the multicast filter more intelligently. From Rui Sousa. 5) Handle multiple threads invoking fanout_add(), fix from Eric Dumazet. 6) Since we can invoke the TCP input path in process context, without BH being disabled, we have to accomodate that in the locking of the TCP probe. Also from Eric Dumazet. 7) Fix erroneous emission of NETEVENT_DELAY_PROBE_TIME_UPDATE when we aren't even updating that sysctl value. From Marcus Huewe. 8) Fix endian bugs in ibmvnic driver, from Thomas Falcon. [ This is the second version of the pull that reverts the nested rhashtable changes that looked a bit too scary for this late in the release - Linus ] * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits) rhashtable: Revert nested table changes. ibmvnic: Fix endian errors in error reporting output ibmvnic: Fix endian error when requesting device capabilities net: neigh: Fix netevent NETEVENT_DELAY_PROBE_TIME_UPDATE notification net: xilinx_emaclite: fix freezes due to unordered I/O net: xilinx_emaclite: fix receive buffer overflow bpf: kernel header files need to be copied into the tools directory tcp: tcp_probe: use spin_lock_bh() uapi: fix linux/if_pppol2tp.h userspace compilation errors packet: fix races in fanout_add() ibmvnic: Fix initial MTU settings net: ethernet: ti: cpsw: fix cpsw assignment in resume kcm: fix a null pointer dereference in kcm_sendmsg() net: fec: fix multicast filtering hardware setup ipv6: Handle IPv4-mapped src to in6addr_any dst. ipv6: Inhibit IPv4-mapped src address on the wire. net/mlx5e: Disable preemption when doing TC statistics upcall rhashtable: Add nested tables tipc: Fix tipc_sk_reinit race conditions gfs2: Use rhashtable walk interface in glock_hash_walk ...
2017-02-16fuse: fix uninitialized flags in pipe_bufferMiklos Szeredi1-0/+1
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: d82718e348fe ("fuse_dev_splice_read(): switch to add_to_pipe()") Cc: <stable@vger.kernel.org> # 4.9+
2017-02-16ixgbe: Don't bother clearing buffer memory for descriptor ringsAlexander Duyck2-71/+99
This patch makes it so that we don't need to bother with clearing the memory out for the descriptor rings. The general idea is to only free buffers associated with buffers in use which are located between the next_to_clean and next_to_use or next_to_alloc values. Everything outside of those regions can be safely ignored since they should have no buffers associated with them. The advantage to doing things this way is that is should speed up bring-up and tear-down of the rings. Specifically we can avoid the 512 or more cycles required to memset the rings in tear-down. In the bring-up phase we then clear the memory as a part of initialization. The general idea is that the clearing in initialization can act as a prefetch of sorts for the buffer info structures so they are in the local CPU when we go to populate them. This should help to improve overall time needed to perform a suspend/resume. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-16ixgbe: Add support for build_skbAlexander Duyck1-1/+48
This patch adds build_skb support to the Rx path. There are several advantages to this change. 1. It avoids the memcpy and skb->head allocation for small packets which improves performance by about 5% in my tests. 2. It avoids the memcpy, skb->head allocation, and eth_get_headlen for larger packets improving performance by about 10% in my tests. 3. For VXLAN packets it allows the full header to be in skb->data which improves the performance by as much as 30% in some of my tests. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-16ixgbe: Add private flag to control buffer modeAlexander Duyck1-0/+47
Since there are potential drawbacks to the new Rx allocation approach I thought it best to add a "chicken bit" so that we can turn the feature off if in the event that a problem is found. It also provides a means of validating the legacy Rx path in the event that we are forced to fall back. At some point in the future when we are convinced we don't need it anymore we might be able to drop the legacy-rx flag. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-16ixgbe: Add support for padding packetAlexander Duyck2-4/+56
This patch adds support for providing a buffer with headroom and tailroom to allow for shared info, NET_SKB_PAD, and NET_IP_ALIGN. With this combined with the DMA changes we can start using build_skb to build frames around an incoming Rx buffer instead of having to memcpy the headers. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-16ixgbe: Break out Rx buffer page managementAlexander Duyck1-109/+122
We are going to be expanding the number of Rx paths in the driver. Instead of duplicating all that code I am pulling it apart into separate functions so that we don't have so much code duplication. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-16ixgbe: Use length to determine if descriptor is doneAlexander Duyck2-7/+9
This change makes it so that we use the length of the packet instead of the DD status bit to determine if a new descriptor is ready to be processed. The obvious advantage is that it cuts down on reads as we don't really even need the DD bit if going from a 0 to a non-zero value on size is enough to inform us that the packet has been completed. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>