aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2016-07-25drivers: net: phy: xgene: Add MDIO driverIyappan Subramanian4-0/+627
Currently, SGMII based 1G rely on the hardware registers for link state and sometimes it's not reliable. To get most accurate link state, this interface has to use the MDIO bus to poll the PHY. In X-Gene SoC, MDIO bus is shared across RGMII and SGMII based 1G interfaces, so adding this driver to manage MDIO bus. This driver registers the mdio bus and registers the PHYs connected to it. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25drivers: net: xgene: Fix module unload crash - clkrst sequenceIyappan Subramanian4-18/+86
This patch fixes clock reset sequence. - Added clock reset sequence for ACPI - Added delay in clock reset sequence to make sure pulse is generated - Added clk_unprepare_disable() in port shutdown to make sure clock increment/decrement counts are matching - Removed MII_MGMT_CONFIG programming, since it is not required - Fixed programming XGENET_CONFIG_REG to enable SGMII mode Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25drivers: net: xgene: Fix module unload crash - change sw sequenceIyappan Subramanian2-49/+73
When the driver is configured as kernel module and when it gets unloaded and reloaded, kernel crash was observed. This patch addresses the software cleanup by doing the following, - Moved register_netdev call after hardware is ready - Since ndev is not ready, added set_irq_name to set irq name - Since ndev is not ready, changed mdio_bus->parent to pdev->dev - Replaced netif_start(stop)_queue by netif_tx_start(stop)_queues - Removed napi_del call since it's called by free_netdev - Added dev_close call, within remove - Added shutdown callback - Changed to use dmam_ APIs Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25drivers: net: xgene: Fix module unload crash - hw resource cleanupIyappan Subramanian6-29/+149
When the driver is configured as kernel module and when it gets unloaded and reloaded, kernel crash was observed. This patch address the hardware resource cleanups by doing the following, - Added mac_ops->clear() to do prefetch buffer clean up - Fixed delete freepool buffers logic - Reordered mac_enable and mac_disable - Added Tx completion ring free - Moved down delete_desc_rings after ring cleanup Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25drivers: net: xgene: Separate set_speed from mac_initIyappan Subramanian7-33/+151
Since mac_init is too heavy to be called when the link changes, moved the speed_set configuration to a new function and added mac_ops->set_speed function pointer. This function will be called from adjust_link callback. Added cases for 10/100 support for SGMII based 1G interface. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Fushen Chen <fchen@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Merge branch 'refactor-tc_action-structs'David S. Miller27-417/+378
Cong Wang says: ==================== net_sched: refactor tc action structures These two patches factor out the struct tcf_common. v2: fix a compile warning ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net_sched: get rid of struct tcf_commonWANG Cong15-122/+114
After the previous patch, struct tc_action should be enough to represent the generic tc action, tcf_common is not necessary any more. This patch gets rid of it to make tc action code more readable. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net_sched: move tc_action into tcf_commonWANG Cong27-329/+298
struct tc_action is confusing, currently we use it for two purposes: 1) Pass in arguments and carry out results from helper functions 2) A generic representation for tc actions The first one is error-prone, since we need to make sure we don't miss anything. This patch aims to get rid of this use, by moving tc_action into tcf_common, so that they are allocated together in hashtable and can be cast'ed easily. And together with the following patch, we could really make tc_action a generic representation for all tc actions and each type of action can inherit from it. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25ipvlan: Scrub skb before crossing the namespace boundryMahesh Bandewar1-14/+25
The earlier patch c3aaa06d5a63 (ipvlan: scrub skb before routing in L3 mode.) did this but only for TX path in L3 mode. This patch extends it for both the modes for TX/RX path. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Merge branch 'bnxt_en-improve-ntuple-and-new-IDs'David S. Miller2-11/+63
Michael Chan says: ==================== bnxt_en: Improve ntuple filters and add new IDs. Improve ntuple filters and add some new PCI device IDs. Please review for net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25bnxt_en: Add new NPAR and dual media device IDs.Michael Chan1-6/+33
Add 5741X/5731X NPAR device IDs and dual media SFP/10GBase-T device IDs. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25bnxt_en: Log a message, if enabling NTUPLE filtering fails.Vasundhara Volam1-2/+6
If there are not enough resources to enable ntuple filtering, log a warning message. v2: Use single message and add missing newline. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25bnxt_en: Improve ntuple filters by checking destination MAC address.Michael Chan2-3/+24
Include the destination MAC address in the ntuple filter structure. The current code assumes that the destination MAC address is always the MAC address of the NIC. This may not be true if there are macvlans, for example. Add destination MAC address checking and configure the filter correctly using the correct index for the destination MAC address. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25qed: Fix setting/clearing bit in completion bitmapManish Chopra1-4/+3
Slowpath completion handling is incorrectly changing SPQ_RING_SIZE bits instead of a single one. Fixes: 76a9a3642a0b ("qed: fix handling of concurrent ramrods") Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25udp: use sk_filter_trim_cap for udp{,6}_queue_rcv_skbDaniel Borkmann2-6/+2
After a612769774a3 ("udp: prevent bugcheck if filter truncates packet too much"), there followed various other fixes for similar cases such as f4979fcea7fd ("rose: limit sk_filter trim to payload"). Latter introduced a new helper sk_filter_trim_cap(), where we can pass the trim limit directly to the socket filter handling. Make use of it here as well with sizeof(struct udphdr) as lower cap limit and drop the extra skb->len test in UDP's input path. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Willem de Bruijn <willemb@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25caif-hsi: Remove deprecated create_singlethread_workqueueBhaktipriya Shridhar1-4/+1
alloc_workqueue replaces deprecated create_singlethread_workqueue(). A dedicated workqueue has been used since the workitems are being used on a packet tx/rx path. Hence, WQ_MEM_RECLAIM has been set to guarantee forward progress under memory pressure. An ordered workqueue has been used since workitems &cfhsi->wake_up_work and &cfhsi->wake_down_work cannot be run concurrently. Calls to flush_workqueue() before destroy_workqueue() have been dropped since destroy_workqueue() itself calls drain_workqueue() which flushes repeatedly till the workqueue becomes empty. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Merge branch 'bpf-probe-write-user'David S. Miller6-0/+191
Sargun Dhillon says: ==================== bpf: add bpf_probe_write_user helper & example This patch series contains two patches that add support for a probe_write helper to BPF programs. This allows them to manipulate user memory during the course of tracing. The second patch in the series has an example that uses it, in one the intended ways to divert execution. Thanks to Alexei Starovoitov, and Daniel Borkmann for being patient, review, and helping me get familiar with the code base. I've made changes based on their recommendations. This helper should be considered for experimental usage and debugging, so we print a warning to dmesg when it is along with the command and pid when someone tries to install a proglet that uses it. A follow-up patchset will contain a mechanism to verify the safety of the probe beyond what was done by hand. ---- v1->v2: restrict writing to user space, as opposed to globally v2->v3: Fixed formatting issues v3->v4: Rename copy_to_user -> bpf_probe_write Simplify checking of whether or not it's safe to write Add warnings to dmesg v4->v5: Raise warning level Cleanup location of warning code Make test fail when helper is broken v5->v6: General formatting cleanup Rename bpf_probe_write -> bpf_probe_write_user v6->v7: More formatting cleanup. Clarifying a few comments Clarified log message ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25samples/bpf: Add test/example of using bpf_probe_write_user bpf helperSargun Dhillon3-0/+134
This example shows using a kprobe to act as a dnat mechanism to divert traffic for arbitrary endpoints. It rewrite the arguments to a syscall while they're still in userspace, and before the syscall has a chance to copy the argument into kernel space. Although this is an example, it also acts as a test because the mapped address is 255.255.255.255:555 -> real address, and that's not a legal address to connect to. If the helper is broken, the example will fail on the intermediate steps, as well as the final step to verify the rewrite of userspace memory succeeded. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25bpf: Add bpf_probe_write_user BPF helper to be called in tracersSargun Dhillon3-0/+57
This allows user memory to be written to during the course of a kprobe. It shouldn't be used to implement any kind of security mechanism because of TOC-TOU attacks, but rather to debug, divert, and manipulate execution of semi-cooperative processes. Although it uses probe_kernel_write, we limit the address space the probe can write into by checking the space with access_ok. We do this as opposed to calling copy_to_user directly, in order to avoid sleeping. In addition we ensure the threads's current fs / segment is USER_DS and the thread isn't exiting nor a kernel thread. Given this feature is meant for experiments, and it has a risk of crashing the system, and running programs, we print a warning on when a proglet that attempts to use this helper is installed, along with the pid and process name. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net/mlx4_core: Check device state before unregistering itAlex Vesker1-0/+3
Verify that the device state is registered before un-registering it. This check is required to prevent an OOPS on flows that do re-registration of the device and its previous state was unregistered. Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25mlxsw: spectrum: Fix compilation error when CLS_ACT isn't setIdo Schimmel2-6/+9
When CONFIG_NET_CLS_ACT isn't set 'struct tcf_exts' has no member named 'actions' and we therefore must not access it. Otherwise compilation fails. Fix this by introducing a new macro similar to tc_no_actions(), which always returns 'false' if CONFIG_NET_CLS_ACT isn't set. Fixes: 763b4b70afcd ("mlxsw: spectrum: Add support in matchall mirror TC offloading") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net: davinci_cpdma: remove excessive dump of register values to kernel logUwe Kleine-König3-219/+0
Such a big dump of register values is hardly useful on a production system. Another downside of the now removed functions is that calling emac_dump_regs resulted in at least 87 calls to dev_info while holding a spinlock and having irqs off which is a big source of latency. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25gtp: #define #define _GTP_H_ and not #define _GTP_HColin Ian King1-1/+1
Fix clang build warning: ./include/net/gtp.h:1:9: warning: '_GTP_H_' is used as a header guard here, followed by #define of a different macro [-Wheader-guard] fix by defining _GTP_H_ and not _GTP_H Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Merge branch 'mlx5-minimum-inline-header-mode'David S. Miller7-7/+105
Saeed Mahameed says: ==================== Mellanox 100G mlx5 minimum inline header mode This small series from Hadar adds the support for minimum inline header mode query in mlx5e NIC driver. Today on TX the driver copies to the HW descriptor only up to L2 header which is the default required mode and sufficient for today's needs. The header in the HW descriptor is used for HW loopback steering decision, without it packets will go directly to the wire with no questions asked. For TX loopback steering according to L2/L3/L4 headers, ConnectX-4 requires to copy the corresponding headers into the send queue(SQ) WQE HW descriptor so it can decide whether to loop it back or to forward to wire. For legacy E-Switch mode only L2 headers copy is required. For advanced steering (E-Switch offloads) more header layers may be required to be copied, the required mode will be advertised by FW to each VF and PF according to the corresponding E-Switch configuration. Changes V2: - Allocate query_nic_vport_context_out on the stack ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net/mlx5e: Query minimum required header copy during xmitHadar Hen Zion5-3/+52
Add support for query the minimum inline mode from the Firmware. It is required for correct TX steering according to L3/L4 packet headers. Each send queue (SQ) has inline mode that defines the minimal required headers that needs to be copied into the SQ WQE. The driver asks the Firmware for the wqe_inline_mode device capability value. In case the device capability defined as "vport context" the driver must check the reported min inline mode from the vport context before creating its SQs. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net/mlx5e: Check the minimum inline header mode before xmitHadar Hen Zion3-4/+53
Each send queue (SQ) has inline mode that defines the minimal required inline headers in the SQ WQE. Before sending each packet check that the minimum required headers on the WQE are copied. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net/sctp: terminate rhashtable walk correctlyVegard Nossum1-0/+1
I was seeing a lot of these: BUG: sleeping function called from invalid context at mm/slab.h:388 in_atomic(): 0, irqs_disabled(): 0, pid: 14971, name: trinity-c2 Preemption disabled at:[<ffffffff819bcd46>] rhashtable_walk_start+0x46/0x150 [<ffffffff81149abb>] preempt_count_add+0x1fb/0x280 [<ffffffff83295722>] _raw_spin_lock+0x12/0x40 [<ffffffff811aac87>] console_unlock+0x2f7/0x930 [<ffffffff811ab5bb>] vprintk_emit+0x2fb/0x520 [<ffffffff811aba6a>] vprintk_default+0x1a/0x20 [<ffffffff812c171a>] printk+0x94/0xb0 [<ffffffff811d6ed0>] print_stack_trace+0xe0/0x170 [<ffffffff8115835e>] ___might_sleep+0x3be/0x460 [<ffffffff81158490>] __might_sleep+0x90/0x1a0 [<ffffffff8139b823>] kmem_cache_alloc+0x153/0x1e0 [<ffffffff819bca1e>] rhashtable_walk_init+0xfe/0x2d0 [<ffffffff82ec64de>] sctp_transport_walk_start+0x1e/0x60 [<ffffffff82edd8ad>] sctp_transport_seq_start+0x4d/0x150 [<ffffffff8143a82b>] seq_read+0x27b/0x1180 [<ffffffff814f97fc>] proc_reg_read+0xbc/0x180 [<ffffffff813d471b>] __vfs_read+0xdb/0x610 [<ffffffff813d4d3a>] vfs_read+0xea/0x2d0 [<ffffffff813d615b>] SyS_pread64+0x11b/0x150 [<ffffffff8100334c>] do_syscall_64+0x19c/0x410 [<ffffffff832960a5>] return_from_SYSCALL_64+0x0/0x6a [<ffffffffffffffff>] 0xffffffffffffffff Apparently we always need to call rhashtable_walk_stop(), even when rhashtable_walk_start() fails: * rhashtable_walk_start - Start a hash table walk * @iter: Hash table iterator * * Start a hash table walk. Note that we take the RCU lock in all * cases including when we return an error. So you must always call * rhashtable_walk_stop to clean up. otherwise we never call rcu_read_unlock() and we get the splat above. Fixes: 53fa1036 ("sctp: fix some rhashtable functions using in sctp proc/diag") See-also: 53fa1036 ("sctp: fix some rhashtable functions using in sctp proc/diag") See-also: f2dba9c6 ("rhashtable: Introduce rhashtable_walk_*") Cc: Xin Long <lucien.xin@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: stable@vger.kernel.org Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queueDavid S. Miller12-107/+160
Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2016-07-22 This series contains updates to ixgbe and ixgbevf only. Emil fixes the NACK check in ixgbevf_set_uc_addr_vf() for instances where the index is not equal to zero. Fixes an issue where mac->ops.setup_fc can be NULL for backplanes which can cause the driver to crash on load. Don fixes the second parameter of the LED functions, which is the index to the LED we are interested in affecting. Fixed variable to store register reads to unsigned integer. Adds support for the new x553 hardware into ixgbevf. Fixed a missing rtnl lock around ixgbevf_reinit_locked(). Fixed an issue where in ixgbevf_reset_subtask() was not verifying that the port has been removed. Cleans up the initial crosstalk fix, since the SFP that indicates the presence of a SFP+ module changes between hardware types. Babu Moger fixes typo in freeing IRQ, since the array subscript increments after the execution of the statement. Wei Yongjun adds the missing destroy_workqueue() before returning from ixgbe_init_module() in the error handling case. Tony adds range checking for setting the MTU from the VF, where the PF can return a NACK but this was not passed on to the VF, so propagate the results from the PF to the VF so errors can be reported. Consolidates mailbox read and write functions, since the recent changes to ixgbevf_write_msg_read_ack(), other functions are performing the same operations done here. Colin Ian King removes a redundant check on ret_val, since ret_val has not changed since the previous check. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net/irda: fix NULL pointer dereference on memory allocation failureVegard Nossum1-2/+5
I ran into this: kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 2 PID: 2012 Comm: trinity-c3 Not tainted 4.7.0-rc7+ #19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 task: ffff8800b745f2c0 ti: ffff880111740000 task.ti: ffff880111740000 RIP: 0010:[<ffffffff82bbf066>] [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710 RSP: 0018:ffff880111747bb8 EFLAGS: 00010286 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000069dd8358 RDX: 0000000000000009 RSI: 0000000000000027 RDI: 0000000000000048 RBP: ffff880111747c00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000069dd8358 R11: 1ffffffff0759723 R12: 0000000000000000 R13: ffff88011a7e4780 R14: 0000000000000027 R15: 0000000000000000 FS: 00007fc738404700(0000) GS:ffff88011af00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc737fdfb10 CR3: 0000000118087000 CR4: 00000000000006e0 Stack: 0000000000000200 ffff880111747bd8 ffffffff810ee611 ffff880119f1f220 ffff880119f1f4f8 ffff880119f1f4f0 ffff88011a7e4780 ffff880119f1f232 ffff880119f1f220 ffff880111747d58 ffffffff82bca542 0000000000000000 Call Trace: [<ffffffff82bca542>] irda_connect+0x562/0x1190 [<ffffffff825ae582>] SYSC_connect+0x202/0x2a0 [<ffffffff825b4489>] SyS_connect+0x9/0x10 [<ffffffff8100334c>] do_syscall_64+0x19c/0x410 [<ffffffff83295ca5>] entry_SYSCALL64_slow_path+0x25/0x25 Code: 41 89 ca 48 89 e5 41 57 41 56 41 55 41 54 41 89 d7 53 48 89 fb 48 83 c7 48 48 89 fa 41 89 f6 48 c1 ea 03 48 83 ec 20 4c 8b 65 10 <0f> b6 04 02 84 c0 74 08 84 c0 0f 8e 4c 04 00 00 80 7b 48 00 74 RIP [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710 RSP <ffff880111747bb8> ---[ end trace 4cda2588bc055b30 ]--- The problem is that irda_open_tsap() can fail and leave self->tsap = NULL, and then irttp_connect_request() almost immediately dereferences it. Cc: stable@vger.kernel.org Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25sctp: also point GSO head_skb to the sk when it's availableMarcelo Ricardo Leitner1-0/+3
The head skb for GSO packets won't travel through the inner depths of SCTP stack as it doesn't contain any chunks on it. That means skb->sk doesn't get set and then when sctp_recvmsg() calls sctp_inet6_skb_msgname() on the head_skb it panics, as this last needs to check flags at the socket (sp->v4mapped). The fix is to initialize skb->sk for th head skb once we are able to do it. That is, when the first chunk is processed. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25sctp: fix BH handling on socket backlogMarcelo Ricardo Leitner2-2/+2
Now that the backlog processing is called with BH enabled, we have to disable BH before taking the socket lock via bh_lock_sock() otherwise it may dead lock: sctp_backlog_rcv() bh_lock_sock(sk); if (sock_owned_by_user(sk)) { if (sk_add_backlog(sk, skb, sk->sk_rcvbuf)) sctp_chunk_free(chunk); else backloged = 1; } else sctp_inq_push(inqueue, chunk); bh_unlock_sock(sk); while sctp_inq_push() was disabling/enabling BH, but enabling BH triggers pending softirq, which then may try to re-lock the socket in sctp_rcv(). [ 219.187215] <IRQ> [ 219.187217] [<ffffffff817ca3e0>] _raw_spin_lock+0x20/0x30 [ 219.187223] [<ffffffffa041888c>] sctp_rcv+0x48c/0xba0 [sctp] [ 219.187225] [<ffffffff816e7db2>] ? nf_iterate+0x62/0x80 [ 219.187226] [<ffffffff816f1b14>] ip_local_deliver_finish+0x94/0x1e0 [ 219.187228] [<ffffffff816f1e1f>] ip_local_deliver+0x6f/0xf0 [ 219.187229] [<ffffffff816f1a80>] ? ip_rcv_finish+0x3b0/0x3b0 [ 219.187230] [<ffffffff816f17a8>] ip_rcv_finish+0xd8/0x3b0 [ 219.187232] [<ffffffff816f2122>] ip_rcv+0x282/0x3a0 [ 219.187233] [<ffffffff810d8bb6>] ? update_curr+0x66/0x180 [ 219.187235] [<ffffffff816abac4>] __netif_receive_skb_core+0x524/0xa90 [ 219.187236] [<ffffffff810d8e00>] ? update_cfs_shares+0x30/0xf0 [ 219.187237] [<ffffffff810d557c>] ? __enqueue_entity+0x6c/0x70 [ 219.187239] [<ffffffff810dc454>] ? enqueue_entity+0x204/0xdf0 [ 219.187240] [<ffffffff816ac048>] __netif_receive_skb+0x18/0x60 [ 219.187242] [<ffffffff816ad1ce>] process_backlog+0x9e/0x140 [ 219.187243] [<ffffffff816ac8ec>] net_rx_action+0x22c/0x370 [ 219.187245] [<ffffffff817cd352>] __do_softirq+0x112/0x2e7 [ 219.187247] [<ffffffff817cc3bc>] do_softirq_own_stack+0x1c/0x30 [ 219.187247] <EOI> [ 219.187248] [<ffffffff810aa1c8>] do_softirq.part.14+0x38/0x40 [ 219.187249] [<ffffffff810aa24d>] __local_bh_enable_ip+0x7d/0x80 [ 219.187254] [<ffffffffa0408428>] sctp_inq_push+0x68/0x80 [sctp] [ 219.187258] [<ffffffffa04190f1>] sctp_backlog_rcv+0x151/0x1c0 [sctp] [ 219.187260] [<ffffffff81692b07>] __release_sock+0x87/0xf0 [ 219.187261] [<ffffffff81692ba0>] release_sock+0x30/0xa0 [ 219.187265] [<ffffffffa040e46d>] sctp_accept+0x17d/0x210 [sctp] [ 219.187266] [<ffffffff810e7510>] ? prepare_to_wait_event+0xf0/0xf0 [ 219.187268] [<ffffffff8172d52c>] inet_accept+0x3c/0x130 [ 219.187269] [<ffffffff8168d7a3>] SYSC_accept4+0x103/0x210 [ 219.187271] [<ffffffff817ca2ba>] ? _raw_spin_unlock_bh+0x1a/0x20 [ 219.187272] [<ffffffff81692bfc>] ? release_sock+0x8c/0xa0 [ 219.187276] [<ffffffffa0413e22>] ? sctp_inet_listen+0x62/0x1b0 [sctp] [ 219.187277] [<ffffffff8168f2d0>] SyS_accept+0x10/0x20 Fixes: 860fbbc343bf ("sctp: prepare for socket backlog behavior change") Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25hv_netvsc: Fix VF register on bonding devicesHaiyang Zhang1-2/+2
Added a condition to avoid bonding devices with same MAC registering as VF. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25kcm: remove redundant -ve error check and return pathColin Ian King1-5/+1
The check for a -ve error is redundant, remove it and just immediately return the return value from the call to seq_open_net. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net: ipv6: Always leave anycast and multicast groups on link downMike Manning1-0/+4
Default kernel behavior is to delete IPv6 addresses on link down, which entails deletion of the multicast and the subnet-router anycast addresses. These deletions do not happen with sysctl setting to keep global IPv6 addresses on link down, so every link down/up causes an increment of the anycast and multicast refcounts. These bogus refcounts may stop these addrs from being removed on subsequent calls to delete them. The solution is to leave the groups for the multicast and subnet anycast on link down for the callflow when global IPv6 addresses are kept. Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional") Signed-off-by: Mike Manning <mmanning@brocade.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Merge tag 'wireless-drivers-next-for-davem-2016-07-22' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-nextDavid S. Miller78-644/+1178
Kalle Valo says: ==================== pull-request: wireless-drivers-next 2016-07-22 I'm sick so I have to keep this short, but here's the last pull request to net-next. This time there's a trivial conflict with mtd tree: http://lkml.kernel.org/g/20160720123133.44dab209@canb.auug.org.au We concluded with Brian (CCed) that it's best that we ask Linus to fix this. The patches have been in linux-next for a couple of days. This time I haven't done any merge tests so I don't know if there are any other conflicts etc. Please let me know if there are any problems. wireless-drivers-next patches for 4.8 Major changes: wl18xx * add initial mesh support bcma * serial flash support on non-MIPS SoCs ath10k * enable support for QCA9888 * disable wake_tx_queue() mac80211 op for older devices to workaround throughput regression ath9k * implement temperature compensation support for AR9003+ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25libcxgb: remove unused including <linux/version.h>Wei Yongjun1-1/+0
Remove including <linux/version.h> that don't need it. Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25sctp: use inet_recvmsg to support sctp RFS wellXin Long2-2/+2
Commit 486bdee0134c ("sctp: add support for RPS and RFS") saves skb->hash into sk->sk_rxhash so that the inet_* can record it to flow table. But sctp uses sock_common_recvmsg as .recvmsg instead of inet_recvmsg, sock_common_recvmsg doesn't invoke sock_rps_record_flow to record the flow. It may cause that the receiver has no chances to record the flow if it doesn't send msg or poll the socket. So this patch fixes it by using inet_recvmsg as .recvmsg in sctp. Fixes: 486bdee0134c ("sctp: add support for RPS and RFS") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Merge branch 'macsec-icv-fixes'David S. Miller2-25/+53
Davide Caratti says: ==================== macsec: fix configurable ICV length This series provides a fix for macsec configurable ICV length. The maximum length of ICV element has been made compliant to IEEE 802.1AE, and error reporting in case of cipher suite configuration failure has been improved. Finally, a test has been added to netlink verify() callback in order to avoid creation of macsec interfaces having user-provided ICV length values that are not supported by the cipher suite. ==================== Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25macsec: validate ICV length on link creationDavide Caratti1-1/+13
Test the cipher suite initialization in case ICV length has a value different than its default. If this test fails, creation of a new macsec link will also fail. This avoids situations where further security associations can't be added due to failures of crypto_aead_setauthsize(), caused by unsupported user-provided values of the ICV length. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25macsec: fix error codes when a SA is createdDavide Caratti1-22/+36
preserve the return value of AEAD functions that are called when a SA is created, to avoid inappropriate display of "RTNETLINK answers: Cannot allocate memory" message. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25macsec: limit ICV length to 16 octetsDavide Caratti2-2/+4
IEEE 802.1AE-2006 standard recommends that the ICV element in a MACsec frame should not exceed 16 octets: add MACSEC_STD_ICV_LEN in uapi definitions accordingly, and avoid accepting configurations where the ICV length exceeds the standard value. Leave definition of MACSEC_MAX_ICV_LEN unchanged for backwards compatibility with userspace programs. Fixes: dece8d2b78d1 ("uapi: add MACsec bits") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25bridge: Fix incorrect re-injection of LLDP packetsIdo Schimmel1-0/+8
Commit 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") caused LLDP packets arriving through a bridge port to be re-injected to the Rx path with skb->dev set to the bridge device, but this breaks the lldpad daemon. The lldpad daemon opens a packet socket with protocol set to ETH_P_LLDP for any valid device on the system, which doesn't not include soft devices such as bridge and VLAN. Since packet sockets (ptype_base) are processed in the Rx path after the Rx handler, LLDP packets with skb->dev set to the bridge device never reach the lldpad daemon. Fix this by making the bridge's Rx handler re-inject LLDP packets with RX_HANDLER_PASS, which effectively restores the behaviour prior to the mentioned commit. This means netfilter will never receive LLDP packets coming through a bridge port, as I don't see a way in which we can have okfn() consume the packet without breaking existing behaviour. I've already carried out a similar fix for STP packets in commit 56fae404fb2c ("bridge: Fix incorrect re-injection of STP packets"). Fixes: 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Cc: Florian Westphal <fw@strlen.de> Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25sctp: support ipv6 nonlocal bindXin Long1-1/+3
This patch makes sctp support ipv6 nonlocal bind by adding sp->inet.freebind and net->ipv6.sysctl.ip_nonlocal_bind check in sctp_v6_available as what sctp did to support ipv4 nonlocal bind (commit cdac4e077489). Reported-by: Shijoe George <spanjikk@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queueDavid S. Miller12-352/+350
Conflicts: drivers/net/ethernet/intel/i40e/i40e_main.c Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2016-07-22 This series contains updates to i40e and i40evf. Heinrich Schuchardt found a possible null pointer being dereferenced in i40e_debug_aq(), fixed the issue by doing the variable assignment after we are sure the pointer is not null. Avinash fixed an issue when link was down, we were not showing the correct advertised link modes. Mitch cleans up a useless initializer since the variable is assigned right away. Refactors the receive filter handling to properly track filter adds and deletes so the driver will not lose filters during a reset and up/down cycles. Also added a tracking mechanism so that the driver knows when to enter and leave promiscuous mode. Catherine removes a device id which is not needed (or used). Moves a mutex lock since we need to lock the client list around the i40e_client_release() call to prevent the release from interrupting the client instances while they are being added. Joshua adds Hyper-V specific VF device ids. Amitoj Kaur Chawla cleans up a redundant memset() call before a memcpy(). Stefan Assmann adds the missing link advertise for some x710 NICs. Tushar Dave fixes and issue found on SPARC, where a PF reset clears MAC filters and if a platform-specific MAC address is used, the driver has to explicitly write default MAC address to MAC filters otherwise all incoming traffic destined to the default MAC address will be dropped after reset. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25bpf, events: fix offset in skb copy handlerDaniel Borkmann4-9/+14
This patch fixes the __output_custom() routine we currently use with bpf_skb_copy(). I missed that when len is larger than the size of the current handle, we can issue multiple invocations of copy_func, and __output_custom() advances destination but also source buffer by the written amount of bytes. When we have __output_custom(), this is actually wrong since in that case the source buffer points to a non-linear object, in our case an skb, which the copy_func helper is supposed to walk. Therefore, since this is non-linear we thus need to pass the offset into the helper, so that copy_func can use it for extracting the data from the source object. Therefore, adjust the callback signatures properly and pass offset into the skb_header_pointer() invoked from bpf_skb_copy() callback. The __DEFINE_OUTPUT_COPY_BODY() is adjusted to accommodate for two things: i) to pass in whether we should advance source buffer or not; this is a compile-time constant condition, ii) to pass in the offset for __output_custom(), which we do with help of __VA_ARGS__, so everything can stay inlined as is currently. Both changes allow for adapting the __output_* fast-path helpers w/o extra overhead. Fixes: 555c8a8623a3 ("bpf: avoid stack copy and use skb ctx for event output") Fixes: 7e3f977edd0b ("perf, events: add non-linear data support for raw records") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net/ncsi: avoid maybe-uninitialized warningArnd Bergmann1-13/+19
gcc-4.9 and higher warn about the newly added NSCI code: net/ncsi/ncsi-manage.c: In function 'ncsi_process_next_channel': net/ncsi/ncsi-manage.c:1003:2: error: 'old_state' may be used uninitialized in this function [-Werror=maybe-uninitialized] The warning is a false positive and therefore harmless, but it would be good to avoid it anyway. I have determined that the barrier in the spin_unlock_irqsave() is what confuses gcc to the point that it cannot track whether the variable was unused or not. This rearranges the code in a way that makes it obvious to gcc that old_state is always initialized at the time of use, functionally this should not change anything. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25Merge branch 'libcxgb'David S. Miller19-866/+540
Varun Prakash says: ==================== common library for Chelsio drivers. This patch series adds common library module(libcxgb.ko) for Chelsio drivers to remove duplicate code. This series moves common iSCSI DDP Page Pod manager code from cxgb4.ko to libcxgb.ko, earlier this code was used by only cxgbit.ko now it is used by three Chelsio iSCSI drivers cxgb3i, cxgb4i, cxgbit. In future this module will have common connection management and hardware specific code that can be shared by multiple Chelsio drivers(cxgb4, csiostor, iw_cxgb4, cxgb4i, cxgbit). Please review. Thanks -v3 - removed unused module init and exit functions. -v2 - updated CONFIG_CHELSIO_LIB to an invisible option - changed libcxgb.ko module license from GPL to Dual BSD/GPL ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25cxgb3i, cxgb4i: fix symbol not declared sparse warningVarun Prakash2-3/+3
Fix following sparse warnings warning: symbol 'cxgb3i_ofld_init' was not declared. Should it be static? warning: symbol 'cxgb4i_cplhandlers' was not declared. Should it be static? warning: symbol 'cxgb4i_ofld_init' was not declared. Should it be static? Signed-off-by: Varun Prakash <varun@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25libcxgb: export ppm release and tagmask set apiVarun Prakash4-0/+6
Export cxgbi_ppm_release() to release ppod manager and cxgbi_tagmask_set() to set tag mask, they are used by cxgb3i, cxgb4i and cxgbit. Signed-off-by: Varun Prakash <varun@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25cxgb3i: add iSCSI DDP supportVarun Prakash1-1/+118
Add iSCSI DDP support in cxgb3i driver using common iSCSI DDP Page Pod Manager. Signed-off-by: Varun Prakash <varun@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>