aboutsummaryrefslogtreecommitdiffstats
path: root/net (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-02-28net/smc: allow pnetid-less configurationUrsula Braun1-1/+41
Without hardware pnetid support there must currently be a pnet table configured to determine the IB device port to be used for SMC RDMA traffic. This patch enables a setup without pnet table, if the used handshake interface belongs already to a RoCE port. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-28net: sched: pie: avoid slow division in drop probability decayLeslie Monis1-1/+2
As per RFC 8033, it is sufficient for the drop probability decay factor to have a value of (1 - 1/64) instead of 98%. This avoids the need to do slow division. Suggested-by: David Laight <David.Laight@aculab.com> Signed-off-by: Leslie Monis <lesliemonis@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-27net: sched: act_csum: Fix csum calc for tagged packetsEli Britstein1-2/+29
The csum calculation is different for IPv4/6. For VLAN packets, tc_skb_protocol returns the VLAN protocol rather than the packet's one (e.g. IPv4/6), so csum is not calculated. Furthermore, VLAN may not be stripped so csum is not calculated in this case too. Calculate the csum for those cases. Fixes: d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan path") Signed-off-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-27net: sched: act_tunnel_key: fix metadata handlingVlad Buslov1-9/+9
Tunnel key action params->tcft_enc_metadata is only set when action is TCA_TUNNEL_KEY_ACT_SET. However, metadata pointer is incorrectly dereferenced during tunnel key init and release without verifying that action is if correct type, which causes NULL pointer dereference. Metadata tunnel dst_cache is also leaked on action overwrite. Fix metadata handling: - Verify that metadata pointer is not NULL before dereferencing it in tunnel_key_init error handling code. - Move dst_cache destroy code into tunnel_key_release_params() function that is called in both action overwrite and release cases (fixes resource leak) and verifies that actions has correct type before dereferencing metadata pointer (fixes NULL pointer dereference). Oops with KASAN enabled during tdc tests execution: [ 261.080482] ================================================================== [ 261.088049] BUG: KASAN: null-ptr-deref in dst_cache_destroy+0x21/0xa0 [ 261.094613] Read of size 8 at addr 00000000000000b0 by task tc/2976 [ 261.102524] CPU: 14 PID: 2976 Comm: tc Not tainted 5.0.0-rc7+ #157 [ 261.108844] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017 [ 261.116726] Call Trace: [ 261.119234] dump_stack+0x9a/0xeb [ 261.122625] ? dst_cache_destroy+0x21/0xa0 [ 261.126818] ? dst_cache_destroy+0x21/0xa0 [ 261.131004] kasan_report+0x176/0x192 [ 261.134752] ? idr_get_next+0xd0/0x120 [ 261.138578] ? dst_cache_destroy+0x21/0xa0 [ 261.142768] dst_cache_destroy+0x21/0xa0 [ 261.146799] tunnel_key_release+0x3a/0x50 [act_tunnel_key] [ 261.152392] tcf_action_cleanup+0x2c/0xc0 [ 261.156490] tcf_generic_walker+0x4c2/0x5c0 [ 261.160794] ? tcf_action_dump_1+0x390/0x390 [ 261.165163] ? tunnel_key_walker+0x5/0x1a0 [act_tunnel_key] [ 261.170865] ? tunnel_key_walker+0xe9/0x1a0 [act_tunnel_key] [ 261.176641] tca_action_gd+0x600/0xa40 [ 261.180482] ? tca_get_fill.constprop.17+0x200/0x200 [ 261.185548] ? __lock_acquire+0x588/0x1d20 [ 261.189741] ? __lock_acquire+0x588/0x1d20 [ 261.193922] ? mark_held_locks+0x90/0x90 [ 261.197944] ? mark_held_locks+0x90/0x90 [ 261.202018] ? __nla_parse+0xfe/0x190 [ 261.205774] tc_ctl_action+0x218/0x230 [ 261.209614] ? tcf_action_add+0x230/0x230 [ 261.213726] rtnetlink_rcv_msg+0x3a5/0x600 [ 261.217910] ? lock_downgrade+0x2d0/0x2d0 [ 261.222006] ? validate_linkmsg+0x400/0x400 [ 261.226278] ? find_held_lock+0x6d/0xd0 [ 261.230200] ? match_held_lock+0x1b/0x210 [ 261.234296] ? validate_linkmsg+0x400/0x400 [ 261.238567] netlink_rcv_skb+0xc7/0x1f0 [ 261.242489] ? netlink_ack+0x470/0x470 [ 261.246319] ? netlink_deliver_tap+0x1f3/0x5a0 [ 261.250874] netlink_unicast+0x2ae/0x350 [ 261.254884] ? netlink_attachskb+0x340/0x340 [ 261.261647] ? _copy_from_iter_full+0xdd/0x380 [ 261.268576] ? __virt_addr_valid+0xb6/0xf0 [ 261.275227] ? __check_object_size+0x159/0x240 [ 261.282184] netlink_sendmsg+0x4d3/0x630 [ 261.288572] ? netlink_unicast+0x350/0x350 [ 261.295132] ? netlink_unicast+0x350/0x350 [ 261.301608] sock_sendmsg+0x6d/0x80 [ 261.307467] ___sys_sendmsg+0x48e/0x540 [ 261.313633] ? copy_msghdr_from_user+0x210/0x210 [ 261.320545] ? save_stack+0x89/0xb0 [ 261.326289] ? __lock_acquire+0x588/0x1d20 [ 261.332605] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 261.340063] ? mark_held_locks+0x90/0x90 [ 261.346162] ? do_filp_open+0x138/0x1d0 [ 261.352108] ? may_open_dev+0x50/0x50 [ 261.357897] ? match_held_lock+0x1b/0x210 [ 261.364016] ? __fget_light+0xa6/0xe0 [ 261.369840] ? __sys_sendmsg+0xd2/0x150 [ 261.375814] __sys_sendmsg+0xd2/0x150 [ 261.381610] ? __ia32_sys_shutdown+0x30/0x30 [ 261.388026] ? lock_downgrade+0x2d0/0x2d0 [ 261.394182] ? mark_held_locks+0x1c/0x90 [ 261.400230] ? do_syscall_64+0x1e/0x280 [ 261.406172] do_syscall_64+0x78/0x280 [ 261.411932] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 261.419103] RIP: 0033:0x7f28e91a8b87 [ 261.424791] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 80 00 00 00 00 8b 05 6a 2b 2c 00 48 63 d2 48 63 ff 85 c0 75 18 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 f3 c3 0f 1f 80 00 00 00 00 53 48 89 f3 48 [ 261.448226] RSP: 002b:00007ffdc5c4e2d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 261.458183] RAX: ffffffffffffffda RBX: 000000005c73c202 RCX: 00007f28e91a8b87 [ 261.467728] RDX: 0000000000000000 RSI: 00007ffdc5c4e340 RDI: 0000000000000003 [ 261.477342] RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000000000c [ 261.486970] R10: 000000000000000c R11: 0000000000000246 R12: 0000000000000001 [ 261.496599] R13: 000000000067b4e0 R14: 00007ffdc5c5248c R15: 00007ffdc5c52480 [ 261.506281] ================================================================== [ 261.516076] Disabling lock debugging due to kernel taint [ 261.523979] BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0 [ 261.534413] #PF error: [normal kernel read fault] [ 261.541730] PGD 8000000317400067 P4D 8000000317400067 PUD 316878067 PMD 0 [ 261.551294] Oops: 0000 [#1] SMP KASAN PTI [ 261.557985] CPU: 14 PID: 2976 Comm: tc Tainted: G B 5.0.0-rc7+ #157 [ 261.568306] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017 [ 261.578874] RIP: 0010:dst_cache_destroy+0x21/0xa0 [ 261.586413] Code: f4 ff ff ff eb f6 0f 1f 00 0f 1f 44 00 00 41 56 41 55 49 c7 c6 60 fe 35 af 41 54 55 49 89 fc 53 bd ff ff ff ff e8 ef 98 73 ff <49> 83 3c 24 00 75 35 eb 6c 4c 63 ed e8 de 98 73 ff 4a 8d 3c ed 40 [ 261.611247] RSP: 0018:ffff888316447160 EFLAGS: 00010282 [ 261.619564] RAX: 0000000000000000 RBX: ffff88835b3e2f00 RCX: ffffffffad1c5071 [ 261.629862] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: 0000000000000297 [ 261.640149] RBP: 00000000ffffffff R08: fffffbfff5dd4e89 R09: fffffbfff5dd4e89 [ 261.650467] R10: 0000000000000001 R11: fffffbfff5dd4e88 R12: 00000000000000b0 [ 261.660785] R13: ffff8883267a10c0 R14: ffffffffaf35fe60 R15: 0000000000000001 [ 261.671110] FS: 00007f28ea3e6400(0000) GS:ffff888364200000(0000) knlGS:0000000000000000 [ 261.682447] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 261.691491] CR2: 00000000000000b0 CR3: 00000003178ae004 CR4: 00000000001606e0 [ 261.701283] Call Trace: [ 261.706374] tunnel_key_release+0x3a/0x50 [act_tunnel_key] [ 261.714522] tcf_action_cleanup+0x2c/0xc0 [ 261.721208] tcf_generic_walker+0x4c2/0x5c0 [ 261.728074] ? tcf_action_dump_1+0x390/0x390 [ 261.734996] ? tunnel_key_walker+0x5/0x1a0 [act_tunnel_key] [ 261.743247] ? tunnel_key_walker+0xe9/0x1a0 [act_tunnel_key] [ 261.751557] tca_action_gd+0x600/0xa40 [ 261.757991] ? tca_get_fill.constprop.17+0x200/0x200 [ 261.765644] ? __lock_acquire+0x588/0x1d20 [ 261.772461] ? __lock_acquire+0x588/0x1d20 [ 261.779266] ? mark_held_locks+0x90/0x90 [ 261.785880] ? mark_held_locks+0x90/0x90 [ 261.792470] ? __nla_parse+0xfe/0x190 [ 261.798738] tc_ctl_action+0x218/0x230 [ 261.805145] ? tcf_action_add+0x230/0x230 [ 261.811760] rtnetlink_rcv_msg+0x3a5/0x600 [ 261.818564] ? lock_downgrade+0x2d0/0x2d0 [ 261.825433] ? validate_linkmsg+0x400/0x400 [ 261.832256] ? find_held_lock+0x6d/0xd0 [ 261.838624] ? match_held_lock+0x1b/0x210 [ 261.845142] ? validate_linkmsg+0x400/0x400 [ 261.851729] netlink_rcv_skb+0xc7/0x1f0 [ 261.857976] ? netlink_ack+0x470/0x470 [ 261.864132] ? netlink_deliver_tap+0x1f3/0x5a0 [ 261.870969] netlink_unicast+0x2ae/0x350 [ 261.877294] ? netlink_attachskb+0x340/0x340 [ 261.883962] ? _copy_from_iter_full+0xdd/0x380 [ 261.890750] ? __virt_addr_valid+0xb6/0xf0 [ 261.897188] ? __check_object_size+0x159/0x240 [ 261.903928] netlink_sendmsg+0x4d3/0x630 [ 261.910112] ? netlink_unicast+0x350/0x350 [ 261.916410] ? netlink_unicast+0x350/0x350 [ 261.922656] sock_sendmsg+0x6d/0x80 [ 261.928257] ___sys_sendmsg+0x48e/0x540 [ 261.934183] ? copy_msghdr_from_user+0x210/0x210 [ 261.940865] ? save_stack+0x89/0xb0 [ 261.946355] ? __lock_acquire+0x588/0x1d20 [ 261.952358] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 261.959468] ? mark_held_locks+0x90/0x90 [ 261.965248] ? do_filp_open+0x138/0x1d0 [ 261.970910] ? may_open_dev+0x50/0x50 [ 261.976386] ? match_held_lock+0x1b/0x210 [ 261.982210] ? __fget_light+0xa6/0xe0 [ 261.987648] ? __sys_sendmsg+0xd2/0x150 [ 261.993263] __sys_sendmsg+0xd2/0x150 [ 261.998613] ? __ia32_sys_shutdown+0x30/0x30 [ 262.004555] ? lock_downgrade+0x2d0/0x2d0 [ 262.010236] ? mark_held_locks+0x1c/0x90 [ 262.015758] ? do_syscall_64+0x1e/0x280 [ 262.021234] do_syscall_64+0x78/0x280 [ 262.026500] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 262.033207] RIP: 0033:0x7f28e91a8b87 [ 262.038421] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 80 00 00 00 00 8b 05 6a 2b 2c 00 48 63 d2 48 63 ff 85 c0 75 18 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 f3 c3 0f 1f 80 00 00 00 00 53 48 89 f3 48 [ 262.060708] RSP: 002b:00007ffdc5c4e2d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 262.070112] RAX: ffffffffffffffda RBX: 000000005c73c202 RCX: 00007f28e91a8b87 [ 262.079087] RDX: 0000000000000000 RSI: 00007ffdc5c4e340 RDI: 0000000000000003 [ 262.088122] RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000000000c [ 262.097157] R10: 000000000000000c R11: 0000000000000246 R12: 0000000000000001 [ 262.106207] R13: 000000000067b4e0 R14: 00007ffdc5c5248c R15: 00007ffdc5c52480 [ 262.115271] Modules linked in: act_tunnel_key act_skbmod act_simple act_connmark nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 act_csum libcrc32c act_meta_skbtcindex act_meta_skbprio act_meta_mark act_ife ife act_police act_sample psample act_gact veth nfsv3 nfs_acl nfs lockd grace fscache bridge stp llc intel_rapl sb_edac mlx5_ib x86_pkg_temp_thermal sunrpc intel_powerclamp coretemp ib_uverbs kvm_intel ib_core kvm irqbypass mlx5_core crct10dif_pclmul crc32_pclmul crc32c_intel igb ghash_clmulni_intel intel_cstate mlxfw iTCO_wdt devlink intel_uncore iTCO_vendor_support ipmi_ssif ptp mei_me intel_rapl_perf ioatdma joydev pps_core ses mei i2c_i801 pcspkr enclosure lpc_ich dca wmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter pcc_cpufreq ast i2c_algo_bit drm_kms_helper ttm drm mpt3sas raid_class scsi_transport_sas [ 262.204393] CR2: 00000000000000b0 [ 262.210390] ---[ end trace 2e41d786f2c7901a ]--- [ 262.226790] RIP: 0010:dst_cache_destroy+0x21/0xa0 [ 262.234083] Code: f4 ff ff ff eb f6 0f 1f 00 0f 1f 44 00 00 41 56 41 55 49 c7 c6 60 fe 35 af 41 54 55 49 89 fc 53 bd ff ff ff ff e8 ef 98 73 ff <49> 83 3c 24 00 75 35 eb 6c 4c 63 ed e8 de 98 73 ff 4a 8d 3c ed 40 [ 262.258311] RSP: 0018:ffff888316447160 EFLAGS: 00010282 [ 262.266304] RAX: 0000000000000000 RBX: ffff88835b3e2f00 RCX: ffffffffad1c5071 [ 262.276251] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: 0000000000000297 [ 262.286208] RBP: 00000000ffffffff R08: fffffbfff5dd4e89 R09: fffffbfff5dd4e89 [ 262.296183] R10: 0000000000000001 R11: fffffbfff5dd4e88 R12: 00000000000000b0 [ 262.306157] R13: ffff8883267a10c0 R14: ffffffffaf35fe60 R15: 0000000000000001 [ 262.316139] FS: 00007f28ea3e6400(0000) GS:ffff888364200000(0000) knlGS:0000000000000000 [ 262.327146] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 262.335815] CR2: 00000000000000b0 CR3: 00000003178ae004 CR4: 00000000001606e0 Fixes: 41411e2fd6b8 ("net/sched: act_tunnel_key: Add dst_cache support") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-27route: Add multipath_hash in flowi_common to make user-define hashwenxu3-4/+8
Current fib_multipath_hash_policy can make hash based on the L3 or L4. But it only work on the outer IP. So a specific tunnel always has the same hash value. But a specific tunnel may contain so many inner connections. This patch provide a generic multipath_hash in floi_common. It can make a user-define hash which can mix with L3 or L4 hash. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-27net: Remove switchdev_opsFlorian Fainelli1-5/+0
Now that we have converted all possible callers to using a switchdev notifier for attributes we do not have a need for implementing switchdev_ops anymore, and this can be removed from all drivers the net_device structure. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-27net: switchdev: Replace port attr set SDO with a notificationFlorian Fainelli2-30/+31
Drop switchdev_ops.switchdev_port_attr_set. Drop the uses of this field from all clients, which were migrated to use switchdev notification in the previous patches. Add a new function switchdev_port_attr_notify() that sends the switchdev notifications SWITCHDEV_PORT_ATTR_SET and calls the blocking (process) notifier chain. We have one odd case within net/bridge/br_switchdev.c with the SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS attribute identifier that requires executing from atomic context, we deal with that one specifically. Drop __switchdev_port_attr_set() and update switchdev_port_attr_set() likewise. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-27net: dsa: Handle SWITCHDEV_PORT_ATTR_SETFlorian Fainelli1-0/+18
Following patches will change the way we communicate setting a port's attribute and use notifiers towards that goal. Prepare DSA to support receiving notifier events targeting SWITCHDEV_PORT_ATTR_SET from both atomic and process context and use a small helper to translate the event notifier into something that dsa_slave_port_attr_set() can process. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-27switchdev: Add SWITCHDEV_PORT_ATTR_SETFlorian Fainelli1-0/+51
In preparation for allowing switchdev enabled drivers to veto specific attribute settings from within the context of the caller, introduce a new switchdev notifier type for port attributes. Suggested-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-27Revert "net: sched: fw: don't set arg->stop in fw_walk() when empty"Vlad Buslov1-1/+4
This reverts commit 31a998487641 ("net: sched: fw: don't set arg->stop in fw_walk() when empty") Cls API function tcf_proto_is_empty() was changed in commit 6676d5e416ee ("net: sched: set dedicated tcf_walker flag when tp is empty") to no longer depend on arg->stop to determine that classifier instance is empty. Instead, it adds dedicated arg->nonempty field, which makes the fix in fw classifier no longer necessary. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-27ethtool: Use explicit designated initializers for .cmdLi RongQing1-2/+2
Initialize the .cmd member by using a designated struct initializer. This fixes warning of missing field initializers, and makes code a little easier to read. Signed-off-by: Li RongQing <lirongqing@baidu.com> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26net: sched: pie: fix 64-bit divisionLeslie Monis1-1/+1
Use div_u64() to resolve build failures on 32-bit platforms. Fixes: 3f7ae5f3dc52 ("net: sched: pie: add more cases to auto-tune alpha and beta") Signed-off-by: Leslie Monis <lesliemonis@gmail.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26net: Use RCU_POINTER_INITIALIZER() to init static variableLi RongQing1-1/+1
This pointer is RCU protected, so proper primitives should be used. Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26tcp: remove tcp_queue argument from tso_fragment()Eric Dumazet1-7/+6
tso_fragment() is only called for packets still in write queue. Remove the tcp_queue parameter to make this more obvious, even if the comment clearly states this. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26tcp: use tcp_md5_needed for timewait socketsEric Dumazet1-8/+13
This might speedup tcp_twsk_destructor() a bit, avoiding a cache line miss. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26tcp: convert tcp_md5_needed to static_branch APIEric Dumazet3-4/+4
We prefer static_branch_unlikely() over static_key_false() these days. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26tcp: get rid of tcp_check_send_head()Eric Dumazet1-1/+2
This helper is used only once, and its name is no longer relevant. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26net: sched: fix typo in walker_check_empty()Vlad Buslov1-2/+2
Function walker_check_empty() incorrectly verifies that tp pointer is not NULL, instead of actual filter pointer. Fix conditional to check the right pointer. Adjust filter pointer naming accordingly to other cls API functions. Fixes: 6676d5e416ee ("net: sched: set dedicated tcf_walker flag when tp is empty") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26net: sched: pie: fix mistake in reference linkLeslie Monis1-1/+1
Fix the incorrect reference link to RFC 8033 Signed-off-by: Leslie Monis <lesliemonis@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26devlink: require non-NULL ops for devlink instancesJakub Kicinski1-26/+22
Commit 76726ccb7f46 ("devlink: add flash update command") and commit 2d8dc5bbf4e7 ("devlink: Add support for reload") access devlink ops without NULL-checking. There is, however, no driver which would pass in NULL ops, so let's just make that a requirement. Remove the now unnecessary NULL-checking. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26devlink: hold a reference to the netdevice around ethtool compatJakub Kicinski2-11/+15
When ethtool is calling into devlink compat code make sure we have a reference on the netdevice on which the operation was invoked. v3: move the hold/lock logic into devlink_compat_* functions (Florian) Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26devlink: create a special NDO for getting the devlink instanceJakub Kicinski1-39/+17
Instead of iterating over all devlink ports add a NDO which will return the devlink instance from the driver. v2: add the netdev_to_devlink() helper (Michal) v3: check that devlink has ops (Florian) v4: hold devlink_mutex (Jiri) Suggested-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26net: devlink: turn devlink into a built-inJakub Kicinski3-24/+4
Being able to build devlink as a module causes growing pains. First all drivers had to add a meta dependency to make sure they are not built in when devlink is built as a module. Now we are struggling to invoke ethtool compat code reliably. Make devlink code built-in, users can still not build it at all but the dynamically loadable module option is removed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26net: remove unused struct inet_frag_queue.fragments fieldPeter Oskolkov5-36/+13
Now that all users of struct inet_frag_queue have been converted to use 'rb_fragments', remove the unused 'fragments' field. Build with `make allyesconfig` succeeded. ip_defrag selftest passed. Signed-off-by: Peter Oskolkov <posk@google.com> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: sched: pie: update referencesMohit P. Tahiliani1-3/+1
RFC 8033 replaces the IETF draft for PIE Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in> Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com> Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com> Signed-off-by: Manish Kumar B <bmanish15597@gmail.com> Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com> Signed-off-by: Leslie Monis <lesliemonis@gmail.com> Acked-by: Dave Taht <dave.taht@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: sched: pie: add derandomization mechanismMohit P. Tahiliani1-1/+27
Random dropping of packets to achieve latency control may introduce outlier situations where packets are dropped too close to each other or too far from each other. This can cause the real drop percentage to temporarily deviate from the intended drop probability. In certain scenarios, such as a small number of simultaneous TCP flows, these deviations can cause significant deviations in link utilization and queuing latency. RFC 8033 suggests using a derandomization mechanism to avoid these deviations. Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in> Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com> Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com> Signed-off-by: Manish Kumar B <bmanish15597@gmail.com> Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com> Signed-off-by: Leslie Monis <lesliemonis@gmail.com> Acked-by: Dave Taht <dave.taht@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: sched: pie: add more cases to auto-tune alpha and betaMohit P. Tahiliani1-33/+32
The current implementation scales the local alpha and beta variables in the calculate_probability function by the same amount for all values of drop probability below 1%. RFC 8033 suggests using additional cases for auto-tuning alpha and beta when the drop probability is less than 1%. In order to add more auto-tuning cases, MAX_PROB must be scaled by u64 instead of u32 to prevent underflow when scaling the local alpha and beta variables in the calculate_probability function. Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in> Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com> Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com> Signed-off-by: Manish Kumar B <bmanish15597@gmail.com> Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com> Signed-off-by: Leslie Monis <lesliemonis@gmail.com> Acked-by: Dave Taht <dave.taht@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: sched: pie: change initial value of pie_vars->burst_timeMohit P. Tahiliani1-2/+2
RFC 8033 suggests an initial value of 150 milliseconds for the maximum time allowed for a burst of packets. Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in> Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com> Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com> Signed-off-by: Manish Kumar B <bmanish15597@gmail.com> Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com> Signed-off-by: Leslie Monis <lesliemonis@gmail.com> Acked-by: Dave Taht <dave.taht@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: sched: pie: change default value of pie_params->tupdateMohit P. Tahiliani1-1/+1
RFC 8033 suggests a default value of 15 milliseconds for the update interval. Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in> Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com> Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com> Signed-off-by: Manish Kumar B <bmanish15597@gmail.com> Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com> Signed-off-by: Leslie Monis <lesliemonis@gmail.com> Acked-by: Dave Taht <dave.taht@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: sched: pie: change default value of pie_params->targetMohit P. Tahiliani1-1/+1
RFC 8033 suggests a default value of 15 milliseconds for the target queue delay. Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in> Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com> Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com> Signed-off-by: Manish Kumar B <bmanish15597@gmail.com> Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com> Signed-off-by: Leslie Monis <lesliemonis@gmail.com> Acked-by: Dave Taht <dave.taht@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: sched: pie: change value of QUEUE_THRESHOLDMohit P. Tahiliani1-1/+1
RFC 8033 recommends a value of 16384 bytes for the queue threshold. Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in> Signed-off-by: Dhaval Khandla <dhavaljkhandla26@gmail.com> Signed-off-by: Hrishikesh Hiraskar <hrishihiraskar@gmail.com> Signed-off-by: Manish Kumar B <bmanish15597@gmail.com> Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com> Signed-off-by: Leslie Monis <lesliemonis@gmail.com> Acked-by: Dave Taht <dave.taht@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: sched: don't release block->lock when dumping chainsVlad Buslov1-9/+7
Function tc_dump_chain() obtains and releases block->lock on each iteration of its inner loop that dumps all chains on block. Outputting chain template info is fast operation so locking/unlocking mutex multiple times is an overhead when lock is highly contested. Modify tc_dump_chain() to only obtain block->lock once and dump all chains without releasing it. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25net: sched: set dedicated tcf_walker flag when tp is emptyVlad Buslov1-4/+9
Using tcf_walker->stop flag to determine when tcf_walker->fn() was called at least once is unreliable. Some classifiers set 'stop' flag on error before calling walker callback, other classifiers used to call it with NULL filter pointer when empty. In order to prevent further regressions, extend tcf_walker structure with dedicated 'nonempty' flag. Set this flag in tcf_walker->fn() implementation that is used to check if classifier has filters configured. Fixes: 8b64678e0af8 ("net: sched: refactor tp insert/delete for concurrent execution") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25tcp: clean up SOCK_DEBUG()Yafang Shao2-20/+1
Per discussion with Daniel[1] and Eric[2], these SOCK_DEBUG() calles in TCP are not needed now. We'd better clean up it. [1] https://patchwork.ozlabs.org/patch/1035573/ [2] https://patchwork.ozlabs.org/patch/1040533/ Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-25tcp: remove unused parameter of tcp_sacktag_bsearch()Taehee Yoo1-10/+6
parameter state in the tcp_sacktag_bsearch() is not used. So, it can be removed. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24switchdev: Complete removal of switchdev_port_attr_get()Florian Fainelli1-42/+0
We have no more in tree users of switchdev_port_attr_get() after d0e698d57a94 ("Merge branch 'net-Get-rid-of-switchdev_port_attr_get'") so completely remove the function signature and body. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24dsa: Remove phydev parameter from disable_port callAndrew Lunn3-4/+4
No current DSA driver makes use of the phydev parameter passed to the disable_port call. Remove it. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-nextDavid S. Miller12-71/+107
Johan Hedberg says: ==================== Here's the main bluetooth-next pull request for the 5.1 kernel. - Fixes & improvements to mediatek, hci_qca, btrtl, and btmrvl HCI drivers - Fixes to parsing invalid L2CAP config option sizes - Locking fix to bt_accept_enqueue() - Add support for new Marvel sd8977 chipset - Various other smaller fixes & cleanups ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24net: fix double-free in bpf_lwt_xmit_reroutePeter Oskolkov1-1/+1
dst_output() frees skb when it fails (see, for example, ip_finish_output2), so it must not be freed in this case. Fixes: 3bd0b15281af ("bpf: add handling of BPF_LWT_REROUTE to lwt_bpf.c") Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24ip_tunnel: Add ip tunnel tun_info type dst_cache in ip_tunnel_xmitwenxu1-11/+27
ip l add dev tun type gretap key 1000 Non-tunnel-dst ip tunnel device can send packet through lwtunnel This patch provide the tun_inf dst cache support for this mode. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24ip_tunnel: Add dst_cache support in lwtunnel_state of ip tunnelwenxu2-8/+26
The lwtunnel_state is not init the dst_cache Which make the ip_md_tunnel_xmit can't use the dst_cache. It will lookup route table every packets. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24tls: Return type of non-data records retrieved using MSG_PEEK in recvmsgVakul Garg1-11/+67
The patch enables returning 'type' in msghdr for records that are retrieved with MSG_PEEK in recvmsg. Further it prevents records peeked from socket from getting clubbed with any other record of different type when records are subsequently dequeued from strparser. For each record, we now retain its type in sk_buff's control buffer cb[]. Inside control buffer, record's full length and offset are already stored by strparser in 'struct strp_msg'. We store record type after 'struct strp_msg' inside 'struct tls_msg'. For tls1.2, the type is stored just after record dequeue. For tls1.3, the type is stored after record has been decrypted. Inside process_rx_list(), before processing a non-data record, we check that we must be able to return back the record type to the user application. If not, the decrypted records in tls context's rx_list is left there without consuming any data. Fixes: 692d7b5d1f912 ("tls: Fix recvmsg() to be able to peek across multiple records") Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24ipv6: icmp: use percpu allocationKefeng Wang1-6/+5
Use percpu allocation for the ipv6.icmp_sk. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24ipv6: icmp: use icmpv6_sk_exit()Kefeng Wang1-14/+11
Simply use icmpv6_sk_exit() when inet_ctl_sock_create() fail in icmpv6_sk_init(). Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24ipv4: icmp: use icmp_sk_exit()Kefeng Wang1-3/+1
Simply use icmp_sk_exit() when inet_ctl_sock_create() fail in icmp_sk_init(). Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24ila: Fix uninitialised return value in ila_xlat_nl_cmd_flushHerbert Xu1-1/+1
This patch fixes an uninitialised return value error in ila_xlat_nl_cmd_flush. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 6c4128f65857 ("rhashtable: Remove obsolete...") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24net/sched: act_tunnel_key: Add dst_cache supportwenxu1-4/+21
The metadata_dst is not init the dst_cache which make the ip_md_tunnel_xmit can't use the dst_cache. It will lookup route table every packets. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24net: Remove switchdev.h inclusion from team/bond/vlanFlorian Fainelli1-1/+0
This is no longer necessary after eca59f691566 ("net: Remove support for bridge bypass ndos from stacked devices") Suggested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andy Gospodarek <andy@greyhouse.net> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24net: dev: add generic protodown handlerAndy Roulin1-0/+19
Introduce dev_change_proto_down_generic, a generic ndo_change_proto_down implementation, which sets the netdev carrier state according to proto_down. This adds the ability to set protodown on vxlan and macvlan devices in a generic way for use by control protocols like VRRPD. Signed-off-by: Andy Roulin <aroulin@cumulusnetworks.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24net: Skip GSO length estimation if transport header is not setMaxim Mikityanskiy1-1/+1
qdisc_pkt_len_init expects transport_header to be set for GSO packets. Patch [1] skips transport_header validation for GSO packets that don't have network_header set at the moment of calling virtio_net_hdr_to_skb, and allows them to pass into the stack. After patch [2] no placeholder value is assigned to transport_header if dissection fails, so this patch adds a check to the place where the value of transport_header is used. [1] https://patchwork.ozlabs.org/patch/1044429/ [2] https://patchwork.ozlabs.org/patch/1046122/ Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>