aboutsummaryrefslogtreecommitdiffstats
path: root/net (follow)
AgeCommit message (Collapse)AuthorFilesLines
2016-10-29netlink: Add nla_memdup() to wrap kmemdup() use on nlattrThomas Graf3-8/+3
Wrap several common instances of: kmemdup(nla_data(attr), nla_len(attr), GFP_KERNEL); Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-29net: ip, diag: include net/inet_sock.hArnd Bergmann1-0/+1
The newly added raw_diag.c fails to build in some configurations unless we include this header: In file included from net/ipv4/raw_diag.c:6:0: include/net/raw.h:71:21: error: field 'inet' has incomplete type net/ipv4/raw_diag.c: In function 'raw_diag_dump': net/ipv4/raw_diag.c:166:29: error: implicit declaration of function 'inet_sk' Fixes: 432490f9d455 ("net: ip, diag -- Add diag interface for raw sockets") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-28net caif: insert missing spaces in pr_* messages and unbreak multi-line stringsColin Ian King1-6/+3
Some of the pr_* messages are missing spaces, so insert these and also unbreak multi-line literal strings in pr_* messages Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-28Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-nextDavid S. Miller1-8/+0
Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2016-10-25 Just a leftover from the last development cycle. 1) Remove some unused code, from Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-27net: skip genenerating uevents for network namespaces that are exitingAndrey Vagin1-3/+11
No one can see these events, because a network namespace can not be destroyed, if it has sockets. Unlike other devices, uevent-s for network devices are generated only inside their network namespaces. They are filtered in kobj_bcast_filter() My experiments shows that net namespaces are destroyed more 30% faster with this optimization. Here is a perf output for destroying network namespaces without this patch. - 94.76% 0.02% kworker/u48:1 [kernel.kallsyms] [k] cleanup_net - 94.74% cleanup_net - 94.64% ops_exit_list.isra.4 - 41.61% default_device_exit_batch - 41.47% unregister_netdevice_many - rollback_registered_many - 40.36% netdev_unregister_kobject - 14.55% device_del + 13.71% kobject_uevent - 13.04% netdev_queue_update_kobjects + 12.96% kobject_put - 12.72% net_rx_queue_update_kobjects kobject_put - kobject_release + 12.69% kobject_uevent + 0.80% call_netdevice_notifiers_info + 19.57% nfsd_exit_net + 11.15% tcp_net_metrics_exit + 8.25% rpcsec_gss_exit_net It's very critical to optimize the exit path for network namespaces, because they are destroyed under net_mutex and many namespaces can be destroyed for one iteration. v2: use dev_set_uevent_suppress() Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrei Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-27genetlink: mark families as __ro_after_initJohannes Berg23-34/+34
Now genl_register_family() is the only thing (other than the users themselves, perhaps, but I didn't find any doing that) writing to the family struct. In all families that I found, genl_register_family() is only called from __init functions (some indirectly, in which case I've add __init annotations to clarifly things), so all can actually be marked __ro_after_init. This protects the data structure from accidental corruption. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-27genetlink: use idr to track familiesJohannes Berg1-165/+106
Since generic netlink family IDs are small integers, allocated densely, IDR is an ideal match for lookups. Replace the existing hand-written hash-table with IDR for allocation and lookup. This lets the families only be written to once, during register, since the list_head can be removed and removal of a family won't cause any writes. It also slightly reduces the code size (by about 1.3k on x86-64). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-27genetlink: statically initialize familiesJohannes Berg23-219/+304
Instead of providing macros/inline functions to initialize the families, make all users initialize them statically and get rid of the macros. This reduces the kernel code size by about 1.6k on x86-64 (with allyesconfig). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-27genetlink: no longer support using static family IDsJohannes Berg23-40/+22
Static family IDs have never really been used, the only use case was the workaround I introduced for those users that assumed their family ID was also their multicast group ID. Additionally, because static family IDs would never be reserved by the generic netlink code, using a relatively low ID would only work for built-in families that can be registered immediately after generic netlink is started, which is basically only the control family (apart from the workaround code, which I also had to add code for so it would reserve those IDs) Thus, anything other than GENL_ID_GENERATE is flawed and luckily not used except in the cases I mentioned. Move those workarounds into a few lines of code, and then get rid of GENL_ID_GENERATE entirely, making it more robust. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-27genetlink: introduce and use genl_family_attrbuf()Johannes Berg5-35/+53
This helper function allows family implementations to access their family's attrbuf. This gets rid of the attrbuf usage in families, and also adds locking validation, since it's not valid to use the attrbuf with parallel_ops or outside of the dumpit callback. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-27skbedit: allow the user to specify bitmask for markAntonio Quartulli1-3/+18
The user may want to use only some bits of the skb mark in his skbedit rules because the remaining part might be used by something else. Introduce the "mask" parameter to the skbedit actor in order to implement such functionality. When the mask is specified, only those bits selected by the latter are altered really changed by the actor, while the rest is left untouched. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-26devlink: Prevent port_type_set() callback when it's not neededElad Raz1-0/+2
When a port_type_set() is been called and the new port type set is the same as the old one, just return success. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-26batman-adv: Revert "use core MTU range checking in misc drivers"Sven Eckelmann1-1/+12
The maximum MTU is defined via the slave devices of an batman-adv interface. Thus it is not possible to calculate the max_mtu during the creation of the batman-adv device when no slave devices are attached. Doing so would for example break non-fragmentation setups which then (incorrectly) allow an MTU of 1500 even when underlying device cannot transport 1500 bytes + batman-adv headers. Checking the dynamically calculated max_mtu via the minimum of the slave devices MTU during .ndo_change_mtu is also used by the bridge interface. Cc: Jarod Wilson <jarod@redhat.com> Fixes: b3e3893e1253 ("net: use core MTU range checking in misc drivers") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-26sch_htb: do not report fake rate estimatorsEric Dumazet1-1/+1
When I prepared commit d250a5f90e53 ("pkt_sched: gen_estimator: Dont report fake rate estimators"), htb still had an implicit rate estimator for all its classes. Then later, I made this rate estimator optional in commit 64153ce0a7b6 ("net_sched: htb: do not setup default rate estimators"), but I forgot to update htb use of gnet_stats_copy_rate_est() After this patch, "tc -s qdisc ..." no longer report fake rate estimators for HTB classes. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-23net: ip, diag -- Add diag interface for raw socketsCyrill Gorcunov6-4/+303
In criu we are actively using diag interface to collect sockets present in the system when dumping applications. And while for unix, tcp, udp[lite], packet, netlink it works as expected, the raw sockets do not have. Thus add it. v2: - add missing sock_put calls in raw_diag_dump_one (by eric.dumazet@) - implement @destroy for diag requests (by dsa@) v3: - add export of raw_abort for IPv6 (by dsa@) - pass net-admin flag into inet_sk_diag_fill due to changes in net-next branch (by dsa@) v4: - use @pad in struct inet_diag_req_v2 for raw socket protocol specification: raw module carries sockets which may have custom protocol passed from socket() syscall and sole @sdiag_protocol is not enough to match underlied ones - start reporting protocol specifed in socket() call when sockets are raw ones for the same reason: user space tools like ss may parse this attribute and use it for socket matching v5 (by eric.dumazet@): - use sock_hold in raw_sock_get instead of atomic_inc, we're holding (raw_v4_hashinfo|raw_v6_hashinfo)->lock when looking up so counter won't be zero here. v6: - use sdiag_raw_protocol() helper which will access @pad structure used for raw sockets protocol specification: we can't simply rename this member without breaking uapi v7: - sine sdiag_raw_protocol() helper is not suitable for uapi lets rather make an alias structure with proper names. __check_inet_diag_req_raw helper will catch if any of structure unintentionally changed. CC: David S. Miller <davem@davemloft.net> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: David Ahern <dsa@cumulusnetworks.com> CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> CC: James Morris <jmorris@namei.org> CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> CC: Patrick McHardy <kaber@trash.net> CC: Andrey Vagin <avagin@openvz.org> CC: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-23lwt: Remove unused len fieldThomas Graf2-7/+2
The field is initialized by ILA and MPLS but never used. Remove it. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-23net: allow to kill a task which waits net_mutex in copy_new_nsAndrey Vagin1-1/+8
net_mutex can be locked for a long time. It may be because many namespaces are being destroyed or many processes decide to create a network namespace. Both these operations are heavy, so it is better to have an ability to kill a process which is waiting net_mutex. Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrei Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-23net/sched: em_meta: Fix 'meta vlan' to correctly recognize zero VID framesShmulik Ladkani1-4/+5
META_COLLECTOR int_vlan_tag() assumes that if the accel tag (vlan_tci) is zero, then no vlan accel tag is present. This is incorrect for zero VID vlan accel packets, making the following match fail: tc filter add ... basic match 'meta(vlan mask 0xfff eq 0)' ... Apparently 'int_vlan_tag' was implemented prior VLAN_TAG_PRESENT was introduced in 05423b2 "vlan: allow null VLAN ID to be used" (and at time introduced, the 'vlan_tx_tag_get' call in em_meta was not adapted). Fix, testing skb_vlan_tag_present instead of testing skb_vlan_tag_get's value. Fixes: 05423b2413 ("vlan: allow null VLAN ID to be used") Fixes: 1a31f2042e ("netsched: Allow meta match on vlan tag on receive") Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-22bpf: add helper for retrieving current numa node idDaniel Borkmann1-0/+2
Use case is mainly for soreuseport to select sockets for the local numa node, but since generic, lets also add this for other networking and tracing program types. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-22udp: use it's own memory accounting schemaPaolo Abeni4-66/+33
Completely avoid default sock memory accounting and replace it with udp-specific accounting. Since the new memory accounting model encapsulates completely the required locking, remove the socket lock on both enqueue and dequeue, and avoid using the backlog on enqueue. Be sure to clean-up rx queue memory on socket destruction, using udp its own sk_destruct. Tested using pktgen with random src port, 64 bytes packet, wire-speed on a 10G link as sender and udp_sink as the receiver, using an l4 tuple rxhash to stress the contention, and one or more udp_sink instances with reuseport. nr readers Kpps (vanilla) Kpps (patched) 1 170 440 3 1250 2150 6 3000 3650 9 4200 4450 12 5700 6250 v4 -> v5: - avoid unneeded test in first_packet_length v3 -> v4: - remove useless sk_rcvqueues_full() call v2 -> v3: - do not set the now unsed backlog_rcv callback v1 -> v2: - add memory pressure support - fixed dropwatch accounting for ipv6 Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-22udp: implement memory accounting helpersPaolo Abeni1-0/+106
Avoid using the generic helpers. Use the receive queue spin lock to protect the memory accounting operation, both on enqueue and on dequeue. On dequeue perform partial memory reclaiming, trying to leave a quantum of forward allocated memory. On enqueue use a custom helper, to allow some optimizations: - use a plain spin_lock() variant instead of the slightly costly spin_lock_irqsave(), - avoid dst_force check, since the calling code has already dropped the skb dst - avoid orphaning the skb, since skb_steal_sock() already did the work for us The above needs custom memory reclaiming on shutdown, provided by the udp_destruct_sock(). v5 -> v6: - don't orphan the skb on enqueue v4 -> v5: - replace the mem_lock with the receive queue spin lock - ensure that the bh is always allowed to enqueue at least a skb, even if sk_rcvbuf is exceeded v3 -> v4: - reworked memory accunting, simplifying the schema - provide an helper for both memory scheduling and enqueuing v1 -> v2: - use a udp specific destrctor to perform memory reclaiming - remove a couple of helpers, unneeded after the above cleanup - do not reclaim memory on dequeue if not under memory pressure - reworked the fwd accounting schema to avoid potential integer overflow Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-22net/socket: factor out helpers for memory and queue manipulationPaolo Abeni2-33/+67
Basic sock operations that udp code can use with its own memory accounting schema. No functional change is introduced in the existing APIs. v4 -> v5: - avoid whitespace changes v2 -> v4: - avoid exporting __sock_enqueue_skb v1 -> v2: - avoid export sock_rmem_free Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-21net: remove MTU limits on a few ether_setup callersJarod Wilson5-2/+11
These few drivers call ether_setup(), but have no ndo_change_mtu, and thus were overlooked for changes to MTU range checking behavior. They previously had no range checks, so for feature-parity, set their min_mtu to 0 and max_mtu to ETH_MAX_MTU (65535), instead of the 68 and 1500 inherited from the ether_setup() changes. Fine-tuning can come after we get back to full feature-parity here. CC: netdev@vger.kernel.org Reported-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st> CC: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st> CC: R Parameswaran <parameswaran.r7@gmail.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20ipv4/6: use core net MTU range checkingJarod Wilson4-33/+12
ipv4/ip_tunnel: - min_mtu = 68, max_mtu = 0xFFF8 - dev->hard_header_len - t_hlen - preserve all ndo_change_mtu checks for now to prevent regressions ipv6/ip6_tunnel: - min_mtu = 68, max_mtu = 0xFFF8 - dev->hard_header_len - preserve all ndo_change_mtu checks for now to prevent regressions ipv6/ip6_vti: - min_mtu = 1280, max_mtu = 65535 - remove redundant vti6_change_mtu ipv6/sit: - min_mtu = 1280, max_mtu = 0xFFF8 - t_hlen - remove redundant ipip6_tunnel_change_mtu CC: netdev@vger.kernel.org CC: "David S. Miller" <davem@davemloft.net> CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> CC: James Morris <jmorris@namei.org> CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20net: use core MTU range checking in misc driversJarod Wilson5-43/+8
firewire-net: - set min/max_mtu - remove fwnet_change_mtu nes: - set max_mtu - clean up nes_netdev_change_mtu xpnet: - set min/max_mtu - remove xpnet_dev_change_mtu hippi: - set min/max_mtu - remove hippi_change_mtu batman-adv: - set max_mtu - remove batadv_interface_change_mtu - initialization is a little async, not 100% certain that max_mtu is set in the optimal place, don't have hardware to test with rionet: - set min/max_mtu - remove rionet_change_mtu slip: - set min/max_mtu - streamline sl_change_mtu um/net_kern: - remove pointless ndo_change_mtu hsi/clients/ssi_protocol: - use core MTU range checking - remove now redundant ssip_pn_set_mtu ipoib: - set a default max MTU value - Note: ipoib's actual max MTU can vary, depending on if the device is in connected mode or not, so we'll just set the max_mtu value to the max possible, and let the ndo_change_mtu function continue to validate any new MTU change requests with checks for CM or not. Note that ipoib has no min_mtu set, and thus, the network core's mtu > 0 check is the only lower bounds here. mptlan: - use net core MTU range checking - remove now redundant mpt_lan_change_mtu fddi: - min_mtu = 21, max_mtu = 4470 - remove now redundant fddi_change_mtu (including export) fjes: - min_mtu = 8192, max_mtu = 65536 - The max_mtu value is actually one over IP_MAX_MTU here, but the idea is to get past the core net MTU range checks so fjes_change_mtu can validate a new MTU against what it supports (see fjes_support_mtu in fjes_hw.c) hsr: - min_mtu = 0 (calls ether_setup, max_mtu is 1500) f_phonet: - min_mtu = 6, max_mtu = 65541 u_ether: - min_mtu = 14, max_mtu = 15412 phonet/pep-gprs: - min_mtu = 576, max_mtu = 65530 - remove redundant gprs_set_mtu CC: netdev@vger.kernel.org CC: linux-rdma@vger.kernel.org CC: Stefan Richter <stefanr@s5r6.in-berlin.de> CC: Faisal Latif <faisal.latif@intel.com> CC: linux-rdma@vger.kernel.org CC: Cliff Whickman <cpw@sgi.com> CC: Robin Holt <robinmholt@gmail.com> CC: Jes Sorensen <jes@trained-monkey.org> CC: Marek Lindner <mareklindner@neomailbox.ch> CC: Simon Wunderlich <sw@simonwunderlich.de> CC: Antonio Quartulli <a@unstable.cc> CC: Sathya Prakash <sathya.prakash@broadcom.com> CC: Chaitra P B <chaitra.basappa@broadcom.com> CC: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com> CC: MPT-FusionLinux.pdl@broadcom.com CC: Sebastian Reichel <sre@kernel.org> CC: Felipe Balbi <balbi@kernel.org> CC: Arvid Brodin <arvid.brodin@alten.se> CC: Remi Denis-Courmont <courmisch@gmail.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20net: use core MTU range checking in core net infraJarod Wilson4-14/+7
geneve: - Merge __geneve_change_mtu back into geneve_change_mtu, set max_mtu - This one isn't quite as straight-forward as others, could use some closer inspection and testing macvlan: - set min/max_mtu tun: - set min/max_mtu, remove tun_net_change_mtu vxlan: - Merge __vxlan_change_mtu back into vxlan_change_mtu - Set max_mtu to IP_MAX_MTU and retain dynamic MTU range checks in change_mtu function - This one is also not as straight-forward and could use closer inspection and testing from vxlan folks bridge: - set max_mtu of IP_MAX_MTU and retain dynamic MTU range checks in change_mtu function openvswitch: - set min/max_mtu, remove internal_dev_change_mtu - note: max_mtu wasn't checked previously, it's been set to 65535, which is the largest possible size supported sch_teql: - set min/max_mtu (note: max_mtu previously unchecked, used max of 65535) macsec: - min_mtu = 0, max_mtu = 65535 macvlan: - min_mtu = 0, max_mtu = 65535 ntb_netdev: - min_mtu = 0, max_mtu = 65535 veth: - min_mtu = 68, max_mtu = 65535 8021q: - min_mtu = 0, max_mtu = 65535 CC: netdev@vger.kernel.org CC: Nicolas Dichtel <nicolas.dichtel@6wind.com> CC: Hannes Frederic Sowa <hannes@stressinduktion.org> CC: Tom Herbert <tom@herbertland.com> CC: Daniel Borkmann <daniel@iogearbox.net> CC: Alexander Duyck <alexander.h.duyck@intel.com> CC: Paolo Abeni <pabeni@redhat.com> CC: Jiri Benc <jbenc@redhat.com> CC: WANG Cong <xiyou.wangcong@gmail.com> CC: Roopa Prabhu <roopa@cumulusnetworks.com> CC: Pravin B Shelar <pshelar@ovn.org> CC: Sabrina Dubroca <sd@queasysnail.net> CC: Patrick McHardy <kaber@trash.net> CC: Stephen Hemminger <stephen@networkplumber.org> CC: Pravin Shelar <pshelar@nicira.com> CC: Maxim Krasnyansky <maxk@qti.qualcomm.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20net: use core MTU range checking in WAN driversJarod Wilson1-10/+1
- set min/max_mtu in all hdlc drivers, remove hdlc_change_mtu - sent max_mtu in lec driver, remove lec_change_mtu - set min/max_mtu in x25_asy driver CC: netdev@vger.kernel.org CC: Krzysztof Halasa <khc@pm.waw.pl> CC: Krzysztof Halasa <khalasa@piap.pl> CC: Jan "Yenya" Kasprzak <kas@fi.muni.cz> CC: Francois Romieu <romieu@fr.zoreil.com> CC: Kevin Curtis <kevin.curtis@farsite.co.uk> CC: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20net: use core MTU range checking in wireless driversJarod Wilson1-11/+4
- set max_mtu in wil6210 driver - set max_mtu in atmel driver - set min/max_mtu in cisco airo driver, remove airo_change_mtu - set min/max_mtu in ipw2100/ipw2200 drivers, remove libipw_change_mtu - set min/max_mtu in p80211netdev, remove wlan_change_mtu - set min/max_mtu in net/mac80211/iface.c and remove ieee80211_change_mtu - set min/max_mtu in wimax/i2400m and remove i2400m_change_mtu - set min/max_mtu in intersil/hostap and remove prism2_change_mtu - set min/max_mtu in intersil/orinoco - set min/max_mtu in tty/n_gsm and remove gsm_change_mtu CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: Maya Erez <qca_merez@qca.qualcomm.com> CC: Simon Kelley <simon@thekelleys.org.uk> CC: Stanislav Yakovlev <stas.yakovlev@gmail.com> CC: Johannes Berg <johannes@sipsolutions.net> CC: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20tcp: relax listening_hash operationsEric Dumazet2-15/+7
softirq handlers use RCU protection to lookup listeners, and write operations all happen from process context. We do not need to block BH for dump operations. Also SYN_RECV since request sockets are stored in the ehash table : 1) inet_diag_dump_icsk() no longer need to clear cb->args[3] and cb->args[4] that were used as cursors while iterating the old per listener hash table. 2) Also factorize a test : No need to scan listening_hash[] if r->id.idiag_dport is not zero. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20ila: Fix tailroom allocation of lwtstateThomas Graf1-1/+1
Tailroom is supposed to be of length sizeof(struct ila_lwt) but sizeof(struct ila_params) is currently allocated. This leads to the dst_cache and connected member of ila_lwt being referenced out of bounds. struct ila_lwt { struct ila_params p; struct dst_cache dst_cache; u32 connected : 1; }; Fixes: 65d7ab8de582 ("net: Identifier Locator Addressing module") Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19openvswitch: remove unnecessary EXPORT_SYMBOLsJiri Benc3-4/+0
Some symbols exported to other modules are really used only by openvswitch.ko. Remove the exports. Tested by loading all 4 openvswitch modules, nothing breaks. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19openvswitch: remove unused functionsJiri Benc2-17/+0
ovs_vport_deferred_free is not used anywhere. It's the only caller of free_vport_rcu thus this one can be removed, too. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: vlan: Use sizeof instead of literal numberGao Feng1-2/+2
Use sizeof variable instead of literal number to enhance the readability. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: dev: Improve debug statements for adjacency trackingDavid Ahern1-7/+15
Adjacency code only has debugs for the insert case. Add debugs for the remove path and make both consistently worded to make it easier to follow the insert and removal with reference counts. In addition, change the BUG to a WARN_ON. A missing adjacency at removal time is not cause for a panic. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: Add warning if any lower device is still in adjacency listDavid Ahern1-0/+15
Lower list should be empty just like upper. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: Remove all_adj_list and its referencesDavid Ahern1-205/+18
Only direct adjacencies are maintained. All upper or lower devices can be learned via the new walk API which recursively walks the adj_list for upper devices or lower devices. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: Introduce new api for walking upper and lower devicesDavid Ahern1-0/+155
This patch introduces netdev_walk_all_upper_dev_rcu, netdev_walk_all_lower_dev and netdev_walk_all_lower_dev_rcu. These functions recursively walk the adj_list of devices to determine all upper and lower devices. The functions take a callback function that is invoked for each device in the list. If the callback returns non-0, the walk is terminated and the functions return that code back to callers. v3 - simplified netdev_has_upper_dev_all_rcu and __netdev_has_upper_dev and removed typecast as suggested by Stephen v2 - fixed definition of netdev_next_lower_dev_rcu to mirror the upper_dev version. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18net: Remove refnr arg when inserting link adjacenciesDavid Ahern1-15/+12
Commit 93409033ae65 ("net: Add netdev all_adj_list refcnt propagation to fix panic") propagated the refnr to insert and remove functions tracking the netdev adjacency graph. However, for the insert path the refnr can only be 1. Accordingly, remove the refnr argument to make that clear. ie., the refnr arg in 93409033ae65 was only needed for the remove path. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18vlan: Remove unnecessary comparison of unsigned against 0Tobias Klauser1-2/+1
args.u.name_type is of type unsigned int and is always >= 0. This fixes the following GCC warning: net/8021q/vlan.c: In function ‘vlan_ioctl_handler’: net/8021q/vlan.c:574:14: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-17net: report right mtu value in error messageJakub Kicinski1-1/+1
Check is for max_mtu but message reports min_mtu. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-17rds: Remove duplicate prefix from rds_conn_path_error useJoe Perches1-2/+1
rds_conn_path_error already prefixes "RDS:" to the output. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-17rds: Remove unused rds_conn_errorJoe Perches2-19/+0
This macro's last use was removed in commit d769ef81d5b59 ("RDS: Update rds_conn_shutdown to work with rds_conn_path") so make the macro and the __rds_conn_error function definition and declaration disappear. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-17ila: Don't use dest cache when gateway is setTom Herbert1-0/+8
If the gateway is set on an ILA route we don't need to bother with using the destination cache in the ILA route. Translation does not change the routing in this case so we can stick with orig_output in the lwstate output function. Tested: Ran netperf with and without gateway for LWT route. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-15ila: Cache a route to translated addressTom Herbert1-6/+74
Add a dst_cache to ila_lwt structure. This holds a cached route for the translated address. In ila_output we now perform a route lookup after translation and if possible (destination in original route is full 128 bits) we set the dst_cache. Subsequent calls to ila_output can then use the cache to avoid the route lookup. This eliminates the need to set the gateway on ILA routes as previously was being done. Now we can do something like: ./ip route add 3333::2000:0:0:2/128 encap ila 2222:0:0:2 \ csum-mode neutral-map dev eth0 ## No via needed! Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-15lwtunnel: Add destroy state operationTom Herbert1-0/+13
Users of lwt tunnels may set up some secondary state in build_state function. Add a corresponding destroy_state function to allow users to clean up state. This destroy state function is called from lwstate_free. Also, we now free lwstate using kfree_rcu so user can assume structure is not freed before rcu. Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14net/sched: act_mirred: Implement ingress actionsShmulik Ladkani1-6/+45
Up until now, 'action mirred' supported only egress actions (either TCA_EGRESS_REDIR or TCA_EGRESS_MIRROR). This patch implements the corresponding ingress actions TCA_INGRESS_REDIR and TCA_INGRESS_MIRROR. This allows attaching filters whose target is to hand matching skbs into the rx processing of a specified device. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Tested-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14net/sched: act_mirred: Refactor detection whether dev needs xmit at mac headerShmulik Ladkani1-13/+15
Move detection logic that tests whether device expects skb data to point at mac_header upon xmit into a function. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14net/sched: act_mirred: Rename tcfm_ok_push to tcfm_mac_header_xmit and make it a boolShmulik Ladkani1-5/+6
'tcfm_ok_push' specifies whether a mac_len sized push is needed upon egress to the target device (if action is performed at ingress). Rename it to 'tcfm_mac_header_xmit' as this is actually an attribute of the target device (and use a bool instead of int). This allows to decouple the attribute from the action to be taken. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14Revert "net: Add driver helper functions to determine checksum offloadability"stephen hemminger1-136/+0
This reverts commit 6ae23ad36253a8033c5714c52b691b84456487c5. The code has been in kernel since 4.4 but there are no in tree code that uses. Unused code is broken code, remove it. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller35-542/+950