aboutsummaryrefslogtreecommitdiffstats
path: root/include/net (follow)
AgeCommit message (Collapse)AuthorFilesLines
2015-10-07net: Rename FLOWI_FLAG_VRFSRC to FLOWI_FLAG_L3MDEV_SRCDavid Ahern2-2/+2
Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-05netfilter: remove dead codeFlavio Leitner1-4/+0
Remove __nf_conntrack_find() from headers. Fixes: dcd93ed4cd1 ("netfilter: nf_conntrack: remove dead code") Signed-off-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-05tcp/dccp: fix old style declarationsRaanan Avargil1-2/+2
I’m using the compilation flag -Werror=old-style-declaration, which requires that the “inline” word would come at the beginning of the code line. $ make drivers/net/ethernet/intel/e1000e/e1000e.ko ... include/net/inet_timewait_sock.h:116:1: error: ‘inline’ is not at beginning of declaration [-Werror=old-style-declaration] static void inline inet_twsk_schedule(struct inet_timewait_sock *tw, int timeo) include/net/inet_timewait_sock.h:121:1: error: ‘inline’ is not at beginning of declaration [-Werror=old-style-declaration] static void inline inet_twsk_reschedule(struct inet_timewait_sock *tw, int timeo) Fixes: ed2e92394589 ("tcp/dccp: fix timewait races in timer handling") Signed-off-by: Raanan Avargil <raanan.avargil@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-05Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/net-nextDavid S. Miller2-4/+4
Eric W. Biederman says: ==================== net: Pass net through ip fragmention This is the next installment of my work to pass struct net through the output path so the code does not need to guess how to figure out which network namespace it is in, and ultimately routes can have output devices in another network namespace. This round focuses on passing net through ip fragmentation which we seem to call from about everywhere. That is the main ip output paths, the bridge netfilter code, and openvswitch. This has to happend at once accross the tree as function pointers are involved. First some prep work is done, then ipv4 and ipv6 are converted and then temporary helper functions are removed. ==================== Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-05ipv4: ICMP packet inspection for multipathPeter Nørlund1-1/+10
ICMP packets are inspected to let them route together with the flow they belong to, minimizing the chance that a problematic path will affect flows on other paths, and so that anycast environments can work with ECMP. Signed-off-by: Peter Nørlund <pch@ordbogen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-05ipv4: L3 hash-based multipathPeter Nørlund1-3/+11
Replaces the per-packet multipath with a hash-based multipath using source and destination address. Signed-off-by: Peter Nørlund <pch@ordbogen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-05tcp: avoid two atomic ops for syncookiesEric Dumazet2-4/+10
inet_reqsk_alloc() is used to allocate a temporary request in order to generate a SYNACK with a cookie. Then later, syncookie validation also uses a temporary request. These paths already took a reference on listener refcount, we can avoid a couple of atomic operations. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-05net: use sk_fullsock() in __netdev_pick_tx()Eric Dumazet1-0/+1
SYN_RECV & TIMEWAIT sockets are not full blown, they do not have a sk_dst_cache pointer. Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-05inet: ip_skb_dst_mtu() should use sk_fullsock()Eric Dumazet1-3/+6
SYN_RECV & TIMEWAIT sockets are not full blown, do not even try to call ip_sk_use_pmtu() on them. Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-05Bluetooth: Send transport open and close monitor eventsMarcel Holtmann1-0/+2
When the core starts or shuts down the actual HCI transport, send a new monitor event that indicates that this is happening. These new events correspond to HCI_DEV_OPEN and HCI_DEV_CLOSE events. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-10-05Bluetooth: Introduce HCI_DEV_OPEN and HCI_DEV_CLOSE eventsMarcel Holtmann1-0/+2
When opening the HCI transport via hdev->open send HCI_DEV_OPEN event and when closing the HCI transport via hdev->close send HCI_DEV_CLOSE. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-10-04netfilter: nfnetlink_queue: get rid of nfnetlink_queue_ct.cPablo Neira Ayuso1-51/+0
The original intention was to avoid dependencies between nfnetlink_queue and conntrack without ifdef pollution. However, we can achieve this by moving the conntrack dependent code into ctnetlink and keep some glue code to access the nfq_ct indirection from nfqueue. After this patch, the nfq_ct indirection is always compiled in the netfilter core to avoid polluting nfqueue with ifdefs. Thus, if nf_conntrack is not compiled this results in only 8-bytes of memory waste in x86_64. This patch also adds ctnetlink_nfqueue_seqadj() to avoid that the nf_conn structure layout if exposed to nf_queue, which creates another dependency with nf_conntrack at compilation time. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-03tcp/dccp: add SLAB_DESTROY_BY_RCU flag for request socketsEric Dumazet1-1/+3
Before letting request sockets being put in TCP/DCCP regular ehash table, we need to add either : - SLAB_DESTROY_BY_RCU flag to their kmem_cache - add RCU grace period before freeing them. Since we carefully respected the SLAB_DESTROY_BY_RCU protocol like ESTABLISH and TIMEWAIT sockets, use it here. req_prot_init() being only used by TCP and DCCP, I did not add a new slab_flags into their rsk_prot, but reuse prot->slab_flags Since all reqsk_alloc() users are correctly dealing with a failure, add the __GFP_NOWARN flag to avoid traces under pressure. Fixes: 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03switchdev: push object ID back to object structureJiri Pirko1-10/+4
Suggested-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03switchdev: bring back switchdev_obj and use it as a generic object paramJiri Pirko1-11/+31
Replace "void *obj" with a generic structure. Introduce couple of helpers along that. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03switchdev: rename switchdev_obj_fdb to switchdev_obj_port_fdbJiri Pirko1-1/+1
Make the struct name in sync with object id name. Suggested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03switchdev: rename switchdev_obj_vlan to switchdev_obj_port_vlanJiri Pirko1-1/+1
Make the struct name in sync with object id name. Suggested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03switchdev: rename SWITCHDEV_ATTR_* enum values to SWITCHDEV_ATTR_ID_*Jiri Pirko1-4/+4
To be aligned with obj. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03switchdev: rename SWITCHDEV_OBJ_* enum values to SWITCHDEV_OBJ_ID_*Jiri Pirko1-7/+7
Suggested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03tcp: remove max_qlen_logEric Dumazet2-9/+3
This control variable was set at first listen(fd, backlog) call, but not updated if application tried to increase or decrease backlog. It made sense at the time listener had a non resizeable hash table. Also rounding to powers of two was not very friendly. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03tcp/dccp: remove struct listen_sockEric Dumazet1-22/+4
It is enough to check listener sk_state, no need for an extra condition. max_qlen_log can be moved into struct request_sock_queue We can remove syn_wait_lock and the alignment it enforced. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03tcp: attach SYNACK messages to request sockets instead of listenerEric Dumazet1-2/+4
If a listen backlog is very big (to avoid syncookies), then the listener sk->sk_wmem_alloc is the main source of false sharing, as we need to touch it twice per SYNACK re-transmit and TX completion. (One SYN packet takes listener lock once, but up to 6 SYNACK are generated) By attaching the skb to the request socket, we remove this source of contention. Tested: listen(fd, 10485760); // single listener (no SO_REUSEPORT) 16 RX/TX queue NIC Sustain a SYNFLOOD attack of ~320,000 SYN per second, Sending ~1,400,000 SYNACK per second. Perf profiles now show listener spinlock being next bottleneck. 20.29% [kernel] [k] queued_spin_lock_slowpath 10.06% [kernel] [k] __inet_lookup_established 5.12% [kernel] [k] reqsk_timer_handler 3.22% [kernel] [k] get_next_timer_interrupt 3.00% [kernel] [k] tcp_make_synack 2.77% [kernel] [k] ipt_do_table 2.70% [kernel] [k] run_timer_softirq 2.50% [kernel] [k] ip_finish_output 2.04% [kernel] [k] cascade Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03ipv6: remove obsolete inet6 functionsEric Dumazet1-9/+0
inet6_csk_search_req() and inet6_csk_reqsk_queue_hash_add() no longer exist. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03tcp/dccp: shrink struct listen_sockEric Dumazet1-3/+0
We no longer use hash_rnd, nr_table_entries and syn_table[] For a listener with a backlog of 10 millions sockets, this saves 80 MBytes of vmalloced memory. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03tcp/dccp: install syn_recv requests into ehash tableEric Dumazet4-11/+1
In this patch, we insert request sockets into TCP/DCCP regular ehash table (where ESTABLISHED and TIMEWAIT sockets are) instead of using the per listener hash table. ACK packets find SYN_RECV pseudo sockets without having to find and lock the listener. In nominal conditions, this halves pressure on listener lock. Note that this will allow for SO_REUSEPORT refinements, so that we can select a listener using cpu/numa affinities instead of the prior 'consistent hash', since only SYN packets will apply this selection logic. We will shrink listen_sock in the following patch to ease code review. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ying Cai <ycai@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03tcp/dccp: remove inet_csk_reqsk_queue_added() timeout argumentEric Dumazet1-2/+1
This is no longer used. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03tcp: get_openreq[46]() changesEric Dumazet1-1/+0
When request sockets are no longer in a per listener hash table but on regular TCP ehash, we need to access listener uid through req->rsk_listener get_openreq6() also gets a const for its request socket argument. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03tcp/dccp: init sk_prot and call sk_node_init() in reqsk_alloc()Eric Dumazet1-10/+12
We plan to use generic functions to insert request sockets into ehash table. sk_prot needs to be set (to retrieve sk_prot->h.hashinfo) sk_node needs to be cleared. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03tcp: move synflood_warned into struct request_sock_queueEric Dumazet1-1/+1
long term plan is to remove struct listen_sock when its hash table is no longer there. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03tcp: move qlen/young out of struct listen_sockEric Dumazet1-30/+10
qlen_inc & young_inc were protected by listener lock, while qlen_dec & young_dec were atomic fields. Everything needs to be atomic for upcoming lockless listener. Also move qlen/young in request_sock_queue as we'll get rid of struct listen_sock eventually. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-03tcp: add a spinlock to protect struct request_sock_queueEric Dumazet1-19/+18
struct request_sock_queue fields are currently protected by the listener 'lock' (not a real spinlock) We need to add a private spinlock instead, so that softirq handlers creating children do not have to worry with backlog notion that the listener 'lock' carries. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+5
Conflicts: net/dsa/slave.c net/dsa/slave.c simply had overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-30mac802154: check on len instead mac_lenAlexander Aring1-1/+1
This patch change the length check to len instead of mac_len for checking if the frame control field is available to dereference. We need to change it because I saw issues with af_packet raw sockets and the mrf24j40 which calls this functionality. The raw socket functionality doesn't set the mac_len but resets the skb_mac_header to skb->data which is still correct. The issue occur at mrf24j40 only, because the driver need to evaluate the fc fields. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-09-30nl802154: add support for security layerAlexander Aring3-75/+322
This patch adds support for accessing mac802154 llsec implementation over nl802154. I added for a new Kconfig entry to provide this functionality CONFIG_IEEE802154_NL802154_EXPERIMENTAL. This interface is still in development. It provides to change security parameters and add/del/dump entries of security tables. Later we can add also a get to get an entry by unique identifier. Cc: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-09-30netlink: add nla_get for le32 and le64Alexander Aring1-0/+18
This patch adds missing inline wrappers for nla_get_le32 and nla_get_le64. The 802.15.4 MAC byteorder is little endian and we keep the byteorder for fields like address configuration in the same byteorder as it comes from the MAC layer. To provide these fields for nl802154 userspace applications, we need these inline wrappers for netlink. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-09-30ipv6: Pass struct net through ip6_fragmentEric W. Biederman1-2/+2
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2015-09-30ipv4: Pass struct net through ip_fragmentEric W. Biederman1-2/+2
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2015-09-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller3-126/+59
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following pull request contains Netfilter/IPVS updates for net-next containing 90 patches from Eric Biederman. The main goal of this batch is to avoid recurrent lookups for the netns pointer, that happens over and over again in our Netfilter/IPVS code. The idea consists of passing netns pointer from the hook state to the relevant functions and objects where this may be needed. You can find more information on the IPVS updates from Simon Horman's commit merge message: c3456026adc0 ("Merge tag 'ipvs2-for-v4.4' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next"). Exceptionally, this time, I'm not posting the patches again on netdev, Eric already Cc'ed this mailing list in the original submission. If you need me to make, just let me know. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: switchdev: extract struct switchdev_obj_*Vivien Didelot1-27/+26
Now that switchdev and its drivers directly use specific switchdev_obj_* structures, move them out of the switchdev_obj union and get rif of this outer structure. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: switchdev: abstract object in add/del opsVivien Didelot1-6/+12
Similar to the notifier_call callback of a notifier_block, change the function signature of switchdev add and del operations to: int switchdev_port_obj_add/del(struct net_device *dev, enum switchdev_obj_id id, void *obj); This allows the caller to pass a specific switchdev_obj_* structure instead of the generic switchdev_obj one. Drivers implementation of these operations and switchdev have been changed accordingly. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: switchdev: pass callback to dump operationVivien Didelot1-3/+6
Similar to the notifier_call callback of a notifier_block, change the function signature of switchdev dump operation to: int switchdev_port_obj_dump(struct net_device *dev, enum switchdev_obj_id id, void *obj, int (*cb)(void *obj)); This allows the caller to pass and expect back a specific switchdev_obj_* structure instead of the generic switchdev_obj one. Drivers implementation of dump operation can now expect this specific structure and call the callback with it. Drivers have been changed accordingly. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: switchdev: remove dev from switchdev_obj cbVivien Didelot1-1/+1
The net_device associated to a dump operation does not have to be passed to the callback. switchdev stores it in a superset struct, if needed. Also some drivers (such as DSA drivers) may not have easy access to it. This will simplify pushing the callback function down to the drivers. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: Move netif_index_is_l3_master to l3mdev.hDavid Ahern2-0/+25
Change CONFIG dependency to CONFIG_NET_L3_MASTER_DEV as well. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: Remove vrf header fileDavid Ahern1-29/+0
Move remaining structs to VRF driver and delete the vrf header file. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: Remove the now unused vrf_ptrDavid Ahern1-6/+0
Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: Replace calls to vrf_dev_get_rthDavid Ahern1-22/+0
Replace calls to vrf_dev_get_rth with l3mdev_get_rtable. The check on the flow flags is handled in the l3mdev operation. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: Replace vrf_dev_table and friendsDavid Ahern1-80/+0
Replace calls to vrf_dev_table and friends with l3mdev_fib_table and kin. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: Replace vrf_master_ifindex{, _rcu} with l3mdev equivalentsDavid Ahern1-41/+0
Replace calls to vrf_master_ifindex_rcu and vrf_master_ifindex with either l3mdev_master_ifindex_rcu or l3mdev_master_ifindex. The pattern: oif = vrf_master_ifindex(dev) ? : dev->ifindex; is replaced with oif = l3mdev_fib_oif(dev); And remove the now unused vrf macros. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: Introduce L3 Master device abstractionDavid Ahern1-0/+125
L3 master devices allow users of the abstraction to influence FIB lookups for enslaved devices. Current API provides a means for the master device to return a specific FIB table for an enslaved device, to return an rtable/custom dst and influence the OIF used for fib lookups. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29net: Rename IFF_VRF_MASTER to IFF_L3MDEV_MASTERDavid Ahern2-3/+3
Rename IFF_VRF_MASTER to IFF_L3MDEV_MASTER and update the name of the netif_is_vrf and netif_index_is_vrf macros. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>