aboutsummaryrefslogtreecommitdiffstats
path: root/net/ipv4 (follow)
AgeCommit message (Collapse)AuthorFilesLines
2016-03-02net/ipv4: remove left over dead codeEric Engestrom1-7/+0
8cc785f6f429c2a3fb81745dc142cbd72a462c4a ("net: ipv4: make the ping /proc code AF-independent") removed the code using it, but renamed this variable instead of removing it. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-02netfilter: nft_masq: support port rangePablo Neira Ayuso1-1/+6
Complete masquerading support by allowing port range selection. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-02netfilter: xtables: don't hook tables by defaultFlorian Westphal8-96/+180
delay hook registration until the table is being requested inside a namespace. Historically, a particular table (iptables mangle, ip6tables filter, etc) was registered on module load. When netns support was added to iptables only the ip/ip6tables ruleset was made namespace aware, not the actual hook points. This means f.e. that when ipt_filter table/module is loaded on a system, then each namespace on that system has an (empty) iptables filter ruleset. In other words, if a namespace sends a packet, such skb is 'caught' by netfilter machinery and fed to hooking points for that table (i.e. INPUT, FORWARD, etc). Thanks to Eric Biederman, hooks are no longer global, but per namespace. This means that we can avoid allocation of empty ruleset in a namespace and defer hook registration until we need the functionality. We register a tables hook entry points ONLY in the initial namespace. When an iptables get/setockopt is issued inside a given namespace, we check if the table is found in the per-namespace list. If not, we attempt to find it in the initial namespace, and, if found, create an empty default table in the requesting namespace and register the needed hooks. Hook points are destroyed only once namespace is deleted, there is no 'usage count' (it makes no sense since there is no 'remove table' operation in xtables api). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-02netfilter: xtables: prepare for on-demand hook registerFlorian Westphal8-46/+55
This change prepares for upcoming on-demand xtables hook registration. We change the protoypes of the register/unregister functions. A followup patch will then add nf_hook_register/unregister calls to the iptables one. Once a hook is registered packets will be picked up, so all assignments of the form net->ipv4.iptable_$table = new_table have to be moved to ip(6)t_register_table, else we can see NULL net->ipv4.iptable_$table later. This patch doesn't change functionality; without this the actual change simply gets too big. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-02netfilter: nf_defrag_ipv4: Drop redundant ip_send_check()Joe Stringer1-3/+1
Since commit 0848f6428ba3 ("inet: frags: fix defragmented packet's IP header for af_packet"), ip_send_check() would be called twice for defragmentation that occurs from netfilter ipv4 defrag hooks. Remove the extra call. Signed-off-by: Joe Stringer <joe@ovn.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-03-01net: remove skb_sender_cpu_clear()WANG Cong1-1/+0
After commit 52bd2d62ce67 ("net: better skb->sender_cpu and skb->napi_id cohabitation") skb_sender_cpu_clear() becomes empty and can be removed. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net: ipv4: tcp_probe: Replace timespec with timespec64Deepa Dinamani1-4/+4
TCP probe log timestamps use struct timespec which is not y2038 safe. Even though timespec might be good enough here as it is used to represent delta time, the plan is to get rid of all uses of timespec in the kernel. Replace with struct timespec64 which is y2038 safe. Prints still use unsigned long format and type. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01net: ipv4: Convert IP network timestamps to be y2038 safeDeepa Dinamani3-12/+33
ICMP timestamp messages and IP source route options require timestamps to be in milliseconds modulo 24 hours from midnight UT format. Add inet_current_timestamp() function to support this. The function returns the required timestamp in network byte order. Timestamp calculation is also changed to call ktime_get_real_ts64() which uses struct timespec64. struct timespec64 is y2038 safe. Previously it called getnstimeofday() which uses struct timespec. struct timespec is not y2038 safe. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: James Morris <jmorris@namei.org> Cc: Patrick McHardy <kaber@trash.net> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-26GSO: Provide software checksum of tunneled UDP fragmentation offloadAlexander Duyck2-6/+30
On reviewing the code I realized that GRE and UDP tunnels could cause a kernel panic if we used GSO to segment a large UDP frame that was sent through the tunnel with an outer checksum and hardware offloads were not available. In order to correct this we need to update the feature flags that are passed to the skb_segment function so that in the event of UDP fragmentation being requested for the inner header the segmentation function will correctly generate the checksum for the payload if we cannot segment the outer header. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-26net: l3mdev: prefer VRF master for source address selectionDavid Lamparter1-0/+17
When selecting an address in context of a VRF, the vrf master should be preferred for address selection. If it isn't, the user has a hard time getting the system to select to their preference - the code will pick the address off the first in-VRF interface it can find, which on a router could well be a non-routable address. Signed-off-by: David Lamparter <equinox@diac24.net> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> [dsa: Fixed comment style and removed extra blank link ] Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-26net: l3mdev: address selection should only consider devices in L3 domainDavid Ahern1-0/+5
David Lamparter noted a use case where the source address selection fails to pick an address from a VRF interface - unnumbered interfaces. Relevant commands from his script: ip addr add 9.9.9.9/32 dev lo ip link set lo up ip link add name vrf0 type vrf table 101 ip rule add oif vrf0 table 101 ip rule add iif vrf0 table 101 ip link set vrf0 up ip addr add 10.0.0.3/32 dev vrf0 ip link add name dummy2 type dummy ip link set dummy2 master vrf0 up --> note dummy2 has no address - unnumbered device ip route add 10.2.2.2/32 dev dummy2 table 101 ip neigh add 10.2.2.2 dev dummy2 lladdr 02:00:00:00:00:02 tcpdump -ni dummy2 & And using ping instead of his socat example: $ ping -I vrf0 -c1 10.2.2.2 ping: Warning: source address might be selected on device other than vrf0. PING 10.2.2.2 (10.2.2.2) from 9.9.9.9 vrf0: 56(84) bytes of data. >From tcpdump: 12:57:29.449128 IP 9.9.9.9 > 10.2.2.2: ICMP echo request, id 2491, seq 1, length 64 Note the source address is from lo and is not a VRF local address. With this patch: $ ping -I vrf0 -c1 10.2.2.2 PING 10.2.2.2 (10.2.2.2) from 10.0.0.3 vrf0: 56(84) bytes of data. >From tcpdump: 12:59:25.096426 IP 10.0.0.3 > 10.2.2.2: ICMP echo request, id 2113, seq 1, length 64 Now the source address comes from vrf0. The ipv4 function for selecting source address takes a const argument. Removing the const requires touching a lot of places, so instead l3mdev_master_ifindex_rcu is changed to take a const argument and then do the typecast to non-const as required by netdev_master_upper_dev_get_rcu. This is similar to what l3mdev_fib_table_rcu does. IPv6 for unnumbered interfaces appears to be selecting the addresses properly. Cc: David Lamparter <david@opensourcerouting.org> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24ipv4: only create late gso-skb if skb is already set up with CHECKSUM_PARTIALHannes Frederic Sowa1-1/+4
Otherwise we break the contract with GSO to only pass CHECKSUM_PARTIAL skbs down. This can easily happen with UDP+IPv4 sockets with the first MSG_MORE write smaller than the MTU, second write is a sendfile. Returning -EOPNOTSUPP lets the callers fall back into normal sendmsg path, were we calculate the checksum manually during copying. Commit d749c9cbffd6 ("ipv4: no CHECKSUM_PARTIAL on MSG_MORE corked sockets") started to exposes this bug. Fixes: d749c9cbffd6 ("ipv4: no CHECKSUM_PARTIAL on MSG_MORE corked sockets") Reported-by: Jiri Benc <jbenc@redhat.com> Cc: Jiri Benc <jbenc@redhat.com> Reported-by: Wakko Warner <wakko@animx.eu.org> Cc: Wakko Warner <wakko@animx.eu.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24soreuseport: fix merge conflict in tcp bindCraig Gallek1-0/+1
One of the validation checks for the new array-based TCP SO_REUSEPORT validation was unintentionally dropped in ea8add2b1903. This adds it back. Lack of this check allows the user to allocate multiple sock_reuseport structures (leaking all but the first). Fixes: ea8add2b1903 ("tcp/dccp: better use of ephemeral ports in bind()") Signed-off-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-23tunnel: Clear IPCB(skb)->opt before dst_link_failure calledBernie Harris2-1/+4
IPCB may contain data from previous layers (in the observed case the qdisc layer). In the observed scenario, the data was misinterpreted as ip header options, which later caused the ihl to be set to an invalid value (<5). This resulted in an infinite loop in the mips implementation of ip_fast_csum. This patch clears IPCB(skb)->opt before dst_link_failure can be called for various types of tunnels. This change only applies to encapsulated ipv4 packets. The code introduced in 11c21a30 which clears all of IPCB has been removed to be consistent with these changes, and instead the opt field is cleared unconditionally in ip_tunnel_xmit. The change in ip_tunnel_xmit applies to SIT, GRE, and IPIP tunnels. The relevant vti, l2tp, and pptp functions already contain similar code for clearing the IPCB. Signed-off-by: Bernie Harris <bernie.harris@alliedtelesis.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-23tcp: convert cached rtt from usec to jiffies when feeding initial rtoKonstantin Khlebnikov1-1/+1
Currently it's converted into msecs, thus HZ=1000 intact. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Fixes: 740b0f1841f6 ("tcp: switch rtt estimations to usec resolution") Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller12-46/+136
Conflicts: drivers/net/phy/bcm7xxx.c drivers/net/phy/marvell.c drivers/net/vxlan.c All three conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-19rtnl: RTM_GETNETCONF: fix wrong return valueAnton Protopopov1-1/+1
An error response from a RTM_GETNETCONF request can return the positive error value EINVAL in the struct nlmsgerr that can mislead userspace. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18gre: clear IFF_TX_SKB_SHARINGJiri Benc1-2/+3
ether_setup sets IFF_TX_SKB_SHARING but this is not supported by gre as it modifies the skb on xmit. Also, clean up whitespace in ipgre_tap_setup when we're already touching it. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18iptunnel: scrub packet in iptunnel_pull_headerJiri Benc3-7/+5
Part of skb_scrub_packet was open coded in iptunnel_pull_header. Let it call skb_scrub_packet directly instead. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18tcp/dccp: fix another race at listener dismantleEric Dumazet2-14/+14
Ilya reported following lockdep splat: kernel: ========================= kernel: [ BUG: held lock freed! ] kernel: 4.5.0-rc1-ceph-00026-g5e0a311 #1 Not tainted kernel: ------------------------- kernel: swapper/5/0 is freeing memory ffff880035c9d200-ffff880035c9dbff, with a lock still held there! kernel: (&(&queue->rskq_lock)->rlock){+.-...}, at: [<ffffffff816f6a88>] inet_csk_reqsk_queue_add+0x28/0xa0 kernel: 4 locks held by swapper/5/0: kernel: #0: (rcu_read_lock){......}, at: [<ffffffff8169ef6b>] netif_receive_skb_internal+0x4b/0x1f0 kernel: #1: (rcu_read_lock){......}, at: [<ffffffff816e977f>] ip_local_deliver_finish+0x3f/0x380 kernel: #2: (slock-AF_INET){+.-...}, at: [<ffffffff81685ffb>] sk_clone_lock+0x19b/0x440 kernel: #3: (&(&queue->rskq_lock)->rlock){+.-...}, at: [<ffffffff816f6a88>] inet_csk_reqsk_queue_add+0x28/0xa0 To properly fix this issue, inet_csk_reqsk_queue_add() needs to return to its callers if the child as been queued into accept queue. We also need to make sure listener is still there before calling sk->sk_data_ready(), by holding a reference on it, since the reference carried by the child can disappear as soon as the child is put on accept queue. Reported-by: Ilya Dryomov <idryomov@gmail.com> Fixes: ebb516af60e1 ("tcp/dccp: fix race at listener dismantle phase") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18route: check and remove route cache when we get routeXin Long1-14/+63
Since the gc of ipv4 route was removed, the route cached would has no chance to be removed, and even it has been timeout, it still could be used, cause no code to check it's expires. Fix this issue by checking and removing route cache when we get route. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17tcp: correctly crypto_alloc_hash return checkInsu Yun1-1/+1
crypto_alloc_hash never returns NULL Signed-off-by: Insu Yun <wuninsu@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17ipv4: Remove inet_lro libraryBen Hutchings3-383/+0
There are no longer any in-tree drivers that use it. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16net: Export ip fragment sysctl to unprivileged usersNikolay Borisov1-4/+0
Now that all the ip fragmentation related sysctls are namespaceified there is no reason to hide them anymore from "root" users inside containers. Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16ipv4: namespacify ip fragment max dist sysctl knobNikolay Borisov1-12/+13
Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16ipv4: namespacify ip_early_demux sysctl knobNikolay Borisov2-11/+9
Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16ipv4: Namespacify ip_dynaddr sysctl knobNikolay Borisov2-15/+10
Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16igmp: net: Move igmp namespace init to correct fileNikolay Borisov2-6/+14
When igmp related sysctl were namespacified their initializatin was erroneously put into the tcp socket namespace constructor. This patch moves the relevant code into the igmp namespace constructor to keep things consistent. Also sprinkle some #ifdefs to silence warnings Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16ipv4: Namespaceify ip_default_ttl sysctl knobNikolay Borisov5-15/+18
Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16tcp: add tcpi_min_rtt and tcpi_notsent_bytes to tcp_infoEric Dumazet1-0/+6
tcpi_min_rtt reports the minimal rtt observed by TCP stack for the flow, in usec unit. Might be ~0U if not yet known. tcpi_notsent_bytes reports the amount of bytes in the write queue that were not yet sent. This is done in a single patch to not add a temporary 32bit padding hole in tcp_info. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16tcp: md5: release request socket instead of listenerEric Dumazet1-2/+4
If tcp_v4_inbound_md5_hash() returns an error, we must release the refcount on the request socket, not on the listener. The bug was added for IPv4 only. Fixes: 079096f103fac ("tcp/dccp: install syn_recv requests into ehash table") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16net/ipv4: add dst cache support for gre lwtunnelsPaolo Abeni1-3/+10
In case of UDP traffic with datagram length below MTU this gives about 4% performance increase Signed-off-by: Paolo Abeni <pabeni@redhat.com> Suggested-and-Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16ip_tunnel: replace dst_cache with generic implementationPaolo Abeni2-65/+14
The current ip_tunnel cache implementation is prone to a race that will cause the wrong dst to be cached on cuncurrent dst cache miss and ip tunnel update via netlink. Replacing with the generic implementation fix the issue. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16tcp: do not set rtt_min to 1Eric Dumazet1-1/+4
There are some cases where rtt_us derives from deltas of jiffies, instead of using usec timestamps. Since we want to track minimal rtt, better to assume a delta of 0 jiffie might be in fact be very close to 1 jiffie. It is kind of sad jiffies_to_usecs(1) calls a function instead of simply using a constant. Fixes: f672258391b42 ("tcp: track min RTT using windowed min-filter") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-13ipv4: fix memory leaks in ip_cmsg_send() callersEric Dumazet4-3/+11
Dmitry reported memory leaks of IP options allocated in ip_cmsg_send() when/if this function returns an error. Callers are responsible for the freeing. Many thanks to Dmitry for the report and diagnostic. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-12net: ip_tunnel: remove 'csum_help' argument to iptunnel_handle_offloadsEdward Cree4-17/+10
All users now pass false, so we can remove it, and remove the code that was conditional upon it. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-12net: gre: Implement LCO for GRE over IPv4Edward Cree1-3/+13
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-12fou: enable LCO in FOU and GUEEdward Cree1-8/+6
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-12net: udp: always set up for CHECKSUM_PARTIAL offloadEdward Cree1-13/+1
If the dst device doesn't support it, it'll get fixed up later anyway by validate_xmit_skb(). Also, this allows us to take advantage of LCO to avoid summing the payload multiple times. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-12net: local checksum offload for encapsulationEdward Cree2-15/+15
The arithmetic properties of the ones-complement checksum mean that a correctly checksummed inner packet, including its checksum, has a ones complement sum depending only on whatever value was used to initialise the checksum field before checksumming (in the case of TCP and UDP, this is the ones complement sum of the pseudo header, complemented). Consequently, if we are going to offload the inner checksum with CHECKSUM_PARTIAL, we can compute the outer checksum based only on the packed data not covered by the inner checksum, and the initial value of the inner checksum field. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-12tcp/dccp: better use of ephemeral ports in bind()Eric Dumazet1-126/+114
Implement strategy used in __inet_hash_connect() in opposite way : Try to find a candidate using odd ports, then fallback to even ports. We no longer disable BH for whole traversal, but one bucket at a time. We also use cond_resched() to yield cpu to other tasks if needed. I removed one indentation level and tried to mirror the loop we have in __inet_hash_connect() and variable names to ease code maintenance. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-12tcp/dccp: better use of ephemeral ports in connect()Eric Dumazet1-85/+85
In commit 07f4c90062f8 ("tcp/dccp: try to not exhaust ip_local_port_range in connect()"), I added a very simple heuristic, so that we got better chances to use even ports, and allow bind() users to have more available slots. It gave nice results, but with more than 200,000 TCP sessions on a typical server, the ~30,000 ephemeral ports are still a rare resource. I chose to go a step further, by looking at all even ports, and if none was available, fallback to odd ports. The companion patch does the same in bind(), but in opposite way. I've seen exec times of up to 30ms on busy servers, so I no longer disable BH for the whole traversal, but only for each hash bucket. I also call cond_resched() to be gentle to other tasks. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-11igmp: Namespacify igmp_qrv sysctl knobNikolay Borisov3-22/+28
Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-11igmp: Namespaceify igmp_llm_reports sysctl knobNikolay Borisov3-12/+18
This was initially introduced in df2cf4a78e488d26 ("IGMP: Inhibit reports for local multicast groups") by defining the sysctl in the ipv4_net_table array, however it was never implemented to be namespace aware. Fix this by changing the code accordingly. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-11igmp: Namespaceify igmp_max_msf sysctl knobNikolay Borisov4-13/+12
Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-11igmp: Namespaceify igmp_max_memberships sysctl knobNikolay Borisov3-10/+10
Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-11udp: Use uh->len instead of skb->len to compute checksum in segmentationAlexander Duyck1-15/+13
The segmentation code was having to do a bunch of work to pull the skb->len and strip the udp header offset before the value could be used to adjust the checksum. Instead of doing all this work we can just use the value that goes into uh->len since that is the correct value with the correct byte order that we need anyway. By using this value we can save ourselves a bunch of pain as there is no need to do multiple byte swaps. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-11udp: Clean up the use of flags in UDP segmentation offloadAlexander Duyck1-19/+18
This patch goes though and cleans up the logic related to several of the control flags used in UDP segmentation. Specifically the use of dont_encap isn't really needed as we can just check the skb for CHECKSUM_PARTIAL and if it isn't set then we don't need to update the internal headers. As such we can just drop that value. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-11gre: Use inner_proto to obtain inner header protocolAlexander Duyck1-4/+2
Instead of parsing headers to determine the inner protocol we can just pull the value from inner_proto. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-11gre: Use GSO flags to determine csum need instead of GRE flagsAlexander Duyck1-34/+30
This patch updates the gre checksum path to follow something much closer to the UDP checksum path. By doing this we can avoid needing to do as much header inspection and can just make use of the fields we were already reading in the sk_buff structure. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>