aboutsummaryrefslogtreecommitdiffstats
path: root/include/linux (follow)
AgeCommit message (Collapse)AuthorFilesLines
2016-11-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller10-17/+38
Several cases of bug fixes in 'net' overlapping other changes in 'net-next-. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2-1/+16
Pull networking fixes from David Miller: 1) Fix off by one wrt. indexing when dumping /proc/net/route entries, from Alexander Duyck. 2) Fix lockdep splats in iwlwifi, from Johannes Berg. 3) Cure panic when inserting certain netfilter rules when NFT_SET_HASH is disabled, from Liping Zhang. 4) Memory leak when nft_expr_clone() fails, also from Liping Zhang. 5) Disable UFO when path will apply IPSEC tranformations, from Jakub Sitnicki. 6) Don't bogusly double cwnd in dctcp module, from Florian Westphal. 7) skb_checksum_help() should never actually use the value "0" for the resulting checksum, that has a special meaning, use CSUM_MANGLED_0 instead. From Eric Dumazet. 8) Per-tx/rx queue statistic strings are wrong in qed driver, fix from Yuval MIntz. 9) Fix SCTP reference counting of associations and transports in sctp_diag. From Xin Long. 10) When we hit ip6tunnel_xmit() we could have come from an ipv4 path in a previous layer or similar, so explicitly clear the ipv6 control block in the skb. From Eli Cooper. 11) Fix bogus sleeping inside of inet_wait_for_connect(), from WANG Cong. 12) Correct deivce ID of T6 adapter in cxgb4 driver, from Hariprasad Shenai. 13) Fix potential access past the end of the skb page frag array in tcp_sendmsg(). From Eric Dumazet. 14) 'skb' can legitimately be NULL in inet{,6}_exact_dif_match(). Fix from David Ahern. 15) Don't return an error in tcp_sendmsg() if we wronte any bytes successfully, from Eric Dumazet. 16) Extraneous unlocks in netlink_diag_dump(), we removed the locking but forgot to purge these unlock calls. From Eric Dumazet. 17) Fix memory leak in error path of __genl_register_family(). We leak the attrbuf, from WANG Cong. 18) cgroupstats netlink policy table is mis-sized, from WANG Cong. 19) Several XDP bug fixes in mlx5, from Saeed Mahameed. 20) Fix several device refcount leaks in network drivers, from Johan Hovold. 21) icmp6_send() should use skb dst device not skb->dev to determine L3 routing domain. From David Ahern. 22) ip_vs_genl_family sets maxattr incorrectly, from WANG Cong. 23) We leak new macvlan port in some cases of maclan_common_netlink() errors. Fix from Gao Feng. 24) Similar to the icmp6_send() fix, icmp_route_lookup() should determine L3 routing domain using skb_dst(skb)->dev not skb->dev. Also from David Ahern. 25) Several fixes for route offloading and FIB notification handling in mlxsw driver, from Jiri Pirko. 26) Properly cap __skb_flow_dissect()'s return value, from Eric Dumazet. 27) Fix long standing regression in ipv4 redirect handling, wrt. validating the new neighbour's reachability. From Stephen Suryaputra Lin. 28) If sk_filter() trims the packet excessively, handle it reasonably in tcp input instead of exploding. From Eric Dumazet. 29) Fix handling of napi hash state when copying channels in sfc driver, from Bert Kenward. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (121 commits) mlxsw: spectrum_router: Flush FIB tables during fini net: stmmac: Fix lack of link transition for fixed PHYs sctp: change sk state only when it has assocs in sctp_shutdown bnx2: Wait for in-flight DMA to complete at probe stage Revert "bnx2: Reset device during driver initialization" ps3_gelic: fix spelling mistake in debug message net: ethernet: ixp4xx_eth: fix spelling mistake in debug message ibmvnic: Fix size of debugfs name buffer ibmvnic: Unmap ibmvnic_statistics structure sfc: clear napi_hash state when copying channels mlxsw: spectrum_router: Correctly dump neighbour activity mlxsw: spectrum: Fix refcount bug on span entries bnxt_en: Fix VF virtual link state. bnxt_en: Fix ring arithmetic in bnxt_setup_tc(). Revert "include/uapi/linux/atm_zatm.h: include linux/time.h" tcp: take care of truncations done by sk_filter() ipv4: use new_gw for redirect neigh lookup r8152: Fix error path in open function net: bpqether.h: remove if_ether.h guard net: __skb_flow_dissect() must cap its return value ...
2016-11-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller9-176/+208
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains a second batch of Netfilter updates for your net-next tree. This includes a rework of the core hook infrastructure that improves Netfilter performance by ~15% according to synthetic benchmarks. Then, a large batch with ipset updates, including a new hash:ipmac set type, via Jozsef Kadlecsik. This also includes a couple of assorted updates. Regarding the core hook infrastructure rework to improve performance, using this simple drop-all packets ruleset from ingress: nft add table netdev x nft add chain netdev x y { type filter hook ingress device eth0 priority 0\; } nft add rule netdev x y drop And generating traffic through Jesper Brouer's samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh script using -i option. perf report shows nf_tables calls in its top 10: 17.30% kpktgend_0 [nf_tables] [k] nft_do_chain 15.75% kpktgend_0 [kernel.vmlinux] [k] __netif_receive_skb_core 10.39% kpktgend_0 [nf_tables_netdev] [k] nft_do_chain_netdev I'm measuring here an improvement of ~15% in performance with this patchset, so we got +2.5Mpps more. I have used my old laptop Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz 4-cores. This rework contains more specifically, in strict order, these patches: 1) Remove compile-time debugging from core. 2) Remove obsolete comments that predate the rcu era. These days it is well known that a Netfilter hook always runs under rcu_read_lock(). 3) Remove threshold handling, this is only used by br_netfilter too. We already have specific code to handle this from br_netfilter, so remove this code from the core path. 4) Deprecate NF_STOP, as this is only used by br_netfilter. 5) Place nf_state_hook pointer into xt_action_param structure, so this structure fits into one single cacheline according to pahole. This also implicit affects nftables since it also relies on the xt_action_param structure. 6) Move state->hook_entries into nf_queue entry. The hook_entries pointer is only required by nf_queue(), so we can store this in the queue entry instead. 7) use switch() statement to handle verdict cases. 8) Remove hook_entries field from nf_hook_state structure, this is only required by nf_queue, so store it in nf_queue_entry structure. 9) Merge nf_iterate() into nf_hook_slow() that results in a much more simple and readable function. 10) Handle NF_REPEAT away from the core, so far the only client is nf_conntrack_in() and we can restart the packet processing using a simple goto to jump back there when the TCP requires it. This update required a second pass to fix fallout, fix from Arnd Bergmann. 11) Set random seed from nft_hash when no seed is specified from userspace. 12) Simplify nf_tables expression registration, in a much smarter way to save lots of boiler plate code, by Liping Zhang. 13) Simplify layer 4 protocol conntrack tracker registration, from Davide Caratti. 14) Missing CONFIG_NF_SOCKET_IPV4 dependency for udp4_lib_lookup, due to recent generalization of the socket infrastructure, from Arnd Bergmann. 15) Then, the ipset batch from Jozsef, he describes it as it follows: * Cleanup: Remove extra whitespaces in ip_set.h * Cleanup: Mark some of the helpers arguments as const in ip_set.h * Cleanup: Group counter helper functions together in ip_set.h * struct ip_set_skbinfo is introduced instead of open coded fields in skbinfo get/init helper funcions. * Use kmalloc() in comment extension helper instead of kzalloc() because it is unnecessary to zero out the area just before explicit initialization. * Cleanup: Split extensions into separate files. * Cleanup: Separate memsize calculation code into dedicated function. * Cleanup: group ip_set_put_extensions() and ip_set_get_extensions() together. * Add element count to hash headers by Eric B Munson. * Add element count to all set types header for uniform output across all set types. * Count non-static extension memory into memsize calculation for userspace. * Cleanup: Remove redundant mtype_expire() arguments, because they can be get from other parameters. * Cleanup: Simplify mtype_expire() for hash types by removing one level of intendation. * Make NLEN compile time constant for hash types. * Make sure element data size is a multiple of u32 for the hash set types. * Optimize hash creation routine, exit as early as possible. * Make struct htype per ipset family so nets array becomes fixed size and thus simplifies the struct htype allocation. * Collapse same condition body into a single one. * Fix reported memory size for hash:* types, base hash bucket structure was not taken into account. * hash:ipmac type support added to ipset by Tomasz Chilinski. * Use setup_timer() and mod_timer() instead of init_timer() by Muhammad Falak R Wani, individually for the set type families. 16) Remove useless connlabel field in struct netns_ct, patch from Florian Westphal. 17) xt_find_table_lock() doesn't return ERR_PTR() anymore, so simplify {ip,ip6,arp}tables code that uses this. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-13Merge tag 'usb-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usbLinus Torvalds1-0/+7
Pull USB / PHY fixes from Greg KH: "Here are a number of small USB and PHY driver fixes for 4.9-rc5 Nothing major, just small fixes for reported issues, all of these have been in linux-next for a while with no reported issues" * tag 'usb-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: cdc-acm: fix TIOCMIWAIT cdc-acm: fix uninitialized variable drivers/usb: Skip auto handoff for TI and RENESAS usb controllers usb: musb: remove duplicated actions usb: musb: da8xx: Don't print phy error on -EPROBE_DEFER phy: sun4i: check PMU presence when poking unknown bit of pmu phy-rockchip-pcie: remove deassert of phy_rst from exit callback phy: da8xx-usb: rename the ohci device to ohci-da8xx phy: Add reset callback for not generic phy uwb: fix device reference leaks usb: gadget: u_ether: remove interrupt throttling usb: dwc3: st: add missing <linux/pinctrl/consumer.h> include usb: dwc3: Fix error handling for core init
2016-11-13net: phy: expose phy_aneg_done API for use by driversLendacky, Thomas1-0/+1
Make phy_aneg_done() available to drivers so that the result of the auto-negotiation initiated by phy_start_aneg() can be determined. Remove the local implementation of phy_aneg_done() from the Aeroflex driver and use the phy library version. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-12bpf: Fix bpf_redirect to an ipip/ip6tnl devMartin KaFai Lau1-0/+15
If the bpf program calls bpf_redirect(dev, 0) and dev is an ipip/ip6tnl, it currently includes the mac header. e.g. If dev is ipip, the end result is IP-EthHdr-IP instead of IP-IP. The fix is to pull the mac header. At ingress, skb_postpull_rcsum() is not needed because the ethhdr should have been pulled once already and then got pushed back just before calling the bpf_prog. At egress, this patch calls skb_postpull_rcsum(). If bpf_redirect(dev, BPF_F_INGRESS) is called, it also fails now because it calls dev_forward_skb() which eventually calls eth_type_trans(skb, dev). The eth_type_trans() will set skb->type = PACKET_OTHERHOST because the mac address does not match the redirecting dev->dev_addr. The PACKET_OTHERHOST will eventually cause the ip_rcv() errors out. To fix this, ____dev_forward_skb() is added. Joint work with Daniel Borkmann. Fixes: cfc7381b3002 ("ip_tunnel: add collect_md mode to IPIP tunnel") Fixes: 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnels") Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-12bpf, mlx4: fix prog refcount in mlx4_en_try_alloc_resources error pathDaniel Borkmann1-0/+5
Commit 67f8b1dcb9ee ("net/mlx4_en: Refactor the XDP forwarding rings scheme") added a bug in that the prog's reference count is not dropped in the error path when mlx4_en_try_alloc_resources() is failing from mlx4_xdp_set(). We previously took bpf_prog_add(prog, priv->rx_ring_num - 1), that we need to release again. Earlier in the call path, dev_change_xdp_fd() itself holds a reference to the prog as well (hence the '- 1' in the bpf_prog_add()), so a simple atomic_sub() is safe to use here. When an error is propagated, then bpf_prog_put() is called eventually from dev_change_xdp_fd() Fixes: 67f8b1dcb9ee ("net/mlx4_en: Refactor the XDP forwarding rings scheme") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-11Merge tag 'acpi-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds1-1/+2
Pull ACPI fix from Rafael Wysocki: "Fix a recent regression in the 8250_dw serial driver introduced by adding a quirk for the APM X-Gene SoC to it which uncovered an issue related to the handling of built-in device properties in the core ACPI device enumeration code (Heikki Krogerus)" * tag 'acpi-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / platform: Add support for build-in properties
2016-11-11Merge branch 'device-properties'Rafael J. Wysocki1-1/+2
* device-properties: ACPI / platform: Add support for build-in properties
2016-11-11Merge branch 'akpm' (patches from Andrew)Linus Torvalds2-8/+3
Merge misc fixes from Andrew Morton: "15 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: lib/stackdepot: export save/fetch stack for drivers mm: kmemleak: scan .data.ro_after_init memcg: prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB coredump: fix unfreezable coredumping task mm/filemap: don't allow partially uptodate page for pipes mm/hugetlb: fix huge page reservation leak in private mapping error paths ocfs2: fix not enough credit panic Revert "console: don't prefer first registered if DT specifies stdout-path" mm: hwpoison: fix thp split handling in memory_failure() swapfile: fix memory corruption via malformed swapfile mm/cma.c: check the max limit for cma allocation scripts/bloat-o-meter: fix SIGPIPE shmem: fix pageflags after swapping DMA32 object mm, frontswap: make sure allocated frontswap map is assigned mm: remove extra newline from allocation stall warning
2016-11-11Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-1/+1
Pull VFS fixes from Al Viro: "Christoph's and Jan's aio fixes, fixup for generic_file_splice_read (removal of pointless detritus that actually breaks it when used for gfs2 ->splice_read()) and fixup for generic_file_read_iter() interaction with ITER_PIPE destinations." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: splice: remove detritus from generic_file_splice_read() mm/filemap: don't allow partially uptodate page for pipes aio: fix freeze protection of aio writes fs: remove aio_run_iocb fs: remove the never implemented aio_fsync file operation aio: hold an extra file reference over AIO read/write operations
2016-11-11Revert "console: don't prefer first registered if DT specifies stdout-path"Hans de Goede1-6/+0
This reverts commit 05fd007e4629 ("console: don't prefer first registered if DT specifies stdout-path"). The reverted commit changes existing behavior on which many ARM boards rely. Many ARM small-board-computers, like e.g. the Raspberry Pi have both a video output and a serial console. Depending on whether the user is using the device as a more regular computer; or as a headless device we need to have the console on either one or the other. Many users rely on the kernel behavior of the console being present on both outputs, before the reverted commit the console setup with no console= kernel arguments on an ARM board which sets stdout-path in dt would look like this: [root@localhost ~]# cat /proc/consoles ttyS0 -W- (EC p a) 4:64 tty0 -WU (E p ) 4:1 Where as after the reverted commit, it looks like this: [root@localhost ~]# cat /proc/consoles ttyS0 -W- (EC p a) 4:64 This commit reverts commit 05fd007e4629 ("console: don't prefer first registered if DT specifies stdout-path") restoring the original behavior. Fixes: 05fd007e4629 ("console: don't prefer first registered if DT specifies stdout-path") Link: http://lkml.kernel.org/r/20161104121135.4780-2-hdegoede@redhat.com Signed-off-by: Hans de Goede <hdegoede@redhat.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11mm, frontswap: make sure allocated frontswap map is assignedVlastimil Babka1-2/+3
Christian Borntraeger reports: With commit 8ea1d2a1985a ("mm, frontswap: convert frontswap_enabled to static key") kmemleak complains about a memory leak in swapon unreferenced object 0x3e09ba56000 (size 32112640): comm "swapon", pid 7852, jiffies 4294968787 (age 1490.770s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: __vmalloc_node_range+0x194/0x2d8 vzalloc+0x58/0x68 SyS_swapon+0xd60/0x12f8 system_call+0xd6/0x270 Turns out kmemleak is right. We now allocate the frontswap map depending on the kernel config (and no longer on the enablement) swapfile.c: [...] if (IS_ENABLED(CONFIG_FRONTSWAP)) frontswap_map = vzalloc(BITS_TO_LONGS(maxpages) * sizeof(long)); but later on this is passed along --> enable_swap_info(p, prio, swap_map, cluster_info, frontswap_map); and ignored if frontswap is disabled --> frontswap_init(p->type, frontswap_map); static inline void frontswap_init(unsigned type, unsigned long *map) { if (frontswap_enabled()) __frontswap_init(type, map); } Thing is, that frontswap map is never freed. The leakage is relatively not that bad, because swapon is an infrequent and privileged operation. However, if the first frontswap backend is registered after a swap type has been already enabled, it will WARN_ON in frontswap_register_ops() and frontswap will not be available for the swap type. Fix this by making sure the map is assigned by frontswap_init() as long as CONFIG_FRONTSWAP is enabled. Fixes: 8ea1d2a1985a ("mm, frontswap: convert frontswap_enabled to static key") Link: http://lkml.kernel.org/r/20161026134220.2566-1-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-10libceph: initialize last_linger_id with a large integerIlya Dryomov1-0/+2
osdc->last_linger_id is a counter for lreq->linger_id, which is used for watch cookies. Starting with a large integer should ease the task of telling apart kernel and userspace clients. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-11-10netfilter: ipset: Count non-static extension memory for userspaceJozsef Kadlecsik2-4/+11
Non-static (i.e. comment) extension was not counted into the memory size. A new internal counter is introduced for this. In the case of the hash types the sizes of the arrays are counted there as well so that we can avoid to scan the whole set when just the header data is requested. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Add element count to all set types headerJozsef Kadlecsik2-1/+3
It is better to list the set elements for all set types, thus the header information is uniform. Element counts are therefore added to the bitmap and list types. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Regroup ip_set_put_extensions and add externJozsef Kadlecsik1-4/+2
Cleanup: group ip_set_put_extensions and ip_set_get_extensions together and add missing extern. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Split extensions into separate filesJozsef Kadlecsik3-93/+123
Cleanup to separate all extensions into individual files. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Use kmalloc() in comment extension helperJozsef Kadlecsik1-1/+1
Allocate memory with kmalloc() rather than kzalloc(): the string is immediately initialized so it is unnecessary to zero out the allocated memory area. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Improve skbinfo get/init helpersJozsef Kadlecsik1-19/+11
Use struct ip_set_skbinfo in struct ip_set_ext instead of open coded fields and assign structure members in get/init helpers instead of copying members one by one. Explicitly note that struct ip_set_skbinfo must be padded to prevent non-aligned access in the extension blob. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Headers file cleanupJozsef Kadlecsik1-21/+21
Group counter helper functions together. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Mark some helper args as const.Jozsef Kadlecsik3-5/+5
Mark some of the helpers arguments as const. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-10netfilter: ipset: Remove extra whitespaces in ip_set.hJozsef Kadlecsik1-6/+7
Remove unnecessary whitespaces. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Suggested-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2016-11-09ptp: Introduce a high resolution frequency adjustment method.Richard Cochran1-0/+8
The internal PTP Hardware Clock (PHC) interface limits the resolution for frequency adjustments to one part per billion. However, some hardware devices allow finer adjustment, and making use of the increased resolution improves synchronization measurably on such devices. This patch adds an alternative method that allows finer frequency tuning by passing the scaled ppm value to PHC drivers. This value comes from user space, and it has a resolution of about 0.015 ppb. We also deprecate the older method, anticipating its removal once existing drivers have been converted over. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Suggested-by: Ulrik De Bie <ulrik.debie-os@e2big.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09net: napi_hash_add() is no longer exportedEric Dumazet1-11/+0
There are no more users except from net/core/dev.c napi_hash_add() can now be static. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09ipv6: sr: add core files for SR HMAC supportDavid Lebrun2-0/+9
This patch adds the necessary functions to compute and check the HMAC signature of an SR-enabled packet. Two HMAC algorithms are supported: hmac(sha1) and hmac(sha256). In order to avoid dynamic memory allocation for each HMAC computation, a per-cpu ring buffer is allocated for this purpose. A new per-interface sysctl called seg6_require_hmac is added, allowing a user-defined policy for processing HMAC-signed SR-enabled packets. A value of -1 means that the HMAC field will always be ignored. A value of 0 means that if an HMAC field is present, its validity will be enforced (the packet is dropped is the signature is incorrect). Finally, a value of 1 means that any SR-enabled packet that does not contain an HMAC signature or whose signature is incorrect will be dropped. Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09ipv6: sr: add support for SRH encapsulation and injection with lwtunnelsDavid Lebrun1-0/+6
This patch creates a new type of interfaceless lightweight tunnel (SEG6), enabling the encapsulation and injection of SRH within locally emitted packets and forwarded packets. >From a configuration viewpoint, a seg6 tunnel would be configured as follows: ip -6 ro ad fc00::1/128 encap seg6 mode encap segs fc42::1,fc42::2,fc42::3 dev eth0 Any packet whose destination address is fc00::1 would thus be encapsulated within an outer IPv6 header containing the SRH with three segments, and would actually be routed to the first segment of the list. If `mode inline' was specified instead of `mode encap', then the SRH would be directly inserted after the IPv6 header without outer encapsulation. The inline mode is only available if CONFIG_IPV6_SEG6_INLINE is enabled. This feature was made configurable because direct header insertion may break several mechanisms such as PMTUD or IPSec AH. Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09ipv6: sr: add code base for control plane support of SR-IPv6David Lebrun1-0/+6
This patch adds the necessary hooks and structures to provide support for SR-IPv6 control plane, essentially the Generic Netlink commands that will be used for userspace control over the Segment Routing kernel structures. The genetlink commands provide control over two different structures: tunnel source and HMAC data. The tunnel source is the source address that will be used by default when encapsulating packets into an outer IPv6 header + SRH. If the tunnel source is set to :: then an address of the outgoing interface will be selected as the source. The HMAC commands currently just return ENOTSUPP and will be implemented in a future patch. Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header)David Lebrun2-0/+7
Implement minimal support for processing of SR-enabled packets as described in https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-02. This patch implements the following operations: - Intermediate segment endpoint: incrementation of active segment and rerouting. - Egress for SR-encapsulated packets: decapsulation of outer IPv6 header + SRH and routing of inner packet. - Cleanup flag support for SR-inlined packets: removal of SRH if we are the penultimate segment endpoint. A per-interface sysctl seg6_enabled is provided, to accept/deny SR-enabled packets. Default is deny. This patch does not provide support for HMAC-signed packets. Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-10ACPI / platform: Add support for build-in propertiesHeikki Krogerus1-1/+2
We have a couple of drivers, acpi_apd.c and acpi_lpss.c, that need to pass extra build-in properties to the devices they create. Previously the drivers added those properties to the struct device which is member of the struct acpi_device, but that does not work. Those properties need to be assigned to the struct device of the platform device instead in order for them to become available to the drivers. To fix this, this patch changes acpi_create_platform_device function to take struct property_entry pointer as parameter. Fixes: 20a875e2e86e (serial: 8250_dw: Add quirk for APM X-Gene SoC) Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Tested-by: Yazen Ghannam <yazen.ghannam@amd.com> Tested-by: Jérôme de Bretagne <jerome.debretagne@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-09net/mlx5: Support encap id when setting new steering entryHadar Hen Zion1-2/+7
In order to support steering rules which add encapsulation headers, encap_id parameter is needed. Add new mlx5_flow_act struct which holds action related parameter: action, flow_tag and encap_id. Use mlx5_flow_act struct when adding a new steering rule. This patch doesn't change any functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09net/mlx5: Add creation flags when adding new flow tableHadar Hen Zion1-2/+8
When creating flow tables, allow the caller to specify creation flags. Currently no flags are used and as such this patch doesn't add any new functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09Revert "net: stmmac: allow to split suspend/resume from init/exit callbacks"Joachim Eastwood1-2/+0
Instead of adding hooks inside stmmac_platform it is better to just use the standard PM callbacks within the specific dwmac-driver. This only used by the dwmac-rk driver. This reverts commit cecbc5563a02 ("stmmac: allow to split suspend/resume from init/exit callbacks"). Signed-off-by: Joachim Eastwood <manabian@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09tcp: no longer hold ehash lock while calling tcp_get_info()Eric Dumazet1-2/+0
We had various problems in the past in tcp_get_info() and used specific synchronization to avoid deadlocks. We would like to add more instrumentation points for TCP, and avoiding grabing socket lock in tcp_getinfo() was too costly. Being able to lock the socket allows to provide consistent set of fields. inet_diag_dump_icsk() can make sure ehash locks are not held any more when tcp_get_info() is called. We can remove syncp added in commit d654976cbf85 ("tcp: fix a potential deadlock in tcp_get_info()"), but we need to use lock_sock_fast() instead of spin_lock_bh() since TCP input path can now be run from process context. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-07udp: do fwd memory scheduling on dequeuePaolo Abeni1-0/+4
A new argument is added to __skb_recv_datagram to provide an explicit skb destructor, invoked under the receive queue lock. The UDP protocol uses such argument to perform memory reclaiming on dequeue, so that the UDP protocol does not set anymore skb->desctructor. Instead explicit memory reclaiming is performed at close() time and when skbs are removed from the receive queue. The in kernel UDP protocol users now need to call a skb_recv_udp() variant instead of skb_recv_datagram() to properly perform memory accounting on dequeue. Overall, this allows acquiring only once the receive queue lock on dequeue. Tested using pktgen with random src port, 64 bytes packet, wire-speed on a 10G link as sender and udp_sink as the receiver, using an l4 tuple rxhash to stress the contention, and one or more udp_sink instances with reuseport. nr sinks vanilla patched 1 440 560 3 2150 2300 6 3650 3800 9 4450 4600 12 6250 6450 v1 -> v2: - do rmem and allocated memory scheduling under the receive lock - do bulk scheduling in first_packet_length() and in udp_destruct_sock() - avoid the typdef for the dequeue callback Suggested-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-07net: phy: broadcom: Add BCM54810 PHY entryJon Mason1-0/+9
The BCM54810 PHY requires some semi-unique configuration, which results in some additional configuration in addition to the standard config. Also, some users of the BCM54810 require the PHY lanes to be swapped. Since there is no way to detect this, add a device tree query to see if it is applicable. Inspired-by: Vikas Soni <vsoni@broadcom.com> Signed-off-by: Jon Mason <jon.mason@broadcom.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-07net: phy: broadcom: add bcm54xx_auxctl_readJon Mason1-0/+1
Add a helper function to read the AUXCTL register for the BCM54xx. This mirrors the bcm54xx_auxctl_write function already present in the code. Signed-off-by: Jon Mason <jon.mason@broadcom.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-07Merge tag 'phy-for-4.9-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/kishon/linux-phy into usb-linusGreg Kroah-Hartman1-0/+7
Kishon writes: phy: for 4.9 -rc phy fixes: *) Add a empty function for phy_reset when CONFIG_GENERIC_PHY is not set *) change the phy lookup table for da8xx-usb to match it with the name present in the board configuraion file (used for non-dt boot) *) Fix incorrect programming sequence in w.r.t deassert of phy_rst in phy-rockchip-pcie *) Fix to avoid NULL pointer dereferencing error in sun4i phy Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2016-11-05Merge tag 'for-linus-20161104' of git://git.infradead.org/linux-mtdLinus Torvalds1-1/+1
Pull MTD fixes from Brian Norris: - MAINTAINERS updates to reflect some new maintainers/submaintainers. We have some great volunteers who've been developing and reviewing already. We're going to try a group maintainership model, so eventually you'll probably see pull requests from people besides me. - NAND fixes from Boris: "Three simple fixes: - fix a non-critical bug in the gpmi driver - fix a bug in the 'automatic NAND timings selection' feature introduced in 4.9-rc1 - fix a false positive uninitialized-var warning" * tag 'for-linus-20161104' of git://git.infradead.org/linux-mtd: mtd: mtk: avoid warning in mtk_ecc_encode mtd: nand: Fix data interface configuration logic mtd: nand: gpmi: disable the clocks on errors MAINTAINERS: add more people to the MTD maintainer team MAINTAINERS: add a maintainer for the SPI NOR subsystem
2016-11-05phy: Add reset callback for not generic phyRandy Li1-0/+7
Add a dummy function for phy_reset in case the CONFIG_GENERIC_PHY is disabled. Signed-off-by: Randy Li <ayaka@soulik.info> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2016-11-04debugfs: constify argument to debugfs_real_fops()Jakub Kicinski1-1/+2
seq_file users can only access const version of file pointer, because the ->file member of struct seq_operations is marked as such. Make parameter to debugfs_real_fops() const. CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> CC: Nicolai Stange <nicstange@gmail.com> CC: Christian Lamparter <chunkeey@gmail.com> CC: LKML <linux-kernel@vger.kernel.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-03net: tcp: check skb is non-NULL for exact match on lookupsDavid Ahern1-1/+1
Andrey reported the following error report while running the syzkaller fuzzer: general protection fault: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 648 Comm: syz-executor Not tainted 4.9.0-rc3+ #333 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff8800398c4480 task.stack: ffff88003b468000 RIP: 0010:[<ffffffff83091106>] [< inline >] inet_exact_dif_match include/net/tcp.h:808 RIP: 0010:[<ffffffff83091106>] [<ffffffff83091106>] __inet_lookup_listener+0xb6/0x500 net/ipv4/inet_hashtables.c:219 RSP: 0018:ffff88003b46f270 EFLAGS: 00010202 RAX: 0000000000000004 RBX: 0000000000004242 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffffc90000e3c000 RDI: 0000000000000054 RBP: ffff88003b46f2d8 R08: 0000000000004000 R09: ffffffff830910e7 R10: 0000000000000000 R11: 000000000000000a R12: ffffffff867fa0c0 R13: 0000000000004242 R14: 0000000000000003 R15: dffffc0000000000 FS: 00007fb135881700(0000) GS:ffff88003ec00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020cc3000 CR3: 000000006d56a000 CR4: 00000000000006f0 Stack: 0000000000000000 000000000601a8c0 0000000000000000 ffffffff00004242 424200003b9083c2 ffff88003def4041 ffffffff84e7e040 0000000000000246 ffff88003a0911c0 0000000000000000 ffff88003a091298 ffff88003b9083ae Call Trace: [<ffffffff831100f4>] tcp_v4_send_reset+0x584/0x1700 net/ipv4/tcp_ipv4.c:643 [<ffffffff83115b1b>] tcp_v4_rcv+0x198b/0x2e50 net/ipv4/tcp_ipv4.c:1718 [<ffffffff83069d22>] ip_local_deliver_finish+0x332/0xad0 net/ipv4/ip_input.c:216 ... MD5 has a code path that calls __inet_lookup_listener with a null skb, so inet{6}_exact_dif_match needs to check skb against null before pulling the flag. Fixes: a04a480d4392 ("net: Require exact match for TCP socket lookups if dif is l3mdev") Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-03ipv6: add IPV6_RECVFRAGSIZE cmsgWillem de Bruijn1-2/+3
When reading a datagram or raw packet that arrived fragmented, expose the maximum fragment size if recorded to allow applications to estimate receive path MTU. At this point, the field is only recorded when ipv6 connection tracking is enabled. A follow-up patch will record this field also in the ipv6 input path. Tested using the test for IP_RECVFRAGSIZE plus ip netns exec to ip addr add dev veth1 fc07::1/64 ip netns exec from ip addr add dev veth0 fc07::2/64 ip netns exec to ./recv_cmsg_recvfragsize -6 -u -p 6000 & ip netns exec from nc -q 1 -u fc07::1 6000 < payload Both with and without enabling connection tracking ip6tables -A INPUT -m state --state NEW -p udp -j LOG Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-03netfilter: remove hook_entries field from nf_hook_statePablo Neira Ayuso2-8/+6
This field is only useful for nf_queue, so store it in the nf_queue_entry structure instead, away from the core path. Pass hook_head to nf_hook_slow(). Since we always have a valid entry on the first iteration in nf_iterate(), we can use 'do { ... } while (entry)' loop instead. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-11-03netfilter: x_tables: move hook state into xt_action_param structurePablo Neira Ayuso1-10/+38
Place pointer to hook state in xt_action_param structure instead of copying the fields that we need. After this change xt_action_param fits into one cacheline. This patch also adds a set of new wrapper functions to fetch relevant hook state structure fields. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-11-03netfilter: kill NF_HOOK_THRESH() and state->treshPablo Neira Ayuso2-38/+14
Patch c5136b15ea36 ("netfilter: bridge: add and use br_nf_hook_thresh") introduced br_nf_hook_thresh(). Replace NF_HOOK_THRESH() by br_nf_hook_thresh from br_nf_forward_finish(), so we have no more callers for this macro. As a result, state->thresh and explicit thresh parameter in the hook state structure is not required anymore. And we can get rid of skip-hook-under-thresh loop in nf_iterate() in the core path that is only used by br_netfilter to search for the filter hook. Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-11-02net: mii: add generic function to support ksetting supportPhilippe Reynes1-0/+4
The old ethtool api (get_setting and set_setting) has generic mii functions mii_ethtool_sset and mii_ethtool_gset. To support the new ethtool api ({get|set}_link_ksettings), we add two generics mii function mii_ethtool_{get|set}_link_ksettings_get. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31net: pim: add all RFC7761 message typesNikolay Aleksandrov1-1/+30
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31net: pim: add a helper to check for IPv4 all pim routers addressNikolay Aleksandrov1-0/+6
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-31net: pim: add common pimhdr struct and helpersNikolay Aleksandrov1-8/+36
Add the common pimhdr structure and helpers to access it, also cleanup the format of the header file. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>