aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2014-12-10net: replace remaining users of arch_fast_hash with jhashDaniel Borkmann3-9/+9
This patch effectively reverts commit 500f80872645 ("net: ovs: use CRC32 accelerated flow hash if available"), and other remaining arch_fast_hash() users such as from nfsd via commit 6282cd565553 ("NFSD: Don't hand out delegations for 30 seconds after recalling them.") where it has been used as a hash function for bloom filtering. While we think that these users are actually not much of concern, it has been requested to remove the arch_fast_hash() library bits that arose from [1] entirely as per recent discussion [2]. The main argument is that using it as a hash may introduce bias due to its linearity (see avalanche criterion) and thus makes it less clear (though we tried to document that) when this security/performance trade-off is actually acceptable for a general purpose library function. Lets therefore avoid any further confusion on this matter and remove it to prevent any future accidental misuse of it. For the time being, this is going to make hashing of flow keys a bit more expensive in the ovs case, but future work could reevaluate a different hashing discipline. [1] https://patchwork.ozlabs.org/patch/299369/ [2] https://patchwork.ozlabs.org/patch/418756/ Cc: Neil Brown <neilb@suse.de> Cc: Francesco Fusco <fusco@ntop.org> Cc: Jesse Gross <jesse@nicira.com> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10netlink: use jhash as hashfn for rhashtableDaniel Borkmann1-1/+1
For netlink, we shouldn't be using arch_fast_hash() as a hashing discipline, but rather jhash() instead. Since netlink sockets can be opened by any user, a local attacker would be able to easily create collisions with the DPDK-derived arch_fast_hash(), which trades off performance for security by using crc32 CPU instructions on x86_64. While it might have a legimite use case in other places, it should be avoided in netlink context, though. As rhashtable's API is very flexible, we could later on still decide on other hashing disciplines, if legitimate. Reference: http://thread.gmane.org/gmane.linux.kernel/1844123 Fixes: e341694e3eb5 ("netlink: Convert netlink_lookup() to use RCU protected hash table") Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10Merge branch 'isdn-next'David S. Miller4-46/+38
Tilman Schmidt says: ==================== ISDN patches for net-next Here's a series of patches for the Gigaset ISDN driver and one for the ISDN CAPI subsystem. Please merge as appropriate. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10isdn/capi: correct argument types of command_2_indexTilman Schmidt1-1/+1
Utility function command_2_index is always called with arguments of type u8. Adapt its declaration accordingly. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10isdn/gigaset: enable Kernel CAPI support by defaultTilman Schmidt1-1/+1
Kernel CAPI has been the recommended ISDN subsystem for the Gigaset driver since kernel release 2.6.34.2. It provides full backwards compatibility to the old I4L subsystem thanks to the capidrv module. I4L has been marked as deprecated for more than seven years. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10isdn/gigaset: elliminate unnecessary argument from send_cb()Tilman Schmidt1-16/+15
No need to pass a member of the cardstate structure as a separate argument if the entire structure is already passed. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10isdn/gigaset: clarify gigaset_modem_fill control structureTilman Schmidt1-28/+24
Replace the flag-controlled retry loop by explicit goto statements in the error branches to make the control structure easier to understand. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10isdn/gigaset: drop duplicate declarationTilman Schmidt1-3/+0
Function gigaset_skb_sent was declared twice, identically, in gigaset.h. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10tipc: fix broadcast wakeup contention after congestionRichard Alpe2-5/+5
commit 908344cdda80 ("tipc: fix bug in multicast congestion handling") introduced a race in the broadcast link wakeup functionality. This patch eliminates this broadcast link wakeup race caused by operation on the wakeup list without proper locking. If this race hit and corrupted the list all subsequent wakeup messages would be lost, resulting in a considerable memory leak. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10enic: add support for set/get rss hash keyGovindarajulu Varadarajan3-4/+49
This patch adds support for setting/getting rss hash key using ethtool. v2: respin patch to support RSS hash function changes. Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10Merge branch 'napi_page_frags'David S. Miller16-70/+178
Alexander Duyck says: ==================== net: Alloc NAPI page frags from their own pool This patch series implements a means of allocating page fragments without the need for the local_irq_save/restore in __netdev_alloc_frag. By doing this I am able to decrease packet processing time by 11ns per packet in my test environment. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10ethernet/broadcom: Use napi_alloc_skb instead of netdev_alloc_skb_ip_alignAlexander Duyck3-3/+3
This patch replaces the calls to netdev_alloc_skb_ip_align in the copybreak paths. Cc: Gary Zambrano <zambrano@broadcom.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Ariel Elior <ariel.elior@qlogic.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10ethernet/realtek: use napi_alloc_skb instead of netdev_alloc_skb_ip_alignAlexander Duyck3-3/+3
This replaces most of the calls to netdev_alloc_skb_ip_align in the Realtek drivers. The one instance I didn't replace in 8139cp.c is because it was called as a part of init and as such is not always accessed from the softirq context. Cc: Realtek linux nic maintainers <nic_swsd@realtek.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10cxgb: Use napi_alloc_skb instead of netdev_alloc_skb_ip_alignAlexander Duyck1-5/+6
In order to use napi_alloc_skb I needed to pass a pointer to struct adapter instead of struct pci_dev. This allowed me to access &adapter->napi. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10ethernet/intel: Use napi_alloc_skbAlexander Duyck6-11/+10
This change replaces calls to netdev_alloc_skb_ip_align with napi_alloc_skb. The advantage of napi_alloc_skb is currently the fact that the page allocation doesn't make use of any irq disable calls. There are few spots where I couldn't replace the calls as the buffer allocation routine is called as a part of init which is outside of the softirq context. Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skbAlexander Duyck3-8/+77
This change pulls the core functionality out of __netdev_alloc_skb and places them in a new function named __alloc_rx_skb. The reason for doing this is to make these bits accessible to a new function __napi_alloc_skb. In addition __alloc_rx_skb now has a new flags value that is used to determine which page frag pool to allocate from. If the SKB_ALLOC_NAPI flag is set then the NAPI pool is used. The advantage of this is that we do not have to use local_irq_save/restore when accessing the NAPI pool from NAPI context. In my test setup I saw at least 11ns of savings using the napi_alloc_skb function versus the netdev_alloc_skb function, most of this being due to the fact that we didn't have to call local_irq_save/restore. The main use case for napi_alloc_skb would be for things such as copybreak or page fragment based receive paths where an skb is allocated after the data has been received instead of before. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10net: Split netdev_alloc_frag into __alloc_page_frag and add __napi_alloc_fragAlexander Duyck2-40/+79
This patch splits the netdev_alloc_frag function up so that it can be used on one of two page frag pools instead of being fixed on the netdev_alloc_cache. By doing this we can add a NAPI specific function __napi_alloc_frag that accesses a pool that is only used from softirq context. The advantage to this is that we do not need to call local_irq_save/restore which can be a significant savings. I also took the opportunity to refactor the core bits that were placed in __alloc_page_frag. First I updated the allocation to do either a 32K allocation or an order 0 page. This is based on the changes in commmit d9b2938aa where it was found that latencies could be reduced in case of failures. Then I also rewrote the logic to work from the end of the page to the start. By doing this the size value doesn't have to be used unless we have run out of space for page fragments. Finally I cleaned up the atomic bits so that we just do an atomic_sub_and_test and if that returns true then we set the page->_count via an atomic_set. This way we can remove the extra conditional for the atomic_read since it would have led to an atomic_inc in the case of success anyway. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10Merge branch 'for-davem-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsDavid S. Miller48-1038/+628
More iov_iter work for the networking from Al Viro. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09dummy: use MODULE_VERSIONFlavio Leitner1-0/+1
Use MODULE_VERSION() now that dummy driver has a version. Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09net: sched: cls: use nla_nest_cancel instead of nlmsg_trimJiri Pirko6-13/+8
To cancel nesting, this function is more convenient. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09amd-xgbe: Use disable_irq_nosync when in IRQ contextLendacky, Thomas1-1/+1
The disable_irq_nosync function, not the disable_irq function, must be used to disable the DMA channel interrupt from within the interrupt service routine. Change the disable_irq call to disable_irq_nosync. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09net: fec: avoid kernal crash by NULL pointer when no phy connectionNimrod Andy1-0/+2
On i.MX6SX sabreauto board, when there have no phy daughter board connection, there have kernel crash by NULL pointer: fec 2188000.ethernet eth0: could not attach to PHY Unable to handle kernel NULL pointer dereference at virtual address 00000220 pgd = 80004000 [00000220] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT SMP ARM Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.24-01042-g27eaeea-dirty #405 task: d8078000 ti: d8076000 task.ti: d8076000 PC is at mutex_lock+0x10/0x54 LR is at phy_start+0x14/0x68 pc : [<806ad4e4>] lr : [<803b0f90>] psr: 60000113 sp : d8077d80 ip : 00000000 fp : d83cc000 r10: 0000100c r9 : d83cc800 r8 : 00000000 r7 : d83bcd0c r6 : 00000200 r5 : 00000220 r4 : 00000220 r3 : 00000000 r2 : 00000000 r1 : d83bcd90 r0 : 00000220 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c5387d Table: 8000404a DAC: 00000015 Process swapper/0 (pid: 1, stack limit = 0xd8076240) Stack: (0xd8077d80 to 0xd8078000) 7d80: 00000000 803b0f90 00000001 00000000 d83bc800 803be034 00000007 805c3fb4 7da0: 00000003 80d4e0bc 805efcb8 fffffff1 fffffff0 00000000 00000000 d8077dfc 7dc0: 0000000d 80d6ce80 80d126b0 800499c8 d83bc800 d83bc800 806f0f40 d83bc82c 7de0: 00000000 00000000 80d6ce80 80d126b0 0000016b 80540250 d8076008 d83bc800 7e00: 0000016b d83bc800 00001003 00000001 00001002 805404d4 d83bc800 00000120 7e20: 00001002 00001002 00000000 805405d4 d83bc800 00000001 80d126c0 00001002 7e40: 80dbc5dc 80d02024 00000000 806ae360 00000002 d6128420 d6127198 12400000 7e60: 00000000 00000000 00000002 d61271e8 00000000 12400000 d801674c 800e49f0 7e80: d6127198 d6124e58 00000000 80238848 d61271c4 00000000 00000001 d8016700 7ea0: 80dd2e00 80d752c0 80d752c0 80cfdaec 0000010c 80239430 806c2e90 d800f080 7ec0: d800f380 804e46b4 ffffffbc 80d15cb0 00000007 80d752c0 80d752c0 80d01e94 7ee0: 0000010c d8076030 00000000 800088cc 80dbaba4 80bd411c d80a6f00 806b1e04 7f00: 00000000 00000000 00000000 80125b84 00000000 80d2c56c 60000113 00000001 7f20: ef7ff9df 806c80cc 0000010c 80043f5c 80c95eb8 00000007 ef7ffa1d 00000007 7f40: 80d2c55c 80d15cb0 00000007 80d752c0 80d752c0 80ccc50c 0000010c 80d0a114 7f60: 80d0a10c 80cccc04 00000007 00000007 80ccc50c 806ae410 00000000 8004cb84 7f80: 80d17bc0 00000000 806a4bd4 00000000 00000000 00000000 00000000 00000000 7fa0: 00000000 806a4bdc 00000000 8000e5f8 00000000 00000000 00000000 00000000 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 1e79a7bb e5337f77 [<806ad4e4>] (mutex_lock) from [<803b0f90>] (phy_start+0x14/0x68) [<803b0f90>] (phy_start) from [<803be034>] (fec_enet_open+0x448/0x5dc) [<803be034>] (fec_enet_open) from [<80540250>] (__dev_open+0xa8/0x110) [<80540250>] (__dev_open) from [<805404d4>] (__dev_change_flags+0x88/0x170) [<805404d4>] (__dev_change_flags) from [<805405d4>] (dev_change_flags+0x18/0x48) [<805405d4>] (dev_change_flags) from [<80d02024>] (ip_auto_config+0x190/0xf94) [<80d02024>] (ip_auto_config) from [<800088cc>] (do_one_initcall+0xe8/0x144) [<800088cc>] (do_one_initcall) from [<80cccc04>] (kernel_init_freeable+0x104/0x1c8) [<80cccc04>] (kernel_init_freeable) from [<806a4bdc>] (kernel_init+0x8/0xec) [<806a4bdc>] (kernel_init) from [<8000e5f8>] (ret_from_fork+0x14/0x3c) Code: e92d4010 e3a03000 e1a04000 ee073fba (e1903f9f) Add phydev check to fix the issue. Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09tipc: avoid double lock 'spin_lock:&seq->lock'Ying Xue1-1/+1
The commit fb9962f3cefe ("tipc: ensure all name sequences are properly protected with its lock") involves below errors: net/tipc/name_table.c:980 tipc_purge_publications() error: double lock 'spin_lock:&seq->lock' Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09net: systemport: allow changing MAC addressFlorian Fainelli1-0/+22
Hook a ndo_set_mac_address callback, update the internal Ethernet MAC in the netdevice structure, and finally write that address down to the UniMAC registers. If the interface is down, and most likely clock gated, we do not update the registers but just the local copy, such that next ndo_open() call will effectively write down the address. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09Merge branch 'bridge_mode'David S. Miller3-20/+10
Roopa Prabhu says: ==================== remove bridge mode BRIDGE_MODE_SWDEV BRIDGE_MODE_SWDEV was introduced to indicate switchdev offloads for bridging from user space (In other words to call into the hw switch port driver directly). But user can use existing BRIDGE_FLAGS_SELF to call into the hw switch port driver today. swdev mode is not required anymore. So, this patch removes it. v4 - v5 incorporate comments - Define BRIDGE_MODE_UNDEF to handle cases where mode is not defined - reverse the order of patches - include patch comments in all patches ==================== Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09bridge: remove mode BRIDGE_MODE_SWDEVRoopa Prabhu1-1/+0
This patch removes bridge mode swdev. Users can use BRIDGE_FLAGS_SELF to indicate swdev offload if needed. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09rocker: remove swdev modeRoopa Prabhu2-19/+9
Remove use of 'swdev' mode in rocker. rocker dev offloads can use the BRIDGE_FLAGS_SELF to indicate offload to hardware. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09bridge: new mode flag to indicate mode 'undefined'Roopa Prabhu1-0/+1
This patch adds mode BRIDGE_MODE_UNDEF for cases where mode is not needed. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09Merge tag 'master-2014-12-08' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-nextDavid S. Miller228-3574/+12428
John W. Linville says: ==================== pull request: wireless-next 2014-12-08 Please pull this last batch of pending wireless updates for the 3.19 tree... For the wireless bits, Johannes says: "This time I have Felix's no-status rate control work, which will allow drivers to work better with rate control even if they don't have perfect status reporting. In addition to this, a small hwsim fix from Patrik, one of the regulatory patches from Arik, and a number of cleanups and fixes I did myself. Of note is a patch where I disable CFG80211_WEXT so that compatibility is no longer selectable - this is intended as a wake-up call for anyone who's still using it, and is still easily worked around (it's a one-line patch) before we fully remove the code as well in the future." For the Bluetooth bits, Johan says: "Here's one more bluetooth-next pull request for 3.19: - Minor cleanups for ieee802154 & mac802154 - Fix for the kernel warning with !TASK_RUNNING reported by Kirill A. Shutemov - Support for another ath3k device - Fix for tracking link key based security level - Device tree bindings for btmrvl + a state update fix - Fix for wrong ACL flags on LE links" And... "In addition to the previous one this contains two more cleanups to mac802154 as well as support for some new HCI features from the Bluetooth 4.2 specification. From the original request: 'Here's what should be the last bluetooth-next pull request for 3.19. It's rather large but the majority of it is the Low Energy Secure Connections feature that's part of the Bluetooth 4.2 specification. The specification went public only this week so we couldn't publish the corresponding code before that. The code itself can nevertheless be considered fairly mature as it's been in development for over 6 months and gone through several interoperability test events. Besides LE SC the pull request contains an important fix for command complete events for mgmt sockets which also fixes some leaks of hci_conn objects when powering off or unplugging Bluetooth adapters. A smaller feature that's part of the pull request is service discovery support. This is like normal device discovery except that devices not matching specific UUIDs or strong enough RSSI are filtered out. Other changes that the pull request contains are firmware dump support to the btmrvl driver, firmware download support for Broadcom BCM20702A0 variants, as well as some coding style cleanups in 6lowpan & ieee802154/mac802154 code.'" For the NFC bits, Samuel says: "With this one we get: - NFC digital improvements for DEP support: Chaining, NACK and ATN support added. - NCI improvements: Support for p2p target, SE IO operand addition, SE operands extensions to support proprietary implementations, and a few fixes. - NFC HCI improvements: OPEN_PIPE and NOTIFY_ALL_CLEARED support, and SE IO operand addition. - A bunch of minor improvements and fixes for STMicro st21nfcb and st21nfca" For the iwlwifi bits, Emmanuel says: "Major works are CSA and TDLS. On top of that I have a new firmware API for scan and a few rate control improvements. Johannes find a few tricks to improve our CPU utilization and adds support for a new spin of 7265 called 7265D. Along with this a few random things that don't stand out." And... "I deprecate here -8.ucode since -9 has been published long ago. Along with that I have a new activity, we have now better a infrastructure for firmware debugging. This will allow to have configurable probes insides the firmware. Luca continues his work on NetDetect, this feature is now complete. All the rest is minor fixes here and there." For the Atheros bits, Kalle says: "Only ath10k changes this time and no major changes. Most visible are: o new debugfs interface for runtime firmware debugging (Yanbo) o fix shared WEP (Sujith) o don't rebuild whenever kernel version changes (Johannes) o lots of refactoring to make it easier to add new hw support (Michal) There's also smaller fixes and improvements with no point of listing here." In addition, there are a few last minute updates to ath5k, ath9k, brcmfmac, brcmsmac, mwifiex, rt2x00, rtlwifi, and wil6210. Also included is a pull of the wireless tree to pick-up the fixes originally included in "pull request: wireless 2014-12-03"... Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09sh_eth: Remove redundant alignment adjustmentMitsuhiro Kimura1-2/+2
PTR_ALIGN macro after skb_reserve is redundant, because skb_reserve function adjusts the alignment of skb->data. Signed-off-by: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09sh_eth: Optimization for RX excess judgementMitsuhiro Kimura1-5/+5
Both of 'boguscnt' and 'quota' have nearly meaning as the condition of the reception loop. In order to cut down redundant processing, this patch changes excess judgement. Signed-off-by: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09net: avoid to call skb_queue_len againLi RongQing1-1/+1
the queue length of sd->input_pkt_queue has been put into qlen, and impossible to change, since hold the lock Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-nextDavid S. Miller8-111/+197
Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2014-12-09 This series contains updates to i40e and i40evf. Jeff (me) provides a single patch to convert a macro to a static inline function based on feedback from Joe Perches on a previous patch. Shannon provides the remaining twelve patches against i40e. Almost all of Shannon's patches cleanup/fix NVM issues varying in range from adding more detail to debug messages, to removing dead code, to fixing NVM state transitions after an error. Change the handy decoder interface for admin queue return code to help catch and properly report the condition as a useful errno rather than returning a misleading '0'. Added a range check to avoid any possible array index-out-of-bound issues. v2: - fixed up patch 05 in the series to use the ARRAY_SIZE() macro as suggested by Sergei Shtylyov - fix up patch 13 to remove unnecessary parens in the return statement as suggested by Sergei Shtylyov ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09Merge tag 'linux-can-next-for-3.19-20141207' of git://gitorious.org/linux-can/linux-can-nextDavid S. Miller14-174/+154
Marc Kleine-Budde says: ==================== pull-request: can-next 2014-12-07 this is a pull request of 8 patches for net-next/master. Andri Yngvason contributes 4 patches in which the CAN state change handling is consolidated and unified among the sja1000, mscan and flexcan driver. The three patches by Jeremiah Mahler fix spelling mistakes and eliminate the banner[] variable in various parts. And a patch by me that switches on sparse endianess checking by default. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09tcp: refine TSO autosizingEric Dumazet3-58/+63
Commit 95bd09eb2750 ("tcp: TSO packets automatic sizing") tried to control TSO size, but did this at the wrong place (sendmsg() time) At sendmsg() time, we might have a pessimistic view of flow rate, and we end up building very small skbs (with 2 MSS per skb). This is bad because : - It sends small TSO packets even in Slow Start where rate quickly increases. - It tends to make socket write queue very big, increasing tcp_ack() processing time, but also increasing memory needs, not necessarily accounted for, as fast clones overhead is currently ignored. - Lower GRO efficiency and more ACK packets. Servers with a lot of small lived connections suffer from this. Lets instead fill skbs as much as possible (64KB of payload), but split them at xmit time, when we have a precise idea of the flow rate. skb split is actually quite efficient. Patch looks bigger than necessary, because TCP Small Queue decision now has to take place after the eventual split. As Neal suggested, introduce a new tcp_tso_autosize() helper, so that tcp_tso_should_defer() can be synchronized on same goal. Rename tp->xmit_size_goal_segs to tp->gso_segs, as this variable contains number of mss that we can put in GSO packet, and is not related to the autosizing goal anymore. Tested: 40 ms rtt link nstat >/dev/null netperf -H remote -l -2000000 -- -s 1000000 nstat | egrep "IpInReceives|IpOutRequests|TcpOutSegs|IpExtOutOctets" Before patch : Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/s 87380 2000000 2000000 0.36 44.22 IpInReceives 600 0.0 IpOutRequests 599 0.0 TcpOutSegs 1397 0.0 IpExtOutOctets 2033249 0.0 After patch : Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 2000000 2000000 0.36 44.27 IpInReceives 221 0.0 IpOutRequests 232 0.0 TcpOutSegs 1397 0.0 IpExtOutOctets 2013953 0.0 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09bury memcpy_toiovec()Al Viro2-26/+0
no users left Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09skb_copy_datagram_iovec() can dieAl Viro2-86/+0
no callers other than itself. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09ppp_read(): switch to skb_copy_datagram_iter()Al Viro1-1/+3
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitivesAl Viro3-54/+28
... making both non-draining. That means that tcp_recvmsg() becomes non-draining. And _that_ would break iscsit_do_rx_data() unless we a) make sure tcp_recvmsg() is uniformly non-draining (it is) b) make sure it copes with arbitrary (including shifted) iov_iter (it does, all it uses is iov_iter primitives) c) make iscsit_do_rx_data() initialize ->msg_iter only once. Fortunately, (c) is doable with minimal work and we are rid of one the two places where kernel send/recvmsg users would be unhappy with non-draining behaviour. Actually, that makes all but one of ->recvmsg() instances iov_iter-clean. The exception is skcipher_recvmsg() and it also isn't hard to convert to primitives (iov_iter_get_pages() is needed there). That'll wait a bit - there's some interplay with ->sendmsg() path for that one. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09first fruits - kill l2cap ->memcpy_fromiovec()Al Viro6-48/+6
Just use copy_from_iter(). That's what this method is trying to do in all cases, in a very convoluted fashion. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09put iov_iter into msghdrAl Viro33-112/+90
Note that the code _using_ ->msg_iter at that point will be very unhappy with anything other than unshifted iovec-backed iov_iter. We still need to convert users to proper primitives. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09vmci: propagate msghdr all way down to __qp_memcpy_from_queue()Al Viro3-12/+14
... and switch it to memcpy_to_msg() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09switch l2cap ->memcpy_fromiovec() to msghdrAl Viro3-7/+7
it'll die soon enough - now that kvec-backed iov_iter works regardless of set_fs(), both instances will become copy_from_iter() as soon as we introduce ->msg_iter... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09switch tcp_sock->ucopy from iovec (ucopy.iov) to msghdr (ucopy.msg)Al Viro3-6/+5
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09ip_generic_getfrag, udplite_getfrag: switch to passing msghdrAl Viro7-10/+11
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt"Al Viro1-56/+56
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09raw.c: stick msghdr into raw_frag_vecAl Viro1-4/+4
we'll want access to ->msg_iter Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09Merge branch 'iov_iter' into for-davem-2Al Viro2-638/+426
2014-12-09chelsio: fix misspelling of current function in stringJulia Lawall1-1/+1
Replace a misspelled function name by %s and then __func__. This was done using Coccinelle, including the use of Levenshtein distance, as proposed by Rasmus Villemoes. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09hp100: fix misspelling of current function in stringJulia Lawall1-2/+5
Replace a misspelled function name by %s and then __func__. This was done using Coccinelle, including the use of Levenshtein distance, as proposed by Rasmus Villemoes. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>