aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/sun3_82586.c (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2010-07-05ipv4: use skb_dst_copy() in ip_copy_metadata()Eric Dumazet1-1/+1
Avoid touching dst refcount in ip_fragment(). Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-05ks8842: Replace usage of dev_dbg with netdev_dbgRichard Röjfors1-24/+18
This patch replaces all usage of dev_dbg with netdev_dbg. A side effect is that the pointer to the platform device in the adapter struct can be removed. Signed-off-by: Richard Röjfors <richard.rojfors@pelagicore.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-05usbnet: remove direct access to urb->statusOliver Neukum2-9/+12
USB drivers should not use urb->status directly because it is scheduled to become a parameter. This does the conversion for drivers/net/usb Signed-off-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-05ixgbe: use netif_<level> instead of netdev_<level>Emil Tantilov7-86/+89
This patch restores the ability to set msglvl through ethtool. The issue was introduced by: commit 849c45423c0c108e08d67644728cc9b0ed225fa1 CC: Joe Perches <joe@perches.com> Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-05igb: drop support for UDP hashing w/ RSSAlexander Duyck1-8/+10
This change removes UDP from the supported protocols for RSS hashing. The reason for removing this protocol is because IP fragmentation was causing a network flow to be broken into two streams, one for fragmented, and one for non-fragmented and this in turn was causing out-of-order issues. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04IB/{nes, ipoib}: Pass supported flags to ethtool_op_set_flags()Ben Hutchings2-2/+13
Following commit 1437ce3983bcbc0447a0dedcd644c14fe833d266 "ethtool: Change ethtool_op_set_flags to validate flags", ethtool_op_set_flags takes a third parameter and cannot be used directly as an implementation of ethtool_ops::set_flags. Changes nes and ipoib driver to pass in the appropriate value. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04xfrm: fix xfrm by MARK logicPeter Kosyh2-0/+4
While using xfrm by MARK feature in 2.6.34 - 2.6.35 kernels, the mark is always cleared in flowi structure via memset in _decode_session4 (net/ipv4/xfrm4_policy.c), so the policy lookup fails. IPv6 code is affected by this bug too. Signed-off-by: Peter Kosyh <p.kosyh@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04bnx2: Update version to 2.0.16.Michael Chan1-2/+2
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04bnx2: Dump some config space registers during TX timeout.Michael Chan1-3/+8
These config register values will be useful when the memory registers are returning 0xffffffff which has been reported. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04bnx2: Add support for skb->rxhash.Michael Chan2-1/+17
Add skb->rxhash support for TCP packets only because the bnx2 RSS hash does not hash UDP ports. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04bnx2: Always enable MSI-X on 5709.Michael Chan1-1/+1
Minor change to use MSI-X even if there is only one CPU. This allows the CNIC driver to always have a dedicated MSI-X vector to handle iSCSI events, instead of sharing the MSI vector. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04netdevice.h: Change netif_<level> macros to call netdev_<level> functionsJoe Perches1-7/+13
Reduces text ~300 bytes of text (woohoo!) in an x86 defconfig $ size vmlinux* text data bss dec hex filename 7198526 720112 1366288 9284926 8dad3e vmlinux 7198862 720112 1366288 9285262 8dae8e vmlinux.netdev Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04netdevice.h net/core/dev.c: Convert netdev_<level> logging macros to functionsJoe Perches2-19/+79
Reduces an x86 defconfig text and data ~2k. text is smaller, data is larger. $ size vmlinux* text data bss dec hex filename 7198862 720112 1366288 9285262 8dae8e vmlinux 7205273 716016 1366288 9287577 8db799 vmlinux.device_h Uses %pV and struct va_format Format arguments are verified before printk Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04device.h drivers/base/core.c Convert dev_<level> logging macros to functionsJoe Perches2-26/+150
Reduces an x86 defconfig text and data ~55k, .6% smaller. $ size vmlinux* text data bss dec hex filename 7205273 716016 1366288 9287577 8db799 vmlinux 7258890 719768 1366288 9344946 8e97b2 vmlinux.master Uses %pV and struct va_format Format arguments are verified before printk The dev_info macro is converted to _dev_info because there are existing uses of variables named dev_info in the kernel tree like drivers/net/pcmcia/pcnet_cs.c A dev_info macro is created to call _dev_info Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04vsprintf: Recursive vsnprintf: Add "%pV", struct va_formatJoe Perches2-0/+14
Add the ability to print a format and va_list from a structure pointer Allows __dev_printk to be implemented as a single printk while minimizing string space duplication. %pV should not be used without some mechanism to verify the format and argument use ala __attribute__(format (printf(...))). Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-02virtio_net: fix oom handling on txRusty Russell1-8/+13
virtio net will never try to overflow the TX ring, so the only reason add_buf may fail is out of memory. Thus, we can not stop the device until some request completes - there's no guarantee anything at all is outstanding. Make the error message clearer as well: error here does not indicate queue full. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (...and avoid TX_BUSY) Cc: stable@kernel.org # .34.x (s/virtqueue_/vi->svq->vq_ops->/) Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-02virtio_net: do not reschedule rx refill foreverMichael S. Tsirkin1-4/+3
We currently fill all of RX ring, then add_buf returns ENOSPC, which gets mis-detected as an out of memory condition and causes us to reschedule the work, and so on forever. Fix this by oom = err == -ENOMEM; Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org # .34.x Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-02s2io: resolve statistics issuesJon Mason2-41/+64
This patch resolves a number of issues in the statistics gathering of the s2io driver. On Xframe adapters, the received multicast statistics counter includes pause frames which are not indicated to the driver. This can cause issues where the multicast packet count is higher than what has actually been received, possibly higher than the number of packets received. The driver software counters are replaced with the adapter hardware statistics for rx_packets, rx_bytes, and tx_bytes. It also uses the overflow registers to determine if the statistics wrapped the 32bit register (removing the window of having a statistic value less than the previous call). rx_length_errors statistic now includes undersized packets in addition to oversized packets in its counting. Finally, rx_crc_errors are now being counted. Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-02linux/net.h: fix kernel-doc warningsRandy Dunlap1-2/+1
Fix kernel-doc warnings in linux/net.h: Warning(include/linux/net.h:151): No description found for parameter 'wq' Warning(include/linux/net.h:151): Excess struct/union/enum/typedef member 'fasync_list' description in 'socket' Warning(include/linux/net.h:151): Excess struct/union/enum/typedef member 'wait' description in 'socket' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-02net: decreasing real_num_tx_queues needs to flush qdiscJohn Fastabend4-5/+30
Reducing real_num_queues needs to flush the qdisc otherwise skbs with queue_mappings greater then real_num_tx_queues can be sent to the underlying driver. The flow for this is, dev_queue_xmit() dev_pick_tx() skb_tx_hash() => hash using real_num_tx_queues skb_set_queue_mapping() ... qdisc_enqueue_root() => enqueue skb on txq from hash ... dev->real_num_tx_queues -= n ... sch_direct_xmit() dev_hard_start_xmit() ndo_start_xmit(skb,dev) => skb queue set with old hash skbs are enqueued on the qdisc with skb->queue_mapping set 0 < queue_mappings < real_num_tx_queues. When the driver decreases real_num_tx_queues skb's may be dequeued from the qdisc with a queue_mapping greater then real_num_tx_queues. This fixes a case in ixgbe where this was occurring with DCB and FCoE. Because the driver is using queue_mapping to map skbs to tx descriptor rings we can potentially map skbs to rings that no longer exist. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-02sched: qdisc_reset_all_tx is calling qdisc_reset without qdisc_lockJohn Fastabend1-2/+10
When calling qdisc_reset() the qdisc lock needs to be held. In this case there is at least one driver i4l which is using this without holding the lock. Add the locking here. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-02qlge: fix a eeh handler to not add a pending timerBreno Leitao1-0/+2
On some ocasions the function qlge_io_resume() tries to add a pending timer, which causes the system to hit the BUG() on add_timer() function. This patch removes the timer during the EEH recovery. Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com> Signed-off-by: Ron Mercer <ron.mercer@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-02qlge: Replacing add_timer() to mod_timer()Breno Leitao1-6/+3
Currently qlge driver calls add_timer() instead of mod_timer(). This patch changes add_timer() to mod_timer(), which seems a better solution. Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com> Signed-off-by: Ron Mercer <ron.mercer@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-02usbnet: Set parent device early for netdev_printk()Ben Hutchings1-2/+3
netdev_printk() follows the net_device's parent device pointer, so we must set that earlier than we previously did. Reported-by: Luís Picciochi Oliveira <pitxyoki@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-02net: Revert "rndis_host: Poll status channel before control channel"Ben Hutchings1-12/+6
This reverts commit c17b274dc2aa538b68c1f02b01a3c4e124b435ba. That change was reported to break rndis_wlan support for the WUSB54GS. Reported-by: Luís Picciochi Oliveira <pitxyoki@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-02netfilter: ip6t_REJECT: fix a dst leak in ipv6 REJECTEric Dumazet1-2/+4
We should release dst if dst->error is set. Bug introduced in 2.6.14 by commit e104411b82f5c ([XFRM]: Always release dst_entry on error in xfrm_lookup) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: stable@kernel.org Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-02bridge: add per bridge device controls for invoking iptablesPatrick McHardy3-9/+97
Support more fine grained control of bridge netfilter iptables invocation by adding seperate brnf_call_*tables parameters for each device using the sysfs interface. Packets are passed to layer 3 netfilter when either the global parameter or the per bridge parameter is enabled. Acked-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-01ixgbe: use NETIF_F_LROStanislaw Gruszka1-1/+1
Both ETH_FLAG_LRO and NETIF_F_LRO have the same value, but NETIF_F_LRO is intended to use with netdev->features. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-01igb: Add commentGreg Rose1-0/+4
Add explanatory comment to avoid confusion when a pointer is set to the second word of an array instead of the customary cast of a pointer to the beginning of the array. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-01igb: correct link test not being run when link is downAlexander Duyck1-5/+3
The igb online link test was always reporting pass because instead of checking for if_running it was checking for netif_carrier_ok. This change corrects the test so that it is run if the interface is running instead of checking for netif carrier ok. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-01igb: Fix Tx hangs seen when loading igb with max_vfs > 7.Emil Tantilov1-4/+1
Check the value of max_vfs at the time of assignment of vfs_allocated_count. The previous check in igb_probe_vfs was too late as by that time the rx/tx rings were initialized with the wrong offset. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-01igb: Use only a single Tx queue in SR-IOV modeGreg Rose1-4/+4
The 82576 expects the second rx queue in any pool to receive L2 switch loop back packets sent from the second tx queue in another pool. The 82576 VF driver does not enable the second rx queue so if the PF driver sends packets destined to a VF from its second tx queue then the VF driver will never see them. In SR-IOV mode limit the number of tx queues used by the PF driver to one. This patch fixes a bug reported in which the PF cannot communciate with the VF and should be considered for 2.6.34 stable. CC: stable@kernel.org Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-01igb: fix PHY config access on 82580Nick Nunley2-0/+10
82580 NICs can have up to 4 functions. This fixes phy accesses to use the correct locks for functions 2 and 3. Signed-off-by: Nicholas Nunley <nicholasx.d.nunley@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-01x86: Drop CONFIG_MCORE2 check around setting of NET_IP_ALIGNAlexander Duyck1-2/+0
This patch removes the CONFIG_MCORE2 check from around NET_IP_ALIGN. It is based on a suggestion from Andi Kleen. The assumption is that there are not any x86 cores where unaligned access is really slow, and this change would allow for a performance improvement to still exist on configurations that are not necessarily optimized for Core 2. Cc: Andi Kleen <ak@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-01ll_temac: add error checking to DMA init pathDenis Kirjanov1-2/+23
Add error checking to DMA descriptor rings initialization code. Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-01be2net: changes to properly provide phy detailsAjit Khaparde4-14/+104
be2net driver is currently not showing correct phy details in certain cases. This patch fixes it. Signed-off-by: Ajit Khaparde <ajitk@serverengines.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-01ehea: Allocate stats buffer with GFP_KERNELBrian King1-1/+1
Since ehea_get_stats calls ehea_h_query_ehea_port, which can sleep, we can also sleep when allocating a page in this function. This fixes some memory allocation failure warnings seen under low memory conditions. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-01drivers: bluetooth: bluecard_cs.c: Fixed include error, changed to linux/io.hCody Rester1-1/+1
Fixed include error, changed to linux/io.h Signed-off-by: Cody Rester <codyrester@gmail.com> Acked-by: Gustavo F. Padovan <padovan@profusion.mobi> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-01vhost: add unlikely annotations to error pathMichael S. Tsirkin2-28/+29
patch 'break out of polling loop on error' caused a minor performance regression on my machine: recover that performance by adding a bunch of unlikely annotations in the error handling. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-06-30x86: Align skb w/ start of cacheline on newer core 2/Xeon ArchAlexander Duyck1-0/+9
x86 architectures can handle unaligned accesses in hardware, and it has been shown that unaligned DMA accesses can be expensive on Nehalem architectures. As such we should overwrite NET_IP_ALIGN to resolve this issue. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30ixgbe: add 1g PHY support for 82599Don Skidmore5-4/+52
Add support for 1G SFP+ PHY's to 82599. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30sfc: Add support for RX flow hash controlBen Hutchings5-10/+105
Allow ethtool to query the number of RX rings, the fields used in RX flow hashing and the hash indirection table. Allow ethtool to update the RX flow hash indirection table. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30ethtool: Add support for control of RX flow hash indirectionBen Hutchings2-0/+95
Many NICs use an indirection table to map an RX flow hash value to one of an arbitrary number of queues (not necessarily a power of 2). It can be useful to remove some queues from this indirection table so that they are only used for flows that are specifically filtered there. It may also be useful to weight the mapping to account for user processes with the same CPU-affinity as the RX interrupts. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30vmxnet3: Remove incorrect implementation of ethtool_ops::get_flags()Ben Hutchings1-7/+1
Only some netdev feature flags correspond directly to ethtool feature flags. ethtool_op_get_flags() does the right thing. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Bhavesh Davda <bhavesh@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30netdev: Make ethtool_ops::set_flags() return -EINVAL for unsupported flagsBen Hutchings4-4/+4
The documented error code for attempts to set unsupported flags (or to clear flags that cannot be disabled) is EINVAL, not EOPNOTSUPP. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30ethtool: Change ethtool_op_set_flags to validate flagsBen Hutchings10-60/+32
ethtool_op_set_flags() does not check for unsupported flags, and has no way of doing so. This means it is not suitable for use as a default implementation of ethtool_ops::set_flags. Add a 'supported' parameter specifying the flags that the driver and hardware support, validate the requested flags against this, and change all current callers to pass this parameter. Change some other trivial implementations of ethtool_ops::set_flags to call ethtool_op_set_flags(). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30cxgb4vf: Use correct shift factor for extracting the SGE DMA Ingress Padding BoundaryCasey Leedom1-1/+1
Use correct shift factor for extracting the SGE DMA Ingress Padding Boundary. Was accidentally using the register field's shift which was close enough (4 instead of the propper value of 5) that it actually sort of worked for various packet sizes ... Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30cxgb4vf: Remove obsolete comment about the lack of a TX Timer CallbackCasey Leedom1-12/+1
Remove obsolete comment about the lack of a TX Timer Callback -- which we now _do_ have ... Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30bonding: check if clients MAC addr has changedFlavio Leitner1-1/+2
When two systems using bonding devices in adaptive load balancing (ALB) communicates with each other, an endless ping-pong of ARP replies starts between these two systems. What happens? In the ALB mode, bonding driver keeps track of each client connected in a hash table, so it can do the receive load balancing (RLB). This hash table is updated when an ARP reply is received, then it scans for the client entry, updates its MAC address and flag it to be announced later. Therefore, two seconds later, the alb monitor runs and send for each updated client entry two ARP replies updating this specific client. The same process happens on the receiving system, causing the endless ping-pong of arp replies. See more information including the relevant functions below: System 1 System 2 bond0 bond0 ping <system2> ARP request ---------> <--------- ARP reply +->rlb_arp_recv <---------------------+ <--- loop begins | rlb_update_entry_from_arp | | client_info->ntt = 1; | | bond_info->rx_ntt = 1; | | | | <communication succeed> | | | | bond_alb_monitor | | rlb_update_rx_clients | | rlb_update_client | | arp_create(ARPOP_REPLY) | | send ARP reply --------------> V | send ARP reply --------------> | rlb_arp_recv | rlb_update_entry_from_arp | client_info->ntt = 1; | bond_info->rx_ntt = 1; | < snipped, same as in system 1> +------- <-------------- send ARP reply <-------------- send ARP reply Besides the unneeded networking traffic, this loop breaks a cluster because a backup system can't take over the IP address. There is always one system sending an ARP reply poisoning the network. This patch fixes the problem adding a check for the MAC address before updating it. Thus, if the MAC address didn't change, there is no need to update neither to announce it later. Signed-off-by: Flavio Leitner <fleitner@redhat.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30fragment: add fast path for in-order fragmentsChangli Gao3-0/+24
add fast path for in-order fragments As the fragments are sent in order in most of OSes, such as Windows, Darwin and FreeBSD, it is likely the new fragments are at the end of the inet_frag_queue. In the fast path, we check if the skb at the end of the inet_frag_queue is the prev we expect. Signed-off-by: Changli Gao <xiaosuo@gmail.com> ---- include/net/inet_frag.h | 1 + net/ipv4/ip_fragment.c | 12 ++++++++++++ net/ipv6/reassembly.c | 11 +++++++++++ 3 files changed, 24 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>