linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2011-11-30	ipv6 : mcast : Delete useless parameter in ip6_mc_add1_src()	Jun Zhao	1	-2/+2
	Need not to used 'delta' flag when add single-source to interface filter source list. Signed-off-by: Jun Zhao <mypopydev@gmail.com> Signed-off-by: David S. Miller <davem@drr.davemloft.net>
2011-11-30	ipv4 : igmp : Delete useless parameter in ip_mc_add1_src()	Jun Zhao	1	-2/+2
	Need not to used 'delta' flag when add single-source to interface filter source list. Signed-off-by: Jun Zhao <mypopydev@gmail.com> Signed-off-by: David S. Miller <davem@drr.davemloft.net>
2011-11-30	bonding: only use primary address for ARP	Henrik Saavedra Persson	1	-27/+6
	Only use the primary address of the bond device for master_ip. This will prevent changing the ARP source address in Active-Backup mode whenever a secondry address is added to the bond device. Signed-off-by: Henrik Saavedra Persson <henrik.e.persson@ericsson.com> Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@drr.davemloft.net>
2011-11-30	ARM: 7182/1: ARM cpu topology: fix warning	Vincent Guittot	2	-2/+2
	kernel/sched.c:7354:2: warning: initialization from incompatible pointer type Align cpu_coregroup_mask prototype interface with sched_domain_mask_f typedef use int cpu instead of unsigned int cpu Cc: <stable@vger.kernel.org> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-11-30	ARM: 7181/1: Restrict kprobes probing SWP instructions to ARMv5 and below	Jon Medhurst (Tixy)	2	-10/+19
	The SWP instruction is deprecated on ARMv6 and with ARMv7 it will be UNDEFINED when CONFIG_SWP_EMULATE is selected. In this case, probing a SWP instruction will cause an oops when the kprobes emulation code executes an undefined instruction. As the SWP instruction should be rare or non-existent in kernels for ARMv6 and later, we can simply avoid these problems by not allowing probing of these. Reported-by: Leif Lindholm <leif.lindholm@arm.com> Tested-by: Leif Lindholm <leif.lindholm@arm.com> Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org> Signed-off-by: Jon Medhurst <tixy@yxit.co.uk> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-11-30	ARM: 7180/1: Change kprobes testcase with unpredictable STRD instruction	Jon Medhurst (Tixy)	1	-1/+1
	There is a kprobes testcase for the instruction "strd r2, [r3], r4". This has unpredictable behaviour as it uses r3 for register writeback addressing and also stores it to memory. On a cortex A9, this testcase would fail because the instruction writes the updated value of r3 to memory, whereas the kprobes emulation code writes the original value. Fix this by changing testcase to used r5 instead of r3. Reported-by: Leif Lindholm <leif.lindholm@arm.com> Tested-by: Leif Lindholm <leif.lindholm@arm.com> Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org> Signed-off-by: Jon Medhurst <tixy@yxit.co.uk> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-11-30	atm: clip: Use device neigh support on top of "arp_tbl".	David Miller	4	-90/+16
	Instead of instantiating an entire new neigh_table instance just for ATM handling, use the neigh device private facility. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30	neigh: Add device constructor/destructor capability.	David Miller	2	-1/+16
	If the neigh entry has device private state, it will need constructor/destructor ops. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30	atm: clip: Convert over to neighbour_priv()	David Miller	2	-15/+15
	Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30	neigh: Do not set tbl->entry_size in ipv4/ipv6 neigh tables.	David Miller	3	-3/+0
	Let the core self-size the neigh entry based upon the key length. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30	neigh: Add infrastructure for allocating device neigh privates.	David Miller	4	-3/+15
	netdev->neigh_priv_len records the private area length. This will trigger for neigh_table objects which set tbl->entry_size to zero, and the first instances of this will be forthcoming. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30	neigh: Get rid of neigh_table->kmem_cachep	David Miller	2	-17/+2
	We are going to alloc for device specific private areas for neighbour entries, and in order to do that we have to move away from the fixed allocation size enforced by using neigh_table->kmem_cachep As a nice side effect we can now use kfree_rcu(). Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30	neigh: Create mechanism for generic neigh private areas.	David Miller	1	-0/+7
	The implementation private sits right after the primary_key memory. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30	ipv4: fix lockdep splat in rt_cache_seq_show	Eric Dumazet	1	-2/+6
	After commit f2c31e32b378 (fix NULL dereferences in check_peer_redir()), dst_get_neighbour() should be guarded by rcu_read_lock() / rcu_read_unlock() section. Reported-by: Miles Lane <miles.lane@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30	sfc: fix race in efx_enqueue_skb_tso()	Eric Dumazet	1	-2/+2
	As soon as skb is pushed to hardware, it can be completed and freed, so we should not dereference skb anymore. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30	sch_teql: fix lockdep splat	Eric Dumazet	1	-11/+20
	We need rcu_read_lock() protection before using dst_get_neighbour(), and we must cache its value (pass it to __teql_resolve()) teql_master_xmit() is called under rcu_read_lock_bh() protection, its not enough. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30	bnx2: Support for byte queue limits	Eric Dumazet	1	-0/+6
	Changes to bnx2 to use byte queue limits. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30	net: fec: Select the FEC driver by default for i.MX SoCs	Fabio Estevam	1	-0/+1
	Since commit 230dec6 (net/fec: add imx6q enet support) the FEC driver is no longer built by default for i.MX SoCs. Let the FEC driver be built by default again. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Suggested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30	tcp: inherit listener congestion control for passive cnx	Eric Dumazet	3	-1/+7
	Rick Jones reported that TCP_CONGESTION sockopt performed on a listener was ignored for its children sockets : right after accept() the congestion control for new socket is the system default one. This seems an oversight of the initial design (quoted from Stephen) Based on prior investigation and patch from Rick. Reported-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> CC: Yuchung Cheng <ycheng@google.com> Tested-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30	can: Revert outdated cc770 driver patches.	David S. Miller	7	-1504/+0
	Newer versions have been floating about, and I applied to older variant unfortunately. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30	Btrfs: skip allocation attempt from empty cluster	Alexandre Oliva	1	-3/+3
	If we don't have a cluster, don't bother trying to allocate from it, jumping right away to the attempt to allocate a new cluster. Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-11-30	Btrfs: skip block groups without enough space for a cluster	Alexandre Oliva	1	-1/+1
	We test whether a block group has enough free space to hold the requested block, but when we're doing clustered allocation, we can save some cycles by testing whether it has enough room for the cluster upfront, otherwise we end up attempting to set up a cluster and failing. Only in the NO_EMPTY_SIZE loop do we attempt an unclustered allocation, and by then we'll have zeroed the cluster size, so this patch won't stop us from using the block group as a last resort. Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-11-30	Btrfs: start search for new cluster at the beginning	Alexandre Oliva	1	-4/+2
	Instead of starting at zero (offset is always zero), request a cluster starting at search_start, that denotes the beginning of the current block group. Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-11-30	Btrfs: reset cluster's max_size when creating bitmap	Alexandre Oliva	1	-0/+1
	The field that indicates the size of the largest contiguous chunk of free space in the cluster is not initialized when setting up bitmaps, it's only increased when we find a larger contiguous chunk. We end up retaining a larger value than appropriate for highly-fragmented clusters, which may cause pointless searches for large contiguous groups, and even cause clusters that do not meet the density requirements to be set up. Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-11-30	Btrfs: initialize new bitmaps' list	Alexandre Oliva	1	-0/+1
	We're failing to create clusters with bitmaps because setup_cluster_no_bitmap checks that the list is empty before inserting the bitmap entry in the list for setup_cluster_bitmap, but the list field is only initialized when it is restored from the on-disk free space cache, or when it is written out to disk. Besides a potential race condition due to the multiple use of the list field, filesystem performance severely degrades over time: as we use up all non-bitmap free extents, the try-to-set-up-cluster dance is done at every metadata block allocation. For every block group, we fail to set up a cluster, and after failing on them all up to twice, we fall back to the much slower unclustered allocation. To make matters worse, before the unclustered allocation, we try to create new block groups until we reach the 1% threshold, which introduces additional bitmaps and thus block groups that we'll iterate over at each metadata block request.
2011-11-30	Btrfs: fix oops when calling statfs on readonly device	Li Zefan	1	-3/+3
	To reproduce this bug: # dd if=/dev/zero of=img bs=1M count=256 # mkfs.btrfs img # losetup -r /dev/loop1 img # mount /dev/loop1 /mnt OOPS!! It triggered BUG_ON(!nr_devices) in btrfs_calc_avail_data_space(). To fix this, instead of checking write-only devices, we check all open deivces: # df -h /dev/loop1 Filesystem Size Used Avail Use% Mounted on /dev/loop1 250M 28K 238M 1% /mnt Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
2011-11-30	Btrfs: Don't error on resizing FS to same size	Mike Fleetwood	1	-1/+1
	It seems overly harsh to fail a resize of a btrfs file system to the same size when a shrink or grow would succeed. User app GParted trips over this error. Allow it by bypassing the shrink or grow operation. Signed-off-by: Mike Fleetwood <mike.fleetwood@googlemail.com>
2011-11-30	Btrfs: fix deadlock on metadata reservation when evicting a inode	Miao Xie	3	-5/+22
	When I ran the xfstests, I found the test tasks was blocked on meta-data reservation. By debugging, I found the reason of this bug: start transaction \| v reserve meta-data space \| v flush delay allocation -> iput inode -> evict inode ^ \| \| v wait for delay allocation flush <- reserve meta-data space And besides that, the flush on evicting inode will block the thread, which is reclaiming the memory, and make oom happen easily. Fix this bug by skipping the flush step when evicting inode. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
2011-11-30	Fix URL of btrfs-progs git repository in docs	Arnd Hannemann	1	-2/+2
	The location of the btrfs-progs repository has been changed. This patch updates the documentation accordingly. Signed-off-by: Arnd Hannemann <arnd@arndnet.de>
2011-11-30	btrfs scrub: handle -ENOMEM from init_ipath()	Dan Carpenter	1	-0/+5
	init_ipath() can return an ERR_PTR(-ENOMEM). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2011-11-29	sky2: add bql support	stephen hemminger	1	-5/+13
	This adds support for byte queue limits and aggregates statistics update (suggestion from Eric). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@drr.davemloft.net>
2011-11-29	bnx2x: handle iSCSI SD mode	Dmitry Kravkov	4	-9/+86
	in iSCSI SD mode to bnx2x device assigned single mac address which is supposted to be iscsi mac. If this mode is recognized bnx2x will disable LRO, decrease number of queues to 1 and rx ring size to the minumum allowed by FW, this in order minimize memory use. It will tranfer mac for iscsi usage and zero primary mac of the netdev. Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	isdn: avoid copying too long drvid	Dan Carpenter	1	-0/+3
	"cfg->drvid" comes from the user so there is a possibility they didn't NUL terminate it properly. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	isdn: make sure strings are null terminated	Dan Carpenter	1	-0/+6
	These strings come from the user. We strcpy() them inside cf_command() so we should check that they are NULL terminated and return an error if not. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	can: cc770: legacy CC770 ISA bus driver	Wolfgang Grandegger	3	-0/+349
	This patch adds support for legacy Bosch CC770 and Intel AN82527 CAN controllers on the ISA or PC-104 bus. The I/O port or memory address and the IRQ number must be specified via module parameters: insmod cc770_isa.ko port=0x310,0x380 irq=7,11 for ISA devices using I/O ports or: insmod cc770_isa.ko mem=0xd1000,0xd1000 irq=7,11 for memory mapped ISA devices. Indirect access via address and data port is supported as well: insmod cc770_isa.ko port=0x310,0x380 indirect=1 irq=7,11 Furthermore, the following mode parameter can be defined: clk: External oscillator clock frequency (default=16000000 [16 MHz]) cir: CPU interface register (default=0x40 [CPU_DSC]) ocr, Bus configuration register (default=0x00) cor, Clockout register (default=0x00) Note: for clk, cir, bcr and cor, the first argument re-defines the default for all other devices, e.g.: insmod cc770_isa.ko mem=0xd1000,0xd1000 irq=7,11 clk=24000000 is equivalent to insmod cc770_isa.ko mem=0xd1000,0xd1000 irq=7,11 clk=24000000,24000000 Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	can: cc770: add driver core for the Bosch CC770 and Intel AN82527	Wolfgang Grandegger	6	-0/+1155
	Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	net/smsc911x: Add regulator support	Robert Marklund	1	-11/+100
	Add some basic regulator support for the power pins, as needed by the ST-Ericsson Snowball platform that powers up the SMSC911 chip using an external regulator. Platforms that use regulators and the smsc911x and have no defined regulator for the smsc911x and claim complete regulator constraints with no dummy regulators will need to provide it, for example using a fixed voltage regulator. It appears that this may affect (apart from Ux500 Snowball) possibly these archs/machines that from some grep:s appear to define both CONFIG_SMSC911X and CONFIG_REGULATOR: - ARM Freescale mx3 and OMAP 2 plus, Raumfeld machines - Blackfin - Super-H Cc: Paul Mundt <lethal@linux-sh.org> Cc: linux-sh@vger.kernel.org Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Tony Lindgren <tony@atomide.com> Cc: linux-omap@vger.kernel.org Cc: Mike Frysinger <vapier@gentoo.org> Cc: uclinux-dist-devel@blackfin.uclinux.org Reviewed-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Robert Marklund <robert.marklund@stericsson.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	can: sja1000_isa: convert to platform driver to support x86_64 systems	Wolfgang Grandegger	2	-30/+65
	This driver is currently not supported on x86_64 systems because the "isa_driver" interface is used (CONFIG_ISA=y). To overcome this limitation, the driver is converted to a platform driver, similar to the serial 8250 driver. Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	ipv4: remove useless codes in ipmr_device_event()	RongQing.Li	1	-3/+1
	Commit 7dc00c82 added a 'notify' parameter for vif_delete() to distinguish whether to unregister the device. When notify=1 means we does not need to unregister the device, so calling unregister_netdevice_many is useless. Signed-off-by: RongQing.Li <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	can: sja1000_isa: fix "limited range" compiler warnings	Wolfgang Grandegger	1	-12/+12
	This patch fixes the compiler warnings: "comparison is always false due to limited range of data type" by using "0xff" instead of "-1" for unsigned values. Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	net: Fix skb_update_prio RCU usage.	Igor Maravic	1	-1/+1
	Change function rcu_dereference to rcu_dereference_bh to avoid warning [ INFO: suspicious RCU usage. ] ------------------------------- net/core/dev.c:2459 suspicious rcu_dereference_check() usage! because we are locking with rcu_read_lock_bh(); in function dev_queue_xmit(struct sk_buff *skb) Signed-off-by: Igor Maravic <igorm@etf.rs> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	netlabel: Fix build problems when IPv6 is not enabled	Paul Moore	1	-8/+14
	A recent fix to the the NetLabel code caused build problem with configurations that did not have IPv6 enabled; see below: netlabel_kapi.c: In function 'netlbl_cfg_unlbl_map_add': netlabel_kapi.c:165:4: error: implicit declaration of function 'netlbl_af6list_add' This patch fixes this problem by making the IPv6 specific code conditional on the IPv6 configuration flags as we done in the rest of NetLabel and the network stack as a whole. We have to move some variable declarations around as a result so things may not be quite as pretty, but at least it builds cleanly now. Some additional IPv6 conditionals were added to the NetLabel code as well for the sake of consistency. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Paul Moore <pmoore@redhat.com> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	IB: Fix RCU lockdep splats	Eric Dumazet	6	-14/+35
	Commit f2c31e32b37 ("net: fix NULL dereferences in check_peer_redir()") forgot to take care of infiniband uses of dst neighbours. Many thanks to Marc Aurele who provided a nice bug report and feedback. Reported-by: Marc Aurele La France <tsi@ualberta.ca> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-29	IB/ipoib: Prevent hung task or softlockup processing multicast response	Mike Marciniszyn	3	-8/+14
	This following can occur with ipoib when processing a multicast reponse: BUG: soft lockup - CPU#0 stuck for 67s! [ib_mad1:982] Modules linked in: ... CPU 0: Modules linked in: ... Pid: 982, comm: ib_mad1 Not tainted 2.6.32-131.0.15.el6.x86_64 #1 ProLiant DL160 G5 RIP: 0010:[<ffffffff814ddb27>] [<ffffffff814ddb27>] _spin_unlock_irqrestore+0x17/0x20 RSP: 0018:ffff8802119ed860 EFLAGS: 00000246 0000000000000004 RBX: ffff8802119ed860 RCX: 000000000000a299 RDX: ffff88021086c700 RSI: 0000000000000246 RDI: 0000000000000246 RBP: ffffffff8100bc8e R08: ffff880210ac229c R09: 0000000000000000 R10: ffff88021278aab8 R11: 0000000000000000 R12: ffff8802119ed860 R13: ffffffff8100be6e R14: 0000000000000001 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00000000006d4840 CR3: 0000000209aa5000 CR4: 00000000000406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: [<ffffffffa032c247>] ? ipoib_mcast_send+0x157/0x480 [ib_ipoib] [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 [<ffffffffa03283d4>] ? ipoib_path_lookup+0x124/0x2d0 [ib_ipoib] [<ffffffffa03286fc>] ? ipoib_start_xmit+0x17c/0x430 [ib_ipoib] [<ffffffff8141e758>] ? dev_hard_start_xmit+0x2c8/0x3f0 [<ffffffff81439d0a>] ? sch_direct_xmit+0x15a/0x1c0 [<ffffffff81423098>] ? dev_queue_xmit+0x388/0x4d0 [<ffffffffa032d6b7>] ? ipoib_mcast_join_finish+0x2c7/0x510 [ib_ipoib] [<ffffffffa032dab8>] ? ipoib_mcast_sendonly_join_complete+0x1b8/0x1f0 [ib_ipoib] [<ffffffffa02a0946>] ? mcast_work_handler+0x1a6/0x710 [ib_sa] [<ffffffffa015f01e>] ? ib_send_mad+0xfe/0x3c0 [ib_mad] [<ffffffffa00f6c93>] ? ib_get_cached_lmc+0xa3/0xb0 [ib_core] [<ffffffffa02a0f9b>] ? join_handler+0xeb/0x200 [ib_sa] [<ffffffffa029e4fc>] ? ib_sa_mcmember_rec_callback+0x5c/0xa0 [ib_sa] [<ffffffffa029e79c>] ? recv_handler+0x3c/0x70 [ib_sa] [<ffffffffa01603a4>] ? ib_mad_completion_handler+0x844/0x9d0 [ib_mad] [<ffffffffa015fb60>] ? ib_mad_completion_handler+0x0/0x9d0 [ib_mad] [<ffffffff81088830>] ? worker_thread+0x170/0x2a0 [<ffffffff8108e160>] ? autoremove_wake_function+0x0/0x40 [<ffffffff810886c0>] ? worker_thread+0x0/0x2a0 [<ffffffff8108ddf6>] ? kthread+0x96/0xa0 [<ffffffff8100c1ca>] ? child_rip+0xa/0x20 Coinciding with stack trace is the following message: ib0: ib_address_create failed The code below in ipoib_mcast_join_finish() will note the above failure in the address handle but otherwise continue: ah = ipoib_create_ah(dev, priv->pd, &av); if (!ah) { ipoib_warn(priv, "ib_address_create failed\n"); } else { The while loop at the bottom of ipoib_mcast_join_finish() will attempt to send queued multicast packets in mcast->pkt_queue and eventually end up in ipoib_mcast_send(): if (!mcast->ah) { if (skb_queue_len(&mcast->pkt_queue) < IPOIB_MAX_MCAST_QUEUE) skb_queue_tail(&mcast->pkt_queue, skb); else { ++dev->stats.tx_dropped; dev_kfree_skb_any(skb); } My read is that the code will requeue the packet and return to the ipoib_mcast_join_finish() while loop and the stage is set for the "hung" task diagnostic as the while loop never sees a non-NULL ah, and will do nothing to resolve. There are GFP_ATOMIC allocates in the provider routines, so this is possible and should be dealt with. The test that induced the failure is associated with a host SM on the same server during a shutdown. This patch causes ipoib_mcast_join_finish() to exit with an error which will flush the queued mcast packets. Nothing is done to unwind the QP attached state so that subsequent sends from above will retry the join. Reviewed-by: Ram Vepa <ram.vepa@qlogic.com> Reviewed-by: Gary Leshner <gary.leshner@qlogic.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-29	sctp: better integer overflow check in sctp_auth_create_key()	Xi Wang	1	-1/+1
	The check from commit 30c2235c is incomplete and cannot prevent cases like key_len = 0x80000000 (INT_MAX + 1). In that case, the left-hand side of the check (INT_MAX - key_len), which is unsigned, becomes 0xffffffff (UINT_MAX) and bypasses the check. However this shouldn't be a security issue. The function is called from the following two code paths: 1) setsockopt() 2) sctp_auth_asoc_set_secret() In case (1), sca_keylength is never going to exceed 65535 since it's bounded by a u16 from the user API. As such, the key length will never overflow. In case (2), sca_keylength is computed based on the user key (1 short) and 2 * key_vector (3 shorts) for a total of 7 * USHRT_MAX, which still will not overflow. In other words, this overflow check is not really necessary. Just make it more correct. Signed-off-by: Xi Wang <xi.wang@gmail.com> Cc: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	sch_choke: use skb_flow_dissect()	Eric Dumazet	1	-89/+31
	Instead of using a custom flow dissector, use skb_flow_dissect() and benefit from tunnelling support. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	sch_sfq: use skb_flow_dissect()	Eric Dumazet	1	-55/+10
	Instead of using a custom flow dissector, use skb_flow_dissect() and benefit from tunnelling support. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	tcp: avoid frag allocation for small frames	Eric Dumazet	1	-3/+6
	tcp_sendmsg() uses select_size() helper to choose skb head size when a new skb must be allocated. If GSO is enabled for the socket, current strategy is to force all payload data to be outside of headroom, in PAGE fragments. This strategy is not welcome for small packets, wasting memory. Experiments show that best results are obtained when using 2048 bytes for skb head (This includes the skb overhead and various headers) This patch provides better len/truesize ratios for packets sent to loopback device, and reduce memory needs for in-flight loopback packets, particularly on arches with big pages. If a sender sends many 1-byte packets to an unresponsive application, receiver rmem_alloc will grow faster and will stop queuing these packets sooner, or will collapse its receive queue to free excess memory. netperf -t TCP_RR results are improved by ~4 %, and many workloads are improved as well (tbench, mysql...) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	flow_dissector: use a 64bit load/store	Eric Dumazet	2	-2/+12
	Le lundi 28 novembre 2011 à 19:06 -0500, David Miller a écrit : > From: Dimitris Michailidis <dm@chelsio.com> > Date: Mon, 28 Nov 2011 08:25:39 -0800 > > >> +bool skb_flow_dissect(const struct sk_buff skb, struct flow_keys > >> flow) > >> +{ > >> + int poff, nhoff = skb_network_offset(skb); > >> + u8 ip_proto; > >> + u16 proto = skb->protocol; > > > > __be16 instead of u16 for proto? > > I'll take care of this when I apply these patches. ( CC trimmed ) Thanks David ! Here is a small patch to use one 64bit load/store on x86_64 instead of two 32bit load/stores. [PATCH net-next] flow_dissector: use a 64bit load/store gcc compiler is smart enough to use a single load/store if we memcpy(dptr, sptr, 8) on x86_64, regardless of CONFIG_CC_OPTIMIZE_FOR_SIZE In IP header, daddr immediately follows saddr, this wont change in the future. We only need to make sure our flow_keys (src,dst) fields wont break the rule. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29	sfc: Support for byte queue limits	Tom Herbert	1	-6/+21
	Changes to sfc to use byte queue limits. Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>