aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2007-04-25[TCP]: Make snd_cwnd_clamp a u32.David S. Miller1-1/+1
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP]: Keep copied_seq, rcv_wup and rcv_next together.Eric Dumazet2-6/+4
I noticed in oprofile study a cache miss in tcp_rcv_established() to read copied_seq. ffffffff80400a80 <tcp_rcv_established>: /* tcp_rcv_established total: 4034293   2.0400 */  55493  0.0281 :ffffffff80400bc9:   mov    0x4c8(%r12),%eax copied_seq 543103  0.2746 :ffffffff80400bd1:   cmp    0x3e0(%r12),%eax   rcv_nxt     if (tp->copied_seq == tp->rcv_nxt &&         len - tcp_header_len <= tp->ucopy.len) { In this function, the cache line 0x4c0 -> 0x500 is used only for this reading 'copied_seq' field. rcv_wup and copied_seq should be next to rcv_nxt field, to lower number of active cache lines in hot paths. (tcp_rcv_established(), tcp_poll(), ...) As you suggested, I changed tcp_create_openreq_child() so that these fields are changed together, to avoid adding a new store buffer stall. Patch is 64bit friendly (no new hole because of alignment constraints) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP]: struct *sock argument renamed: sp -> skIlpo Järvinen1-11/+11
In general, TCP code uses "sk" for struct sock pointer. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP]: Add RFC3742 Limited Slow-Start, controlled by variable sysctl_tcp_max_ssthresh.John Heffner4-9/+32
Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP] YeAH-TCP: algorithm implementationAngelo P. Castellani4-0/+437
YeAH-TCP is a sender-side high-speed enabled TCP congestion control algorithm, which uses a mixed loss/delay approach to compute the congestion window. It's design goals target high efficiency, internal, RTT and Reno fairness, resilience to link loss while keeping network elements load as low as possible. For further details look here: http://wil.cs.caltech.edu/pfldnet2007/paper/YeAH_TCP.pdf Signed-off-by: Angelo P. Castellani <angelo.castellani@gmail.con> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP] FRTO: Sysctl documentation for SACK enhanced versionIlpo Järvinen1-1/+4
The description is overly verbose to avoid ambiguity between "SACK enabled" and "SACK enhanced FRTO" Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP]: SACK enhanced FRTOIlpo Järvinen1-11/+65
Implements the SACK-enhanced FRTO given in RFC4138 using the variant given in Appendix B. RFC4138, Appendix B: "This means that in order to declare timeout spurious, the TCP sender must receive an acknowledgment for non-retransmitted segment between SND.UNA and RecoveryPoint in algorithm step 3. RecoveryPoint is defined in conservative SACK-recovery algorithm [RFC3517]" The basic version of the FRTO algorithm can still be used also when SACK is enabled. To enabled SACK-enhanced version, tcp_frto sysctl is set to 2. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP]: Prevent reordering adjustments during FRTOIlpo Järvinen1-1/+2
To be honest, I'm not too sure how the reord stuff works in the first place but this seems necessary. When FRTO has been active, the one and only retransmission could be unnecessary but the state and sending order might not be what the sacktag code expects it to be (to work correctly). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP] FRTO: Fake cwnd for ssthresh callbackIlpo Järvinen1-1/+25
TCP without FRTO would be in Loss state with small cwnd. FRTO, however, leaves cwnd (typically) to a larger value which causes ssthresh to become too large in case RTO is triggered again compared to what conventional recovery would do. Because consecutive RTOs result in only a single ssthresh reduction, RTO+cumulative ACK+RTO pattern is required to trigger this event. A large comment is included for congestion control module writers trying to figure out what CA_EVENT_FRTO handler should do because there exists a remote possibility of incompatibility between FRTO and module defined ssthresh functions. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP] FRTO: Reverse RETRANS bit clearing logicIlpo Järvinen1-12/+23
Previously RETRANS bits were cleared on the entry to FRTO. We postpone that into tcp_enter_frto_loss, which is really the place were the clearing should be done anyway. This allows simplification of the logic from a clearing loop to the head skb clearing only. Besides, the other changes made in the previous patches to tcp_use_frto made it impossible for the non-SACKed FRTO to be entered if other than the head has been rexmitted. With SACK-enhanced FRTO (and Appendix B), however, there can be a number retransmissions in flight when RTO expires (same thing could happen before this patchset also with non-SACK FRTO). To not introduce any jumpiness into the packet counting during FRTO, instead of clearing RETRANS bits from skbs during entry, do it later on. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP] FRTO: Entry is allowed only during (New)Reno like recoveryIlpo Järvinen2-5/+22
This interpretation comes from RFC4138: "If the sender implements some loss recovery algorithm other than Reno or NewReno [FHG04], the F-RTO algorithm SHOULD NOT be entered when earlier fast recovery is underway." I think the RFC means to say (especially in the light of Appendix B) that ...recovery is underway (not just fast recovery) or was underway when it was interrupted by an earlier (F-)RTO that hasn't yet been resolved (snd_una has not advanced enough). Thus, my interpretation is that whenever TCP has ever retransmitted other than head, basic version cannot be used because then the order assumptions which are used as FRTO basis do not hold. NewReno has only the head segment retransmitted at a time. Therefore, walk up to the segment that has not been SACKed, if that segment is not retransmitted nor anything before it, we know for sure, that nothing after the non-SACKed segment should be either. This assumption is valid because TCPCB_EVER_RETRANS does not leave holes but each non-SACKed segment is rexmitted in-order. Check for retrans_out > 1 avoids more expensive walk through the skb list, as we can know the result beforehand: F-RTO will not be allowed. SACKed skb can turn into non-SACked only in the extremely rare case of SACK reneging, in this case we might fail to detect retransmissions if there were them for any other than head. To get rid of that feature, whole rexmit queue would have to be walked (always) or FRTO should be prevented when SACK reneging happens. Of course RTO should still trigger after reneging which makes this issue even less likely to show up. And as long as the response is as conservative as it's now, nothing bad happens even then. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP]: Prevent unrelated cwnd adjustment while using FRTOIlpo Järvinen1-7/+11
FRTO controls cwnd when it still processes the ACK input or it has just reverted back to conventional RTO recovery; the normal rules apply when FRTO has reverted to standard congestion control. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP] FRTO: frto_counter modulo-op converted to two assignmentsIlpo Järvinen1-2/+2
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP]: Don't enter to fast recovery while using FRTOIlpo Järvinen1-0/+4
Because TCP is not in Loss state during FRTO recovery, fast recovery could be triggered by accident. Non-SACK FRTO is more robust than not yet included SACK-enhanced version (that can receiver high number of duplicate ACKs with SACK blocks during FRTO), at least with unidirectional transfers, but under extraordinary patterns fast recovery can be incorrectly triggered, e.g., Data loss+ACK losses => cumulative ACK with enough SACK blocks to meet sacked_out >= dupthresh condition). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP] FRTO: Response should reset also snd_cwnd_cntIlpo Järvinen1-0/+1
Since purpose is to reduce CWND, we prevent immediate growth. This is not a major issue nor is "the correct way" specified anywhere. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP] FRTO: fixes fallback to conventional recoveryIlpo Järvinen1-5/+9
The FRTO detection did not care how ACK pattern affects to cwnd calculation of the conventional recovery. This caused incorrect setting of cwnd when the fallback becames necessary. The knowledge tcp_process_frto() has about the incoming ACK is now passed on to tcp_enter_frto_loss() in allowed_segments parameter that gives the number of segments that must be added to packets-in-flight while calculating the new cwnd. Instead of snd_una we use FLAG_DATA_ACKED in duplicate ACK detection because RFC4138 states (in Section 2.2): If the first acknowledgment after the RTO retransmission does not acknowledge all of the data that was retransmitted in step 1, the TCP sender reverts to the conventional RTO recovery. Otherwise, a malicious receiver acknowledging partial segments could cause the sender to declare the timeout spurious in a case where data was lost. If the next ACK after RTO is duplicate, we do not retransmit anything, which is equal to what conservative conventional recovery does in such case. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP] FRTO: Ignore some uninteresting ACKsIlpo Järvinen1-3/+10
Handles RFC4138 shortcoming (in step 2); it should also have case c) which ignores ACKs that are not duplicates nor advance window (opposite dir data, winupdate). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP] FRTO: Use Disorder state during operation instead of OpenIlpo Järvinen1-3/+3
Retransmission counter assumptions are to be changed. Forcing reason to do this exist: Using sysctl in check would be racy as soon as FRTO starts to ignore some ACKs (doing that in the following patches). Userspace may disable it at any moment giving nice oops if timing is right. frto_counter would be inaccessible from userspace, but with SACK enhanced FRTO retrans_out can include other than head, and possibly leaving it non-zero after spurious RTO, boom again. Luckily, solution seems rather simple: never go directly to Open state but use Disorder instead. This does not really change much, since TCP could anyway change its state to Disorder during FRTO using path tcp_fastretrans_alert -> tcp_try_to_open (e.g., when a SACK block makes ACK dubious). Besides, Disorder seems to be the state where TCP should be if not recovering (in Recovery or Loss state) while having some retransmissions in-flight (see tcp_try_to_open), which is exactly what happens with FRTO. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP] FRTO: Consecutive RTOs keep prior_ssthresh and ssthreshIlpo Järvinen1-6/+14
In case a latency spike causes more than one RTO, the later should not cause the already reduced ssthresh to propagate into the prior_ssthresh since FRTO declares all such RTOs spurious at once or none of them. In treating of ssthresh, we mimic what tcp_enter_loss() does. The previous state (in frto_counter) must be available until we have checked it in tcp_enter_frto(), and also ACK information flag in process_frto(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP] FRTO: Comment cleanup & improvementIlpo Järvinen1-17/+32
Moved comments out from the body of process_frto() to the head (preferred way; see Documentation/CodingStyle). Bonus: it's much easier to read in this compacted form. FRTO algorithm and implementation is described in greater detail. For interested reader, more information is available in RFC4138. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP] FRTO: Moved tcp_use_frto from tcp.h to tcp_input.cIlpo Järvinen2-13/+14
In addition, removed inline. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP] FRTO: Separated response from FRTO detection algorithmIlpo Järvinen1-6/+10
FRTO spurious RTO detection algorithm (RFC4138) does not include response to a detected spurious RTO but can use different response algorithms. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25[TCP] FRTO: Incorrectly clears TCPCB_EVER_RETRANS bitIlpo Järvinen1-1/+1
FRTO was slightly too brave... Should only clear TCPCB_SACKED_RETRANS bit. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25Linux 2.6.21Linus Torvalds1-1/+1
.. ok, enough waffling about it already. "Just do it!" Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-25Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds2-5/+8
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [PARPORT] SUNBPP: Fix OOPS when debugging is enabled. [SPARC] openprom: Switch to ref counting PCI API
2007-04-25Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds1-1/+7
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NETLINK]: Infinite recursion in netlink.
2007-04-25packet: fix error handlingAndrew Morton1-1/+2
The packet driver is assuming (reasonably) that the (undocumented) request.errors is an errno. But it is in fact some mysterious bitfield. When things go wrong we return weird positive numbers to the VFS as pointers and it goes oops. Thanks to William Heimbigner for reporting and diagnosis. (It doesn't oops, but this driver still doesn't work for William) Cc: William Heimbigner <icxcnika@mar.tar.cc> Cc: Peter Osterlund <petero2@telia.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-25[NETLINK]: Infinite recursion in netlink.Alexey Kuznetsov1-1/+7
Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel, which resulted in infinite recursion and stack overflow. The bug is present in all kernel versions since the feature appeared. The patch also makes some minimal cleanup: 1. Return something consistent (-ENOENT) when fib table is missing 2. Do not crash when queue is empty (does not happen, but yet) 3. Put result of lookup Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25cfq-iosched: fix alias + front merge bugJens Axboe1-6/+6
There's a really rare and obscure bug in CFQ, that causes a crash in cfq_dispatch_insert() due to rq == NULL. One example of the resulting oops is seen here: http://lkml.org/lkml/2007/4/15/41 Neil correctly diagnosed the situation for how this can happen: if two concurrent requests with the exact same sector number (due to direct IO or aliasing between MD and the raw device access), the alias handling will add the request to the sortlist, but next_rq remains NULL. Read the more complete analysis at: http://lkml.org/lkml/2007/4/25/57 This looks like it requires md to trigger, even though it should potentially be possible to due with O_DIRECT (at least if you edit the kernel and doctor some of the unplug calls). The fix is to move the ->next_rq update to when we add a request to the rbtree. Then we remove the possibility for a request to exist in the rbtree code, but not have ->next_rq correctly updated. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-24IPv6: fix Routing Header Type 0 handling thinkoYOSHIFUJI Hideaki1-1/+1
Oops, thinko. The test for accempting a RH0 was exatly the wrong way around. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-24Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds9-20/+79
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [BNX2]: Fix occasional NETDEV WATCHDOG on 5709. [IPV6]: Disallow RH0 by default. [XFRM]: beet: fix pseudo header length value [TCP]: Congestion control initialization.
2007-04-24[BNX2]: Fix occasional NETDEV WATCHDOG on 5709.Michael Chan2-2/+6
Tweak a register setting to prevent the tx mailbox from halting. Update version to 1.5.8. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-24[IPV6]: Disallow RH0 by default.YOSHIFUJI Hideaki5-6/+58
A security issue is emerging. Disallow Routing Header Type 0 by default as we have been doing for IPv4. Note: We allow RH2 by default because it is harmless. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-24[MIPS] Fix oprofile logic to physical counter remappingRalf Baechle1-1/+1
This did cause oprofile to fail on non-multithreaded systems with more than 2 processors such as the BCM1480. Reported by Manish Lachwani (mlachwani@mvista.com). Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-04-24Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6Linus Torvalds5-33/+40
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: drivers/net/hamradio/baycom_ser_fdx build fix usb-net/pegasus: fix pegasus carrier detection sis900: Allocate rx replacement buffer before rx operation [netdrvr] depca: handle platform_device_add() failure
2007-04-24drivers/net/hamradio/baycom_ser_fdx build fixAndrew Morton1-2/+4
sparc64: drivers/net/hamradio/baycom_ser_fdx.c: In function `ser12_open': drivers/net/hamradio/baycom_ser_fdx.c:417: error: `NR_IRQS' undeclared (first us e in this function) drivers/net/hamradio/baycom_ser_fdx.c:417: error: (Each undeclared identifier is reported only once drivers/net/hamradio/baycom_ser_fdx.c:417: error: for each function it appears i n.) Cc: Folkert van Heusden <folkert@vanheusden.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-04-24usb-net/pegasus: fix pegasus carrier detectionDan Williams2-6/+14
Broken by 4a1728a28a193aa388900714bbb1f375e08a6d8e which switched the return semantics of read_mii_word() but didn't fix usage of read_mii_word() to conform to the new semantics. Setting carrier to off based on the NO_CARRIER flag is also incorrect as that flag only triggers on TX failure and therefore isn't correct when no frames are being transmitted. Since there is already a 2*HZ MII carrier check going on, defer to that. Add a TRUST_LINK_STATUS feature flag for adapters where the LINK_STATUS flag is actually correct, and use that rather than the NO_CARRIER flag. Signed-off-by: Dan Williams <dcbw@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-04-24sis900: Allocate rx replacement buffer before rx operationNeil Horman1-24/+20
The sis900 driver appears to have a bug in which the receive routine passes the skbuff holding the received frame to the network stack before refilling the buffer in the rx ring. If a new skbuff cannot be allocated, the driver simply leaves a hole in the rx ring, which causes the driver to stop receiving frames and become non-recoverable without an rmmod/insmod according to reporters. This patch reverses that order, attempting to allocate a replacement buffer first, and receiving the new frame only if one can be allocated. If no skbuff can be allocated, the current skbuf in the rx ring is recycled, dropping the current frame, but keeping the NIC operational. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-04-24[netdrvr] depca: handle platform_device_add() failureAndrea Righi1-1/+2
The following patch fixes a kernel bug in depca_platform_probe(). We don't use a dynamic pointer for pldev->dev.platform_data, so it seems that the correct way to proceed if platform_device_add(pldev) fails is to explicitly set the pldev->dev.platform_data pointer to NULL, before calling the platform_device_put(pldev), or it will be kfree'ed by platform_device_release(). Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-04-24Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6Linus Torvalds5-29/+12
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: [PATCH] i386: Fix some warnings added by earlier patch [PATCH] x86-64: Always flush all pages in change_page_attr [PATCH] x86: Remove noreplacement option [PATCH] x86-64: make GART PTEs uncacheable
2007-04-24Merge master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6Linus Torvalds1-32/+13
* master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6: Revert "adjust legacy IDE resource setting (v2)"
2007-04-248250: fix possible deadlock between serial8250_handle_port() and serial8250_interrupt()Jiri Kosina1-2/+3
Commit 40b36daa introduced possibility that serial8250_backup_timeout() -> serial8250_handle_port() locks port.lock without disabling irqs, thus allowing deadlock against interrupt handler (port.lock is acquired in serial8250_interrupt()). Spotted by lockdep. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Alex Williamson <alex.williamson@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-24fault injection: add entry to MAINTAINERSAkinobu Mita1-0/+5
Add maintainer for fault injection support. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-24Char: icom, mark __init as __devinitJiri Slaby1-2/+2
Two functions are called from __devinit context, but they are marked as __init. Fix this. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-24reiserfs: fix xattr root locking/refcount bugJeff Mahoney1-68/+24
The listxattr() and getxattr() operations are only protected by a read lock. As a result, if either of these operations run in parallel, a race condition exists where the xattr_root will end up being cached twice, which results in the leaking of a reference and a BUG() on umount. This patch refactors get_xa_root(), __get_xa_root(), and create_xa_root(), into one get_xa_root() function that takes the appropriate locking around the entire critical section. Reported, diagnosed and tested by Andrea Righi <a.righi@cineca.it> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: Andrea Righi <a.righi@cineca.it> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Edward Shishkin <edward@namesys.com> Cc: Alex Zarochentsev <zam@namesys.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-24hwmon/w83627ehf: Don't redefine REGION_OFFSETJean Delvare1-7/+7
On ia64, kernel headers define REGION_OFFSET so we can't use that. Reported by Andrew Morton. Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: David Hubbard <david.c.hubbard@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-24do not truncate irq number for icom adapterOlaf Hering2-4/+2
irq values are u32, not u8. Large irq numbers will be truncated, free_irq may free a different irq. Remove incorrectly sized struct member and use the one from pci_dev. Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-24Allow reading tainted flag as userBastian Blank1-1/+1
The commit 34f5a39899f3f3e815da64f48ddb72942d86c366 restricted reading of the tainted value. The attached patch changes this back to a write-only check and restores the read behaviour of older versions. Signed-off-by: Bastian Blank <bastian@waldi.eu.org> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-24acpi-thermal: fix mod_timer() intervalAndrew Morton1-1/+2
Use relative time, not absolute. Discovered by Jung-Ik (John) Lee <jilee@google.com>. Cc: Jung-Ik (John) Lee <jilee@google.com> Acked-by: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-24v9fs: don't use primary fid when removing fileLatchesar Ionkov1-1/+1
v9fs_insert uses v9fs_fid_lookup (which also locks the fid) to get the primary fid associated with the dentry and destroys the v9fs_fid struct after removing the file. If another process called v9fs_fid_lookup on the same dentry, it may wait undefinitely for the fid's lock (as the struct is freed). This patch changes v9fs_remove to use a cloned fid, so the primary fid is not locked and freed. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@hera.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>