aboutsummaryrefslogtreecommitdiffstats
path: root/net/ipv6 (follow)
AgeCommit message (Collapse)AuthorFilesLines
2008-01-20[IPV6] ROUTE: Make sending algorithm more friendly with RFC 4861.YOSHIFUJI Hideaki1-3/+8
We omit (or delay) sending NSes for known-to-unreachable routers (in NUD_FAILED state) according to RFC 4191 (Default Router Preferences and More-Specific Routes). But this is not fully compatible with RFC 4861 (Neighbor Discovery Protocol for IPv6), which does not remember unreachability of neighbors. So, let's avoid mixing sending algorithm of RFC 4191 and that of RFC 4861, and make the algorithm more friendly with RFC 4861 if RFC 4191 is disabled. Issue was found by IPv6 Ready Logo Core Self_Test 1.5.0b2 (by TAHI Project), and has been tracked down by Mitsuru Chinen <mitch@linux.vnet.ibm.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-20[IPV6]: Mischecked tw match in __inet6_check_established.Pavel Emelyanov1-1/+1
When looking for a conflicting connection the !sk->sk_bound_dev_if check is performed only for live sockets, but not for timewait-ed. This is not the case for ipv4, for __inet6_lookup_established in both ipv4 and ipv6 and for other places that check for tw-s. Was this missed accidentally? If so, then this patch fixes it and besides makes use if the dif variable declared in the function. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-10[NETFILTER]: ip6t_eui64: Fixes calculation of Universal/Local bitYasuyuki Kozakai1-1/+1
RFC2464 says that the next to lowerst order bit of the first octet of the Interface Identifier is formed by complementing the Universal/Local bit of the EUI-64. But ip6t_eui64 uses OR not XOR. Thanks Peter Ivancik for reporing this bug and posting a patch for it. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-08[IPV6]: IPV6_MULTICAST_IF setting is ignored on link-local connect()Brian Haley1-3/+3
Signed-off-by: Brian Haley <brian.haley@hp.com> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-20[IPV6]: Spelling fixesJoe Perches1-1/+1
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-16[IPV6]: Fix the return value of ipv6_getsockoptWei Yongjun1-7/+5
If CONFIG_NETFILTER if not selected when compile the kernel source code, ipv6_getsockopt will returen an EINVAL error if optname is not supported by the kernel. But if CONFIG_NETFILTER is selected, ENOPROTOOPT error will be return. This patch fix to always return ENOPROTOOPT error if optname argument of ipv6_getsockopt is not supported by the kernel. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-11[IPv6] ESP: Discard dummy packets introduced in rfc4303Thomas Graf1-0/+6
RFC4303 introduces dummy packets with a nexthdr value of 59 to implement traffic confidentiality. Such packets need to be dropped silently and the payload may not be attempted to be parsed as it consists of random chunk. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-11[IPV6] XFRM: Fix auditing rt6i_flags; use RTF_xxx flags instead of RTCF_xxx.YOSHIFUJI Hideaki1-1/+1
RTCF_xxx flags, defined in include/linux/in_route.h) are available for IPv4 route (rtable) entries only. Use RTF_xxx flags instead, defined in include/linux/ipv6_route.h, for IPv6 route entries (rt6_info). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-12-07[IPv6] SNMP: Increment OutNoRoutes when connecting to unreachable networkMitsuru Chinen1-0/+2
IPv6 stack doesn't increment OutNoRoutes counter when IP datagrams is being discarded because no route could be found to transmit them to their destination. IPv6 stack should increment the counter. Incidentally, IPv4 stack increments that counter in such situation. Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-30[IPV6]: Restore IPv6 when MTU is big enoughEvgeniy Polyakov1-1/+10
Avaid provided test application, so bug got fixed. IPv6 addrconf removes ipv6 inner device from netdev each time cmu changes and new value is less than IPV6_MIN_MTU (1280 bytes). When mtu is changed and new value is greater than IPV6_MIN_MTU, it does not add ipv6 addresses and inner device bac. This patch fixes that. Tested with Avaid's application, which works ok now. Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2007-11-20[IPV6] TCPMD5: Fix deleting key operation.YOSHIFUJI Hideaki1-4/+2
Due to the bug, refcnt for md5sig pool was leaked when an user try to delete a key if we have more than one key. In addition to the leakage, we returned incorrect return result value for userspace. This fix should close Bug #9418, reported by <ming-baini@163.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-20[IPV6] TCPMD5: Check return value of tcp_alloc_md5sig_pool().YOSHIFUJI Hideaki1-1/+4
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-19[IPV6]: Add missing "space"Joe Perches1-1/+1
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12[IPV6]: Add ifindex field to ND user option messages.Pierre Ynard1-0/+1
Userland neighbor discovery options are typically heavily involved with the interface on which thay are received: add a missing ifindex field to the original struct. Thanks to Rémi Denis-Courmont. Signed-off-by: Pierre Ynard <linkfanel@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[INET]: Small possible memory leak in FIB rulesDenis V. Lunev1-22/+15
This patch fixes a small memory leak. Default fib rules can be deleted by the user if the rule does not carry FIB_RULE_PERMANENT flag, f.e. by ip rule flush Such a rule will not be freed as the ref-counter has 2 on start and becomes clearly unreachable after removal. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-10[NET]: Make helper to get dst entry and "use" itPavel Emelyanov1-5/+1
There are many places that get the dst entry, increase the __use counter and set the "lastuse" time stamp. Make a helper for this. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[INET]: Remove per bucket rwlock in tcp/dccp ehash table.Eric Dumazet1-9/+10
As done two years ago on IP route cache table (commit 22c047ccbc68fa8f3fa57f0e8f906479a062c426) , we can avoid using one lock per hash bucket for the huge TCP/DCCP hash tables. On a typical x86_64 platform, this saves about 2MB or 4MB of ram, for litle performance differences. (we hit a different cache line for the rwlock, but then the bucket cache line have a better sharing factor among cpus, since we dirty it less often). For netstat or ss commands that want a full scan of hash table, we perform fewer memory accesses. Using a 'small' table of hashed rwlocks should be more than enough to provide correct SMP concurrency between different buckets, without using too much memory. Sizing of this table depends on num_possible_cpus() and various CONFIG settings. This patch provides some locking abstraction that may ease a future work using a different model for TCP/DCCP table. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[IPSEC]: Fix crypto_alloc_comp error checkingHerbert Xu1-1/+2
The function crypto_alloc_comp returns an errno instead of NULL to indicate error. So it needs to be tested with IS_ERR. This is based on a patch by Vicenç Beltran Querol. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[IPV6]: Convert /proc/net/ipv6_route to seq_file interfaceAlexey Dobriyan1-62/+29
This removes last proc_net_create() user. Kudos to Benjamin Thery and Stephen Hemminger for comments on previous version. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[IPV6]: Use the {DEFINE|REF}_PROTO_INUSE infrastructureEric Dumazet4-0/+12
Trivial patch to make "tcpv6,udpv6,udplitev6,rawv6" protocols uses the fast "inuse sockets" infrastructure Each protocol use then a static percpu var, instead of a dynamic one. This saves some ram and some cpu cycles Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[NET]: Define infrastructure to keep 'inuse' changes in an efficent SMP/NUMA way.Eric Dumazet1-15/+4
"struct proto" currently uses an array stats[NR_CPUS] to track change on 'inuse' sockets per protocol. If NR_CPUS is big, this means we use a big memory area for this. Moreover, all this memory area is located on a single node on NUMA machines, increasing memory pressure on the boot node. In this patch, I tried to : - Keep a fast !CONFIG_SMP implementation - Keep a fast CONFIG_SMP implementation for often used protocols (tcp,udp,raw,...) - Introduce a NUMA efficient implementation Some helper macros are defined in include/net/sock.h These macros take into account CONFIG_SMP If a "struct proto" is declared without using DEFINE_PROTO_INUSE / REF_PROTO_INUSE macros, it will automatically use a default implementation, using a dynamically allocated percpu zone. This default implementation will be NUMA efficient, but might use 32/64 bytes per possible cpu because of current alloc_percpu() implementation. However it still should be better than previous implementation based on stats[NR_CPUS] field. When a "struct proto" is changed to use the new macros, we use a single static "int" percpu variable, lowering the memory and cpu costs, still preserving NUMA efficiency. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[IPv6] SNMP: Restore Udp6InErrors incrementationMitsuru Chinen1-3/+2
As the checksum verification is postponed till user calls recv or poll, the inrementation of Udp6InErrors counter should be also postponed. Currently, it is postponed in non-blocking operation case. However it should be postponed in all case like the IPv4 code. Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[IPV6]: Consolidate the ip cork destruction in ip6_output.cPavel Emelyanov1-21/+15
The ip6_push_pending_frames and ip6_flush_pending_frames do the same things to flush the sock's cork. Move this into a separate function and save ~100 bytes from the .text Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[NETFILTER]: Clean up MakefileJan Engelhardt1-12/+16
Sort matches and targets in the NF makefiles. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07[NETFILTER]: ip{,6}_queue: convert to seq_file interfaceAlexey Dobriyan1-17/+20
I plan to kill ->get_info which means killing proc_net_create(). Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-02[SG] Get rid of __sg_mark_end()Jens Axboe1-1/+1
sg_mark_end() overwrites the page_link information, but all users want __sg_mark_end() behaviour where we just set the end bit. That is the most natural way to use the sg list, since you'll fill it in and then mark the end point. So change sg_mark_end() to only set the termination bit. Add a sg_magic debug check as well, and clear a chain pointer if it is set. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-02cleanup asm/scatterlist.h includesAdrian Bunk2-2/+0
Not architecture specific code should not #include <asm/scatterlist.h>. This patch therefore either replaces them with #include <linux/scatterlist.h> or simply removes them if they were unused. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-01[NET]: Forget the zero_it argument of sk_alloc()Pavel Emelyanov1-1/+1
Finally, the zero_it argument can be completely removed from the callers and from the function prototype. Besides, fix the checkpatch.pl warnings about using the assignments inside if-s. This patch is rather big, and it is a part of the previous one. I splitted it wishing to make the patches more readable. Hope this particular split helped. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-30[NET]: Fix incorrect sg_mark_end() calls.David S. Miller2-7/+8
This fixes scatterlist corruptions added by commit 68e3f5dd4db62619fdbe520d36c9ebf62e672256 [CRYPTO] users: Fix up scatterlist conversion errors The issue is that the code calls sg_mark_end() which clobbers the sg_page() pointer of the final scatterlist entry. The first part fo the fix makes skb_to_sgvec() do __sg_mark_end(). After considering all skb_to_sgvec() call sites the most correct solution is to call __sg_mark_end() in skb_to_sgvec() since that is what all of the callers would end up doing anyways. I suspect this might have fixed some problems in virtio_net which is the sole non-crypto user of skb_to_sgvec(). Other similar sg_mark_end() cases were converted over to __sg_mark_end() as well. Arguably sg_mark_end() is a poorly named function because it doesn't just "mark", it clears out the page pointer as a side effect, which is what led to these bugs in the first place. The one remaining plain sg_mark_end() call is in scsi_alloc_sgtable() and arguably it could be converted to __sg_mark_end() if only so that we can delete this confusing interface from linux/scatterlist.h Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-30[IPV6]: remove duplicate call to proc_net_removeDaniel Lezcano1-4/+0
The file /proc/net/if_inet6 is removed twice. First time in: inet6_exit ->addrconf_cleanup And followed a few lines after by: inet6_exit -> if6_proc_exit Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-29[TCP] MD5: Remove some more unnecessary casting.Matthias M. Dellweg1-5/+5
while reviewing the tcp_md5-related code further i came across with another two of these casts which you probably have missed. I don't actually think that they impose a problem by now, but as you said we should remove them. Signed-off-by: Matthias M. Dellweg <2500@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-29[IPV6] NDISC: Fix setting base_reachable_time_ms variable.YOSHIFUJI Hideaki1-1/+1
This bug was introduced by the commit d12af679bcf8995a237560bdf7a4d734f8df5dbb (sysctl: fix neighbour table sysctls). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-27[CRYPTO] users: Fix up scatterlist conversion errorsHerbert Xu1-2/+6
This patch fixes the errors made in the users of the crypto layer during the sg_init_table conversion. It also adds a few conversions that were missing altogether. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26[INET] ESP: Must #include <linux/scatterlist.h>Adrian Bunk1-1/+1
This patch fixes the following compile errors in some configurations: <-- snip --> ... CC net/ipv4/esp4.o /home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv4/esp4.c: In function 'esp_output': /home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv4/esp4.c:113: error: implicit declaration of function 'sg_init_table' make[3]: *** [net/ipv4/esp4.o] Error 1 ... /home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv6/esp6.c: In function 'esp6_output': /home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv6/esp6.c:112: error: implicit declaration of function 'sg_init_table' make[3]: *** [net/ipv6/esp6.o] Error 1 <-- snip --> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26[TCP] IPV6: fix softnet build breakageJeff Garzik1-0/+1
net/ipv6/tcp_ipv6.c: In function 'tcp_v6_rcv': net/ipv6/tcp_ipv6.c:1736: error: implicit declaration of function 'get_softnet_dma' net/ipv6/tcp_ipv6.c:1736: warning: assignment makes pointer from integer without a cast Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26[TCP]: Add missing I/O AT code to ipv6 side.David S. Miller1-0/+2
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26[TCP]: Fix scatterlist handling in MD5 signature support.David S. Miller1-0/+4
Use sg_init_table() and sg_mark_end() as needed. Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26[IPSEC]: Add missing sg_init_table() calls to ESP.David S. Miller1-0/+2
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-23[NET]: Treat the sign of the result of skb_headroom() consistentlyChuck Lever3-3/+3
In some places, the result of skb_headroom() is compared to an unsigned integer, and in others, the result is compared to a signed integer. Make the comparisons consistent and correct. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-22[IPSEC] IPV6: Fix to add tunnel mode SA correctly.Masahide NAKAMURA2-0/+2
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-18[INET]: Justification for local port range robustness.Anton Arapov1-1/+1
There is a justifying patch for Stephen's patches. Stephen's patches disallows using a port range of one single port and brakes the meaning of the 'remaining' variable, in some places it has different meaning. My patch gives back the sense of 'remaining' variable. It should mean how many ports are remaining and nothing else. Also my patch allows using a single port. I sure we must be able to use mentioned port range, this does not restricted by documentation and does not brake current behavior. usefull links: Patches posted by Stephen Hemminger http://marc.info/?l=linux-netdev&m=119206106218187&w=2 http://marc.info/?l=linux-netdev&m=119206109918235&w=2 Andrew Morton's comment http://marc.info/?l=linux-kernel&m=119248225007737&w=2 1. Allows using a port range of one single port. 2. Gives back sense of 'remaining' variable. Signed-off-by: Anton Arapov <aarapov@redhat.com> Acked-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-18Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds16-249/+146
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (51 commits) [IPV6]: Fix again the fl6_sock_lookup() fixed locking [NETFILTER]: nf_conntrack_tcp: fix connection reopening fix [IPV6]: Fix race in ipv6_flowlabel_opt() when inserting two labels [IPV6]: Lost locking in fl6_sock_lookup [IPV6]: Lost locking when inserting a flowlabel in ipv6_fl_list [NETFILTER]: xt_sctp: fix mistake to pass a pointer where array is required [NET]: Fix OOPS due to missing check in dev_parse_header(). [TCP]: Remove lost_retrans zero seqno special cases [NET]: fix carrier-on bug? [NET]: Fix uninitialised variable in ip_frag_reasm() [IPSEC]: Rename mode to outer_mode and add inner_mode [IPSEC]: Disallow combinations of RO and AH/ESP/IPCOMP [IPSEC]: Use the top IPv4 route's peer instead of the bottom [IPSEC]: Store afinfo pointer in xfrm_mode [IPSEC]: Add missing BEET checks [IPSEC]: Move type and mode map into xfrm_state.c [IPSEC]: Fix length check in xfrm_parse_spi [IPSEC]: Move ip_summed zapping out of xfrm6_rcv_spi [IPSEC]: Get nexthdr from caller in xfrm6_rcv_spi [IPSEC]: Move tunnel parsing for IPv4 out of xfrm4_input ...
2007-10-18sysctl: remove broken netfilter binary sysctlsEric W. Biederman2-2/+0
No one has bothered to set strategy routine for the the netfilter sysctls that return jiffies to be sysctl_jiffies. So it appears the sys_sysctl path is unused and untested, so this patch removes the binary sysctl numbers. Which fixes the netfilter oops in 2.6.23-rc2-mm2 for me. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Patrick McHardy <kaber@trash.net> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18sysctl: ipv6 route flushing (kill binary path)Eric W. Biederman1-1/+0
We don't preoperly support the sysctl binary path for flushing the ipv6 routes. So remove support for a binary path. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18sysctl: fix neighbour table sysctls.Eric W. Biederman1-14/+10
- In ipv6 ndisc_ifinfo_syctl_change so it doesn't depend on binary sysctl names for a function that works with proc. - In neighbour.c reorder the table to put the possibly unused entries at the end so we can remove them by terminating the table early. - In neighbour.c kill the entries with questionable binary sysctl handling behavior. - In neighbour.c if we don't have a strategy routine remove the binary path. So we don't the default sysctl strategy routine on data that is not ready for it. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18[IPV6]: Fix again the fl6_sock_lookup() fixed lockingPavel Emelyanov1-1/+1
YOSHIFUJI fairly pointed out, that the users increment should be done under the ip6_sk_fl_lock not to give IPV6_FL_A_PUT a chance to put this count to zero and release the flowlabel. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-18[IPV6]: Fix race in ipv6_flowlabel_opt() when inserting two labelsPavel Emelyanov1-9/+25
In the IPV6_FL_A_GET case the hash is checked for flowlabels with the given label. If it is not found, the lock, protecting the hash, is dropped to be re-get for writing. After this a newly allocated entry is inserted, but no checks are performed to catch a classical SMP race, when the conflicting label may be inserted on another cpu. Use the (currently unused) return value from fl_intern() to return the conflicting entry (if found) and re-check, whether we can reuse it (IPV6_FL_F_EXCL) or return -EEXISTS. Also add the comment, about why not re-lookup the current sock for conflicting flowlabel entry. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-18[IPV6]: Lost locking in fl6_sock_lookupPavel Emelyanov1-0/+3
This routine scans the ipv6_fl_list whose update is protected with the socket lock and the ip6_sk_fl_lock. Since the socket lock is not taken in the lookup, use the other one. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-18[IPV6]: Lost locking when inserting a flowlabel in ipv6_fl_listPavel Emelyanov1-8/+12
The new flowlabels should be inserted into the sock list under the ip6_sk_fl_lock. This was lost in one place. This list is naturally protected with the socket lock, but the fl6_sock_lookup() is called without it, so another protection is required. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-17[IPSEC]: Rename mode to outer_mode and add inner_modeHerbert Xu3-4/+4
This patch adds a new field to xfrm states called inner_mode. The existing mode object is renamed to outer_mode. This is the first part of an attempt to fix inter-family transforms. As it is we always use the outer family when determining which mode to use. As a result we may end up shoving IPv4 packets into netfilter6 and vice versa. What we really want is to use the inner family for the first part of outbound processing and the outer family for the second part. For inbound processing we'd use the opposite pairing. I've also added a check to prevent silly combinations such as transport mode with inter-family transforms. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>