linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2011-02-01	IPVS: use z modifier for sizeof() argument	Simon Horman	1	-1/+1
	Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Tested-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: ctnetlink: fix ctnetlink_parse_tuple() warning	Patrick McHardy	1	-1/+1
	net/netfilter/nf_conntrack_netlink.c: In function 'ctnetlink_parse_tuple': net/netfilter/nf_conntrack_netlink.c:832:11: warning: comparison between 'enum ctattr_tuple' and 'enum ctattr_type' Use ctattr_type for the 'type' parameter since that's the type of all attributes passed to this function. Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: ipset: remove unnecessary includes	Patrick McHardy	9	-27/+0
	None of the set types need uaccess.h since this is handled centrally in ip_set_core. Most set types additionally don't need bitops.h and spinlock.h since they use neither. tcp.h is only needed by those using before(), udp.h is not needed at all. Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: ipset: use nla_parse_nested()	Patrick McHardy	1	-26/+16
	Replace calls of the form: nla_parse(tb, ATTR_MAX, nla_data(attr), nla_len(attr), policy) by: nla_parse_nested(tb, ATTR_MAX, attr, policy) Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: xtables: "set" match and "SET" target support	Jozsef Kadlecsik	4	-0/+427
	The patch adds the combined module of the "SET" target and "set" match to netfilter. Both the previous and the current revisions are supported. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: ipset: list:set set type support	Jozsef Kadlecsik	4	-0/+624
	The module implements the list:set type support in two flavours: without and with timeout. The sets has two sides: for the userspace, they store the names of other (non list:set type of) sets: one can add, delete and test set names. For the kernel, it forms an ordered union of the member sets: the members sets are tried in order when elements are added, deleted and tested and the process stops at the first success. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: ipset: hash:net,port set type support	Jozsef Kadlecsik	3	-0/+592
	The module implements the hash:net,port type support in four flavours: for IPv4 and IPv6, both without and with timeout support. The elements are two dimensional: IPv4/IPv6 network address/prefix and protocol/port pairs. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: ipset: hash:net set type support	Jozsef Kadlecsik	3	-0/+471
	The module implements the hash:net type support in four flavours: for IPv4 and IPv6, both without and with timeout support. The elements are one dimensional: IPv4/IPv6 network address/prefixes. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: ipset: hash:ip,port,net set type support	Jozsef Kadlecsik	3	-0/+642
	The module implements the hash:ip,port,net type support in four flavours: for IPv4 and IPv6, both without and with timeout support. The elements are three dimensional: IPv4/IPv6 address, protocol/port and IPv4/IPv6 network address/prefix triples. The different prefixes are searched/matched from the longest prefix to the shortes one (most specific to least). In other words the processing time linearly grows with the number of different prefixes in the set. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: ipset: hash:ip,port,ip set type support	Jozsef Kadlecsik	3	-0/+576
	The module implements the hash:ip,port,ip type support in four flavours: for IPv4 and IPv6, both without and with timeout support. The elements are three dimensional: IPv4/IPv6 address, protocol/port and IPv4/IPv6 address triples. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: ipset: hash:ip,port set type support	Jozsef Kadlecsik	3	-0/+557
	The module implements the hash:ip,port type support in four flavours: for IPv4 and IPv6, both without and with timeout support. The elements are two dimensional: IPv4/IPv6 address and protocol/port pairs. The port is interpeted for TCP, UPD, ICMP and ICMPv6 (at the latters as type/code of course). Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: ipset: hash:ip set type support	Jozsef Kadlecsik	5	-0/+1580
	The module implements the hash:ip type support in four flavours: for IPv4 or IPv6, both without and with timeout support. All the hash types are based on the "array hash" or ahash structure and functions as a good compromise between minimal memory footprint and speed. The hashing uses arrays to resolve clashes. The hash table is resized (doubled) when searching becomes too long. Resizing can be triggered by userspace add commands only and those are serialized by the nfnl mutex. During resizing the set is read-locked, so the only possible concurrent operations are the kernel side readers. Those are protected by RCU locking. Because of the four flavours and the other hash types, the functions are implemented in general forms in the ip_set_ahash.h header file and the real functions are generated before compiling by macro expansion. Thus the dereferencing of low-level functions and void pointer arguments could be avoided: the low-level functions are inlined, the function arguments are pointers of type-specific structures. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: ipset; bitmap:port set type support	Jozsef Kadlecsik	3	-0/+530
	The module implements the bitmap:port type in two flavours, without and with timeout support to store TCP/UDP ports from a range. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: ipset: bitmap:ip,mac type support	Jozsef Kadlecsik	3	-0/+665
	The module implements the bitmap:ip,mac set type in two flavours, without and with timeout support. In this kind of set one can store IPv4 address and (source) MAC address pairs. The type supports elements added without the MAC part filled out: when the first matching from kernel happens, the MAC part is automatically filled out. The timing out of the elements stars when an element is complete in the IP,MAC pair. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: ipset: bitmap:ip set type support	Jozsef Kadlecsik	5	-0/+758
	The module implements the bitmap:ip set type in two flavours, without and with timeout support. In this kind of set one can store IPv4 addresses (or network addresses) from a given range. In order not to waste memory, the timeout version does not rely on the kernel timer for every element to be timed out but on garbage collection. All set types use this mechanism. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: ipset: IP set core support	Jozsef Kadlecsik	10	-0/+2626
	The patch adds the IP set core support to the kernel. The IP set core implements a netlink (nfnetlink) based protocol by which one can create, destroy, flush, rename, swap, list, save, restore sets, and add, delete, test elements from userspace. For simplicity (and backward compatibilty and for not to force ip(6)tables to be linked with a netlink library) reasons a small getsockopt-based protocol is also kept in order to communicate with the ip(6)tables match and target. The netlink protocol passes all u16, etc values in network order with NLA_F_NET_BYTEORDER flag. The protocol enforces the proper use of the NLA_F_NESTED and NLA_F_NET_BYTEORDER flags. For other kernel subsystems (netfilter match and target) the API contains the functions to add, delete and test elements in sets and the required calls to get/put refereces to the sets before those operations can be performed. The set types (which are implemented in independent modules) are stored in a simple RCU protected list. A set type may have variants: for example without timeout or with timeout support, for IPv4 or for IPv6. The sets (i.e. the pointers to the sets) are stored in an array. The sets are identified by their index in the array, which makes possible easy and fast swapping of sets. The array is protected indirectly by the nfnl mutex from nfnetlink. The content of the sets are protected by the rwlock of the set. There are functional differences between the add/del/test functions for the kernel and userspace: - kernel add/del/test: works on the current packet (i.e. one element) - kernel test: may trigger an "add" operation in order to fill out unspecified parts of the element from the packet (like MAC address) - userspace add/del: works on the netlink message and thus possibly on multiple elements from the IPSET_ATTR_ADT container attribute. - userspace add: may trigger resizing of a set Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01	netfilter: NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros	Jozsef Kadlecsik	2	-1/+11
	The patch adds the NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros to the vanilla kernel. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-28	netfilter: xt_iprange: add IPv6 match debug print code	Thomas Jacob	1	-2/+14
	Signed-off-by: Thomas Jacob <jacob@internet24.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-27	netfilter: xt_iprange: typo in IPv4 match debug print code	Thomas Jacob	1	-1/+1
	Signed-off-by: Thomas Jacob <jacob@internet24.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-26	netfilter: xt_connlimit: pick right dstaddr in NAT scenario	Jan Engelhardt	1	-4/+8
	xt_connlimit normally records the "original" tuples in a hashlist (such as "1.2.3.4 -> 5.6.7.8"), and looks in this list for iph->daddr when counting. When the user however uses DNAT in PREROUTING, looking for iph->daddr -- which is now 192.168.9.10 -- will not match. Thus in daddr mode, we need to record the reverse direction tuple ("192.168.9.10 -> 1.2.3.4") instead. In the reverse tuple, the dst addr is on the src side, which is convenient, as count_them still uses &conn->tuple.src.u3. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2011-01-25	netfilter: ipvs: fix compiler warnings	Changli Gao	1	-3/+1
	Fix compiler warnings when IP_VS_DBG() isn't defined. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-25	IPVS netns BUG, register sysctl for root ns	Hans Schillstrom	1	-1/+1
	The newly created table was not used when register sysctl for a new namespace. I.e. sysctl doesn't work for other than root namespace (init_net) Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-22	IPVS: Change sock_create_kernel() to __sock_create()	Simon Horman	1	-2/+2
	The recent netns changes omitted to change sock_create_kernel() to __sock_create() in ip_vs_sync.c The effect of this is that the interface will be selected in the root-namespace, from my point of view it's a major bug. Reported-by: Hans Schillstrom <hans@schillstrom.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-22	netfilter: ipvs: fix compiler warnings	Changli Gao	2	-0/+8
	Fix compiler warnings when no transport protocol load balancing support is configured. [horms@verge.net.au: removed suprious __ip_vs_cleanup() clean-up hunk] Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2011-01-20	netfilter: add a missing include in nf_conntrack_reasm.c	Eric Dumazet	1	-0/+1
	After commit ae90bdeaeac6b (netfilter: fix compilation when conntrack is disabled but tproxy is enabled) we have following warnings : net/ipv6/netfilter/nf_conntrack_reasm.c:520:16: warning: symbol 'nf_ct_frag6_gather' was not declared. Should it be static? net/ipv6/netfilter/nf_conntrack_reasm.c:591:6: warning: symbol 'nf_ct_frag6_output' was not declared. Should it be static? net/ipv6/netfilter/nf_conntrack_reasm.c:612:5: warning: symbol 'nf_ct_frag6_init' was not declared. Should it be static? net/ipv6/netfilter/nf_conntrack_reasm.c:640:6: warning: symbol 'nf_ct_frag6_cleanup' was not declared. Should it be static? Fix this including net/netfilter/ipv6/nf_defrag_ipv6.h Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-20	netfilter: nf_conntrack: fix linker error with NF_CONNTRACK_TIMESTAMP=n	Patrick McHardy	1	-0/+12
	net/built-in.o: In function `nf_conntrack_init_net': net/netfilter/nf_conntrack_core.c:1521: undefined reference to `nf_conntrack_tstamp_init' net/netfilter/nf_conntrack_core.c:1531: undefined reference to `nf_conntrack_tstamp_fini' Add dummy inline functions for the =n case to fix this. Reported-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-20	netfilter: xtables: add missing header inclusions for headers_check	Jan Engelhardt	39	-0/+77
	Resolve these warnings on `make headers_check`: usr/include/linux/netfilter/xt_CT.h:7: found __[us]{8,16,32,64} type without #include <linux/types.h> ... Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2011-01-20	netfilter: nf_nat: place conntrack in source hash after SNAT is done	Changli Gao	1	-7/+11
	If SNAT isn't done, the wrong info maybe got by the other cts. As the filter table is after DNAT table, the packets dropped in filter table also bother bysource hash table. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-20	netfilter: xtables: remove duplicate member	Jan Engelhardt	1	-1/+1
	Accidentally missed removing the old out-of-union "inverse" member, which caused the struct size to change which then gives size mismatch warnings when using an old iptables. It is interesting to see that gcc did not warn about this before. (Filed http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47376 ) Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2011-01-20	netfilter: do not omit re-route check on NF_QUEUE verdict	Florian Westphal	1	-1/+1
	ret != NF_QUEUE only works in the "--queue-num 0" case; for queues > 0 the test should be '(ret & NF_VERDICT_MASK) != NF_QUEUE'. However, NF_QUEUE no longer DROPs the skb unconditionally if queueing fails (due to NF_VERDICT_FLAG_QUEUE_BYPASS verdict flag), so the re-route test should also be performed if this flag is set in the verdict. The full test would then look something like && ((ret & NF_VERDICT_MASK) == NF_QUEUE && (ret & NF_VERDICT_FLAG_QUEUE_BYPASS)) This is rather ugly, so just remove the NF_QUEUE test altogether. The only effect is that we might perform an unnecessary route lookup in the NF_QUEUE case. ip6table_mangle did not have such a check. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-20	netfilter: xtables: remove extraneous header that slipped in	Jan Engelhardt	1	-1/+0
	Commit 0b8ad87 (netfilter: xtables: add missing header files to export list) erroneously added this. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-19	net_sched: cleanups	Eric Dumazet	41	-801/+842
	Cleanup net/sched code to current CodingStyle and practices. Reduce inline abuse Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19	af_unix: coding style: remove one level of indentation in unix_shutdown()	Alban Crequy	1	-29/+31
	Signed-off-by: Alban Crequy <alban.crequy@collabora.co.uk> Reviewed-by: Ian Molton <ian.molton@collabora.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19	net_sched: implement a root container qdisc sch_mqprio	John Fastabend	5	-0/+446
	This implements a mqprio queueing discipline that by default creates a pfifo_fast qdisc per tx queue and provides the needed configuration interface. Using the mqprio qdisc the number of tcs currently in use along with the range of queues alloted to each class can be configured. By default skbs are mapped to traffic classes using the skb priority. This mapping is configurable. Configurable parameters, struct tc_mqprio_qopt { __u8 num_tc; __u8 prio_tc_map[TC_BITMASK + 1]; __u8 hw; __u16 count[TC_MAX_QUEUE]; __u16 offset[TC_MAX_QUEUE]; }; Here the count/offset pairing give the queue alignment and the prio_tc_map gives the mapping from skb->priority to tc. The hw bit determines if the hardware should configure the count and offset values. If the hardware bit is set then the operation will fail if the hardware does not implement the ndo_setup_tc operation. This is to avoid undetermined states where the hardware may or may not control the queue mapping. Also minimal bounds checking is done on the count/offset to verify a queue does not exceed num_tx_queues and that queue ranges do not overlap. Otherwise it is left to user policy or hardware configuration to create useful mappings. It is expected that hardware QOS schemes can be implemented by creating appropriate mappings of queues in ndo_tc_setup(). One expected use case is drivers will use the ndo_setup_tc to map queue ranges onto 802.1Q traffic classes. This provides a generic mechanism to map network traffic onto these traffic classes and removes the need for lower layer drivers to know specifics about traffic types. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19	net: implement mechanism for HW based QOS	John Fastabend	2	-1/+122
	This patch provides a mechanism for lower layer devices to steer traffic using skb->priority to tx queues. This allows for hardware based QOS schemes to use the default qdisc without incurring the penalties related to global state and the qdisc lock. While reliably receiving skbs on the correct tx ring to avoid head of line blocking resulting from shuffling in the LLD. Finally, all the goodness from txq caching and xps/rps can still be leveraged. Many drivers and hardware exist with the ability to implement QOS schemes in the hardware but currently these drivers tend to rely on firmware to reroute specific traffic, a driver specific select_queue or the queue_mapping action in the qdisc. By using select_queue for this drivers need to be updated for each and every traffic type and we lose the goodness of much of the upstream work. Firmware solutions are inherently inflexible. And finally if admins are expected to build a qdisc and filter rules to steer traffic this requires knowledge of how the hardware is currently configured. The number of tx queues and the queue offsets may change depending on resources. Also this approach incurs all the overhead of a qdisc with filters. With the mechanism in this patch users can set skb priority using expected methods ie setsockopt() or the stack can set the priority directly. Then the skb will be steered to the correct tx queues aligned with hardware QOS traffic classes. In the normal case with single traffic class and all queues in this class everything works as is until the LLD enables multiple tcs. To steer the skb we mask out the lower 4 bits of the priority and allow the hardware to configure upto 15 distinct classes of traffic. This is expected to be sufficient for most applications at any rate it is more then the 8021Q spec designates and is equal to the number of prio bands currently implemented in the default qdisc. This in conjunction with a userspace application such as lldpad can be used to implement 8021Q transmission selection algorithms one of these algorithms being the extended transmission selection algorithm currently being used for DCB. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19	netlink: support setting devgroup parameters	Vlad Dogaru	1	-4/+28
	If a rtnetlink request specifies a negative or zero ifindex and has no interface name attribute, but has a group attribute, then the chenges are made to all the interfaces belonging to the specified group. Signed-off-by: Vlad Dogaru <ddvlad@rosedu.org> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19	net_device: add support for network device groups	Vlad Dogaru	4	-0/+26
	Net devices can now be grouped, enabling simpler manipulation from userspace. This patch adds a group field to the net_device structure, as well as rtnetlink support to query and modify it. Signed-off-by: Vlad Dogaru <ddvlad@rosedu.org> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19	net: cleanup unused macros in net directory	Shan Wei	10	-12/+2
	Clean up some unused macros in net/*. 1. be left for code change. e.g. PGV_FROM_VMALLOC, PGV_FROM_VMALLOC, KMEM_SAFETYZONE. 2. never be used since introduced to kernel. e.g. P9_RDMA_MAX_SGE, UTIL_CTRL_PKT_SIZE. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Acked-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19	vxge: update driver version	Jon Mason	1	-2/+2
	Update vxge driver version to 2.5.2 Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19	vxge: MSIX one shot mode	Jon Mason	6	-50/+302
	To reduce the possibility of losing an interrupt in the handler due to a race between an interrupt processing and disable/enable of interrupts, enable MSIX one shot. Also, add support for adaptive interrupt coalesing Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Masroor Vettuparambil <masroor.vettuparambil@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19	vxge: correct eprom version detection	Jon Mason	2	-2/+2
	The firmware PXE EPROM version detection is failing due to passing the wrong parameter into firmware query function. Also, the version printing function has an extraneous newline. Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Sivakumar Subramani <sivakumar.subramani@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19	vxge: cleanup probe error paths	Jon Mason	1	-30/+25
	Reorder the commands to be in the inverse order of their allocations (instead of the random order they appear to be in), propagate return code on errors from pci_request_region and register_netdev, reduce the config_dev_cnt and total_dev_cnt counters on remove, and return the correct error code for vdev->vpaths kzalloc failures. Also, prevent leaking of vdev->vpaths memory and netdev in vxge_probe error path due to freeing for these not occurring in vxge_device_unregister. Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Sivakumar Subramani <sivakumar.subramani@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19	netfilter: nf_conntrack: fix lifetime display for disabled connections	Patrick McHardy	1	-17/+12
	When no tstamp extension exists, ct_delta_time() returns -1, which is then assigned to an u64 and tested for negative values to decide whether to display the lifetime. This obviously doesn't work, use a s64 and merge the two minor functions into one. Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-19	netfilter: xtables: connlimit revision 1	Jan Engelhardt	3	-14/+49
	This adds destination address-based selection. The old "inverse" member is overloaded (memory-wise) with a new "flags" variable, similar to how J.Park did it with xt_string rev 1. Since revision 0 userspace only sets flag 0x1, no great changes are made to explicitly test for different revisions. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2011-01-19	netfilter: nf_conntrack_tstamp: add flow-based timestamp extension	Pablo Neira Ayuso	10	-1/+312
	This patch adds flow-based timestamping for conntracks. This conntrack extension is disabled by default. Basically, we use two 64-bits variables to store the creation timestamp once the conntrack has been confirmed and the other to store the deletion time. This extension is disabled by default, to enable it, you have to: echo 1 > /proc/sys/net/netfilter/nf_conntrack_timestamp This patch allows to save memory for user-space flow-based loogers such as ulogd2. In short, ulogd2 does not need to keep a hashtable with the conntrack in user-space to know when they were created and destroyed, instead we use the kernel timestamp. If we want to have a sane IPFIX implementation in user-space, this nanosecs resolution timestamps are also useful. Other custom user-space applications can benefit from this via libnetfilter_conntrack. This patch modifies the /proc output to display the delta time in seconds since the flow start. You can also obtain the flow-start date by means of the conntrack-tools. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-18	net: filter: dont block softirqs in sk_run_filter()	Eric Dumazet	3	-7/+7
	Packet filter (BPF) doesnt need to disable softirqs, being fully re-entrant and lock-less. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-18	af_unix: implement socket filter	Alban Crequy	1	-0/+6
	Linux Socket Filters can already be successfully attached and detached on unix sockets with setsockopt(sockfd, SOL_SOCKET, SO_{ATTACH,DETACH}_FILTER, ...). See: Documentation/networking/filter.txt But the filter was never used in the unix socket code so it did not work. This patch uses sk_filter() to filter buffers before delivery. This short program demonstrates the problem on SOCK_DGRAM. int main(void) { int i, j, ret; int sv[2]; struct pollfd fds[2]; char message = "Hello world!"; char buffer[64]; struct sock_filter ins[32] = {{0,},}; struct sock_fprog filter; socketpair(AF_UNIX, SOCK_DGRAM, 0, sv); for (i = 0 ; i < 2 ; i++) { fds[i].fd = sv[i]; fds[i].events = POLLIN; fds[i].revents = 0; } for(j = 1 ; j < 13 ; j++) { / Set a socket filter to truncate the message / memset(ins, 0, sizeof(ins)); ins[0].code = BPF_RET\|BPF_K; ins[0].k = j; filter.len = 1; filter.filter = ins; setsockopt(sv[1], SOL_SOCKET, SO_ATTACH_FILTER, &filter, sizeof(filter)); / send a message / send(sv[0], message, strlen(message) + 1, 0); / The filter should let the message pass but truncated. / poll(fds, 2, 0); / Receive the truncated message*/ ret = recv(sv[1], buffer, 64, 0); printf("received %d bytes, expected %d\n", ret, j); } for (i = 0 ; i < 2 ; i++) close(sv[i]); return 0; } Signed-off-by: Alban Crequy <alban.crequy@collabora.co.uk> Reviewed-by: Ian Molton <ian.molton@collabora.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-18	gianfar: Fix misleading indentation in startup_gfar()	Anton Vorontsov	1	-1/+1
	Just stumbled upon the issue while looking for another bug. The code looks correct, the indentation is not. Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-18	net/irda/sh_irda: return to RX mode when TX error	Kuninori Morimoto	1	-2/+12
	sh_irda can not use RX/TX in same time, but this driver didn't return to RX mode when TX error occurred. This patch care xmit error case to solve this issue. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-18	net offloading: Do not mask out NETIF_F_HW_VLAN_TX for vlan.	Jesse Gross	1	-2/+2
	In netif_skb_features() we return only the features that are valid for vlans if we have a vlan packet. However, we should not mask out NETIF_F_HW_VLAN_TX since it enables transmission of vlan tags and is obviously valid. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>