linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2010-03-25	netfilter: xt_recent: allow changing ip_list_[ug]id at runtime	Jan Engelhardt	1	-4/+4
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-25	netfilter: xtables: consolidate code into xt_request_find_match	Jan Engelhardt	5	-24/+30
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-25	netfilter: xtables: make use of xt_request_find_target	Jan Engelhardt	6	-52/+29
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-25	netfilter: xt extensions: use pr_<level> (2)	Jan Engelhardt	31	-185/+151
	Supplement to 1159683ef48469de71dc26f0ee1a9c30d131cf89. Downgrade the log level to INFO for most checkentry messages as they are, IMO, just an extra information to the -EINVAL code that is returned as part of a parameter "constraint violation". Leave errors to real errors, such as being unable to create a LED trigger. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-25	netfilter: xtables: make use of caller family rather than target family	Jan Engelhardt	2	-5/+5
	Supplement to aa5fa3185791aac71c9172d4fda3e8729164b5d1. The semantic patch for this change is: // <smpl> @@ struct xt_target_param par; @@ -par->target->family +par->family @@ struct xt_tgchk_param par; @@ -par->target->family +par->family @@ struct xt_tgdtor_param *par; @@ -par->target->family +par->family // </smpl> Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-19	netfilter: remove unused headers in net/ipv4/netfilter/nf_nat_h323.c	Zhitong Wang	1	-1/+0
	Remove unused headers in net/ipv4/netfilter/nf_nat_h323.c Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-19	netfilter: remove unused headers in net/ipv6/netfilter/ip6t_LOG.c	Zhitong Wang	1	-1/+0
	Remove unused headers in net/ipv6/netfilter/ip6t_LOG.c Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-18	netfilter: xt extensions: use pr_<level>	Jan Engelhardt	21	-107/+88
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18	netfilter: xtables: replace custom duprintf with pr_debug	Jan Engelhardt	7	-79/+41
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18	netfilter: xtables: do not print any messages on ENOMEM	Jan Engelhardt	4	-17/+6
	ENOMEM is a very obvious error code (cf. EINVAL), so I think we do not really need a warning message. Not to mention that if the allocation fails, the user is most likely going to get a stack trace from slab already. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18	netfilter: xtables: reduce holes in struct xt_target	Jan Engelhardt	1	-1/+1
	This will save one full padding chunk (8 bytes on x86_64) per target. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18	netfilter: xtables: remove almost-unused xt_match_param.data member	Jan Engelhardt	2	-6/+6
	This member is taking up a "long" per match, yet is only used by one module out of the roughly 90 modules, ip6t_hbh. ip6t_hbh can be restructured a little to accomodate for the lack of the .data member. This variant uses checking the par->match address, which should avoid having to add two extra functions, including calls, i.e. (hbh_mt6: call hbhdst_mt6(skb, par, NEXTHDR_OPT), dst_mt6: call hbhdst_mt6(skb, par, NEXTHDR_DEST)) Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18	netfilter: update documentation fields of x_tables.h	Jan Engelhardt	1	-2/+8
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18	netfilter: xtables: make use of caller family rather than match family	Jan Engelhardt	5	-14/+14
	The matches can have .family = NFPROTO_UNSPEC, and though that is not the case for the touched modules, it seems better to just use the nfproto from the caller. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18	netfilter: xtables: resort osf kconfig text	Jan Engelhardt	1	-13/+13
	Restore alphabetical ordering of the list and put the xt_osf option into its 'right' place again. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18	netfilter: xtables: limit xt_mac to ethernet devices	Jan Engelhardt	1	-0/+3
	I do not see a point of allowing the MAC module to work with devices that don't possibly have one, e.g. various tunnel interfaces such as tun and sit. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18	netfilter: xtables: clean up xt_mac match routine	Jan Engelhardt	1	-8/+10
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18	netfilter: xtables: do without explicit XT_ALIGN	Jan Engelhardt	2	-2/+2
	XT_ALIGN is already applied on matchsize/targetsize in x_tables.c, so it is not strictly needed in the extensions. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-17	netfilter: remove unused headers in net/netfilter/nfnetlink.c	Zhitong Wang	1	-3/+0
	Remove unused headers in net/netfilter/nfnetlink.c Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-17	netfilter: xt_recent: check for unsupported user space flags	Tim Gardner	2	-0/+8
	Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-17	netfilter: xt_recent: add an entry reaper	Tim Gardner	2	-1/+31
	One of the problems with the way xt_recent is implemented is that there is no efficient way to remove expired entries. Of course, one can write a rule '-m recent --remove', but you have to know beforehand which entry to delete. This commit adds reaper logic which checks the head of the LRU list when a rule is invoked that has a '--seconds' value and XT_RECENT_REAP set. If an entry ceases to accumulate time stamps, then it will eventually bubble to the top of the LRU list where it is then reaped. Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-17	netfilter: xt_recent: remove old proc directory	Jan Engelhardt	3	-122/+0
	The compat option was introduced in October 2008. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-17	netfilter: xt_recent: update description	Jan Engelhardt	1	-1/+1
	It had IPv6 for quite a while already :-) Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-17	netfilter: ebt_ip6: add principal maintainer in a MODULE_AUTHOR tag	Jan Engelhardt	1	-0/+1
	Cc: Kuo-Lang Tseng <kuo-lang.tseng@intel.com> Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-17	netfilter: update my email address	Jan Engelhardt	9	-12/+8
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-17	netfilter: xtables: schedule xt_NOTRACK for removal	Jan Engelhardt	1	-0/+8
	It is being superseded by xt_CT (-j CT --notrack). Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-17	netfilter: xtables: merge xt_CONNMARK into xt_connmark	Jan Engelhardt	6	-156/+116
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-17	netfilter: xtables: merge xt_MARK into xt_mark	Jan Engelhardt	6	-82/+70
	Two arguments for combining the two: - xt_mark is pretty useless without xt_MARK - the actual code is so small anyway that the kmod metadata and the module in its loaded state totally outweighs the combined actual code size. i586-before: -rw-r--r-- 1 jengelh users 3821 Feb 10 01:01 xt_MARK.ko -rw-r--r-- 1 jengelh users 2592 Feb 10 00:04 xt_MARK.o -rw-r--r-- 1 jengelh users 3274 Feb 10 01:01 xt_mark.ko -rw-r--r-- 1 jengelh users 2108 Feb 10 00:05 xt_mark.o text data bss dec hex filename 354 264 0 618 26a xt_MARK.o 223 176 0 399 18f xt_mark.o And the runtime size is like 14 KB. i586-after: -rw-r--r-- 1 jengelh users 3264 Feb 18 17:28 xt_mark.o Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-17	netfilter: xtables: add comment markers to Xtables Kconfig	Jan Engelhardt	1	-0/+6
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-17	netfilter: xt_NFQUEUE: consolidate v4/v6 targets into one	Jan Engelhardt	1	-28/+12
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-17	netfilter: xt_CT: par->family is an nfproto	Jan Engelhardt	1	-2/+2
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-16	e1000e: Fix build with CONFIG_PM disabled.	David S. Miller	1	-0/+2
	Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	drivers/net/e100.c: Use pr_<level> and netif_<level>	Joe Perches	1	-85/+98
	Convert DPRINTK, commonly used for debugging, to netif_<level> Remove #define PFX Use #define pr_fmt Consistently use no periods for non-sentence logging messages Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	NET: Support clause 45 MDIO commands at the MDIO bus level	Jason Gunthorpe	3	-15/+61
	IEEE 802.3ae clause 45 specifies a somewhat modified MDIO protocol for use by 10GIGE phys. The main change is a 21 bit address split into a 5 bit device ID and a 16 bit register offset. The definition is designed so that normal and extended devices can run on the same MDIO bus. Extend mdio-bitbang to do the new protocol. At the MDIO bus level the protocol is requested by or'ing MII_ADDR_C45 into the register offset. Make phy_read/phy_write/etc pass a full 32 bit register offset. This does not attempt to make the phy layer support C45 style PHYs, just to provide the MDIO bus support. Tested against a Broadcom 10GE phy with ID 0x206034, and several Broadcom 10/100/1000 Phys in normal mode. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	e1000e / PCI / PM: Add basic runtime PM support (rev. 4)	Rafael J. Wysocki	2	-27/+138
	Use the PCI runtime power management framework to add basic PCI runtime PM support to the e1000e driver. Namely, make the driver suspend the device when the link is off and set it up for generating a wakeup event after the link has been detected again. [This feature is disabled until the user space enables it with the help of the /sys/devices/.../power/contol device attribute.] Based on a patch from Matthew Garrett. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	r8169 / PCI / PM: Add simplified runtime PM support (rev. 3)	Rafael J. Wysocki	1	-27/+125
	Use the PCI runtime power management framework to add basic PCI runtime PM support to the r8169 driver. Namely, make the driver suspend the device when the link is not present and set it up for generating a wakeup event after the link has been detected again. [This feature is disabled until the user space enables it with the help of the /sys/devices/.../power/contol device attribute.] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	net: convert multiple drivers to use netdev_for_each_mc_addr, part7	Jiri Pirko	5	-41/+29
	In mlx4, using char * to store mc address in private structure instead. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	drivers/net/ks*: Use netdev_<level>, netif_<level> and pr_<level>	Joe Perches	4	-120/+91
	I'm not sure this is correct. It changes logging macros from: dev_<level>(&ks->spidev->dev, to netdev_<level>(ks->netdev, Comments? Use netdev_<level> Use netif_<level> Use pr_<level> Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt Add missing line to message in ks8851_remove Change kmalloc/memset(,0) to kzalloc Remove ks_<level> macros Consolidation code into set_media_state Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	tipc: Allow retransmission of cloned buffers	Neil Horman	1	-5/+6
	Forward port commit fc477e160af086f6e30c3d4fdf5f5c000d29beb5 from git://tipc.cslab.ericsson.net/pub/git/people/allan/tipc.git Origional commit message: Allow retransmission of cloned buffers This patch fixes an issue with TIPC's message retransmission logic that prevented retransmission of clone sk_buffs. Originally intended as a means of avoiding wasted work in retransmitting messages that were still on the driver's outbound queue, it also prevented TIPC from retransmitting messages through other means -- such as the secondary bearer of the broadcast link, or another interface in a set of bonded interfaces. This fix removes existing checks for cloned sk_buffs that prevented such retransmission. Origionally-Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	tipc: Increase frequency of load distribution over broadcast link	Neil Horman	1	-21/+14
	Forward port commit 29eb572941501c40ac6e62dbc5043bf9ee76ee56 from git://tipc.cslab.ericsson.net/pub/git/people/allan/tipc.git Origional commit message: Increase frequency of load distribution over broadcast link This patch enhances the behavior of TIPC's broadcast link so that it alternates between redundant bearers (if available) after every message sent, rather than after every 10 messages. This change helps to speed up delivery of retransmitted messages by ensuring that they are not sent repeatedly over a bearer that is no longer working, but not yet recognized as failed. Tested by myself in the latest net-2.6 tree using the tipc sanity test suite Origionally-signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> bcast.c \| 35 ++++++++++++++--------------------- 1 file changed, 14 insertions(+), 21 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	net: core: add IFLA_STATS64 support	Jan Engelhardt	2	-1/+74
	`ip -s link` shows interface counters truncated to 32 bit. This is because interface statistics are transported only in 32-bit quantity to userspace. This commit adds a new IFLA_STATS64 attribute that exports them in full 64 bit. References: http://lkml.indiana.edu/hypermail/linux/kernel/0307.3/0215.html Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	net: tcp: make veno selectable as default congestion module	Jan Engelhardt	1	-0/+4
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	net: tcp: make hybla selectable as default congestion module	Jan Engelhardt	1	-0/+4
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	net: remove rcu locking from fib_rules_event()	Eric Dumazet	1	-8/+2
	We hold RTNL at this point and dont use RCU variants of list traversals, we dont need rcu_read_lock()/rcu_read_unlock() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	bridge: per-cpu packet statistics (v3)	stephen hemminger	4	-6/+57
	The shared packet statistics are a potential source of slow down on bridged traffic. Convert to per-cpu array, but only keep those statistics which change per-packet. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	rps: Receive Packet Steering	Tom Herbert	5	-59/+538
	This patch implements software receive side packet steering (RPS). RPS distributes the load of received packet processing across multiple CPUs. Problem statement: Protocol processing done in the NAPI context for received packets is serialized per device queue and becomes a bottleneck under high packet load. This substantially limits pps that can be achieved on a single queue NIC and provides no scaling with multiple cores. This solution queues packets early on in the receive path on the backlog queues of other CPUs. This allows protocol processing (e.g. IP and TCP) to be performed on packets in parallel. For each device (or each receive queue in a multi-queue device) a mask of CPUs is set to indicate the CPUs that can process packets. A CPU is selected on a per packet basis by hashing contents of the packet header (e.g. the TCP or UDP 4-tuple) and using the result to index into the CPU mask. The IPI mechanism is used to raise networking receive softirqs between CPUs. This effectively emulates in software what a multi-queue NIC can provide, but is generic requiring no device support. Many devices now provide a hash over the 4-tuple on a per packet basis (e.g. the Toeplitz hash). This patch allow drivers to set the HW reported hash in an skb field, and that value in turn is used to index into the RPS maps. Using the HW generated hash can avoid cache misses on the packet when steering it to a remote CPU. The CPU mask is set on a per device and per queue basis in the sysfs variable /sys/class/net/<device>/queues/rx-<n>/rps_cpus. This is a set of canonical bit maps for receive queues in the device (numbered by <n>). If a device does not support multi-queue, a single variable is used for the device (rx-0). Generally, we have found this technique increases pps capabilities of a single queue device with good CPU utilization. Optimal settings for the CPU mask seem to depend on architectures and cache hierarcy. Below are some results running 500 instances of netperf TCP_RR test with 1 byte req. and resp. Results show cumulative transaction rate and system CPU utilization. e1000e on 8 core Intel Without RPS: 108K tps at 33% CPU With RPS: 311K tps at 64% CPU forcedeth on 16 core AMD Without RPS: 156K tps at 15% CPU With RPS: 404K tps at 49% CPU bnx2x on 16 core AMD Without RPS 567K tps at 61% CPU (4 HW RX queues) Without RPS 738K tps at 96% CPU (8 HW RX queues) With RPS: 854K tps at 76% CPU (4 HW RX queues) Caveats: - The benefits of this patch are dependent on architecture and cache hierarchy. Tuning the masks to get best performance is probably necessary. - This patch adds overhead in the path for processing a single packet. In a lightly loaded server this overhead may eliminate the advantages of increased parallelism, and possibly cause some relative performance degradation. We have found that masks that are cache aware (share same caches with the interrupting CPU) mitigate much of this. - The RPS masks can be changed dynamically, however whenever the mask is changed this introduces the possibility of generating out of order packets. It's probably best not change the masks too frequently. Signed-off-by: Tom Herbert <therbert@google.com> include/linux/netdevice.h \| 32 ++++- include/linux/skbuff.h \| 3 + net/core/dev.c \| 335 +++++++++++++++++++++++++++++++++++++-------- net/core/net-sysfs.c \| 225 ++++++++++++++++++++++++++++++- net/core/skbuff.c \| 2 + 5 files changed, 538 insertions(+), 59 deletions(-) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	RDS: Enable per-cpu workqueue threads	Tina Yang	1	-1/+1
	Create per-cpu workqueue threads instead of a single krdsd thread. This is a step towards better scalability. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	RDS: Do not call set_page_dirty() with irqs off	Andy Grover	3	-7/+12
	set_page_dirty() unconditionally re-enables interrupts, so if we call it with irqs off, they will be on after the call, and that's bad. This patch moves the call after we've re-enabled interrupts in send_drop_to(), so it's safe. Also, add BUG_ONs to let us know if we ever do call set_page_dirty with interrupts off. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	RDS: Properly unmap when getting a remote access error	Sherman Pun	1	-1/+5
	If the RDMA op has aborted with a remote access error, in addition to what we already do (tell userspace it has completed with an error) also unmap it and put() the rm. Otherwise, hangs may occur on arches that track maps and will not exit without proper cleanup. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-16	RDS: only put sockets that have seen congestion on the poll_waitq	Andy Grover	3	-2/+11
	rds_poll_waitq's listeners will be awoken if we receive a congestion notification. Bad performance may result because all polled sockets contend for this single lock. However, it should not be necessary to wake pollers when a congestion update arrives if they have never experienced congestion, and not putting these on the waitq will hopefully greatly reduce contention. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>