aboutsummaryrefslogtreecommitdiffstats
path: root/net/netfilter (follow)
AgeCommit message (Collapse)AuthorFilesLines
2012-04-30netfilter: xt_CT: fix wrong checking in the timeout assignment pathPablo Neira Ayuso1-1/+1
The current checking always succeeded. We have to check the first character of the string to check that it's empty, thus, skipping the timeout path. This fixes the use of the CT target without the timeout option. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-30ipvs: kernel oops - do_ip_vs_get_ctlHans Schillstrom2-22/+39
Change order of init so netns init is ready when register ioctl and netlink. Ver2 Whitespace fixes and __init added. Reported-by: "Ryan O'Hara" <rohara@redhat.com> Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2012-04-30ipvs: take care of return value from protocol init_netnsHans Schillstrom4-5/+21
ip_vs_create_timeout_table() can return NULL All functions protocol init_netns is affected of this patch. Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2012-04-30ipvs: null check of net->ipvs in lblc(r) shedulersHans Schillstrom2-0/+6
Avoid crash when registering shedulers after the IPVS core initialization for netns fails. Do this by checking for present core (net->ipvs). Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2012-04-26ipvs: reset ipvs pointer in netnsJulian Anastasov1-0/+2
Make sure net->ipvs is reset on netns cleanup or failed initialization. It is needed for IPVS applications to know that IPVS core is not loaded in netns. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2012-04-26ipvs: add check in ftp for initialized coreJulian Anastasov1-0/+2
Avoid crash when registering ip_vs_ftp after the IPVS core initialization for netns fails. Do this by checking for present core (net->ipvs). Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2012-04-25ipvs: fix crash in ip_vs_control_net_cleanup on unloadJulian Anastasov1-2/+2
commit 14e405461e664b777e2a5636e10b2ebf36a686ec (2.6.39) ("Add __ip_vs_control_{init,cleanup}_sysctl()") introduced regression due to wrong __net_init for __ip_vs_control_cleanup_sysctl. This leads to crash when the ip_vs module is unloaded. Fix it by changing __net_init to __net_exit for the function that is already renamed to ip_vs_control_net_cleanup_sysctl. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-25ipvs: Verify that IP_VS protocol has been registeredSasha Levin1-9/+18
The registration of a protocol might fail, there were no checks and all registrations were assumed to be correct. This lead to NULL ptr dereferences when apps tried registering. For example: [ 1293.226051] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 1293.227038] IP: [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0 [ 1293.227038] PGD 391de067 PUD 6c20b067 PMD 0 [ 1293.227038] Oops: 0000 [#1] PREEMPT SMP [ 1293.227038] CPU 1 [ 1293.227038] Pid: 19609, comm: trinity Tainted: G W 3.4.0-rc1-next-20120405-sasha-dirty #57 [ 1293.227038] RIP: 0010:[<ffffffff822aacb0>] [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0 [ 1293.227038] RSP: 0018:ffff880038c1dd18 EFLAGS: 00010286 [ 1293.227038] RAX: ffffffffffffffc0 RBX: 0000000000001500 RCX: 0000000000010000 [ 1293.227038] RDX: 0000000000000000 RSI: ffff88003a2d5888 RDI: 0000000000000282 [ 1293.227038] RBP: ffff880038c1dd48 R08: 0000000000000000 R09: 0000000000000000 [ 1293.227038] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003a2d5668 [ 1293.227038] R13: ffff88003a2d5988 R14: ffff8800696a8ff8 R15: 0000000000000000 [ 1293.227038] FS: 00007f01930d9700(0000) GS:ffff88007ce00000(0000) knlGS:0000000000000000 [ 1293.227038] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1293.227038] CR2: 0000000000000018 CR3: 0000000065dfc000 CR4: 00000000000406e0 [ 1293.227038] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1293.227038] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1293.227038] Process trinity (pid: 19609, threadinfo ffff880038c1c000, task ffff88002dc73000) [ 1293.227038] Stack: [ 1293.227038] ffff880038c1dd48 00000000fffffff4 ffff8800696aada0 ffff8800694f5580 [ 1293.227038] ffffffff8369f1e0 0000000000001500 ffff880038c1dd98 ffffffff822a716b [ 1293.227038] 0000000000000000 ffff8800696a8ff8 0000000000000015 ffff8800694f5580 [ 1293.227038] Call Trace: [ 1293.227038] [<ffffffff822a716b>] ip_vs_app_inc_new+0xdb/0x180 [ 1293.227038] [<ffffffff822a7258>] register_ip_vs_app_inc+0x48/0x70 [ 1293.227038] [<ffffffff822b2fea>] __ip_vs_ftp_init+0xba/0x140 [ 1293.227038] [<ffffffff821c9060>] ops_init+0x80/0x90 [ 1293.227038] [<ffffffff821c90cb>] setup_net+0x5b/0xe0 [ 1293.227038] [<ffffffff821c9416>] copy_net_ns+0x76/0x100 [ 1293.227038] [<ffffffff810dc92b>] create_new_namespaces+0xfb/0x190 [ 1293.227038] [<ffffffff810dca21>] unshare_nsproxy_namespaces+0x61/0x80 [ 1293.227038] [<ffffffff810afd1f>] sys_unshare+0xff/0x290 [ 1293.227038] [<ffffffff8187622e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 1293.227038] [<ffffffff82665539>] system_call_fastpath+0x16/0x1b [ 1293.227038] Code: 89 c7 e8 34 91 3b 00 89 de 66 c1 ee 04 31 de 83 e6 0f 48 83 c6 22 48 c1 e6 04 4a 8b 14 26 49 8d 34 34 48 8d 42 c0 48 39 d6 74 13 <66> 39 58 58 74 22 48 8b 48 40 48 8d 41 c0 48 39 ce 75 ed 49 8d [ 1293.227038] RIP [<ffffffff822aacb0>] tcp_register_app+0x60/0xb0 [ 1293.227038] RSP <ffff880038c1dd18> [ 1293.227038] CR2: 0000000000000018 [ 1293.379284] ---[ end trace 364ab40c7011a009 ]--- [ 1293.381182] Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-10netfilter: nf_conntrack: fix incorrect logic in nf_conntrack_init_netGao feng1-1/+1
in function nf_conntrack_init_net,when nf_conntrack_timeout_init falied, we should call nf_conntrack_ecache_fini to do rollback. but the current code calls nf_conntrack_timeout_fini. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-09netfilter: nf_ct_tcp: don't scale the size of the window up twiceChangli Gao1-2/+2
For a picked up connection, the window win is scaled twice: one is by the initialization code, and the other is by the sender updating code. I use the temporary variable swin instead of modifying the variable win. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-03netfilter: nf_conntrack: fix count leak in error path of __nf_conntrack_allocPablo Neira Ayuso1-0/+1
We have to decrement the conntrack counter if we fail to access the zone extension. Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-03netfilter: xt_CT: fix missing put timeout object in error pathPablo Neira Ayuso1-5/+19
The error path misses putting the timeout object. This patch adds new function xt_ct_tg_timeout_put() to put the timeout object. Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-03netfilter: xt_CT: allocation has to be GFP_ATOMIC under rcu_read_lock sectionPablo Neira Ayuso1-1/+1
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-03Merge branch 'master' of git://1984.lsi.us.es/netDavid S. Miller1-0/+2
2012-04-03netfilter: xt_CT: remove a compile warningPablo Neira Ayuso1-0/+2
If CONFIG_NF_CONNTRACK_TIMEOUT=n we have following warning : CC [M] net/netfilter/xt_CT.o net/netfilter/xt_CT.c: In function ‘xt_ct_tg_check_v1’: net/netfilter/xt_CT.c:284: warning: label ‘err4’ defined but not used Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-1/+1
Pull networking fixes from David Miller: 1) Provide device string properly for USB i2400m wimax devices, also don't OOPS when providing firmware string. From Phil Sutter. 2) Add support for sh_eth SH7734 chips, from Nobuhiro Iwamatsu. 3) Add another device ID to USB zaurus driver, from Guan Xin. 4) Loop index start in pool vector iterator is wrong causing MAC to not get configured in bnx2x driver, fix from Dmitry Kravkov. 5) EQL driver assumes HZ=100, fix from Eric Dumazet. 6) Now that skb_add_rx_frag() can specify the truesize increment separately, do so in f_phonet and cdc_phonet, also from Eric Dumazet. 7) virtio_net accidently uses net_ratelimit() not only on the kernel warning but also the statistic bump, fix from Rick Jones. 8) ip_route_input_mc() uses fixed init_net namespace, oops, use dev_net(dev) instead. Fix from Benjamin LaHaise. 9) dev_forward_skb() needs to clear the incoming interface index of the SKB so that it looks like a new incoming packet, also from Benjamin LaHaise. 10) iwlwifi mistakenly initializes a channel entry as 2GHZ instead of 5GHZ, fix from Stanislav Yakovlev. 11) Missing kmalloc() return value checks in orinoco, from Santosh Nayak. 12) ath9k doesn't check for HT capabilities in the right way, it is checking ht_supported instead of the ATH9K_HW_CAP_HT flag. Fix from Sujith Manoharan. 13) Fix x86 BPF JIT emission of 16-bit immediate field of AND instructions, from Feiran Zhuang. 14) Avoid infinite loop in GARP code when registering sysfs entries. From David Ward. 15) rose protocol uses memcpy instead of memcmp in a device address comparison, oops. Fix from Daniel Borkmann. 16) Fix build of lpc_eth due to dev_hw_addr_rancom() interface being renamed to eth_hw_addr_random(). From Roland Stigge. 17) Make ipv6 RTM_GETROUTE interpret RTA_IIF attribute the same way that ipv4 does. Fix from Shmulik Ladkani. 18) via-rhine has an inverted bit test, causing suspend/resume regressions. Fix from Andreas Mohr. 19) RIONET assumes 4K page size, fix from Akinobu Mita. 20) Initialization of imask register in sky2 is buggy, because bits are "or'd" into an uninitialized local variable. Fix from Lino Sanfilippo. 21) Fix FCOE checksum offload handling, from Yi Zou. 22) Fix VLAN processing regression in e1000, from Jiri Pirko. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits) sky2: dont overwrite settings for PHY Quick link tg3: Fix 5717 serdes powerdown problem net: usb: cdc_eem: fix mtu net: sh_eth: fix endian check for architecture independent usb/rtl8150 : Remove duplicated definitions rionet: fix page allocation order of rionet_active via-rhine: fix wait-bit inversion. ipv6: Fix RTM_GETROUTE's interpretation of RTA_IIF to be consistent with ipv4 net: lpc_eth: Fix rename of dev_hw_addr_random net/netfilter/nfnetlink_acct.c: use linux/atomic.h rose_dev: fix memcpy-bug in rose_set_mac_address Fix non TBI PHY access; a bad merge undid bug fix in a previous commit. net/garp: avoid infinite loop if attribute already exists x86 bpf_jit: fix a bug in emitting the 16-bit immediate operand of AND bonding: emit event when bonding changes MAC mac80211: fix oper channel timestamp updation ath9k: Use HW HT capabilites properly MAINTAINERS: adding maintainer for ipw2x00 net: orinoco: add error handling for failed kmalloc(). net/wireless: ipw2x00: fix a typo in wiphy struct initilization ...
2012-04-01net/netfilter/nfnetlink_acct.c: use linux/atomic.hAndrew Morton1-1/+1
There's no known problem here, but this is one of only two non-arch files in the kernel which use asm/atomic.h instead of linux/atomic.h. Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-28Merge tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_systemLinus Torvalds3-3/+0
Pull "Disintegrate and delete asm/system.h" from David Howells: "Here are a bunch of patches to disintegrate asm/system.h into a set of separate bits to relieve the problem of circular inclusion dependencies. I've built all the working defconfigs from all the arches that I can and made sure that they don't break. The reason for these patches is that I recently encountered a circular dependency problem that came about when I produced some patches to optimise get_order() by rewriting it to use ilog2(). This uses bitops - and on the SH arch asm/bitops.h drags in asm-generic/get_order.h by a circuituous route involving asm/system.h. The main difficulty seems to be asm/system.h. It holds a number of low level bits with no/few dependencies that are commonly used (eg. memory barriers) and a number of bits with more dependencies that aren't used in many places (eg. switch_to()). These patches break asm/system.h up into the following core pieces: (1) asm/barrier.h Move memory barriers here. This already done for MIPS and Alpha. (2) asm/switch_to.h Move switch_to() and related stuff here. (3) asm/exec.h Move arch_align_stack() here. Other process execution related bits could perhaps go here from asm/processor.h. (4) asm/cmpxchg.h Move xchg() and cmpxchg() here as they're full word atomic ops and frequently used by atomic_xchg() and atomic_cmpxchg(). (5) asm/bug.h Move die() and related bits. (6) asm/auxvec.h Move AT_VECTOR_SIZE_ARCH here. Other arch headers are created as needed on a per-arch basis." Fixed up some conflicts from other header file cleanups and moving code around that has happened in the meantime, so David's testing is somewhat weakened by that. We'll find out anything that got broken and fix it.. * tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits) Delete all instances of asm/system.h Remove all #inclusions of asm/system.h Add #includes needed to permit the removal of asm/system.h Move all declarations of free_initmem() to linux/mm.h Disintegrate asm/system.h for OpenRISC Split arch_align_stack() out from asm-generic/system.h Split the switch_to() wrapper out of asm-generic/system.h Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h Create asm-generic/barrier.h Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h Disintegrate asm/system.h for Xtensa Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt] Disintegrate asm/system.h for Tile Disintegrate asm/system.h for Sparc Disintegrate asm/system.h for SH Disintegrate asm/system.h for Score Disintegrate asm/system.h for S390 Disintegrate asm/system.h for PowerPC Disintegrate asm/system.h for PA-RISC Disintegrate asm/system.h for MN10300 ...
2012-03-28Remove all #inclusions of asm/system.hDavid Howells3-3/+0
Remove all #inclusions of asm/system.h preparatory to splitting and killing it. Performed with the following command: perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *` Signed-off-by: David Howells <dhowells@redhat.com>
2012-03-23netfilter: nf_conntrack: permanently attach timeout policy to conntrackPablo Neira Ayuso1-17/+22
We need to permanently attach the timeout policy to the conntrack, otherwise we may apply the custom timeout policy inconsistently. Without this patch, the following example: nfct timeout add test inet icmp timeout 100 iptables -I PREROUTING -t raw -p icmp -s 1.1.1.1 -j CT --timeout test Will only apply the custom timeout policy to outgoing packets from 1.1.1.1, but not to reply packets from 2.2.2.2 going to 1.1.1.1. To fix this issue, this patch modifies the current logic to attach the timeout policy when the first packet is seen (which is when the conntrack entry is created). Then, we keep using the attached timeout policy until the conntrack entry is destroyed. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-23netfilter: xt_CT: fix assignation of the generic protocol trackerPablo Neira Ayuso1-1/+8
`iptables -p all' uses 0 to match all protocols, while the conntrack subsystem uses 255. We still need `-p all' to attach the custom timeout policies for the generic protocol tracker. Moreover, we may use `iptables -p sctp' while the SCTP tracker is not loaded. In that case, we have to default on the generic protocol tracker. Another possibility is `iptables -p ip' that should be supported as well. This patch makes sure we validate all possible scenarios. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-23netfilter: xt_CT: missing rcu_read_lock section in timeout assignmentPablo Neira Ayuso1-6/+12
Fix a dereference to pointer without rcu_read_lock held. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-23netfilter: cttimeout: fix dependency with l4protocol conntrack modulePablo Neira Ayuso3-24/+48
This patch introduces nf_conntrack_l4proto_find_get() and nf_conntrack_l4proto_put() to fix module dependencies between timeout objects and l4-protocol conntrack modules. Thus, we make sure that the module cannot be removed if it is used by any of the cttimeout objects. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-22netfilter: xt_LOG: use CONFIG_IP6_NF_IPTABLES instead of CONFIG_IPV6Pablo Neira Ayuso1-6/+6
This fixes the following linking error: xt_LOG.c:(.text+0x789b1): undefined reference to `ip6t_ext_hdr' ifdefs have to use CONFIG_IP6_NF_IPTABLES instead of CONFIG_IPV6. Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds31-295/+2996
Pull networking merge from David Miller: "1) Move ixgbe driver over to purely page based buffering on receive. From Alexander Duyck. 2) Add receive packet steering support to e1000e, from Bruce Allan. 3) Convert TCP MD5 support over to RCU, from Eric Dumazet. 4) Reduce cpu usage in handling out-of-order TCP packets on modern systems, also from Eric Dumazet. 5) Support the IP{,V6}_UNICAST_IF socket options, making the wine folks happy, from Erich Hoover. 6) Support VLAN trunking from guests in hyperv driver, from Haiyang Zhang. 7) Support byte-queue-limtis in r8169, from Igor Maravic. 8) Outline code intended for IP_RECVTOS in IP_PKTOPTIONS existed but was never properly implemented, Jiri Benc fixed that. 9) 64-bit statistics support in r8169 and 8139too, from Junchang Wang. 10) Support kernel side dump filtering by ctmark in netfilter ctnetlink, from Pablo Neira Ayuso. 11) Support byte-queue-limits in gianfar driver, from Paul Gortmaker. 12) Add new peek socket options to assist with socket migration, from Pavel Emelyanov. 13) Add sch_plug packet scheduler whose queue is controlled by userland daemons using explicit freeze and release commands. From Shriram Rajagopalan. 14) Fix FCOE checksum offload handling on transmit, from Yi Zou." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1846 commits) Fix pppol2tp getsockname() Remove printk from rds_sendmsg ipv6: fix incorrent ipv6 ipsec packet fragment cpsw: Hook up default ndo_change_mtu. net: qmi_wwan: fix build error due to cdc-wdm dependecy netdev: driver: ethernet: Add TI CPSW driver netdev: driver: ethernet: add cpsw address lookup engine support phy: add am79c874 PHY support mlx4_core: fix race on comm channel bonding: send igmp report for its master fs_enet: Add MPC5125 FEC support and PHY interface selection net: bpf_jit: fix BPF_S_LDX_B_MSH compilation net: update the usage of CHECKSUM_UNNECESSARY fcoe: use CHECKSUM_UNNECESSARY instead of CHECKSUM_PARTIAL on tx net: do not do gso for CHECKSUM_UNNECESSARY in netif_needs_gso ixgbe: Fix issues with SR-IOV loopback when flow control is disabled net/hyperv: Fix the code handling tx busy ixgbe: fix namespace issues when FCoE/DCB is not enabled rtlwifi: Remove unused ETH_ADDR_LEN defines igbvf: Use ETH_ALEN ... Fix up fairly trivial conflicts in drivers/isdn/gigaset/interface.c and drivers/net/usb/{Kconfig,qmi_wwan.c} as per David.
2012-03-20Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-3/+3
Pull perf events changes for v3.4 from Ingo Molnar: - New "hardware based branch profiling" feature both on the kernel and the tooling side, on CPUs that support it. (modern x86 Intel CPUs with the 'LBR' hardware feature currently.) This new feature is basically a sophisticated 'magnifying glass' for branch execution - something that is pretty difficult to extract from regular, function histogram centric profiles. The simplest mode is activated via 'perf record -b', and the result looks like this in perf report: $ perf record -b any_call,u -e cycles:u branchy $ perf report -b --sort=symbol 52.34% [.] main [.] f1 24.04% [.] f1 [.] f3 23.60% [.] f1 [.] f2 0.01% [k] _IO_new_file_xsputn [k] _IO_file_overflow 0.01% [k] _IO_vfprintf_internal [k] _IO_new_file_xsputn 0.01% [k] _IO_vfprintf_internal [k] strchrnul 0.01% [k] __printf [k] _IO_vfprintf_internal 0.01% [k] main [k] __printf This output shows from/to branch columns and shows the highest percentage (from,to) jump combinations - i.e. the most likely taken branches in the system. "branches" can also include function calls and any other synchronous and asynchronous transitions of the instruction pointer that are not 'next instruction' - such as system calls, traps, interrupts, etc. This feature comes with (hopefully intuitive) flat ascii and TUI support in perf report. - Various 'perf annotate' visual improvements for us assembly junkies. It will now recognize function calls in the TUI and by hitting enter you can follow the call (recursively) and back, amongst other improvements. - Multiple threads/processes recording support in perf record, perf stat, perf top - which is activated via a comma-list of PIDs: perf top -p 21483,21485 perf stat -p 21483,21485 -ddd perf record -p 21483,21485 - Support for per UID views, via the --uid paramter to perf top, perf report, etc. For example 'perf top --uid mingo' will only show the tasks that I am running, excluding other users, root, etc. - Jump label restructurings and improvements - this includes the factoring out of the (hopefully much clearer) include/linux/static_key.h generic facility: struct static_key key = STATIC_KEY_INIT_FALSE; ... if (static_key_false(&key)) do unlikely code else do likely code ... static_key_slow_inc(); ... static_key_slow_inc(); ... The static_key_false() branch will be generated into the code with as little impact to the likely code path as possible. the static_key_slow_*() APIs flip the branch via live kernel code patching. This facility can now be used more widely within the kernel to micro-optimize hot branches whose likelihood matches the static-key usage and fast/slow cost patterns. - SW function tracer improvements: perf support and filtering support. - Various hardenings of the perf.data ABI, to make older perf.data's smoother on newer tool versions, to make new features integrate more smoothly, to support cross-endian recording/analyzing workflows better, etc. - Restructuring of the kprobes code, the splitting out of 'optprobes', and a corner case bugfix. - Allow the tracing of kernel console output (printk). - Improvements/fixes to user-space RDPMC support, allowing user-space self-profiling code to extract PMU counts without performing any system calls, while playing nice with the kernel side. - 'perf bench' improvements - ... and lots of internal restructurings, cleanups and fixes that made these features possible. And, as usual this list is incomplete as there were also lots of other improvements * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (120 commits) perf report: Fix annotate double quit issue in branch view mode perf report: Remove duplicate annotate choice in branch view mode perf/x86: Prettify pmu config literals perf report: Enable TUI in branch view mode perf report: Auto-detect branch stack sampling mode perf record: Add HEADER_BRANCH_STACK tag perf record: Provide default branch stack sampling mode option perf tools: Make perf able to read files from older ABIs perf tools: Fix ABI compatibility bug in print_event_desc() perf tools: Enable reading of perf.data files from different ABI rev perf: Add ABI reference sizes perf report: Add support for taken branch sampling perf record: Add support for sampling taken branch perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK x86/kprobes: Split out optprobe related code to kprobes-opt.c x86/kprobes: Fix a bug which can modify kernel code permanently x86/kprobes: Fix instruction recovery on optimized path perf: Add callback to flush branch_stack on context switch perf: Disable PERF_SAMPLE_BRANCH_* when not supported perf/x86: Add LBR software filter support for Intel CPUs ...
2012-03-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-11/+12
2012-03-17netfilter: ctnetlink: fix race between delete and timeout expirationPablo Neira Ayuso1-11/+12
Kerin Millar reported hardlockups while running `conntrackd -c' in a busy firewall. That system (with several processors) was acting as backup in a primary-backup setup. After several tries, I found a race condition between the deletion operation of ctnetlink and timeout expiration. This patch fixes this problem. Tested-by: Kerin Millar <kerframil@gmail.com> Reported-by: Kerin Millar <kerframil@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-12Merge branch 'perf/urgent' into perf/coreIngo Molnar2-5/+6
Merge reason: We are going to queue up a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-5/+6
2012-03-07netfilter: xt_CT: allow to attach timeout policy + glue codePablo Neira Ayuso3-16/+256
This patch allows you to attach the timeout policy via the CT target, it adds a new revision of the target to ensure backward compatibility. Moreover, it also contains the glue code to stick the timeout object defined via nfnetlink_cttimeout to the given flow. Example usage (it requires installing the nfct tool and libnetfilter_cttimeout): 1) create the timeout policy: nfct timeout add tcp-policy0 inet tcp \ established 1000 close 10 time_wait 10 last_ack 10 2) attach the timeout policy to the packet: iptables -I PREROUTING -t raw -p tcp -j CT --timeout tcp-policy0 You have to install the following user-space software: a) libnetfilter_cttimeout: git://git.netfilter.org/libnetfilter_cttimeout b) nfct: git://git.netfilter.org/nfct You also have to get iptables with -j CT --timeout support. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: nf_ct_ext: add timeout extensionPablo Neira Ayuso5-10/+78
This patch adds the timeout extension, which allows you to attach specific timeout policies to flows. This extension is only used by the template conntrack. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: add cttimeout infrastructure for fine timeout tuningPablo Neira Ayuso10-0/+909
This patch adds the infrastructure to add fine timeout tuning over nfnetlink. Now you can use the NFNL_SUBSYS_CTNETLINK_TIMEOUT subsystem to create/delete/dump timeout objects that contain some specific timeout policy for one flow. The follow up patches will allow you attach timeout policy object to conntrack via the CT target and the conntrack extension infrastructure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: nf_conntrack: pass timeout array to l4->new and l4->packetPablo Neira Ayuso8-44/+102
This patch defines a new interface for l4 protocol trackers: unsigned int *(*get_timeouts)(struct net *net); that is used to return the array of unsigned int that contains the timeouts that will be applied for this flow. This is passed to the l4proto->new(...) and l4proto->packet(...) functions to specify the timeout policy. This interface allows per-net global timeout configuration (although only DCCP supports this by now) and it will allow custom custom timeout configuration by means of follow-up patches. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: nf_ct_gre: add unsigned int array to define timeoutsPablo Neira Ayuso1-4/+12
This patch adds an array to define the default GRE timeouts. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: nf_ct_tcp: move retransmission and unacknowledged timeout to arrayPablo Neira Ayuso1-14/+13
This patch moves the retransmission and unacknowledged timeouts to the tcp_timeouts array. This change is required by follow-up patches. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: nf_ct_udp[lite]: convert UDP[lite] timeouts to arrayPablo Neira Ayuso2-18/+37
Use one array to store the UDP timeouts instead of two variables. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: ctnetlink: fix lockep splatsHans Schillstrom1-16/+24
net/netfilter/nf_conntrack_proto.c:70 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 3 locks held by conntrack/3235: nfnl_lock+0x17/0x20 netlink_dump+0x32/0x240 ctnetlink_dump_table+0x3e/0x170 [nf_conntrack_netlink] stack backtrace: Pid: 3235, comm: conntrack Tainted: G W 3.2.0+ #511 Call Trace: [<ffffffff8108ce45>] lockdep_rcu_suspicious+0xe5/0x100 [<ffffffffa00ec6e1>] __nf_ct_l4proto_find+0x81/0xb0 [nf_conntrack] [<ffffffffa0115675>] ctnetlink_fill_info+0x215/0x5f0 [nf_conntrack_netlink] [<ffffffffa0115dc1>] ctnetlink_dump_table+0xd1/0x170 [nf_conntrack_netlink] [<ffffffff815fbdbf>] netlink_dump+0x7f/0x240 [<ffffffff81090f9d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff815fd34f>] netlink_dump_start+0xdf/0x190 [<ffffffffa0111490>] ? ctnetlink_change_nat_seq_adj+0x160/0x160 [nf_conntrack_netlink] [<ffffffffa0115cf0>] ? ctnetlink_get_conntrack+0x2a0/0x2a0 [nf_conntrack_netlink] [<ffffffffa0115ad9>] ctnetlink_get_conntrack+0x89/0x2a0 [nf_conntrack_netlink] [<ffffffff81603a47>] nfnetlink_rcv_msg+0x467/0x5f0 [<ffffffff81603a7c>] ? nfnetlink_rcv_msg+0x49c/0x5f0 [<ffffffff81603922>] ? nfnetlink_rcv_msg+0x342/0x5f0 [<ffffffff81071b21>] ? get_parent_ip+0x11/0x50 [<ffffffff816035e0>] ? nfnetlink_subsys_register+0x60/0x60 [<ffffffff815fed49>] netlink_rcv_skb+0xa9/0xd0 [<ffffffff81603475>] nfnetlink_rcv+0x15/0x20 [<ffffffff815fe70e>] netlink_unicast+0x1ae/0x1f0 [<ffffffff815fea16>] netlink_sendmsg+0x2c6/0x320 [<ffffffff815b2a87>] sock_sendmsg+0x117/0x130 [<ffffffff81125093>] ? might_fault+0x53/0xb0 [<ffffffff811250dc>] ? might_fault+0x9c/0xb0 [<ffffffff81125093>] ? might_fault+0x53/0xb0 [<ffffffff815b5991>] ? move_addr_to_kernel+0x71/0x80 [<ffffffff815b644e>] sys_sendto+0xfe/0x130 [<ffffffff815b5c94>] ? sys_bind+0xb4/0xd0 [<ffffffff817a8a0e>] ? retint_swapgs+0xe/0x13 [<ffffffff817afcd2>] system_call_fastpath+0x16/0x1b Reported-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
2012-03-07netfilter: xt_LOG: fix bogus extra layer-4 logging informationRichard Weinberger1-0/+4
In 16059b5 netfilter: merge ipt_LOG and ip6_LOG into xt_LOG, we have merged ipt_LOG and ip6t_LOG. However: IN=wlan0 OUT= MAC=xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx SRC=213.150.61.61 DST=192.168.1.133 LEN=40 TOS=0x00 PREC=0x00 TTL=117 ID=10539 DF PROTO=TCP SPT=80 DPT=49013 WINDOW=0 RES=0x00 ACK RST URGP=0 PROTO=UDPLITE SPT=80 DPT=49013 LEN=45843 PROTO=ICMP TYPE=0 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Several missing break in the code led to including bogus layer-4 information. This patch fixes this problem. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: nf_ct_ecache: refactor nf_ct_deliver_cached_eventsTony Zelenoff1-26/+29
* identation lowered * some CPU cycles saved at delayed item variable initialization Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: nf_ct_ecache: trailing whitespace removedTony Zelenoff1-1/+1
Signed-off-by: Tony Zelenoff <antonz@parallels.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: merge ipt_LOG and ip6_LOG into xt_LOGRichard Weinberger3-0/+931
ipt_LOG and ip6_LOG have a lot of common code, merge them to reduce duplicate code. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: ctnetlink: allow to set expectfn for expectationsPablo Neira Ayuso2-1/+72
This patch allows you to set expectfn which is specifically used by the NAT side of most of the existing conntrack helpers. I have added a symbol map that uses a string as key to look up for the function that is attached to the expectation object. This is the best solution I came out with to solve this issue. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: ctnetlink: add NAT support for expectationsPablo Neira Ayuso1-2/+66
This patch adds the missing bits to create expectations that are created in NAT setups.
2012-03-07netfilter: ctnetlink: allow to set expectation classPablo Neira Ayuso1-1/+11
This patch allows you to set the expectation class. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: ctnetlink: allow to set helper for new expectationsPablo Neira Ayuso1-1/+29
This patch allow you to set the helper for newly created expectations based of the CTA_EXPECT_HELP_NAME attribute. Before this, the helper set was NULL. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: ipset: Exceptions support added to hash:*net* typesJozsef Kadlecsik4-93/+329
The "nomatch" keyword and option is added to the hash:*net* types, by which one can add exception entries to sets. Example: ipset create test hash:net ipset add test 192.168.0/24 ipset add test 192.168.0/30 nomatch In this case the IP addresses from 192.168.0/24 except 192.168.0/30 match the elements of the set. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-07netfilter: ipset: use NFPROTO_ constantsJan Engelhardt13-60/+60
ipset is actually using NFPROTO values rather than AF (xt_set passes that along). Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-06netfilter: nf_conntrack: fix early_drop with reliable event deliveryPablo Neira Ayuso1-2/+6
If reliable event delivery is enabled and ctnetlink fails to deliver the destroy event in early_drop, the conntrack subsystem cannot drop any the candidate flow that was planned to be evicted. Reported-by: Kerin Millar <kerframil@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-06netfilter: ctnetlink: remove incorrect spin_[un]lock_bh on NAT module autoloadPablo Neira Ayuso1-3/+0
Since 7d367e0, ctnetlink_new_conntrack is called without holding the nf_conntrack_lock spinlock. Thus, ctnetlink_parse_nat_setup does not require to release that spinlock anymore in the NAT module autoload case. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>