aboutsummaryrefslogtreecommitdiffstats
path: root/net (follow)
AgeCommit message (Collapse)AuthorFilesLines
2014-01-03{pktgen, xfrm} Show spi value properly when ipsec turned onFan Du1-1/+4
If user run pktgen plus ipsec by using spi, show spi value properly when cat /proc/net/pktgen/ethX Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-03{pktgen, xfrm} Introduce xfrm_state_lookup_byspi for pktgenFan Du2-7/+37
Introduce xfrm_state_lookup_byspi to find user specified by custom from "pgset spi xxx". Using this scheme, any flow regardless its saddr/daddr could be transform by SA specified with configurable spi. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-03{pktgen, xfrm} Construct skb dst for tunnel mode transformationFan Du1-1/+27
IPsec tunnel mode encapuslation needs to set outter ip header with right protocol/ttl/id value with regard to skb->dst->child. Looking up a rt in a standard way is absolutely wrong for every packet transmission. In a simple way, construct a dst by setting neccessary information to make tunnel mode encapuslation working. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-03{pktgen, xfrm} Using "pgset spi xxx" to spedifiy SA for a given flowFan Du1-0/+12
User could set specific SPI value to arm pktgen flow with IPsec transformation, instead of looking up SA by sadr/daddr. The reaseon to do so is because current state lookup scheme is both slow and, most important of all, in fact pktgen doesn't need to match any SA state addresses information, all it needs is the SA transfromation shell to do the encapuslation. And this option also provide user an alternative to using pktgen test existing SA without creating new ones. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-03{pktgen, xfrm} Correct xfrm_state_lock usage in xfrm_stateonly_findFan Du1-2/+2
Acquiring xfrm_state_lock in process context is expected to turn BH off, as this lock is also used in BH context, namely xfrm state timer handler. Otherwise it surprises LOCKDEP with below messages. [ 81.422781] pktgen: Packet Generator for packet performance testing. Version: 2.74 [ 81.725194] [ 81.725211] ========================================================= [ 81.725212] [ INFO: possible irq lock inversion dependency detected ] [ 81.725215] 3.13.0-rc2+ #92 Not tainted [ 81.725216] --------------------------------------------------------- [ 81.725218] kpktgend_0/2780 just changed the state of lock: [ 81.725220] (xfrm_state_lock){+.+...}, at: [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725231] but this lock was taken by another, SOFTIRQ-safe lock in the past: [ 81.725232] (&(&x->lock)->rlock){+.-...} [ 81.725232] [ 81.725232] and interrupts could create inverse lock ordering between them. [ 81.725232] [ 81.725235] [ 81.725235] other info that might help us debug this: [ 81.725237] Possible interrupt unsafe locking scenario: [ 81.725237] [ 81.725238] CPU0 CPU1 [ 81.725240] ---- ---- [ 81.725241] lock(xfrm_state_lock); [ 81.725243] local_irq_disable(); [ 81.725244] lock(&(&x->lock)->rlock); [ 81.725246] lock(xfrm_state_lock); [ 81.725248] <Interrupt> [ 81.725249] lock(&(&x->lock)->rlock); [ 81.725251] [ 81.725251] *** DEADLOCK *** [ 81.725251] [ 81.725254] no locks held by kpktgend_0/2780. [ 81.725255] [ 81.725255] the shortest dependencies between 2nd lock and 1st lock: [ 81.725269] -> (&(&x->lock)->rlock){+.-...} ops: 8 { [ 81.725274] HARDIRQ-ON-W at: [ 81.725276] [<ffffffff8109a64b>] __lock_acquire+0x65b/0x1d70 [ 81.725282] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725284] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725289] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290 [ 81.725292] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40 [ 81.725300] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0 [ 81.725303] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0 [ 81.725305] [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 81.725308] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60 [ 81.725313] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80 [ 81.725316] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30 [ 81.725329] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0 [ 81.725333] [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0 [ 81.725338] IN-SOFTIRQ-W at: [ 81.725340] [<ffffffff8109a61d>] __lock_acquire+0x62d/0x1d70 [ 81.725342] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725344] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725347] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290 [ 81.725349] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40 [ 81.725352] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0 [ 81.725355] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0 [ 81.725358] [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 81.725360] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60 [ 81.725363] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80 [ 81.725365] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30 [ 81.725368] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0 [ 81.725370] [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0 [ 81.725373] INITIAL USE at: [ 81.725375] [<ffffffff8109a31a>] __lock_acquire+0x32a/0x1d70 [ 81.725385] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725388] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725390] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290 [ 81.725394] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40 [ 81.725398] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0 [ 81.725401] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0 [ 81.725404] [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 81.725407] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60 [ 81.725409] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80 [ 81.725412] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30 [ 81.725415] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0 [ 81.725417] [<ffffffff8102e5b0>] start_secondary+0x190/0x1f0 [ 81.725420] } [ 81.725421] ... key at: [<ffffffff8295b9c8>] __key.46349+0x0/0x8 [ 81.725445] ... acquired at: [ 81.725446] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725449] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725452] [<ffffffff816dc057>] __xfrm_state_delete+0x37/0x140 [ 81.725454] [<ffffffff816dc18c>] xfrm_state_delete+0x2c/0x50 [ 81.725456] [<ffffffff816dc277>] xfrm_state_flush+0xc7/0x1b0 [ 81.725458] [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key] [ 81.725465] [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key] [ 81.725468] [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key] [ 81.725471] [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0 [ 81.725476] [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130 [ 81.725479] [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10 [ 81.725482] [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b [ 81.725484] [ 81.725486] -> (xfrm_state_lock){+.+...} ops: 11 { [ 81.725490] HARDIRQ-ON-W at: [ 81.725493] [<ffffffff8109a64b>] __lock_acquire+0x65b/0x1d70 [ 81.725504] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725507] [<ffffffff81774e4b>] _raw_spin_lock_bh+0x3b/0x70 [ 81.725510] [<ffffffff816dc1df>] xfrm_state_flush+0x2f/0x1b0 [ 81.725513] [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key] [ 81.725516] [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key] [ 81.725519] [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key] [ 81.725522] [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0 [ 81.725525] [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130 [ 81.725527] [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10 [ 81.725530] [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b [ 81.725533] SOFTIRQ-ON-W at: [ 81.725534] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70 [ 81.725537] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725539] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725541] [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725544] [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen] [ 81.725547] [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen] [ 81.725550] [<ffffffff81078f84>] kthread+0xe4/0x100 [ 81.725555] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0 [ 81.725565] INITIAL USE at: [ 81.725567] [<ffffffff8109a31a>] __lock_acquire+0x32a/0x1d70 [ 81.725569] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725572] [<ffffffff81774e4b>] _raw_spin_lock_bh+0x3b/0x70 [ 81.725574] [<ffffffff816dc1df>] xfrm_state_flush+0x2f/0x1b0 [ 81.725576] [<ffffffffa005f6cc>] pfkey_flush+0x7c/0x100 [af_key] [ 81.725580] [<ffffffffa005efb7>] pfkey_process+0x1c7/0x1f0 [af_key] [ 81.725583] [<ffffffffa005f139>] pfkey_sendmsg+0x159/0x260 [af_key] [ 81.725586] [<ffffffff8162c16f>] sock_sendmsg+0xaf/0xc0 [ 81.725589] [<ffffffff8162c99c>] SYSC_sendto+0xfc/0x130 [ 81.725594] [<ffffffff8162cf3e>] SyS_sendto+0xe/0x10 [ 81.725597] [<ffffffff8177dd12>] system_call_fastpath+0x16/0x1b [ 81.725599] } [ 81.725600] ... key at: [<ffffffff81cadef8>] xfrm_state_lock+0x18/0x50 [ 81.725606] ... acquired at: [ 81.725607] [<ffffffff810995c0>] check_usage_backwards+0x110/0x150 [ 81.725609] [<ffffffff81099e96>] mark_lock+0x196/0x2f0 [ 81.725611] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70 [ 81.725614] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725616] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725627] [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725629] [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen] [ 81.725632] [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen] [ 81.725635] [<ffffffff81078f84>] kthread+0xe4/0x100 [ 81.725637] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0 [ 81.725640] [ 81.725641] [ 81.725641] stack backtrace: [ 81.725645] CPU: 0 PID: 2780 Comm: kpktgend_0 Not tainted 3.13.0-rc2+ #92 [ 81.725647] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006 [ 81.725649] ffffffff82537b80 ffff880018199988 ffffffff8176af37 0000000000000007 [ 81.725652] ffff8800181999f0 ffff8800181999d8 ffffffff81099358 ffffffff82537b80 [ 81.725655] ffffffff81a32def ffff8800181999f4 0000000000000000 ffff880002cbeaa8 [ 81.725659] Call Trace: [ 81.725664] [<ffffffff8176af37>] dump_stack+0x46/0x58 [ 81.725667] [<ffffffff81099358>] print_irq_inversion_bug.part.42+0x1e8/0x1f0 [ 81.725670] [<ffffffff810995c0>] check_usage_backwards+0x110/0x150 [ 81.725672] [<ffffffff81099e96>] mark_lock+0x196/0x2f0 [ 81.725675] [<ffffffff810994b0>] ? check_usage_forwards+0x150/0x150 [ 81.725685] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70 [ 81.725691] [<ffffffff810899a5>] ? sched_clock_local+0x25/0x90 [ 81.725694] [<ffffffff81089b38>] ? sched_clock_cpu+0xa8/0x120 [ 81.725697] [<ffffffff8109a31a>] ? __lock_acquire+0x32a/0x1d70 [ 81.725699] [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0 [ 81.725702] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 81.725704] [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0 [ 81.725707] [<ffffffff810899a5>] ? sched_clock_local+0x25/0x90 [ 81.725710] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 81.725712] [<ffffffff816dd751>] ? xfrm_stateonly_find+0x41/0x1f0 [ 81.725715] [<ffffffff810971ec>] ? lock_release_holdtime.part.26+0x1c/0x1a0 [ 81.725717] [<ffffffff816dd751>] xfrm_stateonly_find+0x41/0x1f0 [ 81.725721] [<ffffffffa008af03>] mod_cur_headers+0x793/0x7f0 [pktgen] [ 81.725724] [<ffffffffa008bca2>] pktgen_thread_worker+0xd42/0x1880 [pktgen] [ 81.725727] [<ffffffffa008ba71>] ? pktgen_thread_worker+0xb11/0x1880 [pktgen] [ 81.725729] [<ffffffff8109cf9d>] ? trace_hardirqs_on+0xd/0x10 [ 81.725733] [<ffffffff81775410>] ? _raw_spin_unlock_irq+0x30/0x40 [ 81.725745] [<ffffffff8151faa0>] ? e1000_clean+0x9d0/0x9d0 [ 81.725751] [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60 [ 81.725753] [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60 [ 81.725757] [<ffffffffa008af60>] ? mod_cur_headers+0x7f0/0x7f0 [pktgen] [ 81.725759] [<ffffffff81078f84>] kthread+0xe4/0x100 [ 81.725762] [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170 [ 81.725765] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0 [ 81.725768] [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170 Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-03{pktgen, xfrm} Add statistics counting when transformingFan Du1-3/+7
so /proc/net/xfrm_stat could give user clue about what's wrong in this process. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-03{pktgen, xfrm} Correct xfrm state lock usage when transformingFan Du1-3/+2
xfrm_state lock protects its state, i.e., VALID/DEAD and statistics, not the transforming procedure, as both mode/type output functions are reentrant. Another issue is state lock can be used in BH context when state timer alarmed, after transformation in pktgen, update state statistics acquiring state lock should disabled BH context for a moment. Otherwise LOCKDEP critisize this: [ 62.354339] pktgen: Packet Generator for packet performance testing. Version: 2.74 [ 62.655444] [ 62.655448] ================================= [ 62.655451] [ INFO: inconsistent lock state ] [ 62.655455] 3.13.0-rc2+ #70 Not tainted [ 62.655457] --------------------------------- [ 62.655459] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. [ 62.655463] kpktgend_0/2764 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 62.655466] (&(&x->lock)->rlock){+.?...}, at: [<ffffffffa00886f6>] pktgen_thread_worker+0x1796/0x1860 [pktgen] [ 62.655479] {IN-SOFTIRQ-W} state was registered at: [ 62.655484] [<ffffffff8109a61d>] __lock_acquire+0x62d/0x1d70 [ 62.655492] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 62.655498] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 62.655505] [<ffffffff816dc3a3>] xfrm_timer_handler+0x43/0x290 [ 62.655511] [<ffffffff81059437>] __tasklet_hrtimer_trampoline+0x17/0x40 [ 62.655519] [<ffffffff8105a1b7>] tasklet_hi_action+0xd7/0xf0 [ 62.655523] [<ffffffff81059ac6>] __do_softirq+0xe6/0x2d0 [ 62.655526] [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 62.655530] [<ffffffff8177fd0a>] smp_apic_timer_interrupt+0x4a/0x60 [ 62.655537] [<ffffffff8177e96f>] apic_timer_interrupt+0x6f/0x80 [ 62.655541] [<ffffffff8100b7c6>] arch_cpu_idle+0x26/0x30 [ 62.655547] [<ffffffff810ace28>] cpu_startup_entry+0x88/0x2b0 [ 62.655552] [<ffffffff81761c3c>] rest_init+0xbc/0xd0 [ 62.655557] [<ffffffff81ea5e5e>] start_kernel+0x3c4/0x3d1 [ 62.655583] [<ffffffff81ea55a8>] x86_64_start_reservations+0x2a/0x2c [ 62.655588] [<ffffffff81ea569f>] x86_64_start_kernel+0xf5/0xfc [ 62.655592] irq event stamp: 77 [ 62.655594] hardirqs last enabled at (77): [<ffffffff810ab7f2>] vprintk_emit+0x1b2/0x520 [ 62.655597] hardirqs last disabled at (76): [<ffffffff810ab684>] vprintk_emit+0x44/0x520 [ 62.655601] softirqs last enabled at (22): [<ffffffff81059b57>] __do_softirq+0x177/0x2d0 [ 62.655605] softirqs last disabled at (15): [<ffffffff8105a026>] irq_exit+0x96/0xc0 [ 62.655609] [ 62.655609] other info that might help us debug this: [ 62.655613] Possible unsafe locking scenario: [ 62.655613] [ 62.655616] CPU0 [ 62.655617] ---- [ 62.655618] lock(&(&x->lock)->rlock); [ 62.655622] <Interrupt> [ 62.655623] lock(&(&x->lock)->rlock); [ 62.655626] [ 62.655626] *** DEADLOCK *** [ 62.655626] [ 62.655629] no locks held by kpktgend_0/2764. [ 62.655631] [ 62.655631] stack backtrace: [ 62.655636] CPU: 0 PID: 2764 Comm: kpktgend_0 Not tainted 3.13.0-rc2+ #70 [ 62.655638] Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006 [ 62.655642] ffffffff8216b7b0 ffff88001be43ab8 ffffffff8176af37 0000000000000007 [ 62.655652] ffff88001c8d4fc0 ffff88001be43b18 ffffffff81766d78 0000000000000000 [ 62.655663] ffff880000000001 ffff880000000001 ffffffff8101025f ffff88001be43b18 [ 62.655671] Call Trace: [ 62.655680] [<ffffffff8176af37>] dump_stack+0x46/0x58 [ 62.655685] [<ffffffff81766d78>] print_usage_bug+0x1f1/0x202 [ 62.655691] [<ffffffff8101025f>] ? save_stack_trace+0x2f/0x50 [ 62.655696] [<ffffffff81099f8c>] mark_lock+0x28c/0x2f0 [ 62.655700] [<ffffffff810994b0>] ? check_usage_forwards+0x150/0x150 [ 62.655704] [<ffffffff8109a67a>] __lock_acquire+0x68a/0x1d70 [ 62.655712] [<ffffffff81115b09>] ? irq_work_queue+0x69/0xb0 [ 62.655717] [<ffffffff810ab7f2>] ? vprintk_emit+0x1b2/0x520 [ 62.655722] [<ffffffff8109cec5>] ? trace_hardirqs_on_caller+0x105/0x1d0 [ 62.655730] [<ffffffffa00886f6>] ? pktgen_thread_worker+0x1796/0x1860 [pktgen] [ 62.655734] [<ffffffff8109c3c7>] lock_acquire+0x97/0x130 [ 62.655741] [<ffffffffa00886f6>] ? pktgen_thread_worker+0x1796/0x1860 [pktgen] [ 62.655745] [<ffffffff81774af6>] _raw_spin_lock+0x36/0x70 [ 62.655752] [<ffffffffa00886f6>] ? pktgen_thread_worker+0x1796/0x1860 [pktgen] [ 62.655758] [<ffffffffa00886f6>] pktgen_thread_worker+0x1796/0x1860 [pktgen] [ 62.655766] [<ffffffffa0087a79>] ? pktgen_thread_worker+0xb19/0x1860 [pktgen] [ 62.655771] [<ffffffff8109cf9d>] ? trace_hardirqs_on+0xd/0x10 [ 62.655777] [<ffffffff81775410>] ? _raw_spin_unlock_irq+0x30/0x40 [ 62.655785] [<ffffffff8151faa0>] ? e1000_clean+0x9d0/0x9d0 [ 62.655791] [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60 [ 62.655795] [<ffffffff81094310>] ? __init_waitqueue_head+0x60/0x60 [ 62.655800] [<ffffffffa0086f60>] ? mod_cur_headers+0x7f0/0x7f0 [pktgen] [ 62.655806] [<ffffffff81078f84>] kthread+0xe4/0x100 [ 62.655813] [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170 [ 62.655819] [<ffffffff8177dc6c>] ret_from_fork+0x7c/0xb0 [ 62.655824] [<ffffffff81078ea0>] ? flush_kthread_worker+0x170/0x170 Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-02xfrm: checkpatch erros with inline keyword positionWeilong Chen1-3/+3
Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-02xfrm: fix checkpatch errorWeilong Chen1-2/+2
Fix that "else should follow close brace '}'". Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-02xfrm: checkpatch erros with space prohibitedWeilong Chen2-4/+4
Fix checkpatch error "space prohibited xxx". Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-02xfrm: checkpatch errors with foo * barWeilong Chen3-13/+13
This patch clean up some checkpatch errors like this: ERROR: "foo * bar" should be "foo *bar" ERROR: "(foo*)" should be "(foo *)" Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-01-02xfrm: checkpatch errors with spaceWeilong Chen4-13/+13
This patch cleanup some space errors. Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-12-19bridge: change the position of '{' to the pre linetanxiaojun4-18/+9
That open brace { should be on the previous line. Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19bridge: change "foo* bar" to "foo *bar"tanxiaojun2-10/+10
"foo * bar" should be "foo *bar". Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19bridge: add space before '(/{', after ',', etc.tanxiaojun7-12/+12
Spaces required before the open parenthesis '(', before the open brace '{', after that ',' and around that '?/:'. Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19bridge: remove unnecessary parenthesestanxiaojun2-3/+3
Return is not a function, parentheses are not required. Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19bridge: remove unnecessary condition judgmenttanxiaojun2-4/+2
Because err is always negative, remove unnecessary condition judgment. Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19rps: NUMA flow limit allocationsEric Dumazet1-1/+2
Given we allocate memory for each cpu, we can do this using NUMA affinities, instead of using NUMA policies of the process changing flow_limit_cpu_bitmap value. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-nextDavid S. Miller23-223/+222
Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2013-12-19 1) Use the user supplied policy index instead of a generated one if present. From Fan Du. 2) Make xfrm migration namespace aware. From Fan Du. 3) Make the xfrm state and policy locks namespace aware. From Fan Du. 4) Remove ancient sleeping when the SA is in acquire state, we now queue packets to the policy instead. This replaces the sleeping code. 5) Remove FLOWI_FLAG_CAN_SLEEP. This was used to notify xfrm about the posibility to sleep. The sleeping code is gone, so remove it. 6) Check user specified spi for IPComp. Thr spi for IPcomp is only 16 bit wide, so check for a valid value. From Fan Du. 7) Export verify_userspi_info to check for valid user supplied spi ranges with pfkey and netlink. From Fan Du. 8) RFC3173 states that if the total size of a compressed payload and the IPComp header is not smaller than the size of the original payload, the IP datagram must be sent in the original non-compressed form. These packets are dropped by the inbound policy check because they are not transformed. Document the need to set 'level use' for IPcomp to receive such packets anyway. From Fan Du. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19sch_cbq: remove unnecessary null pointer checkYang Yingliang1-2/+1
It already has a NULL pointer check of rtab in qdisc_put_rtab(). Remove the check outside of qdisc_put_rtab(). Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19act_police: remove unnecessary null pointer checkYang Yingliang1-4/+2
It already has a NULL pointer check of rtab in qdisc_put_rtab(). Remove the check outside of qdisc_put_rtab(). Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdiscTerry Lam3-0/+756
This patch implements the first size-based qdisc that attempts to differentiate between small flows and heavy-hitters. The goal is to catch the heavy-hitters and move them to a separate queue with less priority so that bulk traffic does not affect the latency of critical traffic. Currently "less priority" means less weight (2:1 in particular) in a Weighted Deficit Round Robin (WDRR) scheduler. In essence, this patch addresses the "delay-bloat" problem due to bloated buffers. In some systems, large queues may be necessary for obtaining CPU efficiency, or due to the presence of unresponsive traffic like UDP, or just a large number of connections with each having a small amount of outstanding traffic. In these circumstances, HHF aims to reduce the HoL blocking for latency sensitive traffic, while not impacting the queues built up by bulk traffic. HHF can also be used in conjunction with other AQM mechanisms such as CoDel. To capture heavy-hitters, we implement the "multi-stage filter" design in the following paper: C. Estan and G. Varghese, "New Directions in Traffic Measurement and Accounting", in ACM SIGCOMM, 2002. Some configurable qdisc settings through 'tc': - hhf_reset_timeout: period to reset counter values in the multi-stage filter (default 40ms) - hhf_admit_bytes: threshold to classify heavy-hitters (default 128KB) - hhf_evict_timeout: threshold to evict idle heavy-hitters (default 1s) - hhf_non_hh_weight: Weighted Deficit Round Robin (WDRR) weight for non-heavy-hitters (default 2) - hh_flows_limit: max number of heavy-hitter flow entries (default 2048) Note that the ratio between hhf_admit_bytes and hhf_reset_timeout reflects the bandwidth of heavy-hitters that we attempt to capture (25Mbps with the above default settings). The false negative rate (heavy-hitter flows getting away unclassified) is zero by the design of the multi-stage filter algorithm. With 100 heavy-hitter flows, using four hashes and 4000 counters yields a false positive rate (non-heavy-hitters mistakenly classified as heavy-hitters) of less than 1e-4. Signed-off-by: Terry Lam <vtlam@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18bridge: spelling fixestanxiaojun3-3/+3
Fix spelling errors in bridge driver. Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18ipv6: support IPV6_PMTU_INTERFACE on socketsHannes Frederic Sowa6-5/+17
IPV6_PMTU_INTERFACE is the same as IPV6_PMTU_PROBE for ipv6. Add it nontheless for symmetry with IPv4 sockets. Also drop incoming MTU information if this mode is enabled. The additional bit in ipv6_pinfo just eats in the padding behind the bitfield. There are no changes to the layout of the struct at all. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18ipv4: new ip_no_pmtu_disc mode to always discard incoming frag needed msgsHannes Frederic Sowa1-1/+3
This new mode discards all incoming fragmentation-needed notifications as I guess was originally intended with this knob. To not break backward compatibility too much, I only added a special case for mode 2 in the receiving path. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18inet: make no_pmtu_disc per namespace and kill ipv4_configHannes Frederic Sowa5-14/+11
The other field in ipv4_config, log_martians, was converted to a per-interface setting, so we can just remove the whole structure. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller31-157/+311
Conflicts: drivers/net/ethernet/intel/i40e/i40e_main.c drivers/net/macvtap.c Both minor merge hassles, simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18net_sched: convert tcf_proto_ops to use struct list_headWANG Cong1-10/+8
We don't need to maintain our own singly linked list code. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18net_sched: convert tc_action_ops to use struct list_headWANG Cong1-11/+9
We don't need to maintain our own singly linked list code. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18net_sched: convert tcf_hashinfo to hlist and use spinlockWANG Cong2-66/+48
So that we don't need to play with singly linked list, and since the code is not on hot path, we can use spinlock instead of rwlock. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18net_sched: init struct tcf_hashinfo at register timeWANG Cong10-94/+80
It looks weird to store the lock out of the struct but still points to a static variable. Just move them into the struct. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18net_sched: cls: refactor out struct tcf_ext_mapWANG Cong10-101/+59
These information can be saved in tcf_exts, and this will simplify the code. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18net_sched: act: use standard struct list_headWANG Cong11-89/+83
Currently actions are chained by a singly linked list, therefore it is a bit hard to add and remove a specific entry. Convert it to struct list_head so that in the latter patch we can remove an action without finding its head. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18net_sched: remove get_stats from tc_action_opsWANG Cong1-4/+0
It is not used. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18packet: deliver VLAN TPID to userspaceAtzm Watanabe1-4/+10
This enables userspace to get VLAN TPID as well as the VLAN TCI. Signed-off-by: Atzm Watanabe <atzm@stratosphere.co.jp> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18packet: fill the gap of TPACKET_ALIGNMENT with zerosAtzm Watanabe1-1/+2
struct tpacket{2,3}_hdr is aligned to a multiple of TPACKET_ALIGNMENT. Explicitly defining and zeroing the gap of this makes additional changes easier. Signed-off-by: Atzm Watanabe <atzm@stratosphere.co.jp> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18packet: make aligned size of struct tpacket{2,3}_hdr clearAtzm Watanabe1-0/+7
struct tpacket{2,3}_hdr is aligned to a multiple of TPACKET_ALIGNMENT. We may add members to them until current aligned size without forcing userspace to call getsockopt(..., PACKET_HDRLEN, ...). Signed-off-by: Atzm Watanabe <atzm@stratosphere.co.jp> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17net: Add utility function to copy skb hashTom Herbert1-2/+1
Adds skb_copy_hash to copy rxhash and l4_rxhash from one skb to another. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17net: Add utility functions to clear rxhashTom Herbert3-9/+8
In several places 'skb->rxhash = 0' is being done to clear the rxhash value in an skb. This does not clear l4_rxhash which could still be set so that the rxhash wouldn't be recalculated on subsequent call to skb_get_rxhash. This patch adds an explict function to clear all the rxhash related information in the skb properly. skb_clear_hash_if_not_l4 clears the rxhash only if it is not marked as l4_rxhash. Fixed up places where 'skb->rxhash = 0' was being called. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17net: Change skb_get_rxhash to skb_get_hashTom Herbert6-10/+10
Changing name of function as part of making the hash in skbuff to be generic property, not just for receive path. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17net/hsr: using kfree_rcu() to simplify the codeWei Yongjun1-9/+4
The callback function of call_rcu() just calls a kfree(), so we can use kfree_rcu() instead of call_rcu() + callback function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17neigh: Netlink notification for administrative NUD state changeBob Gilligan1-0/+1
The neighbour code sends up an RTM_NEWNEIGH netlink notification if the NUD state of a neighbour cache entry is changed by a timer (e.g. from REACHABLE to STALE), even if the lladdr of the entry has not changed. But an administrative change to the the NUD state of a neighbour cache entry that does not change the lladdr (e.g. via "ip -4 neigh change ... nud ...") does not trigger a netlink notification. This means that netlink listeners will not hear about administrative NUD state changes such as from a resolved state to PERMANENT. This patch changes the neighbor code to generate an RTM_NEWNEIGH message when the NUD state of an entry is changed administratively. Signed-off-by: Bob Gilligan <gilligan@aristanetworks.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17pkt_sched: fq: more robust memory allocationEric Dumazet1-6/+28
This patch brings NUMA support and automatic fallback to vmalloc() in case kmalloc() failed to allocate FQ hash table. NUMA support depends on XPS being setup for the device before qdisc allocation. After a XPS change, it might be worth creating qdisc hierarchy again. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17tcp: refine TSO splitsEric Dumazet1-39/+54
While investigating performance problems on small RPC workloads, I noticed linux TCP stack was always splitting the last TSO skb into two parts (skbs). One being a multiple of MSS, and a small one with the Push flag. This split is done even if TCP_NODELAY is set, or if no small packet is in flight. Example with request/response of 4K/4K IP A > B: . ack 68432 win 2783 <nop,nop,timestamp 6524593 6525001> IP A > B: . 65537:68433(2896) ack 69632 win 2783 <nop,nop,timestamp 6524593 6525001> IP A > B: P 68433:69633(1200) ack 69632 win 2783 <nop,nop,timestamp 6524593 6525001> IP B > A: . ack 68433 win 2768 <nop,nop,timestamp 6525001 6524593> IP B > A: . 69632:72528(2896) ack 69633 win 2768 <nop,nop,timestamp 6525001 6524593> IP B > A: P 72528:73728(1200) ack 69633 win 2768 <nop,nop,timestamp 6525001 6524593> IP A > B: . ack 72528 win 2783 <nop,nop,timestamp 6524593 6525001> IP A > B: . 69633:72529(2896) ack 73728 win 2783 <nop,nop,timestamp 6524593 6525001> IP A > B: P 72529:73729(1200) ack 73728 win 2783 <nop,nop,timestamp 6524593 6525001> We can avoid this split by including the Nagle tests at the right place. Note : If some NIC had trouble sending TSO packets with a partial last segment, we would have hit the problem in GRO/forwarding workload already. tcp_minshall_update() is moved to tcp_output.c and is updated as we might feed a TSO packet with a partial last segment. This patch tremendously improves performance, as the traffic now looks like : IP A > B: . ack 98304 win 2783 <nop,nop,timestamp 6834277 6834685> IP A > B: P 94209:98305(4096) ack 98304 win 2783 <nop,nop,timestamp 6834277 6834685> IP B > A: . ack 98305 win 2768 <nop,nop,timestamp 6834686 6834277> IP B > A: P 98304:102400(4096) ack 98305 win 2768 <nop,nop,timestamp 6834686 6834277> IP A > B: . ack 102400 win 2783 <nop,nop,timestamp 6834279 6834686> IP A > B: P 98305:102401(4096) ack 102400 win 2783 <nop,nop,timestamp 6834279 6834686> IP B > A: . ack 102401 win 2768 <nop,nop,timestamp 6834687 6834279> IP B > A: P 102400:106496(4096) ack 102401 win 2768 <nop,nop,timestamp 6834687 6834279> IP A > B: . ack 106496 win 2783 <nop,nop,timestamp 6834280 6834687> IP A > B: P 102401:106497(4096) ack 106496 win 2783 <nop,nop,timestamp 6834280 6834687> IP B > A: . ack 106497 win 2768 <nop,nop,timestamp 6834688 6834280> IP B > A: P 106496:110592(4096) ack 106497 win 2768 <nop,nop,timestamp 6834688 6834280> Before : lpq83:~# nstat >/dev/null;perf stat ./super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K 280774 Performance counter stats for './super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K': 205719.049006 task-clock # 9.278 CPUs utilized 8,449,968 context-switches # 0.041 M/sec 1,935,997 CPU-migrations # 0.009 M/sec 160,541 page-faults # 0.780 K/sec 548,478,722,290 cycles # 2.666 GHz [83.20%] 455,240,670,857 stalled-cycles-frontend # 83.00% frontend cycles idle [83.48%] 272,881,454,275 stalled-cycles-backend # 49.75% backend cycles idle [66.73%] 166,091,460,030 instructions # 0.30 insns per cycle # 2.74 stalled cycles per insn [83.39%] 29,150,229,399 branches # 141.699 M/sec [83.30%] 1,943,814,026 branch-misses # 6.67% of all branches [83.32%] 22.173517844 seconds time elapsed lpq83:~# nstat | egrep "IpOutRequests|IpExtOutOctets" IpOutRequests 16851063 0.0 IpExtOutOctets 23878580777 0.0 After patch : lpq83:~# nstat >/dev/null;perf stat ./super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K 280877 Performance counter stats for './super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K': 107496.071918 task-clock # 4.847 CPUs utilized 5,635,458 context-switches # 0.052 M/sec 1,374,707 CPU-migrations # 0.013 M/sec 160,920 page-faults # 0.001 M/sec 281,500,010,924 cycles # 2.619 GHz [83.28%] 228,865,069,307 stalled-cycles-frontend # 81.30% frontend cycles idle [83.38%] 142,462,742,658 stalled-cycles-backend # 50.61% backend cycles idle [66.81%] 95,227,712,566 instructions # 0.34 insns per cycle # 2.40 stalled cycles per insn [83.43%] 16,209,868,171 branches # 150.795 M/sec [83.20%] 874,252,952 branch-misses # 5.39% of all branches [83.37%] 22.175821286 seconds time elapsed lpq83:~# nstat | egrep "IpOutRequests|IpExtOutOctets" IpOutRequests 11239428 0.0 IpExtOutOctets 23595191035 0.0 Indeed, the occupancy of tx skbs (IpExtOutOctets/IpOutRequests) is higher : 2099 instead of 1417, thus helping GRO to be more efficient when using FQ packet scheduler. Many thanks to Neal for review and ideas. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Van Jacobson <vanj@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17net: remove dead code for add/del multiplestephen hemminger1-96/+1
These function to manipulate multiple addresses are not used anywhere in current net-next tree. Some out of tree code maybe using these but too bad; they should submit their code upstream.. Also, make __hw_addr_flush local since only used by dev_addr_lists.c Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-nextDavid S. Miller31-216/+970
John W. Linville says: ==================== Please pull this batch of updates for the 3.14 stream... For the Bluetooth bits, Gustavo says: "This is the first batch of patches intended for 3.14. There is nothing big here. Most of the code are refactors, clean up, small fixes, plus some new device id support." And... "More patches to 3.14. Here we have the support for Low Energy Connection Oriented Channels (LE CoC). Basically, as the name says, this adds supports for connection oriented channels in the same way we already have them for BR/EDR connections so profiles/protocols that work on top of BR/EDR can now work on LE plus a plenty of new possibilities for LE." For the ath10k bits, Kalle says: "Janusz and Marek implemented DFS support to ath10k, but the code is not enabled yet due to missing cfg80211/mac80211 patches (it will be enabled in the next pull request). Michal did some device reset fixes and made it possible for ath10k to share an interrupt with another device. And lots of smaller fixes from different people." For the iwlwifi bits, Emmanuel says: "I have here a big rework of the rate control by Eyal. This is obviously the biggest part of this batch. I also have enhancement of protection flags by Avri and a few bits for WoWLAN by Eliad and Luca. Johannes cleans up the debugfs plus a few fixes. I provided a few things for Bluetooth coexistence. Besides this we have an implementation for low priority scan." Along with all that, there are big batches of updates to mwifiex and ath9k, Jeff Kirsher's FSF address fix patches, and a handful of other bits here and there. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller3-1/+3
Pablo Neira Ayuso says: ==================== The following patchset contains two Netfilter fixes for your net tree, they are: * Fix endianness in nft_reject, the NFTA_REJECT_TYPE netlink attributes was not converted to network byte order as needed by all nfnetlink subsystems, from Eric Leblond. * Restrict SYNPROXY target to INPUT and FORWARD chains, this avoid a possible crash due to misconfigurations, from Patrick McHardy. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17net: unix: allow bind to fail on mutex lockSasha Levin1-2/+6
This is similar to the set_peek_off patch where calling bind while the socket is stuck in unix_dgram_recvmsg() will block and cause a hung task spew after a while. This is also the last place that did a straightforward mutex_lock(), so there shouldn't be any more of these patches. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17udp: ipv4: do not use sk_dst_lock from softirq contextEric Dumazet1-9/+4
Using sk_dst_lock from softirq context is not supported right now. Instead of adding BH protection everywhere, udp_sk_rx_dst_set() can instead use xchg(), as suggested by David. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Fixes: 975022310233 ("udp: ipv4: must add synchronization in udp_sk_rx_dst_set()") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17net: ovs: use CRC32 accelerated flow hash if availableFrancesco Fusco1-2/+2
Currently OVS uses jhash2() for calculating flow hashes in its internal flow_hash() function. The performance of the flow_hash() function is critical, as the input data can be hundreds of bytes long. OVS is largely deployed in x86_64 based datacenters. Therefore, we argue that the performance critical fast path of OVS should exploit underlying CPU features in order to reduce the per packet processing costs. We replace jhash2 with the hash implementation provided by the kernel hash lib, which exploits the crc32l instruction to achieve high performance Our patch greatly reduces the hash footprint from ~200 cycles of jhash2() to around ~90 cycles in case of ovs_flow_hash_crc() (measured with rdtsc over maximum length flow keys on an i7 Intel CPU). Additionally, we wrote a microbenchmark to stress the flow table performance. The benchmark inserts random flows into the flow hash and then performs lookups. Our hash deployed on a CRC32 capable CPU reduces the lookup for 1000 flows, 100 masks from ~10,100us to ~6,700us, for example. Thus, simply use the newly introduced arch_fast_hash2() as a drop-in replacement. Signed-off-by: Francesco Fusco <ffusco@redhat.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Thomas Graf <tgraf@redhat.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>