aboutsummaryrefslogtreecommitdiffstats
path: root/net/core (follow)
AgeCommit message (Collapse)AuthorFilesLines
2011-02-25net: handle addr_type of 0 properlyHagen Paul Pfeifer1-1/+1
addr_type of 0 means that the type should be adopted from from_dev and not from __hw_addr_del_multiple(). Unfortunately it isn't so and addr_type will always be considered. Fix this by implementing the considered and documented behavior. Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-18net: deinit automatic LIST_HEADEric Dumazet1-0/+2
commit 9b5e383c11b08784 (net: Introduce unregister_netdevice_many()) left an active LIST_HEAD() in rollback_registered(), with possible memory corruption. Even if device is freed without touching its unreg_list (and therefore touching the previous memory location holding LISTE_HEAD(single), better close the bug for good, since its really subtle. (Same fix for default_device_exit_batch() for completeness) Reported-by: Michal Hocko <mhocko@suse.cz> Tested-by: Michal Hocko <mhocko@suse.cz> Reported-by: Eric W. Biderman <ebiderman@xmission.com> Tested-by: Eric W. Biderman <ebiderman@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Ingo Molnar <mingo@elte.hu> CC: Octavian Purdila <opurdila@ixiacom.com> CC: stable <stable@kernel.org> [.33+] Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-18net: dont leave active on stack LIST_HEADLinus Torvalds1-2/+5
Eric W. Biderman and Michal Hocko reported various memory corruptions that we suspected to be related to a LIST head located on stack, that was manipulated after thread left function frame (and eventually exited, so its stack was freed and reused). Eric Dumazet suggested the problem was probably coming from commit 443457242beb (net: factorize sync-rcu call in unregister_netdevice_many) This patch fixes __dev_close() and dev_close() to properly deinit their respective LIST_HEAD(single) before exiting. References: https://lkml.org/lkml/2011/2/16/304 References: https://lkml.org/lkml/2011/2/14/223 Reported-by: Michal Hocko <mhocko@suse.cz> Tested-by: Michal Hocko <mhocko@suse.cz> Reported-by: Eric W. Biderman <ebiderman@xmission.com> Tested-by: Eric W. Biderman <ebiderman@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Ingo Molnar <mingo@elte.hu> CC: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-08net: Fix lockdep regression caused by initializing netdev queues too early.David S. Miller1-11/+16
In commit aa9421041128abb4d269ee1dc502ff65fb3b7d69 ("net: init ingress queue") we moved the allocation and lock initialization of the queues into alloc_netdev_mq() since register_netdevice() is way too late. The problem is that dev->type is not setup until the setup() callback is invoked by alloc_netdev_mq(), and the dev->type is what determines the lockdep class to use for the locks in the queues. Fix this by doing the queue allocation after the setup() callback runs. This is safe because the setup() callback is not allowed to make any state changes that need to be undone on error (memory allocations, etc.). It may, however, make state changes that are undone by free_netdev() (such as netif_napi_add(), which is done by the ipoib driver's setup routine). The previous code also leaked a reference to the &init_net namespace object on RX/TX queue allocation failures. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-02gro: reset skb_iif on reuseAndy Gospodarek1-0/+1
Like Herbert's change from a few days ago: 66c46d741e2e60f0e8b625b80edb0ab820c46d7a gro: Reset dev pointer on reuse this may not be necessary at this point, but we should still clean up the skb->skb_iif. If not we may end up with an invalid valid for skb->skb_iif when the skb is reused and the check is done in __netif_receive_skb. Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-31net: Check rps_flow_table when RPS map length is 1Tom Herbert1-1/+2
In get_rps_cpu, add check that the rps_flow_table for the device is NULL when trying to take fast path when RPS map length is one. Without this, RFS is effectively disabled if map length is one which is not correct. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-30net: Fix ip link add netns oopsEric W. Biederman1-0/+3
Ed Swierk <eswierk@bigswitch.com> writes: > On 2.6.35.7 > ip link add link eth0 netns 9999 type macvlan > where 9999 is a nonexistent PID triggers an oops and causes all network functions to hang: > [10663.821898] BUG: unable to handle kernel NULL pointer dereference at 000000000000006d > [10663.821917] IP: [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170 > [10663.821933] PGD 1d3927067 PUD 22f5c5067 PMD 0 > [10663.821944] Oops: 0000 [#1] SMP > [10663.821953] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq > [10663.821959] CPU 3 > [10663.821963] Modules linked in: macvlan ip6table_filter ip6_tables rfcomm ipt_MASQUERADE binfmt_misc iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack sco ipt_REJECT bnep l2cap xt_tcpudp iptable_filter ip_tables x_tables bridge stp vboxnetadp vboxnetflt vboxdrv kvm_intel kvm parport_pc ppdev snd_hda_codec_intelhdmi snd_hda_codec_conexant arc4 iwlagn iwlcore mac80211 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi i915 snd_seq_midi_event snd_seq thinkpad_acpi drm_kms_helper btusb tpm_tis nvram uvcvideo snd_timer snd_seq_device bluetooth videodev v4l1_compat v4l2_compat_ioctl32 tpm drm tpm_bios snd cfg80211 psmouse serio_raw intel_ips soundcore snd_page_alloc intel_agp i2c_algo_bit video output netconsole configfs lp parport usbhid hid e1000e sdhci_pci ahci libahci sdhci led_class > [10663.822155] > [10663.822161] Pid: 6000, comm: ip Not tainted 2.6.35-23-generic #41-Ubuntu 2901CTO/2901CTO > [10663.822167] RIP: 0010:[<ffffffff8149c2fa>] [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170 > [10663.822177] RSP: 0018:ffff88014aebf7b8 EFLAGS: 00010286 > [10663.822182] RAX: 00000000fffffff4 RBX: ffff8801ad900800 RCX: 0000000000000000 > [10663.822187] RDX: ffff880000000000 RSI: 0000000000000000 RDI: ffff88014ad63000 > [10663.822191] RBP: ffff88014aebf808 R08: 0000000000000041 R09: 0000000000000041 > [10663.822196] R10: 0000000000000000 R11: dead000000200200 R12: ffff88014aebf818 > [10663.822201] R13: fffffffffffffffd R14: ffff88014aebf918 R15: ffff88014ad62000 > [10663.822207] FS: 00007f00c487f700(0000) GS:ffff880001f80000(0000) knlGS:0000000000000000 > [10663.822212] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [10663.822216] CR2: 000000000000006d CR3: 0000000231f19000 CR4: 00000000000026e0 > [10663.822221] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [10663.822226] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [10663.822231] Process ip (pid: 6000, threadinfo ffff88014aebe000, task ffff88014afb16e0) > [10663.822236] Stack: > [10663.822240] ffff88014aebf808 ffffffff814a2bb5 ffff88014aebf7e8 00000000a00ee8d6 > [10663.822251] <0> 0000000000000000 ffffffffa00ef940 ffff8801ad900800 ffff88014aebf818 > [10663.822265] <0> ffff88014aebf918 ffff8801ad900800 ffff88014aebf858 ffffffff8149c413 > [10663.822281] Call Trace: > [10663.822290] [<ffffffff814a2bb5>] ? dev_addr_init+0x75/0xb0 > [10663.822298] [<ffffffff8149c413>] dev_alloc_name+0x43/0x90 > [10663.822307] [<ffffffff814a85ee>] rtnl_create_link+0xbe/0x1b0 > [10663.822314] [<ffffffff814ab2aa>] rtnl_newlink+0x48a/0x570 > [10663.822321] [<ffffffff814aafcc>] ? rtnl_newlink+0x1ac/0x570 > [10663.822332] [<ffffffff81030064>] ? native_x2apic_icr_read+0x4/0x20 > [10663.822339] [<ffffffff814a8c17>] rtnetlink_rcv_msg+0x177/0x290 > [10663.822346] [<ffffffff814a8aa0>] ? rtnetlink_rcv_msg+0x0/0x290 > [10663.822354] [<ffffffff814c25d9>] netlink_rcv_skb+0xa9/0xd0 > [10663.822360] [<ffffffff814a8a85>] rtnetlink_rcv+0x25/0x40 > [10663.822367] [<ffffffff814c223e>] netlink_unicast+0x2de/0x2f0 > [10663.822374] [<ffffffff814c303e>] netlink_sendmsg+0x1fe/0x2e0 > [10663.822383] [<ffffffff81488533>] sock_sendmsg+0xf3/0x120 > [10663.822391] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20 > [10663.822400] [<ffffffff81168656>] ? __d_lookup+0x136/0x150 > [10663.822406] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20 > [10663.822414] [<ffffffff812b7a0d>] ? _atomic_dec_and_lock+0x4d/0x80 > [10663.822422] [<ffffffff8116ea90>] ? mntput_no_expire+0x30/0x110 > [10663.822429] [<ffffffff81486ff5>] ? move_addr_to_kernel+0x65/0x70 > [10663.822435] [<ffffffff81493308>] ? verify_iovec+0x88/0xe0 > [10663.822442] [<ffffffff81489020>] sys_sendmsg+0x240/0x3a0 > [10663.822450] [<ffffffff8111e2a9>] ? __do_fault+0x479/0x560 > [10663.822457] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20 > [10663.822465] [<ffffffff8116cf4a>] ? alloc_fd+0x10a/0x150 > [10663.822473] [<ffffffff8158d76e>] ? do_page_fault+0x15e/0x350 > [10663.822482] [<ffffffff8100a0f2>] system_call_fastpath+0x16/0x1b > [10663.822487] Code: 90 48 8d 78 02 be 25 00 00 00 e8 92 1d e2 ff 48 85 c0 75 cf bf 20 00 00 00 e8 c3 b1 c6 ff 49 89 c7 b8 f4 ff ff ff 4d 85 ff 74 bd <4d> 8b 75 70 49 8d 45 70 48 89 45 b8 49 83 ee 58 eb 28 48 8d 55 > [10663.822618] RIP [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170 > [10663.822627] RSP <ffff88014aebf7b8> > [10663.822631] CR2: 000000000000006d > [10663.822636] ---[ end trace 3dfd6c3ad5327ca7 ]--- This bug was introduced in: commit 81adee47dfb608df3ad0b91d230fb3cef75f0060 Author: Eric W. Biederman <ebiederm@aristanetworks.com> Date: Sun Nov 8 00:53:51 2009 -0800 net: Support specifying the network namespace upon device creation. There is no good reason to not support userspace specifying the network namespace during device creation, and it makes it easier to create a network device and pass it to a child network namespace with a well known name. We have to be careful to ensure that the target network namespace for the new device exists through the life of the call. To keep that logic clear I have factored out the network namespace grabbing logic into rtnl_link_get_net. In addtion we need to continue to pass the source network namespace to the rtnl_link_ops.newlink method so that we can find the base device source network namespace. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Where apparently I forgot to add error handling to the path where we create a new network device in a new network namespace, and pass in an invalid pid. Cc: stable@kernel.org Reported-by: Ed Swierk <eswierk@bigswitch.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-29gro: Reset dev pointer on reuseHerbert Xu1-0/+1
On older kernels the VLAN code may zero skb->dev before dropping it and causing it to be reused by GRO. Unfortunately we didn't reset skb->dev in that case which causes the next GRO user to get a bogus skb->dev pointer. This particular problem no longer happens with the current upstream kernel due to changes in VLAN processing. However, for correctness we should still reset the skb->dev pointer in the GRO reuse function in case a future user does the same thing. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-27net: add kmemcheck annotation in __alloc_skb()Eric Dumazet1-0/+1
pskb_expand_head() triggers a kmemcheck warning when copy of skb_shared_info is done in pskb_expand_head() This is because destructor_arg field is not necessarily initialized at this point. Add kmemcheck_annotate_variable() call in __alloc_skb() to instruct kmemcheck this is a normal situation. Resolves bugzilla.kernel.org 27212 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=27212 Reported-by: Christian Casteyde <casteyde.christian@free.fr> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-27net: fix validate_link_af in rtnetlink coreKurt Van Dijck1-2/+1
I'm testing an API that uses IFLA_AF_SPEC attribute. In the rtnetlink core , the set_link_af() member of the rtnl_af_ops struct receives the nested attribute (as I expected), but the validate_link_af() member receives the parent attribute. IMO, this patch fixes this. Signed-off-by: Kurt Van Dijck <kurt.van.dijck@eia.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24net: clear heap allocation for ethtool_get_regs()Eugene Teo1-1/+1
There is a conflict between commit b00916b1 and a77f5db3. This patch resolves the conflict by clearing the heap allocation in ethtool_get_regs(). Cc: stable@kernel.org Signed-off-by: Eugene Teo <eugeneteo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24GRO: fix merging a paged skb after non-paged skbsMichal Schmidt1-2/+6
Suppose that several linear skbs of the same flow were received by GRO. They were thus merged into one skb with a frag_list. Then a new skb of the same flow arrives, but it is a paged skb with data starting in its frags[]. Before adding the skb to the frag_list skb_gro_receive() will of course adjust the skb to throw away the headers. It correctly modifies the page_offset and size of the frag, but it leaves incorrect information in the skb: ->data_len is not decreased at all. ->len is decreased only by headlen, as if no change were done to the frag. Later in a receiving process this causes skb_copy_datagram_iovec() to return -EFAULT and this is seen in userspace as the result of the recv() syscall. In practice the bug can be reproduced with the sfc driver. By default the driver uses an adaptive scheme when it switches between using napi_gro_receive() (with skbs) and napi_gro_frags() (with pages). The bug is reproduced when under rx load with enough successful GRO merging the driver decides to switch from the former to the latter. Manual control is also possible, so reproducing this is easy with netcat: - on machine1 (with sfc): nc -l 12345 > /dev/null - on machine2: nc machine1 12345 < /dev/zero - on machine1: echo 1 > /sys/module/sfc/parameters/rx_alloc_method # use skbs echo 2 > /sys/module/sfc/parameters/rx_alloc_method # use pages - See that nc has quit suddenly. [v2: Modified by Eric Dumazet to avoid advancing skb->data past the end and to use a temporary variable.] Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6David S. Miller2-4/+4
2011-01-24net: arp_ioctl() must hold RTNLEric Dumazet1-1/+2
Commit 941666c2e3e0 "net: RCU conversion of dev_getbyhwaddr() and arp_ioctl()" introduced a regression, reported by Jamie Heilman. "arp -Ds 192.168.2.41 eth0 pub" triggered the ASSERT_RTNL() assert in pneigh_lookup() Removing RTNL requirement from arp_ioctl() was a mistake, just revert that part. Reported-by: Jamie Heilman <jamie@audible.transient.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2-4/+4
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (41 commits) sctp: user perfect name for Delayed SACK Timer option net: fix can_checksum_protocol() arguments swap Revert "netlink: test for all flags of the NLM_F_DUMP composite" gianfar: Fix misleading indentation in startup_gfar() net/irda/sh_irda: return to RX mode when TX error net offloading: Do not mask out NETIF_F_HW_VLAN_TX for vlan. USB CDC NCM: tx_fixup() race condition fix ns83820: Avoid bad pointer deref in ns83820_init_one(). ipv6: Silence privacy extensions initialization bnx2x: Update bnx2x version to 1.62.00-4 bnx2x: Fix AER setting for BCM57712 bnx2x: Fix BCM84823 LED behavior bnx2x: Mark full duplex on some external PHYs bnx2x: Fix BCM8073/BCM8727 microcode loading bnx2x: LED fix for BCM8727 over BCM57712 bnx2x: Common init will be executed only once after POR bnx2x: Swap BCM8073 PHY polarity if required iwlwifi: fix valid chain reading from EEPROM ath5k: fix locking in tx_complete_poll_work ath9k_hw: do PA offset calibration only on longcal interval ...
2011-01-19net: fix can_checksum_protocol() arguments swapEric Dumazet1-1/+1
commit 0363466866d901fbc (net offloading: Convert checksums to use centrally computed features.) mistakenly swapped can_checksum_protocol() arguments. This broke IPv6 on bnx2 for instance, on NIC without TCPv6 checksum offloads. Reported-by: Hans de Bruin <jmdebruin@xmsnet.nl> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19Revert "netlink: test for all flags of the NLM_F_DUMP composite"David S. Miller1-1/+1
This reverts commit 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf. It breaks several things including the avahi daemon. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-18net offloading: Do not mask out NETIF_F_HW_VLAN_TX for vlan.Jesse Gross1-2/+2
In netif_skb_features() we return only the features that are valid for vlans if we have a vlan packet. However, we should not mask out NETIF_F_HW_VLAN_TX since it enables transmission of vlan tags and is obviously valid. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2-29/+2
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits) GRETH: resolve SMP issues and other problems GRETH: handle frame error interrupts GRETH: avoid writing bad speed/duplex when setting transfer mode GRETH: fixed skb buffer memory leak on frame errors GRETH: GBit transmit descriptor handling optimization GRETH: fix opening/closing GRETH: added raw AMBA vendor/device number to match against. cassini: Fix build bustage on x86. e1000e: consistent use of Rx/Tx vs. RX/TX/rx/tx in comments/logs e1000e: update Copyright for 2011 e1000: Avoid unhandled IRQ r8169: keep firmware in memory. netdev: tilepro: Use is_unicast_ether_addr helper etherdevice.h: Add is_unicast_ether_addr function ks8695net: Use default implementation of ethtool_ops::get_link ks8695net: Disable non-working ethtool operations USB CDC NCM: Don't deref NULL in cdc_ncm_rx_fixup() and don't use uninitialized variable. vxge: Remember to release firmware after upgrading firmware netdev: bfin_mac: Remove is_multicast_ether_addr use in netdev_for_each_mc_addr ipsec: update MAX_AH_AUTH_LEN to support sha512 ...
2011-01-13net: remove dev_txq_stats_fold()Eric Dumazet1-29/+0
After recent changes, (percpu stats on vlan/tunnels...), we dont need anymore per struct netdev_queue tx_bytes/tx_packets/tx_dropped counters. Only remaining users are ixgbe, sch_teql, gianfar & macvlan : 1) ixgbe can be converted to use existing tx_ring counters. 2) macvlan incremented txq->tx_dropped, it can use the dev->stats.tx_dropped counter. 3) sch_teql : almost revert ab35cd4b8f42 (Use net_device internal stats) Now we have ndo_get_stats64(), use it, even for "unsigned long" fields (No need to bring back a struct net_device_stats) 4) gianfar adds a stats structure per tx queue to hold tx_bytes/tx_packets This removes a lockdep warning (and possible lockup) in rndis gadget, calling dev_get_stats() from hard IRQ context. Ref: http://www.spinics.net/lists/netdev/msg149202.html Reported-by: Neil Jones <neiljay@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Jarek Poplawski <jarkao2@gmail.com> CC: Alexander Duyck <alexander.h.duyck@intel.com> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Sandeep Gopalpet <sandeep.kumar@freescale.com> CC: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds1-3/+3
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (46 commits) hwrng: via_rng - Fix memory scribbling on some CPUs crypto: padlock - Move padlock.h into include/crypto hwrng: via_rng - Fix asm constraints crypto: n2 - use __devexit not __exit in n2_unregister_algs crypto: mark crypto workqueues CPU_INTENSIVE crypto: mv_cesa - dont return PTR_ERR() of wrong pointer crypto: ripemd - Set module author and update email address crypto: omap-sham - backlog handling fix crypto: gf128mul - Remove experimental tag crypto: af_alg - fix af_alg memory_allocated data type crypto: aesni-intel - Fixed build with binutils 2.16 crypto: af_alg - Make sure sk_security is initialized on accept()ed sockets net: Add missing lockdep class names for af_alg include: Install linux/if_alg.h for user-space crypto API crypto: omap-aes - checkpatch --file warning fixes crypto: omap-aes - initialize aes module once per request crypto: omap-aes - unnecessary code removed crypto: omap-aes - error handling implementation improved crypto: omap-aes - redundant locking is removed crypto: omap-aes - DMA initialization fixes for OMAP off mode ...
2011-01-13Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivialLinus Torvalds1-1/+1
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits) Documentation/trace/events.txt: Remove obsolete sched_signal_send. writeback: fix global_dirty_limits comment runtime -> real-time ppc: fix comment typo singal -> signal drivers: fix comment typo diable -> disable. m68k: fix comment typo diable -> disable. wireless: comment typo fix diable -> disable. media: comment typo fix diable -> disable. remove doc for obsolete dynamic-printk kernel-parameter remove extraneous 'is' from Documentation/iostats.txt Fix spelling milisec -> ms in snd_ps3 module parameter description Fix spelling mistakes in comments Revert conflicting V4L changes i7core_edac: fix typos in comments mm/rmap.c: fix comment sound, ca0106: Fix assignment to 'channel'. hrtimer: fix a typo in comment init/Kconfig: fix typo anon_inodes: fix wrong function name in comment fix comment typos concerning "consistent" poll: fix a typo in comment ... Fix up trivial conflicts in: - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c) - fs/ext4/ext4.h Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.
2011-01-12Merge branch 'master' of git://1984.lsi.us.es/net-2.6David S. Miller1-0/+2
2011-01-12netfilter: fix compilation when conntrack is disabled but tproxy is enabledKOVACS Krisztian1-0/+2
The IPv6 tproxy patches split IPv6 defragmentation off of conntrack, but failed to update the #ifdef stanzas guarding the defragmentation related fields and code in skbuff and conntrack related code in nf_defrag_ipv6.c. This patch adds the required #ifdefs so that IPv6 tproxy can truly be used without connection tracking. Original report: http://marc.info/?l=linux-netdev&m=129010118516341&w=2 Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-01-10net_sched: factorize qdisc stats handlingEric Dumazet1-1/+4
HTB takes into account skb is segmented in stats updates. Generalize this to all schedulers. They should use qdisc_bstats_update() helper instead of manipulating bstats.bytes and bstats.packets Add bstats_update() helper too for classes that use gnet_stats_basic_packed fields. Note : Right now, TCQ_F_CAN_BYPASS shortcurt can be taken only if no stab is setup on qdisc. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10net: Add alloc_netdev_mqs functionTom Herbert1-11/+21
Added alloc_netdev_mqs function which allows the number of transmit and receive queues to be specified independenty. alloc_netdev_mq was changed to a macro to call the new function. Also added alloc_etherdev_mqs with same purpose. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09net offloading: Convert checksums to use centrally computed features.Jesse Gross1-28/+12
In order to compute the features for other offloads (primarily scatter/gather), we need to first check the ability of the NIC to offload the checksum for the packet. Since we have already computed this, we can directly use the result instead of figuring it out again. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09net offloading: Convert skb_need_linearize() to use precomputed features.Jesse Gross1-15/+6
This switches skb_need_linearize() to use the features that have been centrally computed. In doing so, this fixes a problem where scatter/gather should not be used because the card does not support checksum offloading on that type of packet. On device registration we only check that some form of checksum offloading is available if scatter/gatther is enabled but we must also check at transmission time. Examples of this include IPv6 or vlan packets on a NIC that only supports IPv4 offloading. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09net offloading: Convert dev_gso_segment() to use precomputed features.Jesse Gross1-5/+3
This switches dev_gso_segment() to use the device features computed by the centralized routine. In doing so, it fixes a problem where it would always use dev->features, instead of those appropriate to the number of vlan tags if any are present. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09net offloading: Pass features into netif_needs_gso().Jesse Gross1-2/+6
Now that there is a single function that can compute the device features relevant to a packet, we don't want to run it for each offload. This converts netif_needs_gso() to take the features of the device, rather than computing them itself. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09net offloading: Generalize netif_get_vlan_features().Jesse Gross1-8/+27
netif_get_vlan_features() is currently only used by netif_needs_gso(), so it only concerns itself with GSO features. However, several other places also should take into account the contents of the packet when deciding whether to offload to hardware. This generalizes the function to return features about all of the various forms of offloading. Since offloads tend to be linked together, this avoids duplicating the logic in each location (i.e. the scatter/gather code also needs the checksum logic). Suggested-by: Michał Mirosław <mirqus@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09net offloading: Accept NETIF_F_HW_CSUM for all protocols.Jesse Gross1-1/+1
We currently only have software fallback for one type of checksum: the TCP/UDP one's complement. This means that a protocol that uses hardware offloading for a different type of checksum (FCoE, SCTP) must directly check the device's features and do the right thing ahead of time. By the time we get to dev_can_checksum(), we're only deciding whether to apply the one algorithm in software or hardware. NETIF_F_HW_CSUM has the same capabilities as the software version, so we should always use it if present. The primary advantage of this is multiply tagged vlans can use hardware checksumming. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09net: fix kernel-doc warning in core/filter.cRandy Dunlap1-1/+1
Fix new kernel-doc notation warning in net/core/filter.c: Warning(net/core/filter.c:172): No description found for parameter 'fentry' Warning(net/core/filter.c:172): Excess function parameter 'filter' description in 'sk_run_filter' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-09netlink: test for all flags of the NLM_F_DUMP compositeJan Engelhardt1-1/+1
Due to NLM_F_DUMP is composed of two bits, NLM_F_ROOT | NLM_F_MATCH, when doing "if (x & NLM_F_DUMP)", it tests for _either_ of the bits being set. Because NLM_F_MATCH's value overlaps with NLM_F_EXCL, non-dump requests with NLM_F_EXCL set are mistaken as dump requests. Substitute the condition to test for _all_ bits being set. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-07Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds1-1/+1
* 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (33 commits) usb: don't use flush_scheduled_work() speedtch: don't abuse struct delayed_work media/video: don't use flush_scheduled_work() media/video: explicitly flush request_module work ioc4: use static work_struct for ioc4_load_modules() init: don't call flush_scheduled_work() from do_initcalls() s390: don't use flush_scheduled_work() rtc: don't use flush_scheduled_work() mmc: update workqueue usages mfd: update workqueue usages dvb: don't use flush_scheduled_work() leds-wm8350: don't use flush_scheduled_work() mISDN: don't use flush_scheduled_work() macintosh/ams: don't use flush_scheduled_work() vmwgfx: don't use flush_scheduled_work() tpm: don't use flush_scheduled_work() sonypi: don't use flush_scheduled_work() hvsi: don't use flush_scheduled_work() xen: don't use flush_scheduled_work() gdrom: don't use flush_scheduled_work() ... Fixed up trivial conflict in drivers/media/video/bt8xx/bttv-input.c as per Tejun.
2011-01-06net: add POLLPRI to sock_def_readable()Eric Dumazet1-1/+1
Leonardo Chiquitto found poll() could block forever on tcp sockets and Urgent data was received, if the event flag only contains POLLPRI. He did a bisection and found commit 4938d7e0233 (poll: avoid extra wakeups in select/poll) was the source of the problem. Problem is TCP sockets use standard sock_def_readable() function for their sk_data_ready() handler, and sock_def_readable() doesnt signal POLLPRI. Only TCP is affected by the problem. Adding POLLPRI to the list of flags might trigger unnecessary schedules, but URGENT handling is such a seldom used feature this seems a good compromise. Thanks a lot to Leonardo for providing the bisection result and a test program as well. Reference : http://www.spinics.net/lists/netdev/msg151793.html Reported-and-bisected-by: Leonardo Chiquitto <leonardo.lists@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-26Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6David S. Miller1-2/+1
Conflicts: net/ipv4/fib_frontend.c
2010-12-23Revert "ipv4: Allow configuring subnets as local addresses"David S. Miller1-2/+1
This reverts commit 4465b469008bc03b98a1b8df4e9ae501b6c69d4b. Conflicts: net/ipv4/fib_frontend.c As reported by Ben Greear, this causes regressions: > Change 4465b469008bc03b98a1b8df4e9ae501b6c69d4b caused rules > to stop matching the input device properly because the > FLOWI_FLAG_MATCH_ANY_IIF is always defined in ip_dev_find(). > > This breaks rules such as: > > ip rule add pref 512 lookup local > ip rule del pref 0 lookup local > ip link set eth2 up > ip -4 addr add 172.16.0.102/24 broadcast 172.16.0.255 dev eth2 > ip rule add to 172.16.0.102 iif eth2 lookup local pref 10 > ip rule add iif eth2 lookup 10001 pref 20 > ip route add 172.16.0.0/24 dev eth2 table 10001 > ip route add unreachable 0/0 table 10001 > > If you had a second interface 'eth0' that was on a different > subnet, pinging a system on that interface would fail: > > [root@ct503-60 ~]# ping 192.168.100.1 > connect: Invalid argument Reported-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-22Merge branch 'master' into for-nextJiri Kosina9-64/+71
Conflicts: MAINTAINERS arch/arm/mach-omap2/pm24xx.c drivers/scsi/bfa/bfa_fcpim.c Needed to update to apply fixes for which the old branch was too outdated.
2010-12-21filter: optimize accesses to ancillary dataEric Dumazet1-28/+44
We can translate pseudo load instructions at filter check time to dedicated instructions to speed up filtering and avoid one switch(). libpcap currently uses SKF_AD_PROTOCOL, but custom filters probably use other ancillary accesses. Note : I made the assertion that ancillary data was always accessed with BPF_LD|BPF_?|BPF_ABS instructions, not with BPF_LD|BPF_?|BPF_IND ones (offset given by K constant, not by K + X register) On x86_64, this saves a few bytes of text : # size net/core/filter.o.* text data bss dec hex filename 4864 0 0 4864 1300 net/core/filter.o.new 4944 0 0 4944 1350 net/core/filter.o.old Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-21net: timestamp cloned packet in dev_queue_xmit_nitEric Dumazet1-7/+2
Le vendredi 17 décembre 2010 à 10:26 +0100, Eric Dumazet a écrit : > > I think we can add this after latest Changli patch : > > He does one skb_clone() before calling the sniffers. > We could set timestamp on this clone, instead of original skb. > > Problem solved. > [PATCH net-next-2.6] net: timestamp cloned packet in dev_queue_xmit_nit Now we do one clone of skb if at least one sniffer might take packet, we also can do the skb timestamping on the clone and let original packet unchanged. This is a generalization of commit 8caf153974f2 (net: sch_netem: Fix an inconsistency in ingress netem timestamps.) This way, we can have a good idea when packets are delivered to our stack (tcpdump -i ifb0), while a tcpdump on original device gives timestamps right before ingressing. This also speedup our stack, avoiding taking timestamps if not needed. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Changli Gao <xiaosuo@gmail.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Jarek Poplawski <jarkao2@gmail.com> Acked-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-20pktgen: Remove unnecessary prefix from pr_<level>Joe Perches1-2/+1
Remove "pktgen: " prefix string from one pr_info. pr_fmt adds it, so this is a duplicate. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-19net: kill unused macrosShan Wei2-2/+0
These macros never be used, so remove them. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-19net: increase skb->users instead of skb_clone()Changli Gao1-10/+20
In dev_queue_xmit_nit(), we have to clone skbs as we need to mangle skbs, however, we don't need to clone skbs for all the packet_types. Except for the first packet_type, we increase skb->users instead of skb_clone(). Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-17Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6David S. Miller2-14/+39
Conflicts: drivers/net/bnx2x/bnx2x.h drivers/net/wireless/iwlwifi/iwl-1000.c drivers/net/wireless/iwlwifi/iwl-6000.c drivers/net/wireless/iwlwifi/iwl-core.h drivers/vhost/vhost.c
2010-12-16net: Use skb_checksum_start_offset()Michał Mirosław2-4/+4
Replace skb->csum_start - skb_headroom(skb) with skb_checksum_start_offset(). Note for usb/smsc95xx: skb->data - skb->head == skb_headroom(skb). Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16net: fix nulls list corruptions in sk_prot_allocOctavian Purdila1-12/+35
Special care is taken inside sk_port_alloc to avoid overwriting skc_node/skc_nulls_node. We should also avoid overwriting skc_bind_node/skc_portaddr_node. The patch fixes the following crash: BUG: unable to handle kernel paging request at fffffffffffffff0 IP: [<ffffffff812ec6dd>] udp4_lib_lookup2+0xad/0x370 [<ffffffff812ecc22>] __udp4_lib_lookup+0x282/0x360 [<ffffffff812ed63e>] __udp4_lib_rcv+0x31e/0x700 [<ffffffff812bba45>] ? ip_local_deliver_finish+0x65/0x190 [<ffffffff812bbbf8>] ? ip_local_deliver+0x88/0xa0 [<ffffffff812eda35>] udp_rcv+0x15/0x20 [<ffffffff812bba45>] ip_local_deliver_finish+0x65/0x190 [<ffffffff812bbbf8>] ip_local_deliver+0x88/0xa0 [<ffffffff812bb2cd>] ip_rcv_finish+0x32d/0x6f0 [<ffffffff8128c14c>] ? netif_receive_skb+0x99c/0x11c0 [<ffffffff812bb94b>] ip_rcv+0x2bb/0x350 [<ffffffff8128c14c>] netif_receive_skb+0x99c/0x11c0 Signed-off-by: Leonard Crestez <lcrestez@ixiacom.com> Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16net: factorize sync-rcu call in unregister_netdevice_manyOctavian Purdila1-42/+76
Add dev_close_many and dev_deactivate_many to factorize another sync-rcu operation on the netdevice unregister path. $ modprobe dummy numdummies=10000 $ ip link set dev dummy* up $ time rmmod dummy Without the patch With the patch real 0m 24.63s real 0m 5.15s user 0m 0.00s user 0m 0.00s sys 0m 6.05s sys 0m 5.14s Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16net: use NUMA_NO_NODE instead of the magic number -1Changli Gao2-2/+3
Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16bnx2x: Take the distribution range definition out of skb_tx_hash()Vladislav Zolotarov1-5/+10
Move the calcualation of the Tx hash for a given hash range into a separate function and define the skb_tx_hash(), which calculates a Tx hash for a [0; dev->real_num_tx_queues - 1] hash values range, using this function (__skb_tx_hash()). Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>