aboutsummaryrefslogtreecommitdiffstats
path: root/net/core (follow)
AgeCommit message (Collapse)AuthorFilesLines
2011-03-07net: allow handlers to be processed for orig_devJiri Pirko1-1/+2
This was there before, I forgot about this. Allows deliveries to ptype_base handlers registered for orig_dev. I presume this is still desired. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-03Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6David S. Miller1-1/+1
Conflicts: drivers/net/bnx2x/bnx2x.h
2011-03-01inet: Remove unused sk_sndmsg_* from UFOHerbert Xu1-3/+0
UFO doesn't really use the sk_sndmsg_* parameters so touching them is pointless. It can't use them anyway since the whole point of UFO is to use the original pages without copying. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-28net: Forgot to commit net/core/dev.c part of Jiri's ->rx_handler patch.David S. Miller1-88/+31
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-27bond: service netpoll arp queue on master deviceAmerigo Wang1-0/+11
Neil pointed out that we can't send ARP reply on behalf of slaves, we need to move the arp queue to their bond device. Signed-off-by: WANG Cong <amwang@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-27netpoll: remove IFF_IN_NETPOLL flagAmerigo Wang1-2/+0
V4: rebase to net-next-2.6 This patch removes the flag IFF_IN_NETPOLL, we don't need it any more since we have netpoll_tx_running() now. Signed-off-by: WANG Cong <amwang@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-25net: handle addr_type of 0 properlyHagen Paul Pfeifer1-1/+1
addr_type of 0 means that the type should be adopted from from_dev and not from __hw_addr_del_multiple(). Unfortunately it isn't so and addr_type will always be considered. Fix this by implementing the considered and documented behavior. Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23net: Implement SFEATURES compatibility for not updated driversMichał Mirosław1-0/+61
Use discrete setting ops for not updated drivers. This will not make them conform to full G/SFEATURES semantics, though. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23net: Fix ETHTOOL_GFEATURES compatibilityMichał Mirosław1-0/+14
Implement getting rx checksum state for not updated drivers. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23net: avoid initial "Features changed" messageMichał Mirosław1-3/+5
Avoid "Features changed" message and ndo_set_features call on device registration caused by automatic enabling of GSO and GRO. Driver should have enabled hardware offloads it set in features, so the ndo_set_features() is not needed at registration time. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-23Fix "(unregistered net_device): Features changed" messageMichał Mirosław1-2/+2
Fix netdev_update_features() messages on register time by moving the call further in register_netdevice(). When netdev->reg_state != NETREG_REGISTERED, netdev_name() returns "(unregistered netdevice)" even if the dev's name is already filled. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22net: Make flow cache paths use a const struct flowi.David S. Miller1-7/+7
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next-2.6David S. Miller1-1/+2
2011-02-19Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6David S. Miller1-2/+7
Conflicts: Documentation/feature-removal-schedule.txt drivers/net/e1000e/netdev.c net/xfrm/xfrm_policy.c
2011-02-18net: deinit automatic LIST_HEADEric Dumazet1-0/+2
commit 9b5e383c11b08784 (net: Introduce unregister_netdevice_many()) left an active LIST_HEAD() in rollback_registered(), with possible memory corruption. Even if device is freed without touching its unreg_list (and therefore touching the previous memory location holding LISTE_HEAD(single), better close the bug for good, since its really subtle. (Same fix for default_device_exit_batch() for completeness) Reported-by: Michal Hocko <mhocko@suse.cz> Tested-by: Michal Hocko <mhocko@suse.cz> Reported-by: Eric W. Biderman <ebiderman@xmission.com> Tested-by: Eric W. Biderman <ebiderman@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Ingo Molnar <mingo@elte.hu> CC: Octavian Purdila <opurdila@ixiacom.com> CC: stable <stable@kernel.org> [.33+] Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-18net: dont leave active on stack LIST_HEADLinus Torvalds1-2/+5
Eric W. Biderman and Michal Hocko reported various memory corruptions that we suspected to be related to a LIST head located on stack, that was manipulated after thread left function frame (and eventually exited, so its stack was freed and reused). Eric Dumazet suggested the problem was probably coming from commit 443457242beb (net: factorize sync-rcu call in unregister_netdevice_many) This patch fixes __dev_close() and dev_close() to properly deinit their respective LIST_HEAD(single) before exiting. References: https://lkml.org/lkml/2011/2/16/304 References: https://lkml.org/lkml/2011/2/14/223 Reported-by: Michal Hocko <mhocko@suse.cz> Tested-by: Michal Hocko <mhocko@suse.cz> Reported-by: Eric W. Biderman <ebiderman@xmission.com> Tested-by: Eric W. Biderman <ebiderman@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Ingo Molnar <mingo@elte.hu> CC: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17net: Add initial_ref arg to dst_alloc().David S. Miller1-2/+2
This allows avoiding multiple writes to the initial __refcnt. The most simplest cases of wanting an initial reference of "1" in ipv4 and ipv6 have been converted, the rest have been left along and kept at the existing "0". Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17net: introduce NETIF_F_RXCSUMMichał Mirosław1-24/+23
Introduce NETIF_F_RXCSUM to replace device-private flags for RX checksum offload. Integrate it with ndo_fix_features. ethtool_op_get_rx_csum() is removed altogether as nothing in-tree uses it. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17net: use ndo_fix_features for ethtool_ops->set_flagsMichał Mirosław1-2/+29
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17net: ethtool: use ndo_fix_features for offload settingMichał Mirosław1-13/+32
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17net: Introduce new feature setting opsMichał Mirosław2-8/+163
This introduces a new framework to handle device features setting. It consists of: - new fields in struct net_device: + hw_features - features that hw/driver supports toggling + wanted_features - features that user wants enabled, when possible - new netdev_ops: + feat = ndo_fix_features(dev, feat) - API checking constraints for enabling features or their combinations + ndo_set_features(dev) - API updating hardware state to match changed dev->features - new ethtool commands: + ETHTOOL_GFEATURES/ETHTOOL_SFEATURES: get/set dev->wanted_features and trigger device reconfiguration if resulting dev->features changed + ETHTOOL_GSTRINGS(ETH_SS_FEATURES): get feature bits names (meaning) Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17ethtool: factorize get/set_one_featureMichał Mirosław1-142/+132
This allows to enable GRO even if RX csum is disabled. GRO will not be used for packets without hardware checksum anyway. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17ethtool: factorize ethtool_get_strings() and ethtool_get_sset_count()Michał Mirosław1-12/+23
This is needed for unified offloads patch. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17ethtool: enable GSO and GRO by defaultMichał Mirosław1-4/+14
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-17ethtool: move EXPORT_SYMBOL(ethtool_op_set_tx_csum) to correct placeMichał Mirosław1-1/+1
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-15net: RPS: Make hardware-accelerated RFS conditional on NETIF_F_NTUPLEBen Hutchings1-1/+2
For testing and debugging purposes it is useful to be able to disable hardware acceleration of RFS without disabling RFS altogether. Since this is a similar feature to 'n-tuple' flow steering through the ethtool API, test the same feature flag that controls that. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2011-02-15Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next-2.6David S. Miller1-1/+2
2011-02-15net: Adjust TX queue kobjects if number of queues changes during unregisterBen Hutchings1-1/+2
If the root qdisc for a net device is mqprio, and the driver's ndo_setup_tc() operation dynamically adds and remvoes TX queues, netif_set_real_num_tx_queues() will be called during device unregistration to remove the extra TX queues when the qdisc is destroyed. Currently this causes the corresponding kobjects to be leaked, and the device's reference count never drops to 0. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2011-02-13rtnetlink: implement setting of master deviceJiri Pirko1-0/+43
This patch allows userspace to enslave/release slave devices via netlink interface using IFLA_MASTER. This introduces generic way to add/remove underling devices. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-13net: make dev->master generalJiri Pirko1-12/+37
dev->master is now tightly connected to bonding driver. This patch makes this pointer more general and ready to be used by others. - netdev_set_master() - bond specifics moved to new function netdev_set_bond_master() - introduced netif_is_bond_slave() to check if device is a bonding slave Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-13net: remove the unnecessary dance around skb_bond_should_dropJiri Pirko1-3/+3
No need to check (master) twice and to drive in and out the header file. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-09net: rename group sysfs entry to netdev_groupXiaotian Feng1-1/+1
commit a512b92 adds sysfs entry for net device group, but before this commit, tun also uses group sysfs, so after this commit checkin, kernel warns like this: sysfs: cannot create duplicate filename '/devices/virtual/net/vnet0/group' Since tun has used this for years, rename sysfs under tun might break existing userspace, so rename group sysfs entry for net device group is a better choice. Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-08Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6David S. Miller1-11/+16
Conflicts: drivers/net/e1000e/netdev.c
2011-02-08net: Fix lockdep regression caused by initializing netdev queues too early.David S. Miller1-11/+16
In commit aa9421041128abb4d269ee1dc502ff65fb3b7d69 ("net: init ingress queue") we moved the allocation and lock initialization of the queues into alloc_netdev_mq() since register_netdevice() is way too late. The problem is that dev->type is not setup until the setup() callback is invoked by alloc_netdev_mq(), and the dev->type is what determines the lockdep class to use for the locks in the queues. Fix this by doing the queue allocation after the setup() callback runs. This is safe because the setup() callback is not allowed to make any state changes that need to be undone on error (memory allocations, etc.). It may, however, make state changes that are undone by free_netdev() (such as netif_napi_add(), which is done by the ipoib driver's setup routine). The previous code also leaked a reference to the &init_net namespace object on RX/TX queue allocation failures. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-04Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6David S. Miller1-1/+3
2011-02-02gro: reset skb_iif on reuseAndy Gospodarek1-0/+1
Like Herbert's change from a few days ago: 66c46d741e2e60f0e8b625b80edb0ab820c46d7a gro: Reset dev pointer on reuse this may not be necessary at this point, but we should still clean up the skb->skb_iif. If not we may end up with an invalid valid for skb->skb_iif when the skb is reused and the check is done in __netif_receive_skb. Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-31net: Check rps_flow_table when RPS map length is 1Tom Herbert1-1/+2
In get_rps_cpu, add check that the rps_flow_table for the device is NULL when trying to take fast path when RPS map length is one. Without this, RFS is effectively disabled if map length is one which is not correct. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-31Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6David S. Miller2-0/+4
2011-01-30net: Fix ip link add netns oopsEric W. Biederman1-0/+3
Ed Swierk <eswierk@bigswitch.com> writes: > On 2.6.35.7 > ip link add link eth0 netns 9999 type macvlan > where 9999 is a nonexistent PID triggers an oops and causes all network functions to hang: > [10663.821898] BUG: unable to handle kernel NULL pointer dereference at 000000000000006d > [10663.821917] IP: [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170 > [10663.821933] PGD 1d3927067 PUD 22f5c5067 PMD 0 > [10663.821944] Oops: 0000 [#1] SMP > [10663.821953] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq > [10663.821959] CPU 3 > [10663.821963] Modules linked in: macvlan ip6table_filter ip6_tables rfcomm ipt_MASQUERADE binfmt_misc iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack sco ipt_REJECT bnep l2cap xt_tcpudp iptable_filter ip_tables x_tables bridge stp vboxnetadp vboxnetflt vboxdrv kvm_intel kvm parport_pc ppdev snd_hda_codec_intelhdmi snd_hda_codec_conexant arc4 iwlagn iwlcore mac80211 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi i915 snd_seq_midi_event snd_seq thinkpad_acpi drm_kms_helper btusb tpm_tis nvram uvcvideo snd_timer snd_seq_device bluetooth videodev v4l1_compat v4l2_compat_ioctl32 tpm drm tpm_bios snd cfg80211 psmouse serio_raw intel_ips soundcore snd_page_alloc intel_agp i2c_algo_bit video output netconsole configfs lp parport usbhid hid e1000e sdhci_pci ahci libahci sdhci led_class > [10663.822155] > [10663.822161] Pid: 6000, comm: ip Not tainted 2.6.35-23-generic #41-Ubuntu 2901CTO/2901CTO > [10663.822167] RIP: 0010:[<ffffffff8149c2fa>] [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170 > [10663.822177] RSP: 0018:ffff88014aebf7b8 EFLAGS: 00010286 > [10663.822182] RAX: 00000000fffffff4 RBX: ffff8801ad900800 RCX: 0000000000000000 > [10663.822187] RDX: ffff880000000000 RSI: 0000000000000000 RDI: ffff88014ad63000 > [10663.822191] RBP: ffff88014aebf808 R08: 0000000000000041 R09: 0000000000000041 > [10663.822196] R10: 0000000000000000 R11: dead000000200200 R12: ffff88014aebf818 > [10663.822201] R13: fffffffffffffffd R14: ffff88014aebf918 R15: ffff88014ad62000 > [10663.822207] FS: 00007f00c487f700(0000) GS:ffff880001f80000(0000) knlGS:0000000000000000 > [10663.822212] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [10663.822216] CR2: 000000000000006d CR3: 0000000231f19000 CR4: 00000000000026e0 > [10663.822221] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [10663.822226] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [10663.822231] Process ip (pid: 6000, threadinfo ffff88014aebe000, task ffff88014afb16e0) > [10663.822236] Stack: > [10663.822240] ffff88014aebf808 ffffffff814a2bb5 ffff88014aebf7e8 00000000a00ee8d6 > [10663.822251] <0> 0000000000000000 ffffffffa00ef940 ffff8801ad900800 ffff88014aebf818 > [10663.822265] <0> ffff88014aebf918 ffff8801ad900800 ffff88014aebf858 ffffffff8149c413 > [10663.822281] Call Trace: > [10663.822290] [<ffffffff814a2bb5>] ? dev_addr_init+0x75/0xb0 > [10663.822298] [<ffffffff8149c413>] dev_alloc_name+0x43/0x90 > [10663.822307] [<ffffffff814a85ee>] rtnl_create_link+0xbe/0x1b0 > [10663.822314] [<ffffffff814ab2aa>] rtnl_newlink+0x48a/0x570 > [10663.822321] [<ffffffff814aafcc>] ? rtnl_newlink+0x1ac/0x570 > [10663.822332] [<ffffffff81030064>] ? native_x2apic_icr_read+0x4/0x20 > [10663.822339] [<ffffffff814a8c17>] rtnetlink_rcv_msg+0x177/0x290 > [10663.822346] [<ffffffff814a8aa0>] ? rtnetlink_rcv_msg+0x0/0x290 > [10663.822354] [<ffffffff814c25d9>] netlink_rcv_skb+0xa9/0xd0 > [10663.822360] [<ffffffff814a8a85>] rtnetlink_rcv+0x25/0x40 > [10663.822367] [<ffffffff814c223e>] netlink_unicast+0x2de/0x2f0 > [10663.822374] [<ffffffff814c303e>] netlink_sendmsg+0x1fe/0x2e0 > [10663.822383] [<ffffffff81488533>] sock_sendmsg+0xf3/0x120 > [10663.822391] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20 > [10663.822400] [<ffffffff81168656>] ? __d_lookup+0x136/0x150 > [10663.822406] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20 > [10663.822414] [<ffffffff812b7a0d>] ? _atomic_dec_and_lock+0x4d/0x80 > [10663.822422] [<ffffffff8116ea90>] ? mntput_no_expire+0x30/0x110 > [10663.822429] [<ffffffff81486ff5>] ? move_addr_to_kernel+0x65/0x70 > [10663.822435] [<ffffffff81493308>] ? verify_iovec+0x88/0xe0 > [10663.822442] [<ffffffff81489020>] sys_sendmsg+0x240/0x3a0 > [10663.822450] [<ffffffff8111e2a9>] ? __do_fault+0x479/0x560 > [10663.822457] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20 > [10663.822465] [<ffffffff8116cf4a>] ? alloc_fd+0x10a/0x150 > [10663.822473] [<ffffffff8158d76e>] ? do_page_fault+0x15e/0x350 > [10663.822482] [<ffffffff8100a0f2>] system_call_fastpath+0x16/0x1b > [10663.822487] Code: 90 48 8d 78 02 be 25 00 00 00 e8 92 1d e2 ff 48 85 c0 75 cf bf 20 00 00 00 e8 c3 b1 c6 ff 49 89 c7 b8 f4 ff ff ff 4d 85 ff 74 bd <4d> 8b 75 70 49 8d 45 70 48 89 45 b8 49 83 ee 58 eb 28 48 8d 55 > [10663.822618] RIP [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170 > [10663.822627] RSP <ffff88014aebf7b8> > [10663.822631] CR2: 000000000000006d > [10663.822636] ---[ end trace 3dfd6c3ad5327ca7 ]--- This bug was introduced in: commit 81adee47dfb608df3ad0b91d230fb3cef75f0060 Author: Eric W. Biederman <ebiederm@aristanetworks.com> Date: Sun Nov 8 00:53:51 2009 -0800 net: Support specifying the network namespace upon device creation. There is no good reason to not support userspace specifying the network namespace during device creation, and it makes it easier to create a network device and pass it to a child network namespace with a well known name. We have to be careful to ensure that the target network namespace for the new device exists through the life of the call. To keep that logic clear I have factored out the network namespace grabbing logic into rtnl_link_get_net. In addtion we need to continue to pass the source network namespace to the rtnl_link_ops.newlink method so that we can find the base device source network namespace. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Where apparently I forgot to add error handling to the path where we create a new network device in a new network namespace, and pass in an invalid pid. Cc: stable@kernel.org Reported-by: Ed Swierk <eswierk@bigswitch.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-29gro: Reset dev pointer on reuseHerbert Xu1-0/+1
On older kernels the VLAN code may zero skb->dev before dropping it and causing it to be reused by GRO. Unfortunately we didn't reset skb->dev in that case which causes the next GRO user to get a bogus skb->dev pointer. This particular problem no longer happens with the current upstream kernel due to changes in VLAN processing. However, for correctness we should still reset the skb->dev pointer in the GRO reuse function in case a future user does the same thing. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-28ipv4: Attach FIB info to dst_default_metrics when possibleDavid S. Miller1-1/+1
If there are no explicit metrics attached to a route, hook fi->fib_info up to dst_default_metrics. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-27net: fix dev_seq_next()Eric Dumazet1-4/+7
Commit c6d14c84566d (net: Introduce for_each_netdev_rcu() iterator) added a race in dev_seq_next(). The rcu_dereference() call should be done _before_ testing the end of list, or we might return a wrong net_device if a concurrent thread changes net_device list under us. Note : discovered thanks to a sparse warning : net/core/dev.c:3919:9: error: incompatible types in comparison expression (different address spaces) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-27Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6David S. Miller2-2/+2
2011-01-27net: add kmemcheck annotation in __alloc_skb()Eric Dumazet1-0/+1
pskb_expand_head() triggers a kmemcheck warning when copy of skb_shared_info is done in pskb_expand_head() This is because destructor_arg field is not necessarily initialized at this point. Add kmemcheck_annotate_variable() call in __alloc_skb() to instruct kmemcheck this is a normal situation. Resolves bugzilla.kernel.org 27212 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=27212 Reported-by: Christian Casteyde <casteyde.christian@free.fr> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-27net: fix validate_link_af in rtnetlink coreKurt Van Dijck1-2/+1
I'm testing an API that uses IFLA_AF_SPEC attribute. In the rtnetlink core , the set_link_af() member of the rtnl_af_ops struct receives the nested attribute (as I expected), but the validate_link_af() member receives the parent attribute. IMO, this patch fixes this. Signed-off-by: Kurt Van Dijck <kurt.van.dijck@eia.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-26net: Implement read-only protection and COW'ing of metrics.David S. Miller1-0/+39
Routing metrics are now copy-on-write. Initially a route entry points it's metrics at a read-only location. If a routing table entry exists, it will point there. Else it will point at the all zero metric place-holder called 'dst_default_metrics'. The writeability state of the metrics is stored in the low bits of the metrics pointer, we have two bits left to spare if we want to store more states. For the initial implementation, COW is implemented simply via kmalloc. However future enhancements will change this to place the writable metrics somewhere else, in order to increase sharing. Very likely this "somewhere else" will be the inetpeer cache. Note also that this means that metrics updates may transiently fail if we cannot COW the metrics successfully. But even by itself, this patch should decrease memory usage and increase cache locality especially for routing workloads. In those cases the read-only metric copies stay in place and never get written to. TCP workloads where metrics get updated, and those rare cases where PMTU triggers occur, will take a very slight performance hit. But that hit will be alleviated when the long-term writable metrics move to a more sharable location. Since the metrics storage went from a u32 array of RTAX_MAX entries to what is essentially a pointer, some retooling of the dst_entry layout was necessary. Most importantly, we need to preserve the alignment of the reference count so that it doesn't share cache lines with the read-mostly state, as per Eric Dumazet's alignment assertion checks. The only non-trivial bit here is the move of the 'flags' member into the writeable cacheline. This is OK since we are always accessing the flags around the same moment when we made a modification to the reference count. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-26Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6David S. Miller2-3/+7
2011-01-25pktgen: speedup fragmented skbsEric Dumazet1-141/+93
We spend lot of time clearing pages in pktgen. (Or not clearing them on ipv6 and leaking kernel memory) Since we dont modify them, we can use one zeroed page, and get references on it. This page can use NUMA affinity as well. Define pktgen_finalize_skb() helper, used both in ipv4 and ipv6 Results using skbs with one frag : Before patch : Result: OK: 608980458(c608978520+d1938) nsec, 1000000000 (100byte,1frags) 1642088pps 1313Mb/sec (1313670400bps) errors: 0 After patch : Result: OK: 345285014(c345283891+d1123) nsec, 1000000000 (100byte,1frags) 2896158pps 2316Mb/sec (2316926400bps) errors: 0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24net: add sysfs entry for device groupVlad Dogaru1-0/+15
The group of a network device can be queried or changed from userspace using sysfs. For example, considering sysfs mounted in /sys, one can change the group that interface lo belongs to: echo 1 > /sys/class/net/lo/group Signed-off-by: Vlad Dogaru <ddvlad@rosedu.org> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24net: clear heap allocation for ethtool_get_regs()Eugene Teo1-1/+1
There is a conflict between commit b00916b1 and a77f5db3. This patch resolves the conflict by clearing the heap allocation in ethtool_get_regs(). Cc: stable@kernel.org Signed-off-by: Eugene Teo <eugeneteo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>