linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2015-02-05	cxgb4: Delete an unnecessary check before the function call "release_firmware"	Markus Elfring	1	-2/+1
	The release_firmware() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	cxgb4: Add low latency socket busy_poll support	Hariprasad Shenai	4	-3/+174
	cxgb_busy_poll, corresponding to ndo_busy_poll, gets called by the socket waiting for data. With busy_poll enabled, improvement is seen in latency numbers as observed by collecting netperf TCP_RR numbers. Below are latency number, with and without busy-poll, in a switched environment for a particular msg size: netperf command: netperf -4 -H <ip> -l 30 -t TCP_RR -- -r1,1 Latency without busy-poll: ~16.25 us Latency with busy-poll : ~08.79 us Based on original work by Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	Revert "bridge: Let bridge not age 'externally' learnt FDB entries, they are removed when 'external' entity notifies the aging"	David S. Miller	1	-1/+1
	This reverts commit 9a05dde59a35eee5643366d3d1e1f43fc9069adb. Requested by Scott Feldman. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	pkt_sched: fq: better control of DDOS traffic	Eric Dumazet	2	-2/+19
	FQ has a fast path for skb attached to a socket, as it does not have to compute a flow hash. But for other packets, FQ being non stochastic means that hosts exposed to random Internet traffic can allocate million of flows structure (104 bytes each) pretty easily. Not only host can OOM, but lookup in RB trees can take too much cpu and memory resources. This patch adds a new attribute, orphan_mask, that is adding possibility of having a stochastic hash for orphaned skb. Its default value is 1024 slots, to mimic SFQ behavior. Note: This does not apply to locally generated TCP traffic, and no locally generated traffic will share a flow structure with another perfect or stochastic flow. This patch also handles the specific case of SYNACK messages: They are attached to the listener socket, and therefore all map to a single hash bucket. If listener have set SO_MAX_PACING_RATE, hoping to have new accepted socket inherit this rate, SYNACK might be paced and even dropped. This is very similar to an internal patch Google have used more than one year. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	tcp: do not pace pure ack packets	Eric Dumazet	3	-3/+32
	When we added pacing to TCP, we decided to let sch_fq take care of actual pacing. All TCP had to do was to compute sk->pacing_rate using simple formula: sk->pacing_rate = 2 * cwnd * mss / rtt It works well for senders (bulk flows), but not very well for receivers or even RPC : cwnd on the receiver can be less than 10, rtt can be around 100ms, so we can end up pacing ACK packets, slowing down the sender. Really, only the sender should pace, according to its own logic. Instead of adding a new bit in skb, or call yet another flow dissection, we tweak skb->truesize to a small value (2), and we instruct sch_fq to use new helper and not pace pure ack. Note this also helps TCP small queue, as ack packets present in qdisc/NIC do not prevent sending a data packet (RPC workload) This helps to reduce tx completion overhead, ack packets can use regular sock_wfree() instead of tcp_wfree() which is a bit more expensive. This has no impact in the case packets are sent to loopback interface, as we do not coalesce ack packets (were we would detect skb->truesize lie) In case netem (with a delay) is used, skb_orphan_partial() also sets skb->truesize to 1. This patch is a combination of two patches we used for about one year at Google. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	netfilter: Use rhashtable walk iterator	Herbert Xu	1	-17/+36
	This patch gets rid of the manual rhashtable walk in nft_hash which touches rhashtable internals that should not be exposed. It does so by using the rhashtable iterator primitives. Note that I'm leaving nft_hash_destroy alone since it's only invoked on shutdown and it shouldn't be affected by changes to rhashtable internals (or at least not what I'm planning to change). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	netlink: Use rhashtable walk iterator	Herbert Xu	1	-66/+64
	This patch gets rid of the manual rhashtable walk in netlink which touches rhashtable internals that should not be exposed. It does so by using the rhashtable iterator primitives. In fact the existing code was very buggy. Some sockets weren't shown at all while others were shown more than once. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	rhashtable: Introduce rhashtable_walk_*	Herbert Xu	2	-0/+198
	Some existing rhashtable users get too intimate with it by walking the buckets directly. This prevents us from easily changing the internals of rhashtable. This patch adds the helpers rhashtable_walk_init/exit/start/next/stop which will replace these custom walkers. They are meant to be usable for both procfs seq_file walks as well as walking by a netlink dump. The iterator structure should fit inside a netlink dump cb structure, with at least one element to spare. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	rhashtable: Fix potential crash on destroy in rhashtable_shrink	Herbert Xu	1	-0/+4
	The current being_destroyed check in rhashtable_expand is not enough since if we start a shrinking process after freeing all elements in the table that's also going to crash. This patch adds a being_destroyed check to the deferred worker thread so that we bail out as soon as we take the lock. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	NetCP: Deletion of unnecessary checks before two function calls	Markus Elfring	1	-4/+2
	The functions cpsw_ale_destroy() and of_dev_put() test whether their argument is NULL and then return immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	IBM-EMAC: Delete an unnecessary check before the function call "of_dev_put"	Markus Elfring	1	-1/+1
	The of_dev_put() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/mlx4_en: Notify TX Vlan offload change	Ido Shamay	1	-0/+4
	Notify users when TX vlan offload feature changed with ethtool. Relevant command - ethtool -K <eth> txvlan on/off. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/mlx4_en: Adjust RX frag strides to frag sizes	Ido Shamay	1	-2/+3
	This patch improves memory utilization and therefore the packets rate for special MTU's. Instead of setting the frag_stride to the maximal hard coded frag_size, use the actual frag_size that is set according to the MTU, when setting the stride of the last frag. So, for example, for MTU 1600, where the frag_size of the 2nd frag is 86, the frag_size is set to 128 instead of 4096. See below: Before: frag:0 - size:1536 prefix:0 stride:1536 frag:1 - size:86 prefix:1536 stride:4096 frag 0 allocator: - size:32768 frags:21 frag 1 allocator: - size:32768 frags:8 After: frag:0 - size:1536 prefix:0 stride:1536 frag:1 - size:86 prefix:1536 stride:128 frag 0 allocator: - size:32768 frags:21 frag 1 allocator: - size:32768 frags:256 Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/mlx4_en: Print page allocator information	Ido Shamay	1	-0/+4
	After Initialization of page_alloc, print actual allocated page size and number of frags it contains. prints is done only when drv message level is set on the interface. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/mlx5_core: Move to use hex PCI device IDs	Or Gerlitz	1	-6/+6
	Align the IDs in the code with the modinfo, lspci -n, etc tools outputs. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/mlx4_core: Fix misleading debug print on CQE stride support	Or Gerlitz	1	-1/+2
	We do support cache line sizes of 32 and 64 bytes without activating the CQE stride feature. Fix a misleading print saying that these cache line sizes aren't supported. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/mlx4: mlx4_config_dev_retrieval() - Initialize struct config_dev before using	Maor Gottlieb	1	-1/+1
	Add Initialization to struct config_dev before filling and using it. Fix to warning: warning: config_dev.rx_checksum_val may be used uninitialized in this function Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/mlx4_core: Fix mpt_entry initialization in mlx4_mr_rereg_mem_write()	Maor Gottlieb	2	-9/+6
	a) Previously, mlx4_mr_rereg_write filled the MPT's start and length with the old MPT's values. Fixing the initialization to take the new start and length. b) In addition access flags in mpt_status were initialized instead of status due to bad boolean operation. Fixing the operation. c) Initialization of pd_slave caused a protection error. Fix - removing this initialization. d) In resource_tracker.c: Fixing vf encoding to be one-based. Fixes: e630664c ('mlx4_core: Add helper functions to support MR re-registration') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	IB/mlx4: Load balance ports in port aggregation mode	Moni Shoua	4	-0/+29
	When the mlx4 IB (RoCE) device works in link aggregation mode, it exposes a single port to upper layers. Therefore, applications always set '1' in port_num attribute when modifying a QP or creating an address handle. To make sure that a node uses all available ports the mlx4 driver will override the port_num attribute with a round robin policy. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	IB/mlx4: Create mirror flows in port aggregation mode	Moni Shoua	2	-13/+80
	In port aggregation mode flows for port #1 (the only port) should be mirrored on port #2. This is because packets can arrive from either physical ports. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	IB/mlx4: Add port aggregation support	Moni Shoua	1	-6/+70
	Register the interface with the mlx4 core driver with port aggregation support and check for port aggregation mode when the 'add' function is called. In this mode, only one physical port is reported to the upper layer (RoCE/IB core stack and ULPs). Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	IB/mlx4: Reuse mlx4_mac_to_u64()	Moni Shoua	1	-11/+1
	This function is implemented twice... get rid of one copy. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/mlx4_en: Port aggregation configuration	Moni Shoua	3	-0/+189
	Capture NETDEV events generated by the bonding driver and based on that make decisions of how to configure port aggregation in the mlx4 core driver. This includes setting the V2P port table and re-creating the interested interfaces in bonded/non-bonded mode. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/mlx4_core: Port aggregation upper layer interface	Moni Shoua	8	-2/+177
	Supply interface functions to bond and unbond ports of a mlx4 internal interfaces. Example for such an interface is the one registered by the mlx4 IB driver under RoCE. There are 1. Functions to go in/out to/from bonded mode 2. Function to remap virtual ports to physical ports The bond_mutex prevents simultaneous access to data that keep status of the device in bonded mode. The upper mlx4 interface marks to the mlx4 core module that they want to be subject for such bonding by setting the MLX4_INTFF_BONDING flag. Interface which goes to/from bonded mode is re-created. The mlx4 Ethernet driver does not set this flag when registering the interface, the IB driver does. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/mlx4_core: Port aggregation low level interface	Moni Shoua	5	-6/+77
	Implement the hardware interface required for port aggregation. 1. Disable RX port check on receive - don't perform a validity check that matches to QP's port and the port where the packet is received. 2. Virtual to physical port remap - configure virtual to physical port mapping. Port remap capability for virtual functions. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/bonding: Notify state change on slaves	Moni Shoua	2	-0/+54
	Use notifier chain to dispatch an event upon a change in slave state. Event is dispatched with slave specific info. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/bonding: Move slave state changes to a helper function	Moni Shoua	2	-26/+43
	Move slave state changes to a helper function, this is a pre-step for adding functionality of dispatching an event when this helper is called. This commit doesn't add new functionality. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net/core: Add event for a change in slave state	Moni Shoua	3	-0/+36
	Add event which provides an indication on a change in the state of a bonding slave. The event handler should cast the pointer to the appropriate type (struct netdev_bonding_info) in order to get the full info about the slave. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	tipc: separate link starting event from link timeout event	Jon Paul Maloy	1	-1/+3
	When a new link instance is created, it is trigged to start by sending it a TIPC_STARTING_EVT, whereafter a regular link reset is applied to it. The starting event is codewise treated as a timeout event, and prompts a link RESET message to be sent to the peer node, carrying a link session identifier. The later link_reset() call nudges this session identifier, whereafter all subsequent RESET messages will be sent out with the new identifier. The latter session number overrides the former, causing the peer to unconditionally accept it irrespective of its current working state. We don't think that this causes any problem, but it is not in accordance with the protocol spec, and may cause confusion when debugging TIPC sessions. To avoid this, we make the starting event distinct from the subsequent timeout events, by not allowing the former to send out any RESET message. This eliminates the described problem. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	tipc: eliminate race during node creation	Jon Paul Maloy	2	-13/+7
	Instances of struct node are created in the function tipc_disc_rcv() under the assumption that there is no race between received discovery messages arriving from the same node. This assumption is wrong. When we use more than one bearer, it is possible that discovery messages from the same node arrive at the same moment, resulting in creation of two instances of struct tipc_node. This may later cause confusion during link establishment, and may result in one of the links never becoming activated. We fix this by making lookup and potential creation of nodes atomic. Instead of first looking up the node, and in case of failure, create it, we now start with looking up the node inside node_link_create(), and return a reference to that one if found. Otherwise, we go ahead and create the node as we did before. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	tipc: avoid stale link after aborted failover	Jon Paul Maloy	2	-0/+5
	During link failover it may happen that the remaining link goes down while it is still in the process of taking over traffic from a previously failed link. When this happens, we currently abort the failover procedure and reset the first failed link to non-failover mode, so that it will be ready to re-establish contact with its peer when it comes available. However, if the first link goes down because its bearer was manually disabled, it is not enough to reset it; it must also be deleted; which is supposed to happen when the failover procedure is finished. Otherwise it will remain a zombie link: attached to the owner node structure, in mode LINK_STOPPED, and permanently blocking any re- establishing of the link to the peer via the interface in question. We fix this by amending the failover abort procedure. Apart from resetting the link to non-failover state, we test if the link is also in LINK_STOPPED mode. If so, we delete it, using the conditional tipc_link_delete() function introduced in the previous commit. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	tipc: add reference count to struct tipc_link	Jon Paul Maloy	2	-31/+50
	When a bearer is disabled, all pertaining links will be reset and deleted. However, if there is a second active link towards a killed link's destination, the delete has to be postponed until the failover is finished. During this interval, we currently put the link in zombie mode, i.e., we take it out of traffic, delete its timer, but leave it attached to the owner node structure until all missing packets have been received. When this is done, we detach the link from its node and delete it, assuming that the synchronous timer deletion that was initiated earlier in a different thread has finished. This is unsafe, as the failover may finish before del_timer_sync() has returned in the other thread. We fix this by adding an atomic reference counter of type kref in struct tipc_link. The counter keeps track of the references kept to the link by the owner node and the timer. We then do a conditional delete, based on the reference counter, both after the failover has been finished and when the timer expires, if applicable. Whoever comes last, will actually delete the link. This approach also implies that we can make the deletion of the timer asynchronous. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	csiostor:Use firmware version from cxgb4/t4fw_version.h	Praveen Madhavan	3	-8/+7
	This patch is to use firmware version macros from t4fw_version.h and also enables 40g T5 adapter. Signed-off-by: Praveen Madhavan <praveenm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	tlan: msecs_to_jiffies convrsion	Nicholas Mc Guire	1	-1/+1
	This is only an API consolidation and should make things more readable it replaces var * HZ / 1000 by msecs_to_jiffies(var). As there is a discrepancy between the code and the comments this is in a separate patch. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	tlan: use msecs_to_jiffies for conversion	Nicholas Mc Guire	1	-6/+6
	This is only an API consolidation and should make things more readable it replaces var * HZ / 1000 by msecs_to_jiffies(var). Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net: add skb functions to process remote checksum offload	Tom Herbert	4	-32/+40
	This patch adds skb_remcsum_process and skb_gro_remcsum_process to perform the appropriate adjustments to the skb when receiving remote checksum offload. Updated vxlan and gue to use these functions. Tested: Ran TCP_RR and TCP_STREAM netperf for VXLAN and GUE, did not see any change in performance. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	bridge: Let bridge not age 'externally' learnt FDB entries, they are removed when 'external' entity notifies the aging	Siva Mannem	1	-1/+1
	When 'learned_sync' flag is turned on, the offloaded switch port syncs learned MAC addresses to bridge's FDB via switchdev notifier (NETDEV_SWITCH_FDB_ADD). Currently, FDB entries learnt via this mechanism are wrongly being deleted by bridge aging logic. This patch ensures that FDB entries synced from offloaded switch ports are not deleted by bridging logic. Such entries can only be deleted via switchdev notifier (NETDEV_SWITCH_FDB_DEL). Signed-off-by: Siva Mannem <siva.mannem.lnx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	net: fs_enet: Implement NETIF_F_SG feature	LEROY Christophe	2	-30/+66
	Freescale ethernet controllers have the capability to re-assemble fragmented data into a single ethernet frame. This patch uses this capability and implements NETIP_F_SG feature into the fs_enet ethernet driver. On a MPC885, I get 53% performance improvement on a ftp transfer of a 15Mb file: * Without the patch : 2,8 Mbps * With the patch : 4,3 Mbps Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	xps: fix xps for stacked devices	Eric Dumazet	3	-3/+15
	A typical qdisc setup is the following : bond0 : bonding device, using HTB hierarchy eth1/eth2 : slaves, multiqueue NIC, using MQ + FQ qdisc XPS allows to spread packets on specific tx queues, based on the cpu doing the send. Problem is that dequeues from bond0 qdisc can happen on random cpus, due to the fact that qdisc_run() can dequeue a batch of packets. CPUA -> queue packet P1 on bond0 qdisc, P1->ooo_okay=1 CPUA -> queue packet P2 on bond0 qdisc, P2->ooo_okay=0 CPUB -> dequeue packet P1 from bond0 enqueue packet on eth1/eth2 CPUC -> dequeue packet P2 from bond0 enqueue packet on eth1/eth2 using sk cache (ooo_okay is 0) get_xps_queue() then might select wrong queue for P1, since current cpu might be different than CPUA. P2 might be sent on the old queue (stored in sk->sk_tx_queue_mapping), if CPUC runs a bit faster (or CPUB spins a bit on qdisc lock) Effect of this bug is TCP reorders, and more generally not optimal TX queue placement. (A victim bulk flow can be migrated to the wrong TX queue for a while) To fix this, we have to record sender cpu number the first time dev_queue_xmit() is called for one tx skb. We can union napi_id (used on receive path) and sender_cpu, granted we clear sender_cpu in skb_scrub_packet() (credit to Willem for this union idea) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04	vhost: vhost_scsi_handle_vq() should just use copy_from_user()	Al Viro	4	-40/+2
	it has just verified that it asks no more than the length of the first segment of iovec. And with that the last user of stuff in lib/iovec.c is gone. RIP. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Nicholas A. Bellinger <nab@linux-iscsi.org> Cc: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04	vhost: don't bother copying iovecs in handle_rx(), kill memcpy_toiovecend()	Al Viro	3	-88/+23
	Cc: Michael S. Tsirkin <mst@redhat.com> Cc: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04	vhost: don't bother with copying iovec in handle_tx()	Al Viro	1	-4/+5
	just advance the msg.msg_iter and be done with that. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04	vhost: switch vhost get_indirect() to iov_iter, kill memcpy_fromiovec()	Al Viro	3	-28/+4
	Cc: Michael S. Tsirkin <mst@redhat.com> Cc: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04	net: switch sockets to ->read_iter/->write_iter	Al Viro	1	-29/+27
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04	net/socket.c: fold do_sock_{read,write} into callers	Al Viro	1	-35/+21
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04	crypto: switch af_alg_make_sg() to iov_iter	Al Viro	4	-100/+62
	With that, all ->sendmsg() instances are converted to iov_iter primitives and are agnostic wrt the kind of iov_iter they are working with. So's the last remaining ->recvmsg() instance that wasn't kind-agnostic yet. All ->sendmsg() and ->recvmsg() advance ->msg_iter by the amount actually copied and none of them modifies the underlying iovec, etc. Cc: linux-crypto@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04	net: bury net/core/iovec.c - nothing in there is used anymore	Al Viro	3	-145/+1
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04	tipc: tipc ->sendmsg() conversion	Al Viro	2	-7/+14
	This one needs to copy the same data from user potentially more than once. Sadly, MTU changes can trigger that ;-/ Cc: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04	net: switch memcpy_fromiovec()/memcpy_fromiovecend() users to copy_from_iter()	Al Viro	7	-18/+14
	That takes care of the majority of ->sendmsg() instances - most of them via memcpy_to_msg() or assorted getfrag() callbacks. One place where we still keep memcpy_fromiovecend() is tipc - there we potentially read the same data over and over; separate patch, that... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04	ip: convert tcp_sendmsg() to iov_iter primitives	Al Viro	4	-141/+123
	patch is actually smaller than it seems to be - most of it is unindenting the inner loop body in tcp_sendmsg() itself... the bit in tcp_input.c is going to get reverted very soon - that's what memcpy_from_msg() will become, but not in this commit; let's keep it reasonably contained... There's one potentially subtle change here: in case of short copy from userland, mainline tcp_send_syn_data() discards the skb it has allocated and falls back to normal path, where we'll send as much as possible after rereading the same data again. This patch trims SYN+data skb instead - that way we don't need to copy from the same place twice. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>