linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2013-01-02	bnx2x: Add VF device ids and enable feature	Ariel Elior	3	-10/+99
	Add the various VF device ids (of all supported hardware) Add the calls to enable_sriov and disable_sriov to enable the SR-IOV feature. This patch also advances the version and release date of the bnx2x module. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Support PF <-> VF Bulletin Board	Ariel Elior	8	-2/+320
	The PF <-> VF Bulletin Board is a simple interface between the PF and the VF. The main reason for the Bulletin Board is to allow the PF to be the initiator. The VF publishes at 'acquire' stage the GPA of a Bulletin Board structure it has allocated. The PF notes this GPA in the VF database. The VF samples the Bulletin Board periodically for new messages. The latest version of the BB is always used. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Support VF FLR	Ariel Elior	6	-7/+321
	The FLR indication arrives as an attention from the management processor. Upon VF flr all FLRed function in the indication have already been released by Firmware and now we basically need to free the resources allocated to those VFs, and clean any remainders from the device (FLR final cleanup). Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Support of PF driver of a VF release request	Ariel Elior	5	-1/+170
	The 'release' request is the opposite of the 'acquire' request. At release, all the resources allocated to the VF are reclaimed. The release flow applies the close flow if applicable. Note that there are actually two types of release: 1. The VF has been removed, and so issued a 'release' request over the VF <-> PF Channel. 2. The PF is going down and so has to release all of it's VFs. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Support of PF driver of a VF close request	Ariel Elior	3	-0/+119
	The 'close' command is the opposite of an init request. Here the queues of the VF are closed (if any are opened) and released. This flow applies the 'q_teardown' flow on all the queues. The VF state is changed by this request. Interrupts are disabled for the VF when closed. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Support of PF driver of a VF q_teardown request	Ariel Elior	3	-0/+295
	The 'q_teardown' request is basically the opposite of the 'q_setup'. Here the PF driver removes from the device the queue it opened against the VF fastpath ring at 'setup_q' stage, along with all related rx_mode info. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Support of PF driver of a VF q_filters request	Ariel Elior	3	-0/+589
	The VF driver uses the 'q_filters' message on the VF <-> PF channel for configuring an open queue, for example when the rxmode changes. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Support of PF driver of a VF setup_q request	Ariel Elior	5	-177/+1071
	Upon receiving a 'setup_q' request from the VF over the VF <-> PF channel the PF driver will open a corresponding queue in the device. The PF driver configures the queue with appropriate mac address, vlan configuration, etc from the VF. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Support statistics collection for VFs by the PF	Ariel Elior	6	-93/+217
	Statistics are collected by the PF driver. The collection is performed via a query sent to the device which is basically an array of 3-tuples of the form (statistics client, function, DMAE address). In this patch the PF driver adds to the query, on top of the statistics clients it is maintaining for itself (rss queues, storage, etc), the 3-tuples for the VFs it is maintaining. The addresses used are the GPAs of the statistics buffers supplied by the VF in the init message on the VF <-> PF channel. The function parameter ensures that the iommu will translate the GPA to the correct physical address. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Support of PF driver of a VF init request	Ariel Elior	6	-3/+190
	The VF driver will send an 'init' request as part of its nic load flow. This message is used by the VF to publish the GPA's of its status blocks, slow path ring and statistics buffer. The PF driver notes all this down in the VF database, and also uses this message to transfer the VF to VF_INIT state internally. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Support of PF driver of a VF acquire request	Ariel Elior	6	-11/+476
	When a VF is probed by the VF driver, the VF driver sends an 'acquire' request over the VF <-> PF channel for the resources it needs to operate (interrupts, queues, etc). The PF driver either ratifies the request and allocates the resources, responds with the maximum values it will allow the VF to acquire, or fails the request entirely if there is a problem. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Infrastructure for VF <-> PF request on PF side	Ariel Elior	5	-45/+664
	Support interrupt from device which indicates VF has placed A request on the VF <-> PF channel. The PF driver issues a DMAE to retrieve the request from the VM memory (the Ghost Physical Address of the request is contained in the interrupt. The PF driver uses the GPA in the DMAE request, which is translated by the IOMMU to the correct physical address). The request which arrives is examined to recognize the sending VF. The PF driver allocates a workitem to handle the VF Operation (vfop). Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Prepare device and initialize VF database	Ariel Elior	8	-51/+535
	At nic load of the PF, if VFs may be present, prepare the device for the VFs. Initialize the VF database in preparation of VF arrival. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Allocate VF database in PF when VFs are present	Ariel Elior	6	-1/+570
	When A PF determines that it may have to manage SRIOV VFs it allocates a database for this purpose. The database is intended to keep track of the VF state, the resources allocated for each VF (queues, interrupt vectors, etc), the state of the VF's queues. When the VF loads the database is updated accordingly. When A VF closes the database is consulted to determine which resources need to be released (close queues against device, reclaim interrupt vectors, etc). Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: VF fastpath	Ariel Elior	3	-51/+50
	When VF driver is transmitting it must supply the correct mac address in the parsing BD. This is used for firmware validation and enforcement and also for tx-switching. Refactor interrupt ack flow to allow for different BAR addresses of the hardware in the PF BAR vs the VF BAR. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Support ndo_set_rxmode in VF driver	Ariel Elior	3	-6/+179
	The VF driver uses the 'q_filter' request in the VF <-> PF channel to have the PF configure the requested rxmode to device. ndo_set_rxmode is called under bottom half lock, so sleeping until the response arrives over the VF <-> PF channel is out of the question. For this reason the VF driver returns from the ndo after scheduling a work item, which in turn processes the rx mode request and adds the classification information through the VF <-> PF channel accordingly. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Add teardown_q and close to VF <-> PF channel	Ariel Elior	4	-2/+112
	When a VF is being closed its queues are released via the 'teardown_q' and the VF itself is closed with 'close'. These are essentially the unload counterparts of 'init' and 'setup_q' from the load flow. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Add init, setup_q, set_mac to VF <-> PF channel	Ariel Elior	4	-0/+310
	'init' - init an acquired VF. Supply allocation GPAs to PF. 'setup_q' - PF to allocate a queue in device on behalf of the VF. 'set_mac' - PF to configure a mac in device on behalf of the VF. VF driver uses these requests in the VF <-> PF channel in nic_load flow. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Separate VF and PF logic	Ariel Elior	6	-266/+525
	Generally, the VF driver cannot access the chip, except by the narrow window its BAR allows. Care had to be taken so the VF driver will not reach code which accesses the chip elsewhere. Refactor the nic_load flow into parts so it would be easier to separate the VF-only logic from the PF-only logic. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Add to VF <-> PF channel the release request	Ariel Elior	3	-0/+59
	VF driver uses this request when removed. The PF driver reclaims all resources allocated for that VF at this time. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: VF <-> PF channel 'acquire' at vf probe	Ariel Elior	7	-1/+407
	Add the 'acquire' request to VF <-> PF channel and use it at VF probe. In the acquire request the VF driver lists the resources it would like to have. In the response the PF either ratifies the request, or denies it and supplies the maximum values supported. The VF may then attempt another acquire request. This patch adds the bnx2x_vfpf.c file which contains the implementation of the VF to PF hardware channel. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-02	bnx2x: Support probing and removing of VF device	Ariel Elior	7	-158/+353
	To support probing and removing of a bnx2x virtual function the following were added: 1. add bnx2x_vfpf.h: defines the VF to PF channel 2. add bnx2x_sriov.h: header for bnx2x SR-IOV functionality 3. enumerate VF hw types (identify VFs) 4. if driving a VF, map VF bar 5. if driving a VF, allocate Vf to PF channel 6. refactor interrupt flows to include VF Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-30	team: add ethtool support	Flavio Leitner	1	-0/+17
	This patch adds few ethtool operations to team driver. Signed-off-by: Flavio Leitner <fbl@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-30	veth: extend device features	Eric Dumazet	1	-1/+6
	veth is lacking most modern facilities, like SG, checksums, TSO. It makes sense to extend dev->features to get them, or GRO aggregation is defeated by a forced segmentation. Reported-by: Andrew Vagin <avagin@parallels.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-30	veth: reduce stat overhead	Eric Dumazet	2	-68/+48
	veth stats are a bit bloated. There is no need to account transmit and receive stats, since they are absolutely symmetric. Also use a per device atomic64_t for the dropped counter, as it should never be used in fast path. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-30	team: implement carrier change	Flavio Leitner	1	-0/+10
	The user space teamd daemon may need to control the master's carrier state depending on the selected mode. Signed-off-by: Flavio Leitner <fbl@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-30	bridge: respect RFC2863 operational state	stephen hemminger	4	-6/+9
	The bridge link detection should follow the operational state of the lower device, rather than the carrier bit. This allows devices like tunnels that are controlled by userspace control plane to work with bridge STP link management. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Reviewed-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-30	net: filter: return -EINVAL if BPF_S_ANC* operation is not supported	Daniel Borkmann	1	-0/+7
	Currently, we return -EINVAL for malformed or wrong BPF filters. However, this is not done for BPF_S_ANC* operations, which makes it more difficult to detect if it's actually supported or not by the BPF machine. Therefore, we should also return -EINVAL if K is within the SKF_AD_OFF universe and the ancillary operation did not match. Why exactly is it needed? If tools such as libpcap/tcpdump want to make use of new ancillary operations (like filtering VLAN in kernel space), there is currently no sane way to test if this feature / BPF_S_ANC* op is present or not, since no error is returned. This patch will make life easier for that and allow for a proper usage for user space applications. There was concern, if this patch will break userland. Short answer: Yes and no. Long answer: It will "break" only for code that calls ... { BPF_LD \| BPF_(W\|H\|B) \| BPF_ABS, 0, 0, <K> }, ... where <K> is in [0xfffff000, 0xffffffff] _and_ <K> is not an ancillary. And here comes the BUT: assuming some old code will have such an instruction where <K> is between [0xfffff000, 0xffffffff] and it doesn't know ancillary operations, then this will give a non-expected / unwanted behavior as well (since we do not return the BPF machine with 0 after a failed load_pointer(), which was the case before introducing ancillary operations, but load sth. into the accumulator instead, and continue with the next instruction, for instance). Thus, user space code would already have been broken by introducing ancillary operations into the BPF machine per se. Code that does such a direct load, e.g. "load word at packet offset 0xffffffff into accumulator" ("ld [0xffffffff]") is quite broken, isn't it? The whole assumption of ancillary operations is that no-one intentionally calls things like "ld [0xffffffff]" and expect this word to be loaded from such a packet offset. Hence, we can also safely make use of this feature testing patch and facilitate application development. Therefore, at least from this patch onwards, we have for sure a check whether current or in future implemented BPF_S_ANC* ops are supported in the kernel. Patch was tested on x86_64. (Thanks to Eric for the previous review.) Cc: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: Ani Sinha <ani@aristanetworks.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-28	skbuff: make __kmalloc_reserve static	stephen hemminger	1	-2/+3
	Sparse detected case where this local function should be static. It may even allow some compiler optimizations. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-28	tcp: make proc_tcp_fastopen_key static	stephen hemminger	1	-2/+2
	Detected by sparse. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-28	sctp: make sctp_addr_wq_timeout_handler static	stephen hemminger	1	-1/+1
	Fix sparse warning about local function that should be static. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-28	net: use per task frag allocator in skb_append_datato_frags	Eric Dumazet	1	-27/+16
	Use the new per task frag allocator in skb_append_datato_frags(), to reduce number of frags and page allocator overhead. Tested: ifconfig lo mtu 16436 perf record netperf -t UDP_STREAM ; perf report before : Throughput: 32928 Mbit/s 51.79% netperf [kernel.kallsyms] [k] copy_user_generic_string 5.98% netperf [kernel.kallsyms] [k] __alloc_pages_nodemask 5.58% netperf [kernel.kallsyms] [k] get_page_from_freelist 5.01% netperf [kernel.kallsyms] [k] __rmqueue 3.74% netperf [kernel.kallsyms] [k] skb_append_datato_frags 1.87% netperf [kernel.kallsyms] [k] prep_new_page 1.42% netperf [kernel.kallsyms] [k] next_zones_zonelist 1.28% netperf [kernel.kallsyms] [k] __inc_zone_state 1.26% netperf [kernel.kallsyms] [k] alloc_pages_current 0.78% netperf [kernel.kallsyms] [k] sock_alloc_send_pskb 0.74% netperf [kernel.kallsyms] [k] udp_sendmsg 0.72% netperf [kernel.kallsyms] [k] zone_watermark_ok 0.68% netperf [kernel.kallsyms] [k] __cpuset_node_allowed_softwall 0.67% netperf [kernel.kallsyms] [k] fib_table_lookup 0.60% netperf [kernel.kallsyms] [k] memcpy_fromiovecend 0.55% netperf [kernel.kallsyms] [k] __udp4_lib_lookup after: Throughput: 47185 Mbit/s 61.74% netperf [kernel.kallsyms] [k] copy_user_generic_string 2.07% netperf [kernel.kallsyms] [k] prep_new_page 1.98% netperf [kernel.kallsyms] [k] skb_append_datato_frags 1.02% netperf [kernel.kallsyms] [k] sock_alloc_send_pskb 0.97% netperf [kernel.kallsyms] [k] enqueue_task_fair 0.97% netperf [kernel.kallsyms] [k] udp_sendmsg 0.91% netperf [kernel.kallsyms] [k] __ip_route_output_key 0.88% netperf [kernel.kallsyms] [k] __netif_receive_skb 0.87% netperf [kernel.kallsyms] [k] fib_table_lookup 0.85% netperf [kernel.kallsyms] [k] resched_task 0.78% netperf [kernel.kallsyms] [k] __udp4_lib_lookup 0.77% netperf [kernel.kallsyms] [k] _raw_spin_lock_irqsave Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-28	dummy: implement carrier change	Jiri Pirko	1	-0/+10
	Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-28	rtnl: expose carrier value with possibility to set it	Jiri Pirko	3	-0/+15
	Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-28	net: allow to change carrier via sysfs	Jiri Pirko	1	-1/+14
	Make carrier writable Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-28	net: add change_carrier netdev op	Jiri Pirko	2	-0/+31
	This allows a driver to register change_carrier callback which will be called whenever user will like to change carrier state. This is useful for devices like dummy, gre, team and so on. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-27	bnx2x: use ARRAY_SIZE where possible	Sasha Levin	1	-7/+7
	Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-26	ipv6/ip6_gre: set transport header correctly	Isaku Yamahata	1	-2/+1
	ip6gre_xmit2() incorrectly sets transport header to inner payload instead of GRE header. It seems copy-and-pasted from ipip.c. Set transport header to gre header. (In ipip case the transport header is the inner ip header, so that's correct.) Found by inspection. In practice the incorrect transport header doesn't matter because the skb usually is sent to another net_device or socket, so the transport header isn't referenced. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-26	ipv4/ip_gre: set transport header correctly to gre header	Isaku Yamahata	1	-1/+1
	ipgre_tunnel_xmit() incorrectly sets transport header to inner payload instead of GRE header. It seems copy-and-pasted from ipip.c. So set transport header to gre header. (In ipip case the transport header is the inner ip header, so that's correct.) Found by inspection. In practice the incorrect transport header doesn't matter because the skb usually is sent to another net_device or socket, so the transport header isn't referenced. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-26	IB/rds: suppress incompatible protocol when version is known	Marciniszyn, Mike	1	-6/+5
	Add an else to only print the incompatible protocol message when version hasn't been established. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-26	IB/rds: Correct ib_api use with gs_dma_address/sg_dma_len	Marciniszyn, Mike	1	-3/+6
	0b088e00 ("RDS: Use page_remainder_alloc() for recv bufs") added uses of sg_dma_len() and sg_dma_address(). This makes RDS DOA with the qib driver. IB ulps should use ib_sg_dma_len() and ib_sg_dma_address respectively since some HCAs overload ib_sg_dma* operations. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-26	net/vxlan: Use the underlying device index when joining/leaving multicast groups	Yan Burman	1	-2/+4
	The socket calls from vxlan to join/leave multicast group aren't using the index of the underlying device, as a result the stack uses the first interface that is up. This results in vxlan being non functional over a device which isn't the 1st to be up. Fix this by providing the iflink field to the vxlan instance to the multicast calls. Signed-off-by: Yan Burman <yanb@mellanox.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-26	tcp: should drop incoming frames without ACK flag set	Eric Dumazet	1	-4/+10
	In commit 96e0bf4b5193d (tcp: Discard segments that ack data not yet sent) John Dykstra enforced a check against ack sequences. In commit 354e4aa391ed5 (tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation) I added more safety tests. But we missed fact that these tests are not performed if ACK bit is not set. RFC 793 3.9 mandates TCP should drop a frame without ACK flag set. " fifth check the ACK field, if the ACK bit is off drop the segment and return" Not doing so permits an attacker to only guess an acceptable sequence number, evading stronger checks. Many thanks to Zhiyun Qian for bringing this issue to our attention. See : http://web.eecs.umich.edu/~zhiyunq/pub/ccs12_TCP_sequence_number_inference.pdf Reported-by: Zhiyun Qian <zhiyunq@umich.edu> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: John Dykstra <john.dykstra1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-26	mm: Fix PageHead when !CONFIG_PAGEFLAGS_EXTENDED	Christoffer Dall	1	-1/+7
	Unfortunately with !CONFIG_PAGEFLAGS_EXTENDED, (!PageHead) is false, and (PageHead) is true, for tail pages. If this is indeed the intended behavior, which I doubt because it breaks cache cleaning on some ARM systems, then the nomenclature is highly problematic. This patch makes sure PageHead is only true for head pages and PageTail is only true for tail pages, and neither is true for non-compound pages. [ This buglet seems ancient - seems to have been introduced back in Apr 2008 in commit 6a1e7f777f61: "pageflags: convert to the use of new macros". And the reason nobody noticed is because the PageHead() tests are almost all about just sanity-checking, and only used on pages that are actual page heads. The fact that the old code returned true for tail pages too was thus not really noticeable. - Linus ] Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu> Acked-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Will Deacon <Will.Deacon@arm.com> Cc: Steve Capper <Steve.Capper@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: stable@kernel.org # 2.6.26+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-26	netprio_cgroup: define sk_cgrp_prioidx only if NETPRIO_CGROUP is enabled	Li Zefan	1	-1/+1
	sock->sk_cgrp_prioidx won't be used at all if CONFIG_NETPRIO_CGROUP=n. Signed-off-by: Li Zefan <lizefan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-26	cpts: fix a run time warn_on.	Richard Cochran	1	-1/+1
	This patch fixes a warning in clk_enable by calling clk_prepare_enable instead. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-26	cpts: fix build error by removing useless code.	Richard Cochran	2	-2/+0
	The cpts driver tries to obtain the input clock frequency by calling the clock's internal 'recalc' method. Since <plat/clock.h> has been removed, this code can no longer compile. However, the driver never makes use of the frequency value, so this patch fixes the issue by removing the offending code altogether. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-26	batman-adv: fix random jitter calculation	Akinobu Mita	1	-1/+1
	batadv_iv_ogm_emit_send_time() attempts to calculates a random integer in the range of 'orig_interval +- BATADV_JITTER' by the below lines. msecs = atomic_read(&bat_priv->orig_interval) - BATADV_JITTER; msecs += (random32() % 2 * BATADV_JITTER); But it actually gets 'orig_interval' or 'orig_interval - BATADV_JITTER' because '%' and '*' have same precedence and associativity is left-to-right. This adds the parentheses at the appropriate position so that it matches original intension. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Antonio Quartulli <ordex@autistici.org> Cc: Marek Lindner <lindner_marek@yahoo.de> Cc: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Cc: Antonio Quartulli <ordex@autistici.org> Cc: b.a.t.m.a.n@lists.open-mesh.org Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-25	f2fs: Don't assign e_id in f2fs_acl_from_disk	Eric W. Biederman	1	-1/+0
	With user namespaces enabled building f2fs fails with: CC fs/f2fs/acl.o fs/f2fs/acl.c: In function ‘f2fs_acl_from_disk’: fs/f2fs/acl.c:85:21: error: ‘struct posix_acl_entry’ has no member named ‘e_id’ make[2]: *** [fs/f2fs/acl.o] Error 1 make[2]: Target `__build' not remade because of errors. e_id is a backwards compatibility field only used for file systems that haven't been converted to use kuids and kgids. When the posix acl tag field is neither ACL_USER nor ACL_GROUP assigning e_id is unnecessary. Remove the assignment so f2fs will build with user namespaces enabled. Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: Amit Sahrawat <a.sahrawat@samsung.com> Acked-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2012-12-25	proc: Allow proc_free_inum to be called from any context	Eric W. Biederman	1	-6/+7
	While testing the pid namespace code I hit this nasty warning. [ 176.262617] ------------[ cut here ]------------ [ 176.263388] WARNING: at /home/eric/projects/linux/linux-userns-devel/kernel/softirq.c:160 local_bh_enable_ip+0x7a/0xa0() [ 176.265145] Hardware name: Bochs [ 176.265677] Modules linked in: [ 176.266341] Pid: 742, comm: bash Not tainted 3.7.0userns+ #18 [ 176.266564] Call Trace: [ 176.266564] [<ffffffff810a539f>] warn_slowpath_common+0x7f/0xc0 [ 176.266564] [<ffffffff810a53fa>] warn_slowpath_null+0x1a/0x20 [ 176.266564] [<ffffffff810ad9ea>] local_bh_enable_ip+0x7a/0xa0 [ 176.266564] [<ffffffff819308c9>] _raw_spin_unlock_bh+0x19/0x20 [ 176.266564] [<ffffffff8123dbda>] proc_free_inum+0x3a/0x50 [ 176.266564] [<ffffffff8111d0dc>] free_pid_ns+0x1c/0x80 [ 176.266564] [<ffffffff8111d195>] put_pid_ns+0x35/0x50 [ 176.266564] [<ffffffff810c608a>] put_pid+0x4a/0x60 [ 176.266564] [<ffffffff8146b177>] tty_ioctl+0x717/0xc10 [ 176.266564] [<ffffffff810aa4d5>] ? wait_consider_task+0x855/0xb90 [ 176.266564] [<ffffffff81086bf9>] ? default_spin_lock_flags+0x9/0x10 [ 176.266564] [<ffffffff810cab0a>] ? remove_wait_queue+0x5a/0x70 [ 176.266564] [<ffffffff811e37e8>] do_vfs_ioctl+0x98/0x550 [ 176.266564] [<ffffffff810b8a0f>] ? recalc_sigpending+0x1f/0x60 [ 176.266564] [<ffffffff810b9127>] ? __set_task_blocked+0x37/0x80 [ 176.266564] [<ffffffff810ab95b>] ? sys_wait4+0xab/0xf0 [ 176.266564] [<ffffffff811e3d31>] sys_ioctl+0x91/0xb0 [ 176.266564] [<ffffffff810a95f0>] ? task_stopped_code+0x50/0x50 [ 176.266564] [<ffffffff81939199>] system_call_fastpath+0x16/0x1b [ 176.266564] ---[ end trace 387af88219ad6143 ]--- It turns out that spin_unlock_bh(proc_inum_lock) is not safe when put_pid is called with another spinlock held and irqs disabled. For now take the easy path and use spin_lock_irqsave(proc_inum_lock) in proc_free_inum and spin_loc_irq in proc_alloc_inum(proc_inum_lock). Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>