aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/mutex-design.txt (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2013-11-04tcp: enable sockets to use MSG_FASTOPEN by defaultYuchung Cheng2-3/+3
Applications have started to use Fast Open (e.g., Chrome browser has such an optional flag) and the feature has gone through several generations of kernels since 3.7 with many real network tests. It's time to enable this flag by default for applications to test more conveniently and extensively. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04netfilter: nft_compat: use _safe version of list_for_eachDan Carpenter1-4/+4
We need to use the _safe version of list_for_each_entry() here otherwise we have a use after free bug. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-11-04net/mlx4_core: Implement resource quota enforcementJack Morgenstein2-13/+173
Implements resource quota grant decision when resources are requested, for the following resources: QPs, CQs, SRQs, MPTs, MTTs, vlans, MACs, and Counters. When granting a resource, the quota system increases the allocated-count for that slave. When the slave later frees the resource, its allocated-count is reduced. A spinlock is used to protect the integrity of each resource's free-pool counter. (One slave may be in the process of being granted a resource while another slave has crashed, initiating cleanup of that slave's resource quotas). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04net/mlx4_core: Fix quota handling in the QUERY_FUNC_CAP wrapperJack Morgenstein1-23/+65
In current kernels, the mlx4 driver running on a VM does not differentiate between max resource numbers for the HCA and max quotas -- it simply takes the quota values passed to it as max-resource values. However, the driver actually requires the VFs to be aware of the actual number of resources that the HCA was initialized with, for QPs, CQs, SRQs and MPTs. For QPs, CQs and SRQs, the reason is that in completion handling the driver must know which of the 24 bits are the actual resource number, and which are "padding" bits. For MPTs, also, the driver assumes knowledge of the number of MPTs in the system. The previous commit fixes the quota logic on the VM for the quota values passed to it by QUERY_FUNC_CAPS. For QPs, CQs, SRQs, and MPTs, it takes the max resource numbers from QUERY_HCA (and not QUERY_FUNC_CAPS). The quotas passed in QUERY_FUNC_CAPS are used to report max resource number values in the response to ib_query_device. However, the Hypervisor driver must consider that VMs may be running previous kernels, and compatibility must be preserved. To resolve the incompatibility with previous kernels running on VMs, we deprecated the quota fields in mlx4_QUERY_FUNC_CAP. In the deprecated fields, we pass the max-resource values from INIT_HCA The quota fields are moved to a new location, and the current kernel driver takes the proper values from that location. There is also a new flag in dword 0, bit 28 of the mlx4_QUERY_FUNC_CAP mailbox; if this flag is set, the (VM) driver takes the quota values from the new location. VMs running previous kernels will work properly, except that the max resource numbers reported in ib_query_device for these resources will be too high. The Hypervisor driver will, however, enforce the quotas for these VMs. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04mlx4: Structures and init/teardown for VF resource quotasJack Morgenstein7-23/+222
This is step #1 for implementing SRIOV resource quotas for VFs. Quotas are implemented per resource type for VFs and the PF, to prevent any entity from simply grabbing all the resources for itself and leaving the other entities unable to obtain such resources. Resources which are allocated using quotas: QPs, CQs, SRQs, MPTs, MTTs, MAC, VLAN, and Counters. The quota system works as follows: Each entity (VF or PF) is given a max number of a given resource (its quota), and a guaranteed minimum number for each resource (starvation prevention). For QPs, CQs, SRQs, MPTs and MTTs: 50% of the available quantity for the resource is divided equally among the PF and all the active VFs (i.e., the number of VFs in the mlx4_core module parameter "num_vfs"). This 50% represents the "guaranteed minimum" pool. The other 50% is the "free pool", allocated on a first-come-first-serve basis. For each VF/PF, resources are first allocated from its "guaranteed-minimum" pool. When that pool is exhausted, the driver attempts to allocate from the resource "free-pool". The quota (i.e., max) for the VFs and the PF is: The free-pool amount (50% of the real max) + the guaranteed minimum For MACs: Guarantee 2 MACs per VF/PF per port. As a result, since we have only 128 MACs per port, reduce the allowable number of VFs from 64 to 63. Any remaining MACs are put into a free pool. For VLANs: For the PF, the per-port quota is 128 and guarantee is 64 (to allow the PF to register at least a VLAN per VF in VST mode). For the VFs, the per-port quota is 64 and the guarantee is 0. We assume that VGT VFs are trusted not to abuse the VLAN resource. For Counters: For all functions (PF and VFs), the quota is 128 and the guarantee is 0. In this patch, we define the needed structures, which are added to the resource-tracker struct. In addition, we do initialization for the resource quota, and adjust the query_device response to use quotas rather than resource maxima. As part of the implementation, we introduce a new field in mlx4_dev: quotas. This field holds the resource quotas used to report maxima to the upper layers (ib_core, via query_device). The HCA maxima of these values are passed to the VFs (via QUERY_HCA) so that they may continue to use these in handling QPs, CQs, SRQs and MPTs. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04net/mlx4_core: Fix checking order in MR table initJack Morgenstein1-3/+3
In procedure mlx4_init_mr_table(), slaves should do no processing, but should return success. This initialization is hypervisor-only. However, the check for num_mpts being a power-of-2 was performed before the check to return immediately if the driver is for a slave. This resulted in spurious failures. The order of performing the checks is reversed, so that if the driver is for a slave, no processing is done and success is returned. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04net/mlx4_core: Don't fail reg/unreg vlan for older guestsJack Morgenstein3-1/+18
In upstream kernels under SRIOV, the vlan register/unregister calls were NOPs (doing nothing and returning OK). We detect these old calls from guests (via the comm channel), since previously the port number in mlx4_register_vlan was passed (improperly) in the out_param. This has been corrected so that the port number is now passed in bits 8..15 of the in_modifier field. For old calls, these bits will be zero, so if the passed port number is zero, we can still look at the out_param field to see if it contains a valid port number. If yes, the VM is running an old driver. Since for old drivers, the register/unregister_vlan wrappers were NOPs, we continue this policy -- the reason being that upstream had an additional bug in eth driver running on guests (where procedure mlx4_en_vlan_rx_kill_vid() had the following code: if (!mlx4_find_cached_vlan(mdev->dev, priv->port, vid, &idx)) mlx4_unregister_vlan(mdev->dev, priv->port, idx); else en_err(priv, "could not find vid %d in cache\n", vid); On a VM, mlx4_find_cached_vlan() will always fail, since the vlan cache is located on the Hypervisor; on guests it is empty. Therefore, if we allow upstream guests to register vlans, we will have vlan leakage since the unregister will never be performed. Leaving vlan reg/unreg for old guest drivers as a NOP is not a feature regression, since in upstream the register/unregister vlan wrapper is a NOP. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04net/mlx4_core: Resource tracker for reg/unreg vlansJack Morgenstein1-6/+121
Add resource tracker support for reg/unreg vlans calls done by VFs. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04net/mlx4_en: Use vlan id instead of vlan index for unregistrationJack Morgenstein6-21/+20
Use of vlan_index created problems unregistering vlans on guests. In addition, tools delete vlan by tag, not by index, lets follow that. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04net/mlx4_core: Fix reg/unreg vlan/mac to conform to the firmware specJack Morgenstein3-25/+49
The functions mlx4_register_vlan, mlx4_unregister_vlan, mlx4_register_mac, mlx4_unregister_mac all made illegal use of the out_param in multifunc mode to pass the port number. The firmware spec specifies that the port number should be passed in bits 8..15 of the input-modifier field for ALLOC_RES and FREE_RES (sections 20.15.1 and 20.15.2). For MAC register/unregister, this patch contains workarounds so that guests running previous kernels continue to work on a new Hypervisor, and guests running the new kernel will continue to work on old hypervisors. Vlan registeration capability is still not operational in multifunction mode, since the vlan wrapper functions are not implemented in this patch. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04net/mlx4_core: Fix register/unreg vlan flowJack Morgenstein1-11/+10
The reg/unreg vlan code was broken: 1. a wrapped function called another wrapped function, causing a deadlock. 2. unregister_vlan called cmd_box instead of cmd_box_imm, leading to incorrectly passed parameters. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04sh_eth: check platform data pointerSergei Shtylyov1-0/+6
Check the platform data pointer before dereferencing it and error out of the probe() method if it's NULL. This has additional effect of preventing kernel oops with outdated platform data containing zero PHY address instead (such as on SolutionEngine7710). Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04net: cdc_mbim: fixup error return valueBjørn Mork1-4/+2
Reported-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04net: cdc_mbim: no need to check for resume if suspend existsBjørn Mork1-1/+1
Reported-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04net: qmi_wwan: no need to check for resume if suspend existsBjørn Mork1-1/+1
Reported-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04net: qmi_wwan: manage_power should always set needs_remote_wakeupBjørn Mork1-6/+4
Reported-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04net: cdc_mbim: manage_power should always set needs_remote_wakeupBjørn Mork1-5/+3
Reported-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04qlcnic: update version to 5.3.52Himanshu Madhani1-2/+2
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04qlcnic: Enable multiple Tx queue support for 83xx/84xx Series adapters.Himanshu Madhani5-17/+18
o 83xx and 84xx firmware is capable of multiple Tx queues. This patch will enable multiple Tx queues for 83xx/84xx series adapters. Max number of Tx queues supported will be 8. Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04qlcnic: refactor Tx/SDS ring calculation and validation in driver.Himanshu Madhani12-311/+404
o Current driver has duplicate code for validating user input for changing Tx/SDS rings using set_channel ethtool interface. This patch removes duplicate code and refactored Tx/SDS ring validation for 82xx/83xx/84xx series adapter. o Refactored code now calculates maximum Tx/Rx ring driver can support based on Default, NPAR and SRIOV PF/VF mode of driver. Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04qlcnic: Enhance ethtool Statistics for Multiple Tx queue.Himanshu Madhani4-62/+94
o Enhance ethtool statistics to display multiple Tx queue stats for all supported adapters. Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04qlcnic: Register netdev in FAILED state for 83xx/84xxSucheta Chakraborty6-33/+67
o Without failing probe, register netdev when device is in FAILED state. o Device will come up with minimum functionality and allow diagnostics and repair of the adapter. Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04lib: crc32: reduce number of cases for crc32{, c}_combineDaniel Borkmann1-2/+2
We can safely reduce the number of test cases by a tenth. There is no particular need to run as many as we're running now for crc32{,c}_combine, that gives us still ~8000 tests we're doing if people run kernels with crc selftests enabled which is perfectly fine. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04lib: crc32: conditionally resched when running testcasesDaniel Borkmann1-0/+3
Fengguang reports that when crc32 selftests are running on startup, on some e.g. 32bit systems, we can get a CPU stall like "INFO: rcu_sched self-detected stall on CPU { 0} (t=2101 jiffies g=4294967081 c=4294967080 q=41)". As this is not intended, add a cond_resched() at the end of a test case to fix it. Introduced by efba721f63 ("lib: crc32: add test cases for crc32{, c}_combine routines"). Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04net: checksum: fix warning in skb_checksumDaniel Borkmann2-1/+6
This patch fixes a build warning in skb_checksum() by wrapping the csum_partial() usage in skb_checksum(). The problem is that on a few architectures, csum_partial is used with prefix asmlinkage whereas on most architectures it's not. So fix this up generically as we did with csum_block_add_ext() to match the signature. Introduced by 2817a336d4d ("net: skb_checksum: allow custom update/combine for walking skb"). Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04net/mlx4_core: Fix call to __mlx4_unregister_macJack Morgenstein1-1/+1
In function mlx4_master_deactivate_admin_state() __mlx4_unregister_mac was called using the MAC index. It should be called with the value of the MAC itself. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04net: sctp: do not trigger BUG_ON in sctp_cmd_delete_tcbDaniel Borkmann1-1/+0
Introduced in f9e42b853523 ("net: sctp: sideeffect: throw BUG if primary_path is NULL"), we intended to find a buggy assoc that's part of the assoc hash table with a primary_path that is NULL. However, we better remove the BUG_ON for now and find a more suitable place to assert for these things as Mark reports that this also triggers the bug when duplication cookie processing happens, and the assoc is not part of the hash table (so all good in this case). Such a situation can for example easily be reproduced by: tc qdisc add dev eth0 root handle 1: prio bands 2 priomap 1 1 1 1 1 1 tc qdisc add dev eth0 parent 1:2 handle 20: netem loss 20% tc filter add dev eth0 protocol ip parent 1: prio 2 u32 match ip \ protocol 132 0xff match u8 0x0b 0xff at 32 flowid 1:2 This drops 20% of COOKIE-ACK packets. After some follow-up discussion with Vlad we came to the conclusion that for now we should still better remove this BUG_ON() assertion, and come up with two follow-ups later on, that is, i) find a more suitable place for this assertion, and possibly ii) have a special allocator/initializer for such kind of temporary assocs. Reported-by: Mark Thomas <Mark.Thomas@metaswitch.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-03net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)Arvid Brodin15-0/+2403
High-availability Seamless Redundancy ("HSR") provides instant failover redundancy for Ethernet networks. It requires a special network topology where all nodes are connected in a ring (each node having two physical network interfaces). It is suited for applications that demand high availability and very short reaction time. HSR acts on the Ethernet layer, using a registered Ethernet protocol type to send special HSR frames in both directions over the ring. The driver creates virtual network interfaces that can be used just like any ordinary Linux network interface, for IP/TCP/UDP traffic etc. All nodes in the network ring must be HSR capable. This code is a "best effort" to comply with the HSR standard as described in IEC 62439-3:2010 (HSRv0). Signed-off-by: Arvid Brodin <arvid.brodin@xdin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-03net: extend net_device allocation to vmalloc()Eric Dumazet4-11/+24
Joby Poriyath provided a xen-netback patch to reduce the size of xenvif structure as some netdev allocation could fail under memory pressure/fragmentation. This patch is handling the problem at the core level, allowing any netdev structures to use vmalloc() if kmalloc() failed. As vmalloc() adds overhead on a critical network path, add __GFP_REPEAT to kzalloc() flags to do this fallback only when really needed. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Joby Poriyath <joby.poriyath@citrix.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-03net: sctp: fix and consolidate SCTP checksumming codeDaniel Borkmann2-45/+20
This fixes an outstanding bug found through IPVS, where SCTP packets with skb->data_len > 0 (non-linearized) and empty frag_list, but data accumulated in frags[] member, are forwarded with incorrect checksum letting SCTP initial handshake fail on some systems. Linearizing each SCTP skb in IPVS to prevent that would not be a good solution as this leads to an additional and unnecessary performance penalty on the load-balancer itself for no good reason (as we actually only want to update the checksum, and can do that in a different/better way presented here). The actual problem is elsewhere, namely, that SCTP's checksumming in sctp_compute_cksum() does not take frags[] into account like skb_checksum() does. So while we are fixing this up, we better reuse the existing code that we have anyway in __skb_checksum() and use it for walking through the data doing checksumming. This will not only fix this issue, but also consolidates some SCTP code with core sk_buff code, bringing it closer together and removing respectively avoiding reimplementation of skb_checksum() for no good reason. As crc32c() can use hardware implementation within the crypto layer, we leave that intact (it wraps around / falls back to e.g. slice-by-8 algorithm in __crc32c_le() otherwise); plus use the __crc32c_le_combine() combinator for crc32c blocks. Also, we remove all other SCTP checksumming code, so that we only have to use sctp_compute_cksum() from now on; for doing that, we need to transform SCTP checkumming in output path slightly, and can leave the rest intact. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-03net: skb_checksum: allow custom update/combine for walking skbDaniel Borkmann3-13/+37
Currently, skb_checksum walks over 1) linearized, 2) frags[], and 3) frag_list data and calculats the one's complement, a 32 bit result suitable for feeding into itself or csum_tcpudp_magic(), but unsuitable for SCTP as we're calculating CRC32c there. Hence, in order to not re-implement the very same function in SCTP (and maybe other protocols) over and over again, use an update() + combine() callback internally to allow for walking over the skb with different algorithms. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-03lib: crc32: add test cases for crc32{, c}_combine routinesDaniel Borkmann1-0/+72
We already have 100 test cases for crcs itself, so split the test buffer with a-prio known checksums, and test crc of two blocks against crc of the whole block for the same results. Output/result with CONFIG_CRC32_SELFTEST=y: [ 2.687095] crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64 [ 2.687097] crc32: self tests passed, processed 225944 bytes in 278177 nsec [ 2.687383] crc32c: CRC_LE_BITS = 64 [ 2.687385] crc32c: self tests passed, processed 225944 bytes in 141708 nsec [ 7.336771] crc32_combine: 113072 self tests passed [ 12.050479] crc32c_combine: 113072 self tests passed [ 17.633089] alg: No test for crc32 (crc32-pclmul) Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-03lib: crc32: add functionality to combine two crc32{, c}s in GF(2)Daniel Borkmann2-0/+121
This patch adds a combinator to merge two or more crc32{,c}s into a new one. This is useful for checksum computations of fragmented skbs that use crc32/crc32c as checksums. The arithmetics for combining both in the GF(2) was taken and slightly modified from zlib. Only passing two crcs is insufficient as two crcs and the length of the second piece is needed for merging. The code is made generic, so that only polynomials need to be passed for crc32_le resp. crc32c_le. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-03lib: crc32: clean up spacing in test casesDaniel Borkmann1-200/+100
This is nothing more but a whitepace cleanup, as 80 chars is not a hard but soft limit, and otherwise makes the test cases array really look ugly. So fix it up. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-03Linux 3.12Linus Torvalds1-1/+1
2013-11-03netfilter: nf_tables: remove duplicated include from nf_tables_ipv4.cWei Yongjun1-1/+0
Remove duplicated include. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-11-03netfilter: ctnetlink: account both directions in one stepHolger Eitzenberger1-25/+24
With the intent to dump other accounting data later. This patch is a cleanup. Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-11-03netfilter: introduce nf_conn_acct structureHolger Eitzenberger6-24/+38
Encapsulate counters for both directions into nf_conn_acct. During that process also consistently name pointers to the extend 'acct', not 'counters'. This patch is a cleanup. Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-11-03ipc, msg: forbid negative values for "msg{max,mnb,mni}"Mathias Krause2-11/+15
Negative message lengths make no sense -- so don't do negative queue lenghts or identifier counts. Prevent them from getting negative. Also change the underlying data types to be unsigned to avoid hairy surprises with sign extensions in cases where those variables get evaluated in unsigned expressions with bigger data types, e.g size_t. In case a user still wants to have "unlimited" sizes she could just use INT_MAX instead. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-02ARC: Incorrect mm reference used in vmalloc fault handlerVineet Gupta1-3/+3
A vmalloc fault needs to sync up PGD/PTE entry from init_mm to current task's "active_mm". ARC vmalloc fault handler however was using mm. A vmalloc fault for non user task context (actually pre-userland, from init thread's open for /dev/console) caused the handler to deref NULL mm (for mm->pgd) The reasons it worked so far is amazing: 1. By default (!SMP), vmalloc fault handler uses a cached value of PGD. In SMP that MMU register is repurposed hence need for mm pointer deref. 2. In pre-3.12 SMP kernel, the problem triggering vmalloc didn't exist in pre-userland code path - it was introduced with commit 20bafb3d23d108bc "n_tty: Move buffers into n_tty_data" Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Gilad Ben-Yossef <gilad@benyossef.com> Cc: Noam Camus <noamc@ezchip.com> Cc: stable@vger.kernel.org #3.10 and 3.11 Cc: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-02net: flow_dissector: fail on evil iph->ihlJason Wang1-1/+1
We don't validate iph->ihl which may lead a dead loop if we meet a IPIP skb whose iph->ihl is zero. Fix this by failing immediately when iph->ihl is evil (less than 5). This issue were introduced by commit ec5efe7946280d1e84603389a1030ccec0a767ae (rps: support IPIP encapsulation). Cc: Eric Dumazet <edumazet@google.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02bonding: bond_get_size() returns wrong sizeDan Carpenter1-2/+2
There is an extra semi-colon so bond_get_size() doesn't return the correct value. Fixes: ec76aa49855f ('bonding: add Netlink support active_slave option') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02net: cdc_ncm: no not set tx_max higher than the device supportsBjørn Mork1-2/+1
There are MBIM devices out there reporting dwNtbInMaxSize=2048 dwNtbOutMaxSize=2048 and since the spec require a datagram max size of at least 2048, this means that a full sized datagram will never fit. Still, sending larger NTBs than the device supports is not going to help. We do not have any other options than either a) refusing to bindi, or b) respect the insanely low value. Alternative b will at least make these devices work, so go for it. Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02net: cdc_ncm: improve bind error debug messagesBjørn Mork1-10/+26
Make it a bit easier for users to figure out what goes wrong when bind fails. Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02net: cdc_ncm: return proper error if setup failsBjørn Mork1-2/+2
Most setup errors are ignored to ensure maximum firmware compatibilty. But GET_NTB_PARAMETERS and the functional descriptors are required. Use proper error codes and log level if these fail. Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02net: cdc_ncm: refactoring cdc_ncm_setupBjørn Mork1-64/+44
Rewriting the "set max datagram" part of dc_ncm_setup to separate the selection and validatation of the size from the code which optionally informs the device of this value. This ensures that we use the correct value regardless of device support for the get and set commands. Removing some of the many indent levels while doing this to make the code more readable. Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02net: cdc_ncm: drop "extern" from header declarationsBjørn Mork1-6/+6
Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02net: cdc_ncm: endian convert constants instead of variablesBjørn Mork1-2/+2
Converting the constants used in these comparisons at build time instead of converting the variables for every received frame at run time. Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02net: cdc_ncm: log signatures in hexBjørn Mork1-4/+6
These signatures are well known bit patterns, mostly made up of ascii characters. Mentally parsing works best if they are printed in hex. Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-02net: cdc_ncm: use netif_* and dev_* instead of pr_*Bjørn Mork1-50/+48
Take advantage of standard device name prefixing and netdevice msglvl control where possible. Cc: Alexey Orishko <alexey.orishko@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>