linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2016-01-04	tilepro: use to_delayed_work	Geliang Tang	1	-2/+1
	Use to_delayed_work() instead of open-coding it. Signed-off-by: Geliang Tang <geliangtang@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04	bnxt_en: Modify ethtool -l\|-L to support combined or rx/tx rings.	Michael Chan	1	-12/+45
	The driver can support either all combined or all rx/tx rings. The default is combined, but the user can now select rx/tx rings. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04	bnxt_en: Modify init sequence to support shared or non shared rings.	Michael Chan	1	-10/+32
	Modify ring memory allocation and MSIX setup to support shared or non shared rings and do the proper mapping. Default is still to use shared rings. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04	bnxt_en: Modify bnxt_get_max_rings() to support shared or non shared rings.	Michael Chan	3	-25/+79
	Add logic to calculate how many shared or non shared rings can be supported. Default is to use shared rings. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04	bnxt_en: Re-structure ring indexing and mapping.	Michael Chan	2	-38/+55
	In order to support dedicated or shared completion rings, the ring indexing and mapping are re-structured as below: 1. bp->grp_info[] array index is 1:1 with bp->bnapi[] array index and completion ring index. 2. rx rings 0 to n will be mapped to completion rings 0 to n. 3. If tx and rx rings share completion rings, then tx rings 0 to m will be mapped to completion rings 0 to m. 4. If tx and rx rings use dedicated completion rings, then tx rings 0 to m will be mapped to completion rings n + 1 to n + m. 5. Each tx or rx ring will use the corresponding completion ring index for doorbell mapping and MSIX mapping. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04	bnxt_en: Check for NULL rx or tx ring.	Michael Chan	1	-5/+22
	Each bnxt_napi structure may no longer be having both an rx ring and a tx ring. Check for a valid ring before using it. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04	bnxt_en: Separate bnxt_{rx\|tx}_ring_info structs from bnxt_napi struct.	Michael Chan	2	-86/+70
	Currently, an rx and a tx ring are always paired with a completion ring. We want to restructure it so that it is possible to have a dedicated completion ring for tx or rx only. The bnxt hardware uses a completion ring for rx and tx events. The driver has to process the completion ring entries sequentially for the rx and tx events. Using a dedicated completion ring for rx only or tx only has these benefits: 1. A burst of rx packets can cause delay in processing tx events if the completion ring is shared. If tx queue is stopped by BQL, this can cause delay in re-starting the tx queue. 2. A completion ring is sized according to the rx and tx ring size rounded up to the nearest power of 2. When the completion ring is shared, it is sized by adding the rx and tx ring sizes and then rounded to the next power of 2, often with a lot of wasted space. 3. Using dedicated completion ring, we can adjust the tx and rx coalescing parameters independently for rx and tx. The first step is to separate the rx and tx ring structures from the bnxt_napi struct. In this patch, an rx ring and a tx ring will point to the same bnxt_napi struct to share the same completion ring. No change in ring assignment and mapping yet. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-04	bnxt_en: Refactor bnxt_dbg_dump_states().	Michael Chan	1	-17/+33
	By adding 3 separate functions to dump the different ring states. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-31	sparc: Wire up mlock2 system call.	David S. Miller	3	-4/+5
	Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-31	sparc: Add all necessary direct socket system calls.	David S. Miller	3	-18/+24
	The GLIBC folks would like to eliminate socketcall support eventually, and this makes sense regardless so wire them all up. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-31	phy: micrel: Add ethtool statistics counters	Andrew Lunn	1	-0/+96
	The PHY counters receiver errors and errors while idle. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-31	phy: marvell: Add ethtool statistics counters	Andrew Lunn	1	-0/+135
	The PHY counters receiver errors and errors while idle. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-31	ethtool: Add phy statistics	Andrew Lunn	3	-1/+89
	Ethernet PHYs can maintain statistics, for example errors while idle and receive errors. Add an ethtool mechanism to retrieve these statistics, using the same model as MAC statistics. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-30	sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close	Xin Long	2	-4/+5
	In sctp_close, sctp_make_abort_user may return NULL because of memory allocation failure. If this happens, it will bypass any state change and never free the assoc. The assoc has no chance to be freed and it will be kept in memory with the state it had even after the socket is closed by sctp_close(). So if sctp_make_abort_user fails to allocate memory, we should abort the asoc via sctp_primitive_ABORT as well. Just like the annotation in sctp_sf_cookie_wait_prm_abort and sctp_sf_do_9_1_prm_abort said, "Even if we can't send the ABORT due to low memory delete the TCB. This is a departure from our typical NOMEM handling". But then the chunk is NULL (low memory) and the SCTP_CMD_REPLY cmd would dereference the chunk pointer, and system crash. So we should add SCTP_CMD_REPLY cmd only when the chunk is not NULL, just like other places where it adds SCTP_CMD_REPLY cmd. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-30	net, socket, socket_wq: fix missing initialization of flags	Nicolai Stange	1	-0/+1
	Commit ceb5d58b2170 ("net: fix sock_wake_async() rcu protection") from the current 4.4 release cycle introduced a new flags member in struct socket_wq and moved SOCKWQ_ASYNC_NOSPACE and SOCKWQ_ASYNC_WAITDATA from struct socket's flags member into that new place. Unfortunately, the new flags field is never initialized properly, at least not for the struct socket_wq instance created in sock_alloc_inode(). One particular issue I encountered because of this is that my GNU Emacs failed to draw anything on my desktop -- i.e. what I got is a transparent window, including the title bar. Bisection lead to the commit mentioned above and further investigation by means of strace told me that Emacs is indeed speaking to my Xorg through an O_ASYNC AF_UNIX socket. This is reproducible 100% of times and the fact that properly initializing the struct socket_wq ->flags fixes the issue leads me to the conclusion that somehow SOCKWQ_ASYNC_WAITDATA got set in the uninitialized ->flags, preventing my Emacs from receiving any SIGIO's due to data becoming available and it got stuck. Make sock_alloc_inode() set the newly created struct socket_wq's ->flags member to zero. Fixes: ceb5d58b2170 ("net: fix sock_wake_async() rcu protection") Signed-off-by: Nicolai Stange <nicstange@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-30	be2net: bump up the driver version to 11.0.0.0	Suresh Reddy	1	-1/+1
	Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-30	be2net: support ethtool get-dump option	Venkat Duvvuru	5	-57/+71
	This patch adds support for ethtool's --get-dump option in be2net, to retrieve FW dump. In the past when this option was not yet available, this feature was supported via the --register-dump option as a workaround. This patch removes support for FW-dump via --register-dump option as it is now available via --get-dump option. Even though the "ethtool --register-dump" cmd which used to work earlier, will now fail with ENOTSUPP error, we feel it is not an issue as this is used only for diagnostics purpose. Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-30	be2net: fix port-res desc query of GET_PROFILE_CONFIG FW cmd	Suresh Reddy	4	-53/+63
	Commit 72ef3a88fa8e ("be2net: set pci_func_num while issuing GET_PROFILE_CONFIG cmd") passed a specific pf_num while issuing a GET_PROFILE_CONFIG cmd as FW returns descriptors for all functions when pf_num is zero. But, when pf_num is set to a non-zero value, FW does not return the Port resource descriptor. This patch fixes this by setting pf_num to 0 while issuing the query cmd and adds code to pick the correct NIC resource descriptor from the list of descriptors returned by FW. Fixes: 72ef3a88fa8e ("be2net: set pci_func_num while issuing GET_PROFILE_CONFIG cmd") Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-30	be2net: remove unused error variables	Venkat Duvvuru	1	-4/+0
	eeh_error, fw_timeout, hw_error variables in the be_adapter structure are not used anymore. An earlier patch that introduced adapter->err_flags to store this information missed removing these variables. Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-30	be2net: remove a line of code that has no effect	Sathya Perla	3	-4/+3
	This patch removes a line of code that changes adapter->recommended_prio value followed by yet another assignment. Also, the variable is used to store the vlan priority value that is already shifted to the PCP bits position in the vlan tag format. Hence, the name of this variable is changed to recommended_prio_bits. Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-30	be2net: log digital signature errors while flashing FW image	Suresh Reddy	2	-2/+16
	(based on a jumper setting on the adapter.) In this mode, the FW image when flashed is authenticated with a digital signature. This patch logs appropriate error messages and return a status to ethtool when errors relating to FW image authentication occur. Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-30	be2net: move FW flash cmd code to be_cmds.c	Suresh Reddy	3	-584/+578
	All code relating to FW cmds is in be_cmds.[ch] excepting FW flash cmd related code. This patch moves these routines from be_main.c to be_cmds.c Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-30	be2net: cleanup FW flash image related macro defines	Suresh Reddy	2	-101/+118
	Many constant definitions relating to the FW-image layout (such as section offset values) were defined in decimal format rather than hexa-decimal. This makes this part of the code un-readable. Also some defines related to BE2 are labeld "g2" and defines related to BE3 are labeled "g3". This patch cleans up all of this to make this code more readable. Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-30	be2net: avoid configuring VEPA mode on BE3	Suresh Reddy	1	-0/+3
	BE3 chip doesn't support VEPA mode. Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-30	be2net: fix VF link state transition from disabled to auto	Suresh Reddy	2	-10/+28
	The VF link state setting transition from "disable" to "auto" does not work due to a bug in SET_LOGICAL_LINK_CONFIG_V1 cmd in FW. This issue could not be fixed in FW due to some backward compatibility issues it causes with some released drivers. The issue has been fixed by introducing a new version (v2) of the cmd from 10.6 FW onwards. In v2, to set the VF link state to auto, both PLINK_ENABLE and PLINK_TRACK bits have to be set to 1. The VF link state setting feature now works on Lancer chips too from FW ver 10.6.315.0 onwards. Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-29	ixgbe: Fix bugs in ixgbe_clear_vf_vlans()	Alexander Duyck	1	-4/+4
	When I had rewritten the code for ixgbe_clear_vf_vlans() it looks like I had transitioned back and forth between using word as an offset and using word as a register offset. As a result I honestly don't see how the code was working before other than the fact that resetting the VLANs on the VF like didn't do much to clear them. Another issue found is that the mask was using a divide instead of a modulus. As a result the mask bit was incorrectly being set to either bit 0 or 1 based on the value of the VF being tested. As a result the wrong VFs were having their VLANs cleared if they were enabled. I have updated the code so that word represents the offset in the array. This way we can use the modulus and xor operations and they will make sense instead of being performed on a 4 byte aligned value. I replaced the statement "(word % 2) ^ 1" with "~word % 2" in order to reduce the line length as the line exceeded 80 characters with the register name inserted. The two should be equivalent so the change should be safe. Reported-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-29	ixgbe: Correct X550EM_x revision check	Mark Rustad	2	-7/+4
	The X550EM_x revision check needs to check a value, not just a bit. Use a mask and check the value. Also remove the redundant check inside the ixgbe_enter_lplu_t_x550em, because it can only be called when both the mac type and revision check pass. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-29	ixgbe: fix RSS limit for X550	Emil Tantilov	1	-1/+1
	X550 allows for up to 64 RSS queues, but the driver can have max of 63 (-1 MSIX vector for link). On systems with >= 64 CPUs the driver will set the redirection table for all 64 queues which will result in packets being dropped. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-29	ixgbe: Clean up redundancy in hw_enc_features	Mark Rustad	1	-5/+2
	Clean up minor redundancy in the setting of hw_enc_features that makes it appears that X550 uniquely has more encapsulation features than other devices. The driver only supports one more feature, so make it look that way. No longer set NETIF_F_SG since that is set by the register_netdev call. Thanks to Alex Duyck for noticing this slight confusion. Reported-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-29	ixgbe: report correct media type for KR, KX and KX4 interfaces	Veola Nazareth	1	-13/+42
	Ethtool reports backplane type interfaces as 1000/10000baseT link modes. This has been corrected to report the media as KR, KX or KX4 based on the backplane interface present. Signed-off-by: Veola Nazareth <veola.nazareth@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-29	ixgbe: add support for QSFP PHY types in ixgbe_get_settings()	Emil Tantilov	1	-0/+4
	Add missing QSFP PHY types to allow for more accurate reporting of port settings. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-29	ixgbevf: minor cleanups for ixgbevf_set_itr()	Emil Tantilov	1	-2/+3
	adapter->rx_itr_setting is not a mask so check it with == instead of & do not default to 12K interrupts in ixgbevf_set_itr() There should be no functional effect from these changes. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-29	ixgbevf: Fix handling of NAPI budget when multiple queues are enabled per vector	William Dauchy	1	-0/+2
	This is the same patch as for ixgbe but applied differently according to busy polling. See commit 5d6002b7b822c74 ("ixgbe: Fix handling of NAPI budget when multiple queues are enabled per vector") Signed-off-by: William Dauchy <william@gandi.net> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-29	mm/vmstat: fix overflow in mod_zone_page_state()	Heiko Carstens	2	-8/+8
	mod_zone_page_state() takes a "delta" integer argument. delta contains the number of pages that should be added or subtracted from a struct zone's vm_stat field. If a zone is larger than 8TB this will cause overflows. E.g. for a zone with a size slightly larger than 8TB the line mod_zone_page_state(zone, NR_ALLOC_BATCH, zone->managed_pages); in mm/page_alloc.c:free_area_init_core() will result in a negative result for the NR_ALLOC_BATCH entry within the zone's vm_stat, since 8TB contain 0x8xxxxxxx pages which will be sign extended to a negative value. Fix this by changing the delta argument to long type. This could fix an early boot problem seen on s390, where we have a 9TB system with only one node. ZONE_DMA contains 2GB and ZONE_NORMAL the rest. The system is trying to allocate a GFP_DMA page but ZONE_DMA is completely empty, so it tries to reclaim pages in an endless loop. This was seen on a heavily patched 3.10 kernel. One possible explaination seem to be the overflows caused by mod_zone_page_state(). Unfortunately I did not have the chance to verify that this patch actually fixes the problem, since I don't have access to the system right now. However the overflow problem does exist anyway. Given the description that a system with slightly less than 8TB does work, this seems to be a candidate for the observed problem. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	ocfs2/dlm: clear migration_pending when migration target goes down	xuejiufei	1	-0/+2
	We have found a BUG on res->migration_pending when migrating lock resources. The situation is as follows. dlm_mark_lockres_migration res->migration_pending = 1; __dlm_lockres_reserve_ast dlm_lockres_release_ast returns with res->migration_pending remains because other threads reserve asts wait dlm_migration_can_proceed returns 1 >>>>>>> o2hb found that target goes down and remove target from domain_map dlm_migration_can_proceed returns 1 dlm_mark_lockres_migrating returns -ESHOTDOWN with res->migration_pending still remains. When reentering dlm_mark_lockres_migrating(), it will trigger the BUG_ON with res->migration_pending. So clear migration_pending when target is down. Signed-off-by: Jiufei Xue <xuejiufei@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone()	Andrew Banman	1	-12/+19
	test_pages_in_a_zone() does not account for the possibility of missing sections in the given pfn range. pfn_valid_within always returns 1 when CONFIG_HOLES_IN_ZONE is not set, allowing invalid pfns from missing sections to pass the test, leading to a kernel oops. Wrap an additional pfn loop with PAGES_PER_SECTION granularity to check for missing sections before proceeding into the zone-check code. This also prevents a crash from offlining memory devices with missing sections. Despite this, it may be a good idea to keep the related patch '[PATCH 3/3] drivers: memory: prohibit offlining of memory blocks with missing sections' because missing sections in a memory block may lead to other problems not covered by the scope of this fix. Signed-off-by: Andrew Banman <abanman@sgi.com> Acked-by: Alex Thorlton <athorlton@sgi.com> Cc: Russ Anderson <rja@sgi.com> Cc: Alex Thorlton <athorlton@sgi.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Greg KH <greg@kroah.com> Cc: Seth Jennings <sjennings@variantweb.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	ocfs2: fix flock panic issue	Junxiao Bi	1	-1/+4
	Commit 4f6563677ae8 ("Move locks API users to locks_lock_inode_wait()") move flock/posix lock indentify code to locks_lock_inode_wait(), but missed to set fl_flags to FL_FLOCK which caused the following kernel panic on 4.4.0_rc5. kernel BUG at fs/locks.c:1895! invalid opcode: 0000 [#1] SMP Modules linked in: ocfs2(O) ocfs2_dlmfs(O) ocfs2_stack_o2cb(O) ocfs2_dlm(O) ocfs2_nodemanager(O) ocfs2_stackglue(O) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi xen_kbdfront xen_netfront xen_fbfront xen_blkfront CPU: 0 PID: 20268 Comm: flock_unit_test Tainted: G O 4.4.0-rc5-next-20151217 #1 Hardware name: Xen HVM domU, BIOS 4.3.1OVM 05/14/2014 task: ffff88007b3672c0 ti: ffff880028b58000 task.ti: ffff880028b58000 RIP: locks_lock_inode_wait+0x2e/0x160 Call Trace: ocfs2_do_flock+0x91/0x160 [ocfs2] ocfs2_flock+0x76/0xd0 [ocfs2] SyS_flock+0x10f/0x1a0 entry_SYSCALL_64_fastpath+0x12/0x71 Code: e5 41 57 41 56 49 89 fe 41 55 41 54 53 48 89 f3 48 81 ec 88 00 00 00 8b 46 40 83 e0 03 83 f8 01 0f 84 ad 00 00 00 83 f8 02 74 04 <0f> 0b eb fe 4c 8d ad 60 ff ff ff 4c 8d 7b 58 e8 0e 8e 73 00 4d RIP locks_lock_inode_wait+0x2e/0x160 RSP <ffff880028b5bce8> ---[ end trace dfca74ec9b5b274c ]--- Fixes: 4f6563677ae8 ("Move locks API users to locks_lock_inode_wait()") Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	m32r: add io*_rep helpers	Sudip Mukherjee	1	-1/+9
	m32r allmodconfig was failing with the error: error: implicit declaration of function 'read' On checking io.h it turned out that 'read' is not defined but 'readb' is defined and 'ioread8' will then obviously mean 'readb'. At the same time some of the helper functions ioreadN_rep() and iowriteN_rep() were missing which also led to the build failure. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	m32r: fix build failure	Sudip Mukherjee	1	-0/+1
	m32r allmodconfig is failing with: In file included from ../include/linux/kvm_para.h:4:0, from ../kernel/watchdog.c:26: ../include/uapi/linux/kvm_para.h:30:26: fatal error: asm/kvm_para.h: No such file or directory kvm_para.h was not included in the build. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	arch/x86/xen/suspend.c: include xen/xen.h	Andrew Morton	1	-0/+1
	Fix the build warning: arch/x86/xen/suspend.c: In function 'xen_arch_pre_suspend': arch/x86/xen/suspend.c:70:9: error: implicit declaration of function 'xen_pv_domain' [-Werror=implicit-function-declaration] if (xen_pv_domain()) ^ Reported-by: kbuild test robot <fengguang.wu@intel.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	mm: memcontrol: fix possible memcg leak due to interrupted reclaim	Vladimir Davydov	1	-14/+46
	Memory cgroup reclaim can be interrupted with mem_cgroup_iter_break() once enough pages have been reclaimed, in which case, in contrast to a full round-trip over a cgroup sub-tree, the current position stored in mem_cgroup_reclaim_iter of the target cgroup does not get invalidated and so is left holding the reference to the last scanned cgroup. If the target cgroup does not get scanned again (we might have just reclaimed the last page or all processes might exit and free their memory voluntary), we will leak it, because there is nobody to put the reference held by the iterator. The problem is easy to reproduce by running the following command sequence in a loop: mkdir /sys/fs/cgroup/memory/test echo 100M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes echo $$ > /sys/fs/cgroup/memory/test/cgroup.procs memhog 150M echo $$ > /sys/fs/cgroup/memory/cgroup.procs rmdir test The cgroups generated by it will never get freed. This patch fixes this issue by making mem_cgroup_iter avoid taking reference to the current position. In order not to hit use-after-free bug while running reclaim in parallel with cgroup deletion, we make use of ->css_released cgroup callback to clear references to the dying cgroup in all reclaim iterators that might refer to it. This callback is called right before scheduling rcu work which will free css, so if we access iter->position from rcu read section, we might be sure it won't go away under us. [hannes@cmpxchg.org: clean up css ref handling] Fixes: 5ac8fb31ad2e ("mm: memcontrol: convert reclaim iterator to simple css refcounting") Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> [3.19+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	ocfs2: fix BUG when calculate new backup super	Joseph Qi	1	-3/+12
	When resizing, it firstly extends the last gd. Once it should backup super in the gd, it calculates new backup super and update the corresponding value. But it currently doesn't consider the situation that the backup super is already done. And in this case, it still sets the bit in gd bitmap and then decrease from bg_free_bits_count, which leads to a corrupted gd and trigger the BUG in ocfs2_block_group_set_bits: BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits); So check whether the backup super is done and then do the updates. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Jiufei Xue <xuejiufei@huawei.com> Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	net: hns: use to_platform_device()	Geliang Tang	1	-2/+1
	Use to_platform_device() instead of open-coding it. Signed-off-by: Geliang Tang <geliangtang@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-29	atm: solos-pci: use to_pci_dev()	Geliang Tang	1	-3/+3
	Use to_pci_dev() instead of open-coding it. Signed-off-by: Geliang Tang <geliangtang@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-29	drivers: net: cpsw: fix error return code	Julia Lawall	1	-3/+7
	Propagate the return value of platform_get_irq on failure. A simplified version of the semantic match that finds the two cases where no error code is returned at all is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ identifier ret; expression e1,e2; @@ ( if ($ret < 0\\|ret != 0$) { ... return ret; } \| ret = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-29	openvswitch: Fix template leak in error cases.	Joe Stringer	1	-2/+4
	Commit 5b48bb8506c5 ("openvswitch: Fix helper reference leak") fixed a reference leak on helper objects, but inadvertently introduced a leak on the ct template. Previously, ct_info.ct->general.use was initialized to 0 by nf_ct_tmpl_alloc() and only incremented when ovs_ct_copy_action() returned successful. If an error occurred while adding the helper or adding the action to the actions buffer, the __ovs_ct_free_action() cleanup would use nf_ct_put() to free the entry; However, this relies on atomic_dec_and_test(ct_info.ct->general.use). This reference must be incremented first, or nf_ct_put() will never free it. Fix the issue by acquiring a reference to the template immediately after allocation. Fixes: cae3a2627520 ("openvswitch: Allow attaching helpers to ct action") Fixes: 5b48bb8506c5 ("openvswitch: Fix helper reference leak") Signed-off-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-29	bpf: hash: use per-bucket spinlock	tom.leiming@gmail.com	1	-18/+32
	Both htab_map_update_elem() and htab_map_delete_elem() can be called from eBPF program, and they may be in kernel hot path, so it isn't efficient to use a per-hashtable lock in this two helpers. The per-hashtable spinlock is used for protecting bucket's hlist, and per-bucket lock is just enough. This patch converts the per-hashtable lock into per-bucket spinlock, so that contention can be decreased a lot. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-29	bpf: hash: move select_bucket() out of htab's spinlock	tom.leiming@gmail.com	1	-4/+2
	The spinlock is just used for protecting the per-bucket hlist, so it isn't needed for selecting bucket. Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-29	bpf: hash: use atomic count	tom.leiming@gmail.com	1	-6/+6
	Preparing for removing global per-hashtable lock, so the counter need to be defined as aotmic_t first. Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-29	[PATCH] arm: fix handling of F_OFD_... in oabi_fcntl64()	Al Viro	1	-36/+37
	Cc: stable@vger.kernel.org # 3.15+ Reviewed-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>