wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2015-01-23	libata: use blk taging	Shaohua Li	4	-17/+48
	libata uses its own tag management which is duplication and the implementation is poor. And if we switch to blk-mq, tag is build-in. It's time to switch to generic taging. The SAS driver has its own tag management, and looks we can't directly map the host controler tag to SATA tag. So I just bypassed the SAS case. I changed the code/variable name for the tag management of libata to make it self contained. Only sas will use it. Later if libsas implements its tag management, the tag management code in libata can be deleted easily. Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Shaohua Li <shli@fb.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-23	blk-mq: add tag allocation policy	Shaohua Li	5	-17/+39
	This is the blk-mq part to support tag allocation policy. The default allocation policy isn't changed (though it's not a strict FIFO). The new policy is round-robin for libata. But it's a try-best implementation. If multiple tasks are competing, the tags returned will be mixed (which is unavoidable even with !mq, as requests from different tasks can be mixed in queue) Cc: Jens Axboe <axboe@fb.com> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-23	block: support different tag allocation policy	Shaohua Li	6	-13/+39
	The libata tag allocation is using a round-robin policy. Next patch will make libata use block generic tag allocation, so let's add a policy to tag allocation. Currently two policies: FIFO (default) and round-robin. Cc: Jens Axboe <axboe@fb.com> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-22	block: Remove annoying "unknown partition table" message	Boaz Harrosh	1	-6/+6
	As Christoph put it: Can we just get rid of the warnings? It's fairly annoying as devices without partitions are perfectly fine and very useful. Me too I see this message every VM boot for ages on all my devices. Would love to just remove it. For me a partition-table is only needed for a booting BIOS, grub, and stuff. CC: Christoph Hellwig <hch@infradead.org> Signed-off-by: Boaz Harrosh <boaz@plexistor.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-21	NVMe: within nvme_free_queues(), delete RCU sychro/deferred free	kaoudis	1	-8/+1
	Converting from to blk-queue got rid of the driver's RCU locking-on-queue, so removing unnecessary RCU locking-on-queue artefacts. Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Kelly Nicole Kaoudis <kaoudis@colorado.edu> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-21	block: Add discard flag to blkdev_issue_zeroout() function	Martin K. Petersen	4	-8/+30
	blkdev_issue_discard() will zero a given block range. This is done by way of explicit writing, thus provisioning or allocating the blocks on disk. There are use cases where the desired behavior is to zero the blocks but unprovision them if possible. The blocks must deterministically contain zeroes when they are subsequently read back. This patch adds a flag to blkdev_issue_zeroout() that provides this variant. If the discard flag is set and a block device guarantees discard_zeroes_data we will use REQ_DISCARD to clear the block range. If the device does not support discard_zeroes_data or if the discard request fails we will fall back to first REQ_WRITE_SAME and then a regular REQ_WRITE. Also update the callers of blkdev_issue_zero() to reflect the new flag and make sb_issue_zeroout() prefer the discard approach. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-21	cfq-iosched: fix incorrect filing of rt async cfqq	Jeff Moyer	1	-2/+7
	Hi, If you can manage to submit an async write as the first async I/O from the context of a process with realtime scheduling priority, then a cfq_queue is allocated, but filed into the wrong async_cfqq bucket. It ends up in the best effort array, but actually has realtime I/O scheduling priority set in cfqq->ioprio. The reason is that cfq_get_queue assumes the default scheduling class and priority when there is no information present (i.e. when the async cfqq is created): static struct cfq_queue * cfq_get_queue(struct cfq_data cfqd, bool is_sync, struct cfq_io_cq cic, struct bio bio, gfp_t gfp_mask) { const int ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio); const int ioprio = IOPRIO_PRIO_DATA(cic->ioprio); cic->ioprio starts out as 0, which is "invalid". So, class of 0 (IOPRIO_CLASS_NONE) is passed to cfq_async_queue_prio like so: async_cfqq = cfq_async_queue_prio(cfqd, ioprio_class, ioprio); static struct cfq_queue * cfq_async_queue_prio(struct cfq_data cfqd, int ioprio_class, int ioprio) { switch (ioprio_class) { case IOPRIO_CLASS_RT: return &cfqd->async_cfqq[0][ioprio]; case IOPRIO_CLASS_NONE: ioprio = IOPRIO_NORM; / fall through / case IOPRIO_CLASS_BE: return &cfqd->async_cfqq[1][ioprio]; case IOPRIO_CLASS_IDLE: return &cfqd->async_idle_cfqq; default: BUG(); } } Here, instead of returning a class mapped from the process' scheduling priority, we get back the bucket associated with IOPRIO_CLASS_BE. Now, there is no queue allocated there yet, so we create it: cfqq = cfq_find_alloc_queue(cfqd, is_sync, cic, bio, gfp_mask); That function ends up doing this: cfq_init_cfqq(cfqd, cfqq, current->pid, is_sync); cfq_init_prio_data(cfqq, cic); cfq_init_cfqq marks the priority as having changed. Then, cfq_init_prio data does this: ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio); switch (ioprio_class) { default: printk(KERN_ERR "cfq: bad prio %x\n", ioprio_class); case IOPRIO_CLASS_NONE: / * no prio set, inherit CPU scheduling settings */ cfqq->ioprio = task_nice_ioprio(tsk); cfqq->ioprio_class = task_nice_ioclass(tsk); break; So we basically have two code paths that treat IOPRIO_CLASS_NONE differently, which results in an RT async cfqq filed into a best effort bucket. Attached is a patch which fixes the problem. I'm not sure how to make it cleaner. Suggestions would be welcome. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Tested-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-16	null_blk: suppress invalid partition info	Jens Axboe	1	-1/+1
	null_blk is partitionable, but it doesn't store any of the info. When it is loaded, you would normally see: [1226739.343608] nullb0: unknown partition table [1226739.343746] nullb1: unknown partition table which can confuse some people. Add the appropriate gendisk flag to suppress this info. Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-14	blk-mq: fix false negative out-of-tags condition	Jens Axboe	1	-17/+26
	The blk-mq tagging tries to maintain some locality between CPUs and the tags issued. The tags are split into groups of words, and the words may not be fully populated. When searching for a new free tag, blk-mq may look at partial words, hence it passes in an offset/size to find_next_zero_bit(). However, it does that wrong, the size must always be the full length of the number of tags in that word, otherwise we'll potentially miss some near the end. Another issue is when __bt_get() goes from one word set to the next. It bumps the index, but not the last_tag associated with the previous index. Bump that to be in the range of the new word. Finally, clean up __bt_get() and __bt_get_word() a bit and get rid of the goto in there, and the unnecessary 'wrap' variable. Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-13	brd: Request from fdisk 4k alignment	Boaz Harrosh	1	-0/+9
	Because of the direct_access() API which returns a PFN. partitions better start on 4K boundary, else offset ZERO of a partition will not be aligned and blk_direct_access() will fail the call. By setting blk_queue_physical_block_size(PAGE_SIZE) we can communicate this to fdisk and friends. The call to blk_queue_physical_block_size() is harmless and will not affect the Kernel behavior in any way. It is only for communication to user-mode. before this patch running fdisk on a default size brd of 4M the first sector offered is 34 (BAD), but after this patch it will be 40, ie 8 sectors aligned. Also when entering some random partition sizes the next partition-start sector is offered 8 sectors aligned after this patch. (Please note that with fdisk the user can still enter bad values, only the offered default values will be correct) Note that with bdev-size > 4M fdisk will try to align on a 1M boundary (above first-sector will be 2048), in any case. CC: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Boaz Harrosh <boaz@plexistor.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-13	brd: Fix all partitions BUGs	Boaz Harrosh	1	-62/+38
	This patch fixes up brd's partitions scheme, now enjoying all worlds. The MAIN fix here is that currently, if one fdisks some partitions, a BAD bug will make all partitions point to the same start-end sector ie: 0 - brd_size And an mkfs of any partition would trash the partition table and the other partition. Another fix is that "mount -U uuid" will not work if show_part was not specified, because of the GENHD_FL_SUPPRESS_PARTITION_INFO flag. We now always load without it and remove the show_part parameter. [We remove Dmitry's new module-param part_show it is now always show] So NOW the logic goes like this: * max_part - Just says how many minors to reserve between ramX devices. In any way, there can be as many partition as requested. If minors between devices ends, then dynamic 259-major ids will be allocated on the fly. The default is now max_part=1, which means all partitions devt(s) will be from the dynamic (259) major-range. (If persistent partition minors is needed use max_part=X) For example with /dev/sdX max_part is hard coded 16. * Creation of new devices on the fly still/always work: mknod /path/devnod b 1 X fdisk -l /path/devnod Will create a new device if [X / max_part] was not already created before. (Just as before) partitions on the dynamically created device will work as well Same logic applies with minors as with the pre-created ones. TODO: dynamic grow of device size. So each device can have it's own size. CC: Dmitry Monakhov <dmonakhov@openvz.org> Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Boaz Harrosh <boaz@plexistor.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-13	block: Change direct_access calling convention	Matthew Wilcox	7	-58/+86
	In order to support accesses to larger chunks of memory, pass in a 'size' parameter (counted in bytes), and return the amount available at that address. Add a new helper function, bdev_direct_access(), to handle common functionality including partition handling, checking the length requested is positive, checking for the sector being page-aligned, and checking the length of the request does not pass the end of the partition. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Boaz Harrosh <boaz@plexistor.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-13	axonram: Fix bug in direct_access	Matthew Wilcox	1	-1/+1
	The 'pfn' returned by axonram was completely bogus, and has been since 2008. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-02	loop: add blk-mq.h include	Jens Axboe	2	-1/+1
	Looks like we pull it in through other ways on x86, but we fail on sparc: In file included from drivers/block/cryptoloop.c:30:0: drivers/block/loop.h:63:24: error: field 'tag_set' has incomplete type struct blk_mq_tag_set tag_set; Add the include to loop.h, kill it from loop.c. Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-02	block: loop: don't handle REQ_FUA explicitly	Ming Lei	1	-11/+3
	block core handles REQ_FUA by its flush state machine, so won't do it in loop explicitly. Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-02	block: loop: introduce lo_discard() and lo_req_flush()	Ming Lei	1	-33/+40
	No behaviour change, just move the handling for REQ_DISCARD and REQ_FLUSH in these two functions. Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-02	block: loop: say goodby to bio	Ming Lei	1	-25/+20
	Switch to block request completely. Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-02	block: loop: improve performance via blk-mq	Ming Lei	2	-170/+174
	The conversion is a bit straightforward, and use work queue to dispatch requests of loop block, and one big change is that requests is submitted to backend file/device concurrently with work queue, so throughput may get improved much. Given write requests over same file are often run exclusively, so don't handle them concurrently for avoiding extra context switch cost, possible lock contention and work schedule cost. Also with blk-mq, there is opportunity to get loop I/O merged before submitting to backend file/device. In the following test: - base: v3.19-rc2-2041231 - loop over file in ext4 file system on SSD disk - bs: 4k, libaio, io depth: 64, O_DIRECT, num of jobs: 1 - throughput: IOPS ------------------------------------------------------ \| \| base \| base with loop-mq \| delta \| ------------------------------------------------------ \| randread \| 1740 \| 25318 \| +1355%\| ------------------------------------------------------ \| read \| 42196 \| 51771 \| +22.6%\| ----------------------------------------------------- \| randwrite \| 35709 \| 34624 \| -3% \| ----------------------------------------------------- \| write \| 39137 \| 40326 \| +3% \| ----------------------------------------------------- So loop-mq can improve throughput for both read and randread, meantime, performance of write and randwrite isn't hurted basically. Another benefit is that loop driver code gets simplified much after blk-mq conversion, and the patch can be thought as cleanup too. Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-02	blk-mq: export blk_mq_freeze_queue()	Jens Axboe	2	-0/+2
	Commit b4c6a028774b exported the start and unfreeze, but we need the regular blk_mq_freeze_queue() for the loop conversion. Signed-off-by: Jens Axboe <axboe@fb.com>
2015-01-02	block: fix checking return value of blk_mq_init_queue	Ming Lei	3	-3/+3
	Check IS_ERR_OR_NULL(return value) instead of just return value. Signed-off-by: Ming Lei <ming.lei@canonical.com> Reduced to IS_ERR() by me, we never return NULL. Signed-off-by: Jens Axboe <axboe@fb.com>
2014-12-31	block: wake up waiters when a queue is marked dying	Jens Axboe	5	-5/+42
	If it's dying, we can't expect new request to complete and come in an wake up other tasks waiting for requests. So after we have marked it as dying, wake up everybody currently waiting for a request. Once they wake, they will retry their allocation and fail appropriately due to the state of the queue. Tested-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-12-28	Linux 3.19-rc2	Linus Torvalds	1	-1/+1

2014-12-28	kvm: warn on more invariant breakage	Paolo Bonzini	1	-1/+3
	Modifying a non-existent slot is not allowed. Also check that the first loop doesn't move a deleted slot beyond the used part of the mslots array. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-28	kvm: fix sorting of memslots with base_gfn == 0	Paolo Bonzini	1	-5/+17
	Before commit 0e60b0799fed (kvm: change memslot sorting rule from size to GFN, 2014-12-01), the memslots' sorting key was npages, meaning that a valid memslot couldn't have its sorting key equal to zero. On the other hand, a valid memslot can have base_gfn == 0, and invalid memslots are identified by base_gfn == npages == 0. Because of this, commit 0e60b0799fed broke the invariant that invalid memslots are at the end of the mslots array. When a memslot with base_gfn == 0 was created, any invalid memslot before it were left in place. This can be fixed by changing the insertion to use a ">=" comparison instead of "<=", but some care is needed to avoid breaking the case of deleting a memslot; see the comment in update_memslots. Thanks to Tiejun Chen for posting an initial patch for this bug. Reported-by: Jamie Heilman <jamie@audible.transient.net> Reported-by: Andy Lutomirski <luto@amacapital.net> Tested-by: Jamie Heilman <jamie@audible.transient.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-27	kvm: x86: drop severity of "generation wraparound" message	Paolo Bonzini	1	-1/+1
	Since most virtual machines raise this message once, it is a bit annoying. Make it KERN_DEBUG severity. Cc: stable@vger.kernel.org Fixes: 7a2e8aaf0f6873b47bc2347f216ea5b0e4c258ab Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-27	kvm: x86: vmx: reorder some msr writing	Tiejun Chen	1	-44/+44
	The commit 34a1cd60d17f, "x86: vmx: move some vmx setting from vmx_init() to hardware_setup()", tried to refactor some codes specific to vmx hardware setting into hardware_setup(), but some msr writing should depend on our previous setting condition like enable_apicv, enable_ept and so on. Reported-by: Jamie Heilman <jamie@audible.transient.net> Tested-by: Jamie Heilman <jamie@audible.transient.net> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-26	[regression] braino in "lustre: use is_root_inode()"	Al Viro	1	-1/+1
	In one of the places (ll_md_blocking_ast()) we had open-coded !is_root_inode(inode) and replaced it with is_root_inode(inode). See the last chunk of f76c23: - inode != inode->i_sb->s_root->d_inode) + is_root_inode(inode)) should've been + !is_root_inode(inode)) obviously... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-26	parisc: fix out-of-register compiler error in ldcw inline assembler function	John David Anglin	1	-3/+10
	The __ldcw macro has a problem when its argument needs to be reloaded from memory. The output memory operand and the input register operand both need to be reloaded using a register in class R1_REGS when generating 64-bit code. This fails because there's only a single register in the class. Instead, use a memory clobber. This also makes the __ldcw macro a compiler memory barrier. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: <stable@vger.kernel.org> [3.13+] Signed-off-by: Helge Deller <deller@gmx.de>
2014-12-26	ALSA: hda_intel: apply the Seperate stream_tag for Skylake	Libin Yang	1	-1/+4
	The total stream number of Skylake's input and output stream exceeds 15, which will cause some streams do not work because of the overflow on SDxCTL.STRM field if using the legacy stream tag allocation method. This patch uses the new stream tag allocation method by add the flag AZX_DCAPS_SEPARATE_STREAM_TAG for Skylake platform. Signed-off-by: Libin Yang <libin.yang@intel.com> Reviewed-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-12-26	ALSA: hda_controller: Separate stream_tag for input and output streams.	Rafal Redzimski	2	-2/+23
	Implemented separate stream_tag assignment for input and output streams. According to hda specification stream tag must be unique throughout the input streams group, however an output stream might use a stream tag which is already in use by an input stream. This change is necessary to support HW which provides a total of more than 15 stream DMA engines which with legacy implementation causes an overflow on SDxCTL.STRM field (and the whole SDxCTL register) and as a result usage of Reserved value 0 in the SDxCTL.STRM field which confuses HDA controller. Signed-off-by: Rafal Redzimski <rafal.f.redzimski@intel.com> Signed-off-by: Jayachandran B <jayachandran.b@intel.com> Signed-off-by: Libin Yang <libin.yang@intel.com> Reviewed-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-12-24	Revert "drm/gem: Warn on illegal use of the dumb buffer interface v2"	Dave Airlie	9	-74/+12
	This reverts commit 355a70183848f21198e9f6296bd646df3478a26d. This had some bad side effects under normal operation, and should have been dropped earlier. Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-12-23	audit: restore AUDIT_LOGINUID unset ABI	Richard Guy Briggs	2	-0/+14
	A regression was caused by commit 780a7654cee8: audit: Make testing for a valid loginuid explicit. (which in turn attempted to fix a regression caused by e1760bd) When audit_krule_to_data() fills in the rules to get a listing, there was a missing clause to convert back from AUDIT_LOGINUID_SET to AUDIT_LOGINUID. This broke userspace by not returning the same information that was sent and expected. The rule: auditctl -a exit,never -F auid=-1 gives: auditctl -l LIST_RULES: exit,never f24=0 syscall=all when it should give: LIST_RULES: exit,never auid=-1 (0xffffffff) syscall=all Tag it so that it is reported the same way it was set. Create a new private flags audit_krule field (pflags) to store it that won't interact with the public one from the API. Cc: stable@vger.kernel.org # v3.10-rc1+ Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-12-23	arm64: mm: Add pgd_page to support RCU fast_gup	Jungseok Lee	1	-2/+3
	This patch adds pgd_page definition in order to keep supporting HAVE_GENERIC_RCU_GUP configuration. In addition, it changes pud_page expression to align with pmd_page for readability. An introduction of pgd_page resolves the following build breakage under 4KB + 4Level memory management combo. mm/gup.c: In function 'gup_huge_pgd': mm/gup.c:889:2: error: implicit declaration of function 'pgd_page' [-Werror=implicit-function-declaration] head = pgd_page(orig); ^ mm/gup.c:889:7: warning: assignment makes pointer from integer without a cast head = pgd_page(orig); Cc: Will Deacon <will.deacon@arm.com> Cc: Steve Capper <steve.capper@linaro.org> Signed-off-by: Jungseok Lee <jungseoklee85@gmail.com> [catalin.marinas@arm.com: remove duplicate pmd_page definition] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-12-23	arm64: defconfig: defconfig update for 3.19	Will Deacon	1	-4/+5
	The usual defconfig tweaks, this time: - FHANDLE and AUTOFS4_FS to keep systemd happy - PID_NS, QUOTA and KEYS to keep LTP happy - Disable DEBUG_PREEMPT, as this really hurts performance Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-12-23	arm64: kernel: fix __cpu_suspend mm switch on warm-boot	Lorenzo Pieralisi	1	-1/+13
	On arm64 the TTBR0_EL1 register is set to either the reserved TTBR0 page tables on boot or to the active_mm mappings belonging to user space processes, it must never be set to swapper_pg_dir page tables mappings. When a CPU is booted its active_mm is set to init_mm even though its TTBR0_EL1 points at the reserved TTBR0 page mappings. This implies that when __cpu_suspend is triggered the active_mm can point at init_mm even if the current TTBR0_EL1 register contains the reserved TTBR0_EL1 mappings. Therefore, the mm save and restore executed in __cpu_suspend might turn out to be erroneous in that, if the current->active_mm corresponds to init_mm, on resume from low power it ends up restoring in the TTBR0_EL1 the init_mm mappings that are global and can cause speculation of TLB entries which end up being propagated to user space. This patch fixes the issue by checking the active_mm pointer before restoring the TTBR0 mappings. If the current active_mm == &init_mm, the code sets the TTBR0_EL1 to the reserved TTBR0 mapping instead of switching back to the active_mm, which is the expected behaviour corresponding to the TTBR0_EL1 settings when __cpu_suspend was entered. Fixes: 95322526ef62 ("arm64: kernel: cpu_{suspend/resume} implementation") Cc: <stable@vger.kernel.org> # 3.14+: 18ab7db Cc: <stable@vger.kernel.org> # 3.14+: 714f599 Cc: <stable@vger.kernel.org> # 3.14+: c3684fb Cc: <stable@vger.kernel.org> # 3.14+ Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-12-23	agp: Fix up email address & attributions in AGP MODULE_AUTHOR tags	Dave Jones	8	-8/+8
	- Remove soon-to-be-dead @redhat address. - Jeff Hartmann wrote the bulk of the original backend code, and should at least get a mention in the MODULE_AUTHOR for backend.o - Various people at Intel have done a lot more work than myself on the intel-* drivers, so again, mention that. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-12-22	Revert "mm/memory.c: share the i_mmap_rwsem"	Kirill A. Shutemov	1	-2/+2
	This reverts commit c8475d144abb1e62958cc5ec281d2a9e161c1946. There are several[1][2] of bug reports which points to this commit as potential cause[3]. Let's revert it until we figure out what's going on. [1] https://lkml.org/lkml/2014/11/14/342 [2] https://lkml.org/lkml/2014/12/22/213 [3] https://lkml.org/lkml/2014/12/9/741 Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Hugh Dickins <hughd@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-23	nouveau: bring back legacy mmap handler	Dave Airlie	1	-1/+2
	nouveau userspace back at 1.0.1 used to call the X server DRIOpenDRMMaster interface even for DRI2 (doh!), this attempts to map the sarea and fails if it can't. Since 884c6dabb0eafe7227f099c9e78e514191efaf13 from Daniel, this fails, but only ancient drivers would see it. Revert the nouveau bits of that fix. Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: <stable@vger.kernel.org> # 3.18 Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-12-22	NVMe: Fix double free irq	Keith Busch	1	-5/+12
	Sets the vector to an invalid value after it's freed so we don't free it twice. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-12-22	audit: correctly record file names with different path name types	Paul Moore	1	-4/+10
	There is a problem with the audit system when multiple audit records are created for the same path, each with a different path name type. The root cause of the problem is in __audit_inode() when an exact match (both the path name and path name type) is not found for a path name record; the existing code creates a new path name record, but it never sets the path name in this record, leaving it NULL. This patch corrects this problem by assigning the path name to these newly created records. There are many ways to reproduce this problem, but one of the easiest is the following (assuming auditd is running): # mkdir /root/tmp/test # touch /root/tmp/test/567 # auditctl -a always,exit -F dir=/root/tmp/test # touch /root/tmp/test/567 Afterwards, or while the commands above are running, check the audit log and pay special attention to the PATH records. A faulty kernel will display something like the following for the file creation: type=SYSCALL msg=audit(1416957442.025:93): arch=c000003e syscall=2 success=yes exit=3 ... comm="touch" exe="/usr/bin/touch" type=CWD msg=audit(1416957442.025:93): cwd="/root/tmp" type=PATH msg=audit(1416957442.025:93): item=0 name="test/" inode=401409 ... nametype=PARENT type=PATH msg=audit(1416957442.025:93): item=1 name=(null) inode=393804 ... nametype=NORMAL type=PATH msg=audit(1416957442.025:93): item=2 name=(null) inode=393804 ... nametype=NORMAL While a patched kernel will show the following: type=SYSCALL msg=audit(1416955786.566:89): arch=c000003e syscall=2 success=yes exit=3 ... comm="touch" exe="/usr/bin/touch" type=CWD msg=audit(1416955786.566:89): cwd="/root/tmp" type=PATH msg=audit(1416955786.566:89): item=0 name="test/" inode=401409 ... nametype=PARENT type=PATH msg=audit(1416955786.566:89): item=1 name="test/567" inode=393804 ... nametype=NORMAL This issue was brought up by a number of people, but special credit should go to hujianyang@huawei.com for reporting the problem along with an explanation of the problem and a patch. While the original patch did have some problems (see the archive link below), it did demonstrate the problem and helped kickstart the fix presented here. * https://lkml.org/lkml/2014/9/5/66 Reported-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Paul Moore <pmoore@redhat.com> Acked-by: Richard Guy Briggs <rgb@redhat.com>
2014-12-22	arm64: Replace set_arch_dma_coherent_ops with arch_setup_dma_ops	Catalin Marinas	1	-5/+6
	Commit a3a60f81ee6f (dma-mapping: replace set_arch_dma_coherent_ops with arch_setup_dma_ops) changes the of_dma_configure() arch dma_ops callback to arch_setup_dma_ops but only the arch/arm code is updated. Subsequent commit 97890ba9289c (dma-mapping: detect and configure IOMMU in of_dma_configure) changes the arch_setup_dma_ops() prototype further to handle iommu. The patch makes the corresponding arm64 changes. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Will Deacon <will.deacon@arm.com>
2014-12-21	ipmi: Fix compile issue with isspace()	Corey Minyard	1	-0/+1
	Some arches don't get ctypes.h included from these includes, so add it explicitly. Signed-off-by: Corey Minyard <cminyard@mvista.com>
2014-12-21	ipmi: Finish cleanup of BMC attributes	Corey Minyard	1	-29/+17
	The previous cleanup of BMC attributes left a few holes, and if you run with lockdep debugging with a BMC with the proper attributes, you could get a warning. This patch removes all the unused attributes from the BMC structure, since they are all declared in the .data section now. It makes the attributes all static. It fixes the referencing of the attributes in a couple of cases that dynamically added the files depending on BMC information. Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: Huang Ying <ying.huang@intel.com> Tested-by: Alexei Starovoitov <ast@plumgrid.com>
2014-12-20	Linux 3.19-rc1	Linus Torvalds	1	-2/+2

2014-12-20	blk-mq: Export freeze/unfreeze functions	Keith Busch	2	-2/+6
	Let drivers prevent entering a queue that isn't available. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-12-20	blk-mq: Exit queue on alloc failure	Keith Busch	1	-1/+3
	Fixes usage counter when a request could not be allocated. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-12-20	i2c: sh_mobile: fix uninitialized var when debug is enabled	Wolfram Sang	1	-0/+1
	Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-12-19	audit: use supplied gfp_mask from audit_buffer in kauditd_send_multicast_skb	Richard Guy Briggs	1	-4/+4
	Eric Paris explains: Since kauditd_send_multicast_skb() gets called in audit_log_end(), which can come from any context (aka even a sleeping context) GFP_KERNEL can't be used. Since the audit_buffer knows what context it should use, pass that down and use that. See: https://lkml.org/lkml/2014/12/16/542 BUG: sleeping function called from invalid context at mm/slab.c:2849 in_atomic(): 1, irqs_disabled(): 0, pid: 885, name: sulogin 2 locks held by sulogin/885: #0: (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff91152e30>] prepare_bprm_creds+0x28/0x8b #1: (tty_files_lock){+.+.+.}, at: [<ffffffff9123e787>] selinux_bprm_committing_creds+0x55/0x22b CPU: 1 PID: 885 Comm: sulogin Not tainted 3.18.0-next-20141216 #30 Hardware name: Dell Inc. Latitude E6530/07Y85M, BIOS A15 06/20/2014 ffff880223744f10 ffff88022410f9b8 ffffffff916ba529 0000000000000375 ffff880223744f10 ffff88022410f9e8 ffffffff91063185 0000000000000006 0000000000000000 0000000000000000 0000000000000000 ffff88022410fa38 Call Trace: [<ffffffff916ba529>] dump_stack+0x50/0xa8 [<ffffffff91063185>] ___might_sleep+0x1b6/0x1be [<ffffffff910632a6>] __might_sleep+0x119/0x128 [<ffffffff91140720>] cache_alloc_debugcheck_before.isra.45+0x1d/0x1f [<ffffffff91141d81>] kmem_cache_alloc+0x43/0x1c9 [<ffffffff914e148d>] __alloc_skb+0x42/0x1a3 [<ffffffff914e2b62>] skb_copy+0x3e/0xa3 [<ffffffff910c263e>] audit_log_end+0x83/0x100 [<ffffffff9123b8d3>] ? avc_audit_pre_callback+0x103/0x103 [<ffffffff91252a73>] common_lsm_audit+0x441/0x450 [<ffffffff9123c163>] slow_avc_audit+0x63/0x67 [<ffffffff9123c42c>] avc_has_perm+0xca/0xe3 [<ffffffff9123dc2d>] inode_has_perm+0x5a/0x65 [<ffffffff9123e7ca>] selinux_bprm_committing_creds+0x98/0x22b [<ffffffff91239e64>] security_bprm_committing_creds+0xe/0x10 [<ffffffff911515e6>] install_exec_creds+0xe/0x79 [<ffffffff911974cf>] load_elf_binary+0xe36/0x10d7 [<ffffffff9115198e>] search_binary_handler+0x81/0x18c [<ffffffff91153376>] do_execveat_common.isra.31+0x4e3/0x7b7 [<ffffffff91153669>] do_execve+0x1f/0x21 [<ffffffff91153967>] SyS_execve+0x25/0x29 [<ffffffff916c61a9>] stub_execve+0x69/0xa0 Cc: stable@vger.kernel.org #v3.16-rc1 Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Tested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-12-19	audit: don't attempt to lookup PIDs when changing PID filtering audit rules	Paul Moore	1	-13/+0
	Commit f1dc4867 ("audit: anchor all pid references in the initial pid namespace") introduced a find_vpid() call when adding/removing audit rules with PID/PPID filters; unfortunately this is problematic as find_vpid() only works if there is a task with the associated PID alive on the system. The following commands demonstrate a simple reproducer. # auditctl -D # auditctl -l # autrace /bin/true # auditctl -l This patch resolves the problem by simply using the PID provided by the user without any additional validation, e.g. no calls to check to see if the task/PID exists. Cc: stable@vger.kernel.org # 3.15 Cc: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <pmoore@redhat.com> Acked-by: Eric Paris <eparis@redhat.com> Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
2014-12-20	CRISv32: Remove last remnants of ETRAX_SPI_MMC_BOARD	Jesper Nilsson	1	-7/+0
	There are no users of this symbol left. Reported-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>