linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2022-08-09	fscache: add tracepoint when failing cookie	Jeff Layton	2	-0/+4
	Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: David Howells <dhowells@redhat.com>
2022-08-09	fscache: don't leak cookie access refs if invalidation is in progress or failed	Jeff Layton	1	-2/+5
	It's possible for a request to invalidate a fscache_cookie will come in while we're already processing an invalidation. If that happens we currently take an extra access reference that will leak. Only call __fscache_begin_cookie_access if the FSCACHE_COOKIE_DO_INVALIDATE bit was previously clear. Also, ensure that we attempt to clear the bit when the cookie is "FAILED" and put the reference to avoid an access leak. Fixes: 85e4ea1049c7 ("fscache: Fix invalidation/lookup race") Suggested-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: David Howells <dhowells@redhat.com>
2022-08-08	fix copy_page_from_iter() for compound destinations	Al Viro	1	-4/+18
	had been broken for ITER_BVEC et.al. since ever (OK, v3.17 when ITER_BVEC had first appeared)... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	hugetlbfs: copy_page_to_iter() can deal with compound pages	Al Viro	1	-30/+1
	... since April 2021 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	copy_page_to_iter(): don't split high-order page in case of ITER_PIPE	Al Viro	1	-15/+6
	... just shove it into one pipe_buffer. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	expand those iov_iter_advance()...	Al Viro	1	-2/+9
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	pipe_get_pages(): switch to append_pipe()	Al Viro	1	-29/+6
	now that we are advancing the iterator, there's no need to treat the first page separately - just call append_pipe() in a loop. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	get rid of non-advancing variants	Al Viro	2	-31/+20
	mechanical change; will be further massaged in subsequent commits Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	ceph: switch the last caller of iov_iter_get_pages_alloc()	Al Viro	1	-1/+1
	here nothing even looks at the iov_iter after the call, so we couldn't care less whether it advances or not. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	9p: convert to advancing variant of iov_iter_get_pages_alloc()	Al Viro	3	-19/+26
	that one is somewhat clumsier than usual and needs serious testing. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages()	Al Viro	2	-4/+4
	... and adjust the callers Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	iter_to_pipe(): switch to advancing variant of iov_iter_get_pages()	Al Viro	1	-23/+24
	... and untangle the cleanup on failure to add into pipe. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	block: convert to advancing variants of iov_iter_get_pages{,_alloc}()	Al Viro	2	-14/+18
	... doing revert if we end up not using some pages Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	iov_iter: advancing variants of iov_iter_get_pages{,_alloc}()	Al Viro	13	-30/+34
	Most of the users immediately follow successful iov_iter_get_pages() with advancing by the amount it had returned. Provide inline wrappers doing that, convert trivial open-coded uses of those. BTW, iov_iter_get_pages() never returns more than it had been asked to; such checks in cifs ought to be removed someday... Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	iov_iter: saner helper for page array allocation	Al Viro	1	-45/+32
	All call sites of get_pages_array() are essenitally identical now. Replace with common helper... Returns number of slots available in resulting array or 0 on OOM; it's up to the caller to make sure it doesn't ask to zero-entry array (i.e. neither maxpages nor size are allowed to be zero). Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	fold __pipe_get_pages() into pipe_get_pages()	Al Viro	1	-37/+38
	... and don't mangle maxsize there - turn the loop into counting one instead. Easier to see that we won't run out of array that way. Note that special treatment of the partial buffer in that thing is an artifact of the non-advancing semantics of iov_iter_get_pages() - if not for that, it would be append_pipe(), same as the body of the loop that follows it. IOW, once we make iov_iter_get_pages() advancing, the whole thing will turn into calculate how many pages do we want allocate an array (if needed) call append_pipe() that many times. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	ITER_XARRAY: don't open-code DIV_ROUND_UP()	Al Viro	1	-9/+1
	Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts	Al Viro	1	-59/+27
	same as for pipes and xarrays; after that iov_iter_get_pages() becomes a wrapper for __iov_iter_get_pages_alloc(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	unify xarray_get_pages() and xarray_get_pages_alloc()	Al Viro	1	-39/+10
	same as for pipes Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	unify pipe_get_pages() and pipe_get_pages_alloc()	Al Viro	1	-32/+17
	The differences between those two are * pipe_get_pages() gets a non-NULL struct page ** value pointing to preallocated array + array size. * pipe_get_pages_alloc() gets an address of struct page variable that contains NULL, allocates the array and (on success) stores its address in that variable. Not hard to combine - always pass struct page *, have the previous pipe_get_pages_alloc() caller pass ~0U as cap for array size. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	iov_iter_get_pages(): sanity-check arguments	Al Viro	1	-7/+2
	zero maxpages is bogus, but best treated as "just return 0"; NULL pages, OTOH, should be treated as a hard bug. get rid of now completely useless checks in xarray_get_pages{,_alloc}(). Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper	Al Viro	1	-16/+22
	Incidentally, ITER_XARRAY did not free the sucker in case when iter_xarray_populate_pages() returned 0... Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	ITER_PIPE: fold data_start() and pipe_space_for_user() together	Al Viro	2	-45/+19
	All their callers are next to each other; all of them want the total amount of pages and, possibly, the offset in the partial final buffer. Combine into a new helper (pipe_npages()), fix the bogosity in pipe_space_for_user(), while we are at it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	ITER_PIPE: cache the type of last buffer	Al Viro	2	-40/+42
	We often need to find whether the last buffer is anon or not, and currently it's rather clumsy: check if ->iov_offset is non-zero (i.e. that pipe is not empty) if so, get the corresponding pipe_buffer and check its ->ops if it's &default_pipe_buf_ops, we have an anon buffer. Let's replace the use of ->iov_offset (which is nowhere near similar to its role for other flavours) with signed field (->last_offset), with the following rules: empty, no buffers occupied: 0 anon, with bytes up to N-1 filled: N zero-copy, with bytes up to N-1 filled: -N That way abs(i->last_offset) is equal to what used to be in i->iov_offset and empty vs. anon vs. zero-copy can be distinguished by the sign of i->last_offset. Checks for "should we extend the last buffer or should we start a new one?" become easier to follow that way. Note that most of the operations can only be done in a sane state - i.e. when the pipe has nothing past the current position of iterator. About the only thing that could be done outside of that state is iov_iter_advance(), which transitions to the sane state by truncating the pipe. There are only two cases where we leave the sane state: 1) iov_iter_get_pages()/iov_iter_get_pages_alloc(). Will be dealt with later, when we make get_pages advancing - the callers are actually happier that way. 2) iov_iter copied, then something is put into the copy. Since they share the underlying pipe, the original gets behind. When we decide that we are done with the copy (original is not usable until then) we advance the original. direct_io used to be done that way; nowadays it operates on the original and we do iov_iter_revert() to discard the excessive data. At the moment there's nothing in the kernel that could do that to ITER_PIPE iterators, so this reason for insane state is theoretical right now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	ITER_PIPE: clean iov_iter_revert()	Al Viro	1	-46/+14
	Fold pipe_truncate() into it, clean up. We can release buffers in the same loop where we walk backwards to the iterator beginning looking for the place where the new position will be. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	ITER_PIPE: clean pipe_advance() up	Al Viro	1	-17/+17
	instead of setting ->iov_offset for new position and calling pipe_truncate() to adjust ->len of the last buffer and discard everything after it, adjust ->len at the same time we set ->iov_offset and use pipe_discard_from() to deal with buffers past that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	ITER_PIPE: lose iter_head argument of __pipe_get_pages()	Al Viro	1	-4/+3
	it's only used to get to the partial buffer we can add to, and that's always the last one, i.e. pipe->head - 1. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	ITER_PIPE: fold push_pipe() into __pipe_get_pages()	Al Viro	1	-55/+25
	Expand the only remaining call of push_pipe() (in __pipe_get_pages()), combine it with the page-collecting loop there. Note that the only reason it's not a loop doing append_pipe() is that append_pipe() is advancing, while iov_iter_get_pages() is not. As soon as it switches to saner semantics, this thing will switch to using append_pipe(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	ITER_PIPE: allocate buffers as we go in copy-to-pipe primitives	Al Viro	1	-73/+98
	New helper: append_pipe(). Extends the last buffer if possible, allocates a new one otherwise. Returns page and offset in it on success, NULL on failure. iov_iter is advanced past the data we've got. Use that instead of push_pipe() in copy-to-pipe primitives; they get simpler that way. Handling of short copy (in "mc" one) is done simply by iov_iter_revert() - iov_iter is in consistent state after that one, so we can use that. [Fix for braino caught by Liu Xinpeng <liuxp11@chinatelecom.cn> folded in] [another braino fix, this time in copy_pipe_to_iter() and pipe_zero(); caught by testcase from Hugh Dickins] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	ITER_PIPE: helpers for adding pipe buffers	Al Viro	1	-42/+46
	There are only two kinds of pipe_buffer in the area used by ITER_PIPE. 1) anonymous - copy_to_iter() et.al. end up creating those and copying data there. They have zero ->offset, and their ->ops points to default_pipe_page_ops. 2) zero-copy ones - those come from copy_page_to_iter(), and page comes from caller. ->offset is also caller-supplied - it might be non-zero. ->ops points to page_cache_pipe_buf_ops. Move creation and insertion of those into helpers - push_anon(pipe, size) and push_page(pipe, page, offset, size) resp., separating them from the "could we avoid creating a new buffer by merging with the current head?" logics. Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	ITER_PIPE: helper for getting pipe buffer by index	Al Viro	1	-6/+9
	pipe_buffer instances of a pipe are organized as a ring buffer, with power-of-2 size. Indices are kept not reduced modulo ring size, so the buffer refered to by index N is pipe->bufs[N & (pipe->ring_size - 1)]. Ring size can change over the lifetime of a pipe, but not while the pipe is locked. So for any iov_iter primitives it's a constant. Original conversion of pipes to this layout went overboard trying to microoptimize that - calculating pipe->ring_size - 1, storing it in a local variable and using through the function. In some cases it might be warranted, but most of the times it only obfuscates what's going on in there. Introduce a helper (pipe_buf(pipe, N)) that would encapsulate that and use it in the obvious cases. More will follow... Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	splice: stop abusing iov_iter_advance() to flush a pipe	Al Viro	1	-5/+2
	Use pipe_discard_from() explicitly in generic_file_read_iter(); don't bother with rather non-obvious use of iov_iter_advance() in there. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	switch new_sync_{read,write}() to ITER_UBUF	Al Viro	1	-4/+2
	Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	new iov_iter flavour - ITER_UBUF	Al Viro	12	-31/+108
	Equivalent of single-segment iovec. Initialized by iov_iter_ubuf(), checked for by iter_is_ubuf(), otherwise behaves like ITER_IOVEC ones. We are going to expose the things like ->write_iter() et.al. to those in subsequent commits. New predicate (user_backed_iter()) that is true for ITER_IOVEC and ITER_UBUF; places like direct-IO handling should use that for checking that pages we modify after getting them from iov_iter_get_pages() would need to be dirtied. DO NOT assume that replacing iter_is_iovec() with user_backed_iter() will solve all problems - there's code that uses iter_is_iovec() to decide how to poke around in iov_iter guts and for that the predicate replacement obviously won't suffice. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08	vfs: Check the truncate maximum size in inode_newsize_ok()	David Howells	1	-0/+2
	If something manages to set the maximum file size to MAX_OFFSET+1, this can cause the xfs and ext4 filesystems at least to become corrupt. Ordinarily, the kernel protects against userspace trying this by checking the value early in the truncate() and ftruncate() system calls calls - but there are at least two places that this check is bypassed: (1) Cachefiles will round up the EOF of the backing file to DIO block size so as to allow DIO on the final block - but this might push the offset negative. It then calls notify_change(), but this inadvertently bypasses the checking. This can be triggered if someone puts an 8EiB-1 file on a server for someone else to try and access by, say, nfs. (2) ksmbd doesn't check the value it is given in set_end_of_file_info() and then calls vfs_truncate() directly - which also bypasses the check. In both cases, it is potentially possible for a network filesystem to cause a disk filesystem to be corrupted: cachefiles in the client's cache filesystem; ksmbd in the server's filesystem. nfsd is okay as it checks the value, but we can then remove this check too. Fix this by adding a check to inode_newsize_ok(), as called from setattr_prepare(), thereby catching the issue as filesystems set up to perform the truncate with minimal opportunity for bypassing the new check. Fixes: 1f08c925e7a3 ("cachefiles: Implement backing file wrangling") Fixes: f44158485826 ("cifsd: add file operations") Signed-off-by: David Howells <dhowells@redhat.com> Reported-by: Jeff Layton <jlayton@kernel.org> Tested-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Namjae Jeon <linkinjeon@kernel.org> Cc: stable@kernel.org Acked-by: Alexander Viro <viro@zeniv.linux.org.uk> cc: Steve French <sfrench@samba.org> cc: Hyunchul Lee <hyc.lee@gmail.com> cc: Chuck Lever <chuck.lever@oracle.com> cc: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-08	kernel/sysctl.c: Remove trailing white space	Fanjun Kong	1	-6/+6
	This patch removes the trailing white space in kernel/sysysctl.c Signed-off-by: Fanjun Kong <bh1scw@gmail.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> [mcgrof: fix commit message subject] Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2022-08-08	kernel/sysctl.c: Clean up indentation, replace spaces with tab.	Fanjun Kong	1	-2/+2
	This patch fixes two coding style issues: 1. Clean up indentation, replace spaces with tab 2. Add space after ',' Signed-off-by: Fanjun Kong <bh1scw@gmail.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2022-08-08	sysctl: Merge adjacent CONFIG_TREE_RCU blocks	Geert Uytterhoeven	1	-3/+1
	There are two adjacent sysctl entries protected by the same CONFIG_TREE_RCU config symbol. Merge them into a single block to improve readability. Use the more common "#ifdef" form while at it. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2022-08-08	remoteproc: qcom_q6v5_pas: Do not fail if regulators are not found	Manivannan Sadhasivam	1	-4/+16
	devm_regulator_get_optional() API will return -ENODEV if the regulator was not found. For the optional supplies CX, PX we should not fail in that case but rather continue. So let's catch that error and continue silently if those regulators are not found. The commit 3f52d118f992 ("remoteproc: qcom_q6v5_pas: Deal silently with optional px and cx regulators") was supposed to do the same but it missed the fact that devm_regulator_get_optional() API returns -ENODEV when the regulator was not found. Cc: Abel Vesa <abel.vesa@linaro.org> Fixes: 3f52d118f992 ("remoteproc: qcom_q6v5_pas: Deal silently with optional px and cx regulators") Reported-by: Steev Klimaszewski <steev@kali.org> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Tested-by: Steev Klimaszewski <steev@kali.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20220801053939.12556-1-manivannan.sadhasivam@linaro.org
2022-08-07	update Coccinelle URL	Julia Lawall	43	-43/+43
	Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2022-08-07	coccinelle: free: add version constraint	Julia Lawall	1	-0/+1
	The various functions contain a NULL check starting in v5.15. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2022-08-06	Revert "iommu/dma: Add config for PCI SAC address trick"	Linus Torvalds	2	-27/+1
	This reverts commit 4bf7fda4dce22214c70c49960b1b6438e6260b67. It turns out that it was hopelessly naive to think that this would work, considering that we've always done this. The first machine I actually tested this on broke at bootup, getting to Reached target cryptsetup.target - Local Encrypted Volumes. and then hanging. It's unclear what actually fails, since there's a lot else going on around that time (eg amdgpu probing also happens around that same time, but it could be some other random init thing that didn't complete earlier and just caused the boot to hang at that point). The expectations that we should default to some unsafe and untested mode seems entirely unfounded, and the belief that this wouldn't affect modern systems is clearly entirely false. The machine in question is about two years old, so it's not exactly shiny, but it's also not some dusty old museum piece PDP-11 in a closet. Cc: Robin Murphy <robin.murphy@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: John Garry <john.garry@huawei.com> Cc: Joerg Roedel <jroedel@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-06	apparmor: Update MAINTAINERS file with new email address	John Johansen	1	-0/+1
	Add the apparmor.net email address that the project is transitioning emails to. Signed-off-by: John Johansen <john.johansen@canonical.com>
2022-08-06	Revert "s390/smp: enforce lowcore protection on CPU restart"	Alexander Gordeev	1	-1/+1
	This reverts commit 6f5c672d17f583b081e283927f5040f726c54598. This breaks normal crash dump when CPU0 is offline. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-08-06	Revert "s390/smp: rework absolute lowcore access"	Alexander Gordeev	14	-294/+83
	This reverts commit 7d06fed77b7d8fc9f6cc41b4e3f2823d32532ad8. This introduced vmem_mutex locking from vmem_map_4k_page() function called from smp_reinit_ipl_cpu() with interrupts disabled. While it is a pre-SMP early initcall no other CPUs running in parallel nor other code taking vmem_mutex on this boot stage - it still needs to be fixed. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-08-06	Revert "s390/smp,ptdump: add absolute lowcore markers"	Alexander Gordeev	1	-7/+0
	This reverts commit e409b7f19172a3c154de62de4baf32a2c25a375a. Commit 7d06fed77b7d ("s390/smp: rework absolute lowcore access") introduced mutex lock with interrupts disabled. This commit is a follow-up that needs to be reverted as well. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-08-05	cxl/hdm: Fix skip allocations vs multiple pmem allocations	Dan Williams	1	-1/+10
	Vishal notes that when attempting to define a second pmem region on a device the DPA allocation fails with a message of the form: decoder11.1: failed to reserve skipped space Recall that the skip setting is used when there is a pmem allocation in the presence of free ram DPA space. The first pmem allocation skips over the free ram and subsequent pmem allocations do not require a skip. The bug is that a skip is still attempted and the DPA reservation code flags the double skip allocation conflict. Fixes: cf880423b6a0 ("cxl/hdm: Add support for allocating DPA to an endpoint decoder") Reported-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165973754730.1558392.15466392461645857658.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05	cxl/region: Disallow region granularity != window granularity	Dan Williams	1	-6/+7
	The endpoint decode granularity must be <= the window granularity otherwise capacity in the endpoints is lost in the decode. Consider an attempt to have a region granularity of 512 with 4 devices within a window that maps 2 host bridges at a granularity of 256 bytes: HPA DPA Offset HB Port EP 0x0 0x0 0 0 0 0x100 0x0 1 0 2 0x200 0x100 0 0 0 0x300 0x100 1 0 2 0x400 0x200 0 1 1 0x500 0x200 1 1 3 0x600 0x300 0 1 1 0x700 0x300 1 1 3 0x800 0x400 0 0 0 0x900 0x400 1 0 2 0xA00 0x500 0 0 0 0xB00 0x500 1 0 2 Notice how endpoint0 maps HPA 0x0 and 0x200 correctly, but then at HPA 0x800 it results in DPA 0x200-0x400 on being skipped. Fix this by restricing the region granularity to be equal to the window granularity resulting in the following for a x4 region under a x2 window at a granularity of 256. HPA DPA Offset HB Port EP 0x0 0x0 0 0 0 0x100 0x0 1 0 2 0x200 0x0 0 1 1 0x300 0x0 1 1 3 0x400 0x100 0 0 0 0x500 0x100 1 0 2 0x600 0x100 0 1 1 0x700 0x100 1 1 3 Not that it ever made practical sense to support region granularity > window granularity. The window rotates host bridges causing endpoints to never see a consecutive stream of requests at the desired granularity without breaks to issue cycles to the other host bridge. Fixes: 80d10a6cee05 ("cxl/region: Add interleave geometry attributes") Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165973127171.1526540.9923273539049172976.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05	cxl/region: Fix x1 interleave to greater than x1 interleave routing	Dan Williams	1	-1/+5
	In cases where the decode fans out as it traverses downstream, the interleave granularity needs to increment to identify the port selector bits out of the remaining address bits. For example, recall that with an x2 parent port intereleave (IW == 1), the downstream decode for children of those ports will either see address bit IG+8 always set, or address bit IG+8 always clear. So if the child port needs to select a downstream port it can only use address bits starting at IG+9 (where IG and IW are the CXL encoded values for interleave granularity (ilog2(ig) - 8) and ways (ilog2(iw))). When the parent port interleave is x1 no such masking occurs and the child port can maintain the granularity that was routed to the parent port. Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165973126583.1526540.657948655360009242.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05	cxl/region: Move HPA setup to cxl_region_attach()	Dan Williams	2	-26/+24
	A recent bug fix added the setup of the endpoint decoder interleave geometry settings to cxl_region_attach(). Move the HPA setup there as well to keep all endpoint decoder parameter setting in a central location. For symmetry, move endpoint HPA teardown to cxl_region_detach(), and for switches move HPA setup / teardown to cxl_port_{setup,reset}_targets(). Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/165973126020.1526540.14701949254436069807.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>