aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/stackcollapse.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2021-10-25io_uring: clean io_wq_submit_work()'s main loopPavel Begunkov1-28/+12
Do a bit of cleaning for the main loop of io_wq_submit_work(). Get rid of switch, just replace it with a single if as we're retrying in both other cases. Kill issue_sqe label, Get rid of needs_poll nesting and disambiguate a bit the comment. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ed12ce0c64e051f9a6b8a37a24f8ea554d299c29.1634987320.git.asml.silence@gmail.com Reviewed-by: Hao Xu <haoxu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-23io-wq: use helper for worker refcountingPavel Begunkov1-2/+1
Use io_worker_release() instead of hand coding it in io_worker_exit(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6f95f09d2cdbafcbb2e22ad0d1a2bc4d3962bf65.1634987320.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-22io_uring: implement async hybrid mode for pollable requestsHao Xu1-1/+35
The current logic of requests with IOSQE_ASYNC is first queueing it to io-worker, then execute it in a synchronous way. For unbound works like pollable requests(e.g. read/write a socketfd), the io-worker may stuck there waiting for events for a long time. And thus other works wait in the list for a long time too. Let's introduce a new way for unbound works (currently pollable requests), with this a request will first be queued to io-worker, then executed in a nonblock try rather than a synchronous way. Failure of that leads it to arm poll stuff and then the worker can begin to handle other works. The detail process of this kind of requests is: step1: original context: queue it to io-worker step2: io-worker context: nonblock try(the old logic is a synchronous try here) | |--fail--> arm poll | |--(fail/ready)-->synchronous issue | |--(succeed)-->worker finish it's job, tw take over the req This works much better than the old IOSQE_ASYNC logic in cases where unbound max_worker is relatively small. In this case, number of io-worker eazily increments to max_worker, new worker cannot be created and running workers stuck there handling old works in IOSQE_ASYNC mode. In my 64-core machine, set unbound max_worker to 20, run echo-server, turns out: (arguments: register_file, connetion number is 1000, message size is 12 Byte) original IOSQE_ASYNC: 76664.151 tps after this patch: 166934.985 tps Suggested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Hao Xu <haoxu@linux.alibaba.com> Link: https://lore.kernel.org/r/20211018133445.103438-1-haoxu@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-20io_uring: Use ERR_CAST() instead of ERR_PTR(PTR_ERR())Changcheng Deng1-1/+1
Use ERR_CAST() instead of ERR_PTR(PTR_ERR()). This makes it more readable and also fix this warning detected by err_cast.cocci: ./fs/io_uring.c: WARNING: 3208: 11-18: ERR_CAST can be used with buf Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn> Link: https://lore.kernel.org/r/20211020084948.1038420-1-deng.changcheng@zte.com.cn Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: split logic of force_nonblockHao Xu1-22/+26
Currently force_nonblock stands for three meanings: - nowait or not - in an io-worker or not(hold uring_lock or not) Let's split the logic to two flags, IO_URING_F_NONBLOCK and IO_URING_F_UNLOCKED for convenience of the next patch. Suggested-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Hao Xu <haoxu@linux.alibaba.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20211018133431.103298-1-haoxu@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: warning about unused-but-set parameterArnd Bergmann1-4/+1
When enabling -Wunused warnings by building with W=1, I get an instance of the -Wunused-but-set-parameter warning in the io_uring code: fs/io_uring.c: In function 'io_queue_async_work': fs/io_uring.c:1445:61: error: parameter 'locked' set but not used [-Werror=unused-but-set-parameter] 1445 | static void io_queue_async_work(struct io_kiocb *req, bool *locked) | ~~~~~~^~~~~~ There are very few warnings of this type, so it would be nice to enable this by default and fix all the existing instances. As the assignment serves no purpose by itself other than to prevent developers from using the variable, an easy workaround is to remove the assignment and just rename the argument to "dont_use". Fixes: f237c30a5610 ("io_uring: batch task work locking") Link: https://lore.kernel.org/lkml/20210920121352.93063-1-arnd@kernel.org/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20211019153507.348480-1-arnd@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: inform block layer of how many requests we are submittingJens Axboe1-1/+3
The block layer can use this knowledge to make smarter decisions on how to handle the request, if it knows that N more may be coming. Switch to using blk_start_plug_nr_ios() to pass in that information. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: simplify io_file_supports_nowait()Pavel Begunkov1-12/+22
Make sure that REQ_F_SUPPORT_NOWAIT is always set io_prep_rw(), and so we can stop caring about setting it down the line simplifying io_file_supports_nowait(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/60c8f1f5e2cb45e00f4897b2cec10c5b3669da91.1634425438.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: combine REQ_F_NOWAIT_{READ,WRITE} flagsPavel Begunkov1-40/+21
Merge REQ_F_NOWAIT_READ and REQ_F_NOWAIT_WRITE into one flag, i.e. REQ_F_SUPPORT_NOWAIT. First it gets rid of dependence on CONFIG_64BIT but also simplifies the code. One thing to consider is when we don't have ->{read,write}_iter and go through loop_rw_iter(). Just fail it with -EAGAIN if we expect nowait behaviour but not sure whether it supports it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f832a20e5186c2e79c6519280c238f559a1d2bbc.1634425438.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: arm poll for non-nowait filesPavel Begunkov1-7/+0
Don't check if we can do nowait before arming apoll, there are several reasons for that. First, we don't care much about files that don't support nowait. Second, it may be useful -- we don't want to be taking away extra workers from io-wq when it can go in some async. Even if it will go through io-wq eventually, it make difference in the numbers of workers actually used. And the last one, it's needed to clean nowait in future commits. [kernel test robot: fix unused-var] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9d06f3cb2c8b686d970269a87986f154edb83043.1634425438.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19fs/io_uring: Prioritise checking faster conditions first in io_writeNoah Goldstein1-1/+1
This commit reorders the conditions in a branch in io_write. The reorder to check 'ret2 == -EAGAIN' first as checking '(req->ctx->flags & IORING_SETUP_IOPOLL)' will likely be more expensive due to 2x memory derefences. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Link: https://lore.kernel.org/r/20211017013229.4124279-1-goldstein.w.n@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: clean io_prep_rw()Pavel Begunkov1-4/+3
We already store req->file in a variable in io_prep_rw(), just use it instead of a couple of left references to kicob->ki_filp. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2f5889fc7ab670daefd5ccaedd99416d8355f0ad.1634314022.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise fixed rw rsrc node settingPavel Begunkov1-7/+4
Move fixed rw io_req_set_rsrc_node() from rw prep into io_import_fixed(), if we're using fixed buffers it will always be called during submission as we save the state in advance, Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/68c06f66d5aa9661f1e4b88d08c52d23528297ec.1634314022.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: return iovec from __io_import_iovecPavel Begunkov1-22/+23
We pass iovec** into __io_import_iovec(), which should keep it, initialise and modify accordingly. It's expensive, return it directly from __io_import_iovec encoding errors with ERR_PTR if needed. io_import_iovec keeps the old interface, but it's inline and so everything is optimised nicely. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6230e9769982f03a8f86fa58df24666088c44d3e.1634314022.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise io_import_iovec fixed pathPavel Begunkov1-3/+6
Delay loading req->rw.{addr,len} in io_import_iovec until it's really needed, so removing extra loads for the fixed path, which doesn't use them. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/3cc48dd0c4f1a37c4ce9aab5784281a2d83ad8be.1634314022.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: kill io_wq_current_is_worker() in iopollPavel Begunkov1-5/+5
Don't decide about locking based on io_wq_current_is_worker(), it's not consistent with all other code and is expensive, use issue_flags. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7546d5a58efa4360173541c6fe02ee6b8c7b4ea7.1634314022.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise req->ctx reloadsPavel Begunkov1-2/+1
Don't load req->ctx in advance, it takes an extra register and the field stays valid even after opcode handlers. It also optimises out req->ctx load in io_iopoll_req_issued() once it's inlined. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1e45ff671c44be0eb904f2e448a211734893fa0b.1634314022.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: rearrange io_read()/write()Pavel Begunkov1-38/+37
Combine force_nonblock branches (which is already optimised by compiler), flip branches so the most hot/common path is the first, e.g. as with non on-stack iov setup, and add extra likely/unlikely attributions for errror paths. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2c2536c5896d70994de76e387ea09a0402173a3f.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: clean up io_import_iovecPavel Begunkov1-15/+25
Make io_import_iovec taking struct io_rw_state instead of an iter pointer. First it takes care of initialising iovec pointer, which can be forgotten. Even more, we can not init it if not needed, e.g. in case of IORING_OP_READ_FIXED or IORING_OP_READ. Also hide saving iter_state inside of it by splitting out an inline function of it to avoid extra ifs. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b1bbc213a95e5272d4da5867bb977d9acb6f2109.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise io_import_iovec nonblock passingPavel Begunkov1-22/+25
First, change IO_URING_F_NONBLOCK to take sign bit of the int, so checking for it can be turned into test + sign-based-jump, makes the binary smaller and may be faster. Then, instead of passing need_lock boolean into io_import_iovec() just give it issue_flags, which is already stored somewhere. Saves some space on stack, a couple of test + cmov operations and other conversions. note: we still leave force_nonblock = issue_flags & IO_URING_F_NONBLOCK variable, but it's optimised out by the compiler into testing issue_flags directly. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ee96547e692f6c975c229cd82fc721679571a734.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise read/write iov state storingPavel Begunkov1-42/+37
Currently io_read() and io_write() keep separate pointers to an iter and to struct iov_iter_state, which is not great for register spilling and requires more on-stack copies. They are both either on-stack or in req->async_data at the same time, so use struct io_rw_state and keep a pointer only to it, so having all the state with just one pointer. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5c5e7ffd7dc25fc35075c70411ba99df72f237fa.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: encapsulate rw statePavel Begunkov1-19/+23
Add a new struct io_rw_state storing all iov related bits: fast iov, iterator and iterator state. Not much changes here, simply convert struct io_async_rw to use it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e8245ffcb568b228a009ec1eb79c993c813679f1.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise rw comletion handlersPavel Begunkov1-2/+2
Don't override req->result in io_complete_rw_iopoll() when it's already of the same value, we have an if just above it, so move the assignment there. Also, add one simle unlikely() in __io_complete_rw_common(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8dfeb4f84026a20172bcf82c05010abe955874ae.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: prioritise read success path over failsPavel Begunkov1-1/+1
Rearrange io_read return handling so first we expect it completing successfully and only then checking for errors, which is a colder path. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c91c7c2da11815ec8b04b5d872f60dc4cde662c5.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: consistent typing for issue_flagsPavel Begunkov1-3/+3
Some of the functions keep issue_flags as int, change those to unsigned. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/04ad43797783bc9cc7567f287ab545518f8e8cf2.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise rsrc referencingPavel Begunkov1-3/+49
Apparently, percpu_ref_put/get() are expensive enough if done per request, get them in a batch and cache on the submission side to avoid getting it over and over again. Also, if we're completing under uring_lock, return refs back into the cache instead of perfcpu_ref_put(). Pretty similar to how we do tctx->cached_refs accounting, but fall back to normal putting when we already changed a rsrc node by the time of free. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b40d8c5bc77d3c9550df8a319117a374ac85f8f4.1633817310.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise io_req_set_rsrc_node()Pavel Begunkov1-5/+4
io_req_set_rsrc_node() reloads loads req->ctx, however it's already in registers in all use cases, so better to pass it as a parameter. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/67a25557b8a51e90bfd578447a6f1671911b05ae.1633817310.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: fix io_free_batch_list racesPavel Begunkov1-2/+3
[ 158.514382] WARNING: CPU: 5 PID: 15251 at fs/io_uring.c:1141 io_free_batch_list+0x269/0x360 [ 158.514426] RIP: 0010:io_free_batch_list+0x269/0x360 [ 158.514437] Call Trace: [ 158.514440] __io_submit_flush_completions+0xde/0x180 [ 158.514444] tctx_task_work+0x14a/0x220 [ 158.514447] task_work_run+0x64/0xa0 [ 158.514448] __do_sys_io_uring_enter+0x7c/0x970 [ 158.514450] __x64_sys_io_uring_enter+0x22/0x30 [ 158.514451] do_syscall_64+0x43/0x90 [ 158.514453] entry_SYSCALL_64_after_hwframe+0x44/0xae We should not touch request internals including req->comp_list.next after putting our ref if it's not final, e.g. we can start freeing requests from the free cache. Fixed: 62ca9cb93e7f8 ("io_uring: optimise io_free_batch_list()") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b1f4df38fbb8f111f52911a02fd418d0283a4e6f.1634047298.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: remove extra io_ring_exit_work wake upPavel Begunkov1-1/+0
task_work_add() takes care of waking up the thread, remove useless wake_up_process(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/de9a71ee255112dcaed3b5d426be24934e74722c.1633532552.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise out req->opcode reloadingPavel Begunkov1-14/+17
Looking at the assembly, the compiler decided to reload req->opcode in io_op_defs[opcode].needs_file instead of one it had in a register, so store it in a temp variable so it can be optimised out. Also move the personality block later, it's better for spilling/etc. as it only depends on @sqe, which we're keeping anyway. By the way, zero req->opcode if it over IORING_OP_LAST, not a problem, at the moment but is safer. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6ba869f5f8b7b0f991c87fdf089f0abf87cbe06b.1633532552.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: reshuffle io_submit_state bitsPavel Begunkov1-9/+5
struct io_submit_state's ->free_list and ->link are hotter and smaller than ->plug, place them first. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6ad3c15849f50b27ad012c042c73e6e069d22df7.1633532552.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: safer fallback_work freePavel Begunkov1-1/+1
Add extra wq flushing for fallback_work, that's not necessary but safer if invariants of io_fallback_req_func() change. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/24179419d6748516299600bc914f50b9e0b02275.1633532552.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise pluggingPavel Begunkov1-16/+16
Plugging is only needed with requests that also need a file, so hide plugging under a ->needs_file check. Also, place ->needs_file and ->plug bits into the same byte of io_op_defs, it may matter for compilers, e.g. only with the change a tested one decided to optimise two memory testb into a more with two register testb. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1600d1287bb7d16451d4ef3343252787a5314927.1633532552.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: correct fill events helpers typesPavel Begunkov1-12/+12
CQE result is a 32-bit integer, so the functions generating CQEs are better to accept not long but ints. Convert io_cqring_fill_event() and other helpers. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7ca6f15255e9117eae28adcac272744cae29b113.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: inline io_poll_completePavel Begunkov1-11/+2
Inline io_poll_complete(), it's simple and doesn't have any particular purpose. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/933d7ee3e4450749a2d892235462c8f18d030293.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: inline io_req_needs_clean()Pavel Begunkov1-6/+1
There is only a single user of io_req_needs_clean() inline it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6111d0221ef4b439cad401e135dd6a5f990a0501.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: remove struct io_completionPavel Begunkov1-17/+7
We keep struct io_completion only as a temporal storage of cflags, Place it in io_kiocb, it's cleaner, removes extra bits and even might be used for future optimisations. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5299bd5c223204065464bd87a515d0e405316086.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: control ->async_data with a REQ_F flagPavel Begunkov1-26/+46
->async_data is a slow path, so it won't matter much if we do the clean up inside io_clean_op(). Moreover, in many cases it's allocated together with setting one or more of IO_REQ_CLEAN_FLAGS flags, so it'd go through io_clean_op() anyway. Control ->async_data allocation with a new flag REQ_F_ASYNC_DATA, so we can do all the maintainence under io_req_needs_clean() fast check. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6892cf5883c459f36bda26f30ceb16742b20b84b.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise io_free_batch_list()Pavel Begunkov1-2/+4
Delay reading the next node in io_free_batch_list(), allows the compiler to load the value a bit later improving register spilling in some cases. With gcc 11.1 it helped to move @task_refs variable from the stack to a register and optimises out a couple of per request instructions. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/cc9fdfb6f72a4e8bc9918a5e9f2d97869a263ae4.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: mark cold functionsPavel Begunkov1-59/+64
Attribute cold functions so compilers can optimise them for size. It shrinks the binary by 2.5-3% text data bss dec hex filename 90670 14002 8 104680 198e8 ./fs/io_uring.o 88053 14002 8 102063 18eaf ./fs/io_uring.o Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b53d385f91dca45170b67d7f11c7abd787e821f6.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise ctx referencing by requestsPavel Begunkov1-12/+14
Currenlty, we allocate one ctx reference per request at submission time and put them at free. It's batched and not so expensive but it still bloats the kernel, adds 2 function calls for rcu and adds some overhead for request counting in io_free_batch_list(). Always keep one reference with a request, even when it's freed and in io_uring request caches. There is extra work at ring exit / quiesce paths, which now need to put all cached requests. io_ring_exit_work() is already looping, so it's not a problem. Add hybrid-busy waiting to io_ctx_quiesce() as well for now. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/99613fbe396e80777228cde39bbda1aa8938554e.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: merge CQ and poll waitqueuesPavel Begunkov1-7/+1
->cq_wait and ->poll_wait and waken up in the same manner, use a single waitqueue for both of them. CQ waiters are queued exclusively, so wake up should first go over all pollers and that's what we need. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/00fe603e50000365774cf8435ef5fe03f049c1c9.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: don't wake sqpoll in io_cqring_ev_postedPavel Begunkov1-2/+0
io_cqring_ev_posted() doesn't need to wake SQPOLL, it's either done by userspace or with task_work, but no action is required on request completion. Rip off bits waking it up in io_cqring_ev_posted(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b49dab27b64cf11f4c50f2f90dcaac123430e05d.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise INIT_WQ_LISTPavel Begunkov1-1/+0
The invariant of io_wq_work_list is that it's empty IFF ->first is NULL, so no need to initially set ->last. With now having more users of the list it may play a role, i.e. used in each tw iteration and on every completion flushing. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c464ab5cab6e46a858c6d39c107e92b3b5291f13.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise request allocationPavel Begunkov1-8/+20
Even after fully inlining io_alloc_req() my compiler does a NULL check in the path of successful allocation, no hacks like an empty dereference help it. Restructure io_alloc_req() by splitting out refilling part, so the compiler generate a slightly better binary. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/eda17571bdc7248d8e617b23e7132a5416e4680b.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: delay req queueing into compl-batch listPavel Begunkov1-10/+16
io_req_complete_state() is inlined and used in lots of places, so we want to keep it concise. Move adding a request into a completion batch list from io_req_complete_state() into the consumer, i.e. __io_queue_sqe(). before vs after text data bss dec hex filename 91894 14002 8 105904 19db0 ./fs/io_uring.o 91046 14002 8 105056 19a60 ./fs/io_uring.o Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4afca4e11abfd4cc8e99777fdcaf4d34cf4d022d.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: add more likely/unlikely() annotationsPavel Begunkov1-3/+3
Add two extra unlikely() in io_submit_sqes() and one around io_req_needs_clean() to help the compiler to avoid extra jumps in hot paths. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/88e087afe657e7660194353aada9b00f11d480f9.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise kiocb layoutPavel Begunkov1-2/+5
We want ->comp_list in the second cacheline, which is hotter comparing to the 3rd. Swap the field with ->link, which is not as hot and controlled by flags and so not accessed unless there is a link. By the way add a couple of comments for io_kiocb fields. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9d9dde31f8f62279a5f48c575bbc27b8290edc0c.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: add flag to not fail link after timeoutPavel Begunkov2-2/+7
For some reason non-off IORING_OP_TIMEOUT always fails links, it's pretty inconvenient and unnecessary limits chaining after it to hard linking, which is far from ideal, e.g. doesn't pair well with timeout cancellation. Add a flag forcing it to not fail links on -ETIME. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/17c7ec0fb7a6113cc6be8cdaedcada0ba836ac0e.1633199723.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: clean up buffer selectPavel Begunkov1-34/+12
Hiding a pointer to a struct io_buffer in rw.addr is error prone. We have some place in io_kiocb, so keep kbuf's in a separate field without aliasing and risks of it being misused. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/3e63a6a953b04cad81d9ea827b12344dd57b37b4.1633107393.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>