aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/md/dm-integrity.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2018-07-27dm integrity: implement fair range locksMikulas Patocka1-9/+59
dm-integrity locks a range of sectors to prevent concurrent I/O or journal writeback. These locks were not fair - so that many small overlapping I/Os could starve a large I/O indefinitely. Fix this by making the range locks fair. The ranges that are waiting are added to the list "wait_list". If a new I/O overlaps some of the waiting I/Os, it is not dispatched, but it is also added to that wait list. Entries on the wait list are processed in first-in-first-out order, so that an I/O can't starve indefinitely. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-07-27dm integrity: decouple common code in dm_integrity_map_continue()Mikulas Patocka1-3/+7
Decouple how dm_integrity_map_continue() responds to being out of free sectors and when add_new_range() fails. This has no functional change, but helps prepare for the next commit. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-07-27dm integrity: change 'suspending' variable from bool to intMikulas Patocka1-3/+3
Early alpha processors can't write a byte or short atomically - they read 8 bytes, modify the byte or two bytes in registers and write back 8 bytes. The modification of the variable "suspending" may race with modification of the variable "failed". Fix this by changing "suspending" to an int. Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-06-12treewide: kvmalloc() -> kvmalloc_array()Kees Cook1-4/+11
The kvmalloc() function has a 2-factor argument form, kvmalloc_array(). This patch replaces cases of: kvmalloc(a * b, gfp) with: kvmalloc_array(a * b, gfp) as well as handling cases of: kvmalloc(a * b * c, gfp) with: kvmalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kvmalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kvmalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kvmalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kvmalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kvmalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kvmalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kvmalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kvmalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kvmalloc( - sizeof(u8) * COUNT + COUNT , ...) | kvmalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kvmalloc( - sizeof(char) * COUNT + COUNT , ...) | kvmalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kvmalloc + kvmalloc_array ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kvmalloc + kvmalloc_array ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kvmalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kvmalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kvmalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kvmalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kvmalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kvmalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kvmalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kvmalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kvmalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kvmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kvmalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kvmalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kvmalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kvmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kvmalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kvmalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kvmalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kvmalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kvmalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kvmalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kvmalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kvmalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kvmalloc(C1 * C2 * C3, ...) | kvmalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kvmalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kvmalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kvmalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kvmalloc(sizeof(THING) * C2, ...) | kvmalloc(sizeof(TYPE) * C2, ...) | kvmalloc(C1 * C2 * C3, ...) | kvmalloc(C1 * C2, ...) | - kvmalloc + kvmalloc_array ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kvmalloc + kvmalloc_array ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kvmalloc + kvmalloc_array ( - (E1) * E2 + E1, E2 , ...) | - kvmalloc + kvmalloc_array ( - (E1) * (E2) + E1, E2 , ...) | - kvmalloc + kvmalloc_array ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12treewide: kmalloc() -> kmalloc_array()Kees Cook1-1/+2
The kmalloc() function has a 2-factor argument form, kmalloc_array(). This patch replaces cases of: kmalloc(a * b, gfp) with: kmalloc_array(a * b, gfp) as well as handling cases of: kmalloc(a * b * c, gfp) with: kmalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kmalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kmalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The tools/ directory was manually excluded, since it has its own implementation of kmalloc(). The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kmalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kmalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kmalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(u8) * COUNT + COUNT , ...) | kmalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kmalloc( - sizeof(char) * COUNT + COUNT , ...) | kmalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kmalloc + kmalloc_array ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kmalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kmalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kmalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kmalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kmalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kmalloc(C1 * C2 * C3, ...) | kmalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kmalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kmalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kmalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kmalloc(sizeof(THING) * C2, ...) | kmalloc(sizeof(TYPE) * C2, ...) | kmalloc(C1 * C2 * C3, ...) | kmalloc(C1 * C2, ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - (E1) * E2 + E1, E2 , ...) | - kmalloc + kmalloc_array ( - (E1) * (E2) + E1, E2 , ...) | - kmalloc + kmalloc_array ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-04Merge tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-blockLinus Torvalds1-8/+7
Pull block updates from Jens Axboe: - clean up how we pass around gfp_t and blk_mq_req_flags_t (Christoph) - prepare us to defer scheduler attach (Christoph) - clean up drivers handling of bounce buffers (Christoph) - fix timeout handling corner cases (Christoph/Bart/Keith) - bcache fixes (Coly) - prep work for bcachefs and some block layer optimizations (Kent). - convert users of bio_sets to using embedded structs (Kent). - fixes for the BFQ io scheduler (Paolo/Davide/Filippo) - lightnvm fixes and improvements (Matias, with contributions from Hans and Javier) - adding discard throttling to blk-wbt (me) - sbitmap blk-mq-tag handling (me/Omar/Ming). - remove the sparc jsflash block driver, acked by DaveM. - Kyber scheduler improvement from Jianchao, making it more friendly wrt merging. - conversion of symbolic proc permissions to octal, from Joe Perches. Previously the block parts were a mix of both. - nbd fixes (Josef and Kevin Vigor) - unify how we handle the various kinds of timestamps that the block core and utility code uses (Omar) - three NVMe pull requests from Keith and Christoph, bringing AEN to feature completeness, file backed namespaces, cq/sq lock split, and various fixes - various little fixes and improvements all over the map * tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-block: (196 commits) blk-mq: update nr_requests when switching to 'none' scheduler block: don't use blocking queue entered for recursive bio submits dm-crypt: fix warning in shutdown path lightnvm: pblk: take bitmap alloc. out of critical section lightnvm: pblk: kick writer on new flush points lightnvm: pblk: only try to recover lines with written smeta lightnvm: pblk: remove unnecessary bio_get/put lightnvm: pblk: add possibility to set write buffer size manually lightnvm: fix partial read error path lightnvm: proper error handling for pblk_bio_add_pages lightnvm: pblk: fix smeta write error path lightnvm: pblk: garbage collect lines with failed writes lightnvm: pblk: rework write error recovery path lightnvm: pblk: remove dead function lightnvm: pass flag on graceful teardown to targets lightnvm: pblk: check for chunk size before allocating it lightnvm: pblk: remove unnecessary argument lightnvm: pblk: remove unnecessary indirection lightnvm: pblk: return NVM_ error on failed submission lightnvm: pblk: warn in case of corrupted write buffer ...
2018-05-30dm: convert to bioset_init()/mempool_init()Kent Overstreet1-8/+7
Convert dm to embedded bio sets. Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-30dm integrity: use kvfree for kvmalloc'd memoryMikulas Patocka1-1/+1
Use kvfree instead of kfree because the array is allocated with kvmalloc. Fixes: 7eada909bfd7a ("dm: add integrity target") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-04-03dm bufio: move dm-bufio.h to include/linux/Mikulas Patocka1-1/+1
Move dm-bufio.h to include/linux/ so that external GPL'd DM target modules can use it. It is better to allow the use of dm-bufio than force external modules to implement the equivalent buffered IO mechanism in some new way. The hope is this will encourage the use of dm-bufio; which will then make it easier for a GPL'd external DM target module to be included upstream. A couple dm-bufio EXPORT_SYMBOL exports have also been updated to use EXPORT_SYMBOL_GPL. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-04-03dm integrity: fail early if required HMAC key is not availableMilan Broz1-0/+3
Since crypto API commit 9fa68f62004 ("crypto: hash - prevent using keyed hashes without setting key") dm-integrity cannot use keyed algorithms without the key being set. The dm-integrity recognizes this too late (during use of HMAC), so it allows creation and formatting of superblock, but the device is in fact unusable. Fix it by detecting the key requirement in integrity table constructor. Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-01-17dm integrity: don't store cipher request on the stackMikulas Patocka1-12/+37
Some asynchronous cipher implementations may use DMA. The stack may be mapped in the vmalloc area that doesn't support DMA. Therefore, the cipher request and initialization vector shouldn't be on the stack. Fix this by allocating the request and iv with kmalloc. Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-11-14md: Convert timers to use timer_setup()Kees Cook1-3/+3
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Shaohua Li <shli@kernel.org> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Cc: linux-bcache@vger.kernel.org Cc: linux-raid@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Michael Lyle <mlyle@lyle.org> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-14Merge tag 'for-4.15/dm' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dmLinus Torvalds1-1/+1
Pull device mapper updates from Mike Snitzer: - a few conversions from atomic_t to ref_count_t - a DM core fix for a race during device destruction that could result in a BUG_ON - a stable@ fix for a DM cache race condition that could lead to data corruption when operating in writeback mode (writethrough is default) - various DM cache cleanups and improvements - add DAX support to the DM log-writes target - a fix for the DM zoned target's ability to deal with the last zone of the drive being smaller than all others - a stable@ DM crypt and DM integrity fix for a negative check that was to restrictive (prevented slab debug with XFS ontop of DM crypt from working) - a DM raid target fix for a panic that can occur when forcing a raid to sync * tag 'for-4.15/dm' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (25 commits) dm cache: lift common migration preparation code to alloc_migration() dm cache: remove usused deferred_cells member from struct cache dm cache policy smq: allocate cache blocks in order dm cache policy smq: change max background work from 10240 to 4096 blocks dm cache background tracker: limit amount of background work that may be issued at once dm cache policy smq: take origin idle status into account when queuing writebacks dm cache policy smq: handle races with queuing background_work dm raid: fix panic when attempting to force a raid to sync dm integrity: allow unaligned bv_offset dm crypt: allow unaligned bv_offset dm: small cleanup in dm_get_md() dm: fix race between dm_get_from_kobject() and __dm_destroy() dm: allocate struct mapped_device with kvzalloc dm zoned: ignore last smaller runt zone dm space map metadata: use ARRAY_SIZE dm log writes: add support for DAX dm log writes: add support for inline data buffers dm cache: simplify get_per_bio_data() by removing data_size argument dm cache: remove all obsolete writethrough-specific code dm cache: submit writethrough writes in parallel to origin and cache ...
2017-11-10dm integrity: allow unaligned bv_offsetMikulas Patocka1-1/+1
When slub_debug is enabled kmalloc returns unaligned memory. XFS uses this unaligned memory for its buffers (if an unaligned buffer crosses a page, XFS frees it and allocates a full page instead - see the function xfs_buf_allocate_memory). dm-integrity checks if bv_offset is aligned on page size and this check fail with slub_debug and XFS. Fix this bug by removing the bv_offset check, leaving only the check for bv_len. Fixes: 7eada909bfd7 ("dm: add integrity target") Cc: stable@vger.kernel.org # v4.12+ Reported-by: Bruno Prémont <bonbons@sysophe.eu> Reviewed-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-10-25locking/atomics, dm-integrity: Convert ACCESS_ONCE() to READ_ONCE()/WRITE_ONCE()Mark Rutland1-7/+8
For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't currently harmful. However, for some features it is necessary to instrument reads and writes separately, which is not possible with ACCESS_ONCE(). This distinction is critical to correct operation. It's possible to transform the bulk of kernel code using the Coccinelle script below. However, this doesn't pick up some uses, including those in dm-integrity.c. As a preparatory step, this patch converts the driver to use {READ,WRITE}_ONCE() consistently. At the same time, this patch adds the missing include of <linux/compiler.h> necessary for the {READ,WRITE}_ONCE() definitions. ---- virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: viro@zeniv.linux.org.uk Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-1-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-14Merge tag 'for-4.14/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dmLinus Torvalds1-16/+26
Pull device mapper updates from Mike Snitzer: - Some request-based DM core and DM multipath fixes and cleanups - Constify a few variables in DM core and DM integrity - Add bufio optimization and checksum failure accounting to DM integrity - Fix DM integrity to avoid checking integrity of failed reads - Fix DM integrity to use init_completion - A couple DM log-writes target fixes - Simplify DAX flushing by eliminating the unnecessary flush abstraction that was stood up for DM's use. * tag 'for-4.14/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dax: remove the pmem_dax_ops->flush abstraction dm integrity: use init_completion instead of COMPLETION_INITIALIZER_ONSTACK dm integrity: make blk_integrity_profile structure const dm integrity: do not check integrity for failed read operations dm log writes: fix >512b sectorsize support dm log writes: don't use all the cpu while waiting to log blocks dm ioctl: constify ioctl lookup table dm: constify argument arrays dm integrity: count and display checksum failures dm integrity: optimize writing dm-bufio buffers that are partially changed dm rq: do not update rq partially in each ending bio dm rq: make dm-sq requeuing behavior consistent with dm-mq behavior dm mpath: complain about unsupported __multipath_map_bio() return values dm mpath: avoid that building with W=1 causes gcc 7 to complain about fall-through
2017-09-11dm integrity: use init_completion instead of COMPLETION_INITIALIZER_ONSTACKArnd Bergmann1-10/+10
The new lockdep support for completions causeed the stack usage in dm-integrity to explode, in case of write_journal from 504 bytes to 1120 (using arm gcc-7.1.1): drivers/md/dm-integrity.c: In function 'write_journal': drivers/md/dm-integrity.c:827:1: error: the frame size of 1120 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] The problem is that not only the size of 'struct completion' grows significantly, but we end up having multiple copies of it on the stack when we assign it from a local variable after the initial declaration. COMPLETION_INITIALIZER_ONSTACK() is the right thing to use when we want to declare and initialize a completion on the stack. However, this driver doesn't do that and instead initializes the completion just before it is used. In this case, init_completion() does the same thing more efficiently, and drops the stack usage for the function above down to 496 bytes. While the other functions in this file are not bad enough to cause a warning, they benefit equally from the change, so I do the change across the entire file. In the one place where we reuse a completion, I picked the cheaper reinit_completion() over init_completion(). Fixes: cd8084f91c02 ("locking/lockdep: Apply crossrelease to completions") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-09-11dm integrity: make blk_integrity_profile structure constBhumika Goyal1-1/+1
Make this structure const as it is only stored in the profile field of a blk_integrity structure. This field is of type const, so make structure as const. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-09-11dm integrity: do not check integrity for failed read operationsHyunchul Lee1-1/+5
Even though read operations fail, dm_integrity_map_continue() calls integrity_metadata() to check integrity. In this case, just complete these. This also makes it so read I/O errors do not generate integrity warnings in the kernel log. Cc: stable@vger.kernel.org Signed-off-by: Hyunchul Lee <cheol.lee@lge.com> Acked-by: Milan Broz <gmazyland@gmail.com> Acked-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-08-28dm: constify argument arraysEric Biggers1-1/+1
The arrays of 'struct dm_arg' are never modified by the device-mapper core, so constify them so that they are placed in .rodata. (Exception: the args array in dm-raid cannot be constified because it is allocated on the stack and modified.) Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-08-28dm integrity: count and display checksum failuresMikulas Patocka1-2/+8
This changes DM integrity to count the number of checksum failures and report the counter in response to STATUSTYPE_INFO request (via 'dmsetup status'). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-08-28dm integrity: optimize writing dm-bufio buffers that are partially changedMikulas Patocka1-1/+1
Rather than write the entire dm-bufio buffer when only a subset is changed, improve dm-bufio (and dm-integrity) by only writing the subset of the buffer that changed. Update dm-integrity to make use of dm-bufio's new dm_bufio_mark_partial_buffer_dirty() interface. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-08-23block: replace bi_bdev with a gendisk pointer and partitions indexChristoph Hellwig1-4/+7
This way we don't need a block_device structure to submit I/O. The block_device has different life time rules from the gendisk and request_queue and is usually only available when the block device node is open. Other callers need to explicitly create one (e.g. the lightnvm passthrough code, or the new nvme multipathing code). For the actual I/O path all that we need is the gendisk, which exists once per block device. But given that the block layer also does partition remapping we additionally need a partition index, which is used for said remapping in generic_make_request. Note that all the block drivers generally want request_queue or sometimes the gendisk, so this removes a layer of indirection all over the stack. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-25dm integrity: test for corrupted disk format during table loadMikulas Patocka1-0/+5
If the dm-integrity superblock was corrupted in such a way that the journal_sections field was zero, the integrity target would deadlock because it would wait forever for free space in the journal. Detect this situation and refuse to activate the device. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Fixes: 7eada909bfd7 ("dm: add integrity target") Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-25dm integrity: WARN_ON if variables representing journal usage get out of syncMikulas Patocka1-0/+2
If this WARN_ON triggers it speaks to programmer error, and likely implies corruption, but no released kernel should trigger it. This WARN_ON serves to assist DM integrity developers as changes are made/tested in the future. BUG_ON is excessive for catching programmer error, if a user or developer would like warnings to trigger a panic, they can enable that via /proc/sys/kernel/panic_on_warn Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-19dm integrity: use plugging when writing the journalMikulas Patocka1-0/+5
When copying data from the journal to the appropriate place, we submit many IOs. Some of these IOs could go to adjacent areas. Use on-stack plugging so that adjacent IOs get merged during submission. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-19dm integrity: fix inefficient allocation of journal spaceMikulas Patocka1-4/+6
When using a block size greater than 512 bytes, the dm-integrity target allocates journal space inefficiently. It allocates one journal entry for each 512-byte chunk of data, fills an entry for each block of data and leaves the remaining entries unused. This issue doesn't cause data corruption, but all the unused journal entries degrade performance severely. For example, with 4k blocks and an 8k bio, it would allocate 16 journal entries but only use 2 entries. The remaining 14 entries were left unused. Fix this by adding the missing 'log2_sectors_per_block' shifts that are required to have each journal entry map to a full block. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Fixes: 7eada909bfd7 ("dm: add integrity target") Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-03Merge branch 'for-4.13/block' of git://git.kernel.dk/linux-blockLinus Torvalds1-15/+15
Pull core block/IO updates from Jens Axboe: "This is the main pull request for the block layer for 4.13. Not a huge round in terms of features, but there's a lot of churn related to some core cleanups. Note this depends on the UUID tree pull request, that Christoph already sent out. This pull request contains: - A series from Christoph, unifying the error/stats codes in the block layer. We now use blk_status_t everywhere, instead of using different schemes for different places. - Also from Christoph, some cleanups around request allocation and IO scheduler interactions in blk-mq. - And yet another series from Christoph, cleaning up how we handle and do bounce buffering in the block layer. - A blk-mq debugfs series from Bart, further improving on the support we have for exporting internal information to aid debugging IO hangs or stalls. - Also from Bart, a series that cleans up the request initialization differences across types of devices. - A series from Goldwyn Rodrigues, allowing the block layer to return failure if we will block and the user asked for non-blocking. - Patch from Hannes for supporting setting loop devices block size to that of the underlying device. - Two series of patches from Javier, fixing various issues with lightnvm, particular around pblk. - A series from me, adding support for write hints. This comes with NVMe support as well, so applications can help guide data placement on flash to improve performance, latencies, and write amplification. - A series from Ming, improving and hardening blk-mq support for stopping/starting and quiescing hardware queues. - Two pull requests for NVMe updates. Nothing major on the feature side, but lots of cleanups and bug fixes. From the usual crew. - A series from Neil Brown, greatly improving the bio rescue set support. Most notably, this kills the bio rescue work queues, if we don't really need them. - Lots of other little bug fixes that are all over the place" * 'for-4.13/block' of git://git.kernel.dk/linux-block: (217 commits) lightnvm: pblk: set line bitmap check under debug lightnvm: pblk: verify that cache read is still valid lightnvm: pblk: add initialization check lightnvm: pblk: remove target using async. I/Os lightnvm: pblk: use vmalloc for GC data buffer lightnvm: pblk: use right metadata buffer for recovery lightnvm: pblk: schedule if data is not ready lightnvm: pblk: remove unused return variable lightnvm: pblk: fix double-free on pblk init lightnvm: pblk: fix bad le64 assignations nvme: Makefile: remove dead build rule blk-mq: map all HWQ also in hyperthreaded system nvmet-rdma: register ib_client to not deadlock in device removal nvme_fc: fix error recovery on link down. nvmet_fc: fix crashes on bad opcodes nvme_fc: Fix crash when nvme controller connection fails. nvme_fc: replace ioabort msleep loop with completion nvme_fc: fix double calls to nvme_cleanup_cmd() nvme-fabrics: verify that a controller returns the correct NQN nvme: simplify nvme_dev_attrs_are_visible ...
2017-06-21dm integrity: fix to not disable/enable interrupts from interrupt contextMike Snitzer1-2/+5
Use spin_lock_irqsave and spin_unlock_irqrestore rather than spin_{lock,unlock}_irq in submit_flush_bio(). Otherwise lockdep issues the following warning: DEBUG_LOCKS_WARN_ON(current->hardirq_context) WARNING: CPU: 1 PID: 0 at kernel/locking/lockdep.c:2748 trace_hardirqs_on_caller+0x107/0x180 Reported-by: Ondrej Kozina <okozina@redhat.com> Tested-by: Ondrej Kozina <okozina@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Mikulas Patocka <mpatocka@redhat.com>
2017-06-12dm integrity: reject mappings too large for deviceOndrej Mosnáček1-0/+5
dm-integrity would successfully create mappings with the number of sectors greater than the provided data sector count. Attempts to read sectors of this mapping that were beyond the provided data sector count would then yield run-time messages of the form "device-mapper: integrity: Too big sector number: ...". Fix this by emitting an error when the requested mapping size is bigger than the provided data sector count. Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com> Acked-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-06-12Merge tag 'v4.12-rc5' into for-4.13/blockJens Axboe1-22/+8
We've already got a few conflicts and upcoming work depends on some of the changes that have gone into mainline as regression fixes for this series. Pull in 4.12-rc5 to resolve these conflicts and make it easier on down stream trees to continue working on 4.13 changes. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-09block: switch bios to blk_status_tChristoph Hellwig1-9/+9
Replace bi_error with a new bi_status to allow for a clear conversion. Note that device mapper overloaded bi_error with a private value, which we'll have to keep arround at least for now and thus propagate to a proper blk_status_t value. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-09dm: don't return errnos from ->mapChristoph Hellwig1-6/+6
Instead use the special DM_MAPIO_KILL return value to return -EIO just like we do for the request based path. Note that dm-log-writes returned -ENOMEM in a few places, which now becomes -EIO instead. No consumer treats -ENOMEM special so this shouldn't be an issue (and it should use a mempool to start with to make guaranteed progress). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-05-31dm: make flush bios explicitly syncJan Kara1-1/+2
Commit b685d3d65ac7 ("block: treat REQ_FUA and REQ_PREFLUSH as synchronous") removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...} definitions. generic_make_request_checks() however strips REQ_FUA and REQ_PREFLUSH flags from a bio when the storage doesn't report volatile write cache and thus write effectively becomes asynchronous which can lead to performance regressions. Fix the problem by making sure all bios which are synchronous are properly marked with REQ_SYNC. Fixes: b685d3d65ac7 ("block: treat REQ_FUA and REQ_PREFLUSH as synchronous") Cc: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-22dm integrity: use kvmalloc() instead of dm_integrity_kvmalloc()Mikulas Patocka1-21/+6
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-04-27dm integrity: use previously calculated log2 of sectors_per_blockMikulas Patocka1-2/+1
The log2 of sectors_per_block was already calculated, so we don't have to use the ilog2 function. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-04-27dm integrity: use hex2bin instead of open-coded variantMikulas Patocka1-12/+2
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-04-24dm integrity: support larger block sizesMikulas Patocka1-45/+174
The DM integrity block size can now be 512, 1k, 2k or 4k. Using larger blocks reduces metadata handling overhead. The block size can be configured at table load time using the "block_size:<value>" option; where <value> is expressed in bytes (defult is still 512 bytes). It is safe to use larger block sizes with DM integrity, because the DM integrity journal makes sure that the whole block is updated atomically even if the underlying device doesn't support atomic writes of that size (e.g. 4k block ontop of a 512b device). Depends-on: 2859323e ("block: fix blk_integrity_register to use template's interval_exp if not 0") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-04-24dm integrity: various small changes and cleanupsMikulas Patocka1-54/+62
Some coding style changes. Fix a bug that the array test_tag has insufficient size if the digest size of internal has is bigger than the tag size. The function __fls is undefined for zero argument, this patch fixes undefined behavior if the user sets zero interleave_sectors. Fix the limit of optional arguments to 8. Don't allocate crypt_data on the stack to avoid a BUG with debug kernel. Rename all optional argument names to have underscores rather than dashes. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-03-24dm integrity: add recovery modeMikulas Patocka1-13/+27
In recovery mode, we don't: - replay the journal - check checksums - allow writes to the device This mode can be used as a last resort for data recovery. The motivation for recovery mode is that when there is a single error in the journal, the user should not lose access to the whole device. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-03-24dm integrity: factor out create_journal() from dm_integrity_ctr()Mike Snitzer1-183/+196
Preparation for next commit that makes call to create_journal() optional. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-03-24dm: add integrity targetMikulas Patocka1-0/+3085
The dm-integrity target emulates a block device that has additional per-sector tags that can be used for storing integrity information. A general problem with storing integrity tags with every sector is that writing the sector and the integrity tag must be atomic - i.e. in case of crash, either both sector and integrity tag or none of them is written. To guarantee write atomicity the dm-integrity target uses a journal. It writes sector data and integrity tags into a journal, commits the journal and then copies the data and integrity tags to their respective location. The dm-integrity target can be used with the dm-crypt target - in this situation the dm-crypt target creates the integrity data and passes them to the dm-integrity target via bio_integrity_payload attached to the bio. In this mode, the dm-crypt and dm-integrity targets provide authenticated disk encryption - if the attacker modifies the encrypted device, an I/O error is returned instead of random data. The dm-integrity target can also be used as a standalone target, in this mode it calculates and verifies the integrity tag internally. In this mode, the dm-integrity target can be used to detect silent data corruption on the disk or in the I/O path. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>