[RHEL7,COMMIT] fs: direct-io: don't dirtying pages for ITER_BVEC/ITER_KVEC direct read

Submitted by Konstantin Khorenko on May 25, 2020, 2:52 p.m.


Message ID 202005251452.04PEqd0c004940@finist-ce7.sw.ru
State New
Series "fs, direct_IO: Switch to iov_iter and allow bio_vec for ext4"
Headers show

Commit Message

Konstantin Khorenko May 25, 2020, 2:52 p.m.
The commit is pushed to "branch-rh7-3.10.0-1127.8.2.vz7.161.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-1127.8.2.vz7.161.1
commit 96629f0a3cdb20f11d3a44db3f428e667f6ae23a
Author: Ming Lei <ming.lei@canonical.com>
Date:   Mon May 25 17:52:38 2020 +0300

    fs: direct-io: don't dirtying pages for ITER_BVEC/ITER_KVEC direct read
    ms commit 53cbf3b157a0
    When direct read IO is submitted from kernel, it is often
    unnecessary to dirty pages, for example of loop, dirtying pages
    have been considered in the upper filesystem(over loop) side
    already, and they don't need to be dirtied again.
    So this patch doesn't dirtying pages for ITER_BVEC/ITER_KVEC
    direct read, and loop should be the 1st case to use ITER_BVEC/ITER_KVEC
    for direct read I/O.
    The patch is based on previous Dave's patch.
    Reviewed-by: Dave Kleikamp <dave.kleikamp@oracle.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Ming Lei <ming.lei@canonical.com>
    Signed-off-by: Jens Axboe <axboe@fb.com>
    Otherwise bio_set_pages_dirty() results in deadlock for bvec pages.
    Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
    Patchset description:
    [00/30] fs,direct_IO: Switch to iov_iter and allow bio_vec for ext4
    This patchset transforms direct_IO callbacks, blockdev_direct_IO
    and its underlining functions to iov_iter, and introduces complete
    support of iov_iter for ext4.
    Supported iov_iter subtypes for ext4 is iovec and bio_vec. The first
    is for traditional user-submitted aio, while bio_vec is the type,
    which is important for us, since we use it in ploop.
    bio_vec operates with pages instead of user addresses (like iovec
    does), so it requires specific callbacks in do_blockdev_direct_IO()
    and in the functions it calls.
    The patchset reworks do_blockdev_direct_IO() in the same manner
    as in mainstrean. The most of rest patches are prepared manually,
    since we have significant differences to ms (RHEL7 patches, our
    direct IO patches for FUSE; all they have changed many functions).
    At the end, kaio engine (resulting in direct_IO) became possible
    to be enabled for ext4.
 fs/direct-io.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Patch hide | download patch | download mbox

diff --git a/fs/direct-io.c b/fs/direct-io.c
index f7e464d8bcdb0..1c3a4851e5cf6 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -129,6 +129,7 @@  struct dio {
 	int page_errors;		/* errno from get_user_pages() */
 	int is_async;			/* is IO async ? */
 	bool defer_completion;		/* defer AIO completion to workqueue? */
+	bool should_dirty;		/* if pages should be dirtied */
 	int io_error;			/* IO error in completion path */
 	unsigned long refcount;		/* direct_io_worker() and bios */
 	struct bio *bio_list;		/* singly linked via bi_private */
@@ -471,7 +472,7 @@  static inline void dio_bio_submit(struct dio *dio, struct dio_submit *sdio)
 	spin_unlock_irqrestore(&dio->bio_lock, flags);
-	if (dio->is_async && dio->rw == READ)
+	if (dio->is_async && dio->rw == READ && dio->should_dirty)
 	if (sdio->submit_io)
@@ -542,13 +543,14 @@  static int dio_bio_complete(struct dio *dio, struct bio *bio)
 	if (!uptodate)
 		dio->io_error = -EIO;
-	if (dio->is_async && dio->rw == READ) {
+	if (dio->is_async && dio->rw == READ && dio->should_dirty) {
 		bio_check_pages_dirty(bio);	/* transfers ownership */
 	} else {
 		bio_for_each_segment_all(bvec, bio, i) {
 			struct page *page = bvec->bv_page;
-			if (dio->rw == READ && !PageCompound(page))
+			if (dio->rw == READ && !PageCompound(page) &&
+					dio->should_dirty)
@@ -1324,6 +1326,7 @@  do_blockdev_direct_IO(int rw, struct kiocb *iocb, struct inode *inode,
 	dio->refcount = 1;
+	dio->should_dirty = iov_iter_has_iovec(iter);
 	sdio.iter = iter;
 	sdio.final_block_in_request =
 		(offset + iov_iter_count(iter)) >> blkbits;