Skip to content

Commit 1d8400b

Browse files
fdmananakdave
authored andcommitted
btrfs: avoid transaction commit on any fsync after subvolume creation
As of commit 1b53e51 ("btrfs: don't commit transaction for every subvol create") we started to make any fsync after creating a subvolume to fallback to a transaction commit if the fsync is performed in the same transaction that was used to create the subvolume. This happens with the following at ioctl.c:create_subvol(): $ cat fs/btrfs/ioctl.c (...) /* Tree log can't currently deal with an inode which is a new root. */ btrfs_set_log_full_commit(trans); (...) Note that the comment is misleading as the problem is not that fsync can not deal with the root inode of a new root, but that we can not log any inode that belongs to a root that was not yet persisted because that would make log replay fail since the root doesn't exist at log replay time. The above simply makes any fsync fallback to a full transaction commit if it happens in the same transaction used to create the subvolume - even if it's an inode that belongs to any other subvolume. This is a brute force solution and it doesn't necessarily improve performance for every workload out there - it just moves a full transaction commit from one place, the subvolume creation, to another - an fsync for any inode. Just improve on this by making the fallback to a transaction commit only for an fsync against an inode of the new subvolume, or for the directory that contains the dentry that points to the new subvolume (in case anyone attempts to fsync the directory in the same transaction). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
1 parent 6c3c553 commit 1d8400b

File tree

3 files changed

+31
-2
lines changed

3 files changed

+31
-2
lines changed

fs/btrfs/ioctl.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -662,8 +662,6 @@ static noinline int create_subvol(struct mnt_idmap *idmap,
662662
qgroup_reserved = 0;
663663
trans->block_rsv = &block_rsv;
664664
trans->bytes_reserved = block_rsv.size;
665-
/* Tree log can't currently deal with an inode which is a new root. */
666-
btrfs_set_log_full_commit(trans);
667665

668666
ret = btrfs_qgroup_inherit(trans, 0, objectid, btrfs_root_id(root), inherit);
669667
if (ret)
@@ -764,6 +762,8 @@ static noinline int create_subvol(struct mnt_idmap *idmap,
764762
goto out;
765763
}
766764

765+
btrfs_record_new_subvolume(trans, BTRFS_I(dir));
766+
767767
d_instantiate_new(dentry, new_inode_args.inode);
768768
new_inode_args.inode = NULL;
769769

fs/btrfs/tree-log.c

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7065,6 +7065,15 @@ static int btrfs_log_inode_parent(struct btrfs_trans_handle *trans,
70657065
goto end_no_trans;
70667066
}
70677067

7068+
/*
7069+
* If we're logging an inode from a subvolume created in the current
7070+
* transaction we must force a commit since the root is not persisted.
7071+
*/
7072+
if (btrfs_root_generation(&root->root_item) == trans->transid) {
7073+
ret = BTRFS_LOG_FORCE_COMMIT;
7074+
goto end_no_trans;
7075+
}
7076+
70687077
/*
70697078
* Skip already logged inodes or inodes corresponding to tmpfiles
70707079
* (since logging them is pointless, a link count of 0 means they
@@ -7445,6 +7454,24 @@ void btrfs_record_snapshot_destroy(struct btrfs_trans_handle *trans,
74457454
mutex_unlock(&dir->log_mutex);
74467455
}
74477456

7457+
/*
7458+
* Call this when creating a subvolume in a directory.
7459+
* Because we don't commit a transaction when creating a subvolume, we can't
7460+
* allow the directory pointing to the subvolume to be logged with an entry that
7461+
* points to an unpersisted root if we are still in the transaction used to
7462+
* create the subvolume, so make any attempt to log the directory to result in a
7463+
* full log sync.
7464+
* Also we don't need to worry with renames, since btrfs_rename() marks the log
7465+
* for full commit when renaming a subvolume.
7466+
*/
7467+
void btrfs_record_new_subvolume(const struct btrfs_trans_handle *trans,
7468+
struct btrfs_inode *dir)
7469+
{
7470+
mutex_lock(&dir->log_mutex);
7471+
dir->last_unlink_trans = trans->transid;
7472+
mutex_unlock(&dir->log_mutex);
7473+
}
7474+
74487475
/*
74497476
* Update the log after adding a new name for an inode.
74507477
*

fs/btrfs/tree-log.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,8 @@ void btrfs_record_unlink_dir(struct btrfs_trans_handle *trans,
9494
bool for_rename);
9595
void btrfs_record_snapshot_destroy(struct btrfs_trans_handle *trans,
9696
struct btrfs_inode *dir);
97+
void btrfs_record_new_subvolume(const struct btrfs_trans_handle *trans,
98+
struct btrfs_inode *dir);
9799
void btrfs_log_new_name(struct btrfs_trans_handle *trans,
98100
struct dentry *old_dentry, struct btrfs_inode *old_dir,
99101
u64 old_dir_index, struct dentry *parent);

0 commit comments

Comments
 (0)