Skip to content
  • Filipe Manana's avatar
    Btrfs: fix fsync after truncate when no_holes feature is enabled · a89ca6f2
    Filipe Manana authored
    
    
    When we have the no_holes feature enabled, if a we truncate a file to a
    smaller size, truncate it again but to a size greater than or equals to
    its original size and fsync it, the log tree will not have any information
    about the hole covering the range [truncate_1_offset, new_file_size[.
    Which means if the fsync log is replayed, the file will remain with the
    state it had before both truncate operations.
    
    Without the no_holes feature this does not happen, since when the inode
    is logged (full sync flag is set) it will find in the fs/subvol tree a
    leaf with a generation matching the current transaction id that has an
    explicit extent item representing the hole.
    
    Fix this by adding an explicit extent item representing a hole between
    the last extent and the inode's i_size if we are doing a full sync.
    
    The issue is easy to reproduce with the following test case for fstests:
    
      . ./common/rc
      . ./common/filter
      . ./common/dmflakey
    
      _need_to_be_root
      _supported_fs generic
      _supported_os Linux
      _require_scratch
      _require_dm_flakey
    
      # This test was motivated by an issue found in btrfs when the btrfs
      # no-holes feature is enabled (introduced in kernel 3.14). So enable
      # the feature if the fs being tested is btrfs.
      if [ $FSTYP == "btrfs" ]; then
          _require_btrfs_fs_feature "no_holes"
          _require_btrfs_mkfs_feature "no-holes"
          MKFS_OPTIONS="$MKFS_OPTIONS -O no-holes"
      fi
    
      rm -f $seqres.full
    
      _scratch_mkfs >>$seqres.full 2>&1
      _init_flakey
      _mount_flakey
    
      # Create our test files and make sure everything is durably persisted.
      $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 64K"         \
                      -c "pwrite -S 0xbb 64K 61K"       \
                      $SCRATCH_MNT/foo | _filter_xfs_io
      $XFS_IO_PROG -f -c "pwrite -S 0xee 0 64K"         \
                      -c "pwrite -S 0xff 64K 61K"       \
                      $SCRATCH_MNT/bar | _filter_xfs_io
      sync
    
      # Now truncate our file foo to a smaller size (64Kb) and then truncate
      # it to the size it had before the shrinking truncate (125Kb). Then
      # fsync our file. If a power failure happens after the fsync, we expect
      # our file to have a size of 125Kb, with the first 64Kb of data having
      # the value 0xaa and the second 61Kb of data having the value 0x00.
      $XFS_IO_PROG -c "truncate 64K" \
                   -c "truncate 125K" \
                   -c "fsync" \
                   $SCRATCH_MNT/foo
    
      # Do something similar to our file bar, but the first truncation sets
      # the file size to 0 and the second truncation expands the size to the
      # double of what it was initially.
      $XFS_IO_PROG -c "truncate 0" \
                   -c "truncate 253K" \
                   -c "fsync" \
                   $SCRATCH_MNT/bar
    
      _load_flakey_table $FLAKEY_DROP_WRITES
      _unmount_flakey
    
      # Allow writes again, mount to trigger log replay and validate file
      # contents.
      _load_flakey_table $FLAKEY_ALLOW_WRITES
      _mount_flakey
    
      # We expect foo to have a size of 125Kb, the first 64Kb of data all
      # having the value 0xaa and the remaining 61Kb to be a hole (all bytes
      # with value 0x00).
      echo "File foo content after log replay:"
      od -t x1 $SCRATCH_MNT/foo
    
      # We expect bar to have a size of 253Kb and no extents (any byte read
      # from bar has the value 0x00).
      echo "File bar content after log replay:"
      od -t x1 $SCRATCH_MNT/bar
    
      status=0
      exit
    
    The expected file contents in the golden output are:
    
      File foo content after log replay:
      0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
      *
      0200000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      *
      0372000
      File bar content after log replay:
      0000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      *
      0772000
    
    Without this fix, their contents are:
    
      File foo content after log replay:
      0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
      *
      0200000 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb
      *
      0372000
      File bar content after log replay:
      0000000 ee ee ee ee ee ee ee ee ee ee ee ee ee ee ee ee
      *
      0200000 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      *
      0372000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      *
      0772000
    
    A test case submission for fstests follows soon.
    
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Reviewed-by: default avatarLiu Bo <bo.li.liu@oracle.com>
    Signed-off-by: default avatarChris Mason <clm@fb.com>
    a89ca6f2