Skip to content
  • Filipe Manana's avatar
    Btrfs: send, flush dellaloc in order to avoid data loss · 9f89d5de
    Filipe Manana authored
    When we set a subvolume to read-only mode we do not flush dellaloc for any
    of its inodes (except if the filesystem is mounted with -o flushoncommit),
    since it does not affect correctness for any subsequent operations - except
    for a future send operation. The send operation will not be able to see the
    delalloc data since the respective file extent items, inode item updates,
    backreferences, etc, have not hit yet the subvolume and extent trees.
    
    Effectively this means data loss, since the send stream will not contain
    any data from existing delalloc. Another problem from this is that if the
    writeback starts and finishes while the send operation is in progress, we
    have the subvolume tree being being modified concurrently which can result
    in send failing unexpectedly with EIO or hitting runtime errors, assertion
    failures or hitting BUG_ONs, etc.
    
    Simple reproducer:
    
      $ mkfs.btrfs -f /dev/sdb
      $ mount /dev/sdb /mnt
    
      $ btrfs subvolume create /mnt/sv
      $ xfs_io -f -c "pwrite -S 0xea 0 108K" /mnt/sv/foo
    
      $ btrfs property set /mnt/sv ro true
      $ btrfs send -f /tmp/send.stream /mnt/sv
    
      $ od -t x1 -A d /mnt/sv/foo
      0000000 ea ea ea ea ea ea ea ea ea ea ea ea ea ea ea ea
      *
      0110592
    
      $ umount /mnt
      $ mkfs.btrfs -f /dev/sdc
      $ mount /dev/sdc /mnt
    
      $ btrfs receive -f /tmp/send.stream /mnt
      $ echo $?
      0
      $ od -t x1 -A d /mnt/sv/foo
      0000000
      # ---> empty file
    
    Since this a problem that affects send only, fix it in send by flushing
    dellaloc for all the roots used by the send operation before send starts
    to process the commit roots.
    
    This is a problem that affects send since it was introduced (commit
    31db9f7c ("Btrfs: introduce BTRFS_IOC_SEND for btrfs send/receive"))
    but backporting it to older kernels has some dependencies:
    
    - For kernels between 3.19 and 4.20, it depends on commit 3cd24c69
      ("btrfs: use tagged writepage to mitigate livelock of snapshot") because
      the function btrfs_start_delalloc_snapshot() does not exist before that
      commit. So one has to either pick that commit or replace the calls to
      btrfs_start_delalloc_snapshot() in this patch with calls to
      btrfs_start_delalloc_inodes().
    
    - For kernels older than 3.19 it also requires commit e5fa8f86
    
    
      ("Btrfs: ensure send always works on roots without orphans") because
      it depends on the function ensure_commit_roots_uptodate() which that
      commits introduced.
    
    - No dependencies for 5.0+ kernels.
    
    A test case for fstests follows soon.
    
    CC: stable@vger.kernel.org # 3.19+
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    9f89d5de