Skip to content
  • Filipe Manana's avatar
    Btrfs: fix incremental send failure after deduplication · b4f9a1a8
    Filipe Manana authored
    When doing an incremental send operation we can fail if we previously did
    deduplication operations against a file that exists in both snapshots. In
    that case we will fail the send operation with -EIO and print a message
    to dmesg/syslog like the following:
    
      BTRFS error (device sdc): Send: inconsistent snapshot, found updated \
      extent for inode 257 without updated inode item, send root is 258, \
      parent root is 257
    
    This requires that we deduplicate to the same file in both snapshots for
    the same amount of times on each snapshot. The issue happens because a
    deduplication only updates the iversion of an inode and does not update
    any other field of the inode, therefore if we deduplicate the file on
    each snapshot for the same amount of time, the inode will have the same
    iversion value (stored as the "sequence" field on the inode item) on both
    snapshots, therefore it will be seen as unchanged between in the send
    snapshot while there are new/updated/deleted extent items when comparing
    to the parent snapshot. This makes the send operation return -EIO and
    print an error message.
    
    Example reproducer:
    
      $ mkfs.btrfs -f /dev/sdb
      $ mount /dev/sdb /mnt
    
      # Create our first file. The first half of the file has several 64Kb
      # extents while the second half as a single 512Kb extent.
      $ xfs_io -f -s -c "pwrite -S 0xb8 -b 64K 0 512K" /mnt/foo
      $ xfs_io -c "pwrite -S 0xb8 512K 512K" /mnt/foo
    
      # Create the base snapshot and the parent send stream from it.
      $ btrfs subvolume snapshot -r /mnt /mnt/mysnap1
      $ btrfs send -f /tmp/1.snap /mnt/mysnap1
    
      # Create our second file, that has exactly the same data as the first
      # file.
      $ xfs_io -f -c "pwrite -S 0xb8 0 1M" /mnt/bar
    
      # Create the second snapshot, used for the incremental send, before
      # doing the file deduplication.
      $ btrfs subvolume snapshot -r /mnt /mnt/mysnap2
    
      # Now before creating the incremental send stream:
      #
      # 1) Deduplicate into a subrange of file foo in snapshot mysnap1. This
      #    will drop several extent items and add a new one, also updating
      #    the inode's iversion (sequence field in inode item) by 1, but not
      #    any other field of the inode;
      #
      # 2) Deduplicate into a different subrange of file foo in snapshot
      #    mysnap2. This will replace an extent item with a new one, also
      #    updating the inode's iversion by 1 but not any other field of the
      #    inode.
      #
      # After these two deduplication operations, the inode items, for file
      # foo, are identical in both snapshots, but we have different extent
      # items for this inode in both snapshots. We want to check this doesn't
      # cause send to fail with an error or produce an incorrect stream.
    
      $ xfs_io -r -c "dedupe /mnt/bar 0 0 512K" /mnt/mysnap1/foo
      $ xfs_io -r -c "dedupe /mnt/bar 512K 512K 512K" /mnt/mysnap2/foo
    
      # Create the incremental send stream.
      $ btrfs send -p /mnt/mysnap1 -f /tmp/2.snap /mnt/mysnap2
      ERROR: send ioctl failed with -5: Input/output error
    
    This issue started happening back in 2015 when deduplication was updated
    to not update the inode's ctime and mtime and update only the iversion.
    Back then we would hit a BUG_ON() in send, but later in 2016 send was
    updated to return -EIO and print the error message instead of doing the
    BUG_ON().
    
    A test case for fstests follows soon.
    
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203933
    Fixes: 1c919a5e
    
     ("btrfs: don't update mtime/ctime on deduped inodes")
    CC: stable@vger.kernel.org # 4.4+
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    b4f9a1a8