1. 30 Jul, 2019 1 commit
    • Filipe Manana's avatar
      Btrfs: fix incremental send failure after deduplication · b4f9a1a8
      Filipe Manana authored
      When doing an incremental send operation we can fail if we previously did
      deduplication operations against a file that exists in both snapshots. In
      that case we will fail the send operation with -EIO and print a message
      to dmesg/syslog like the following:
      
        BTRFS error (device sdc): Send: inconsistent snapshot, found updated \
        extent for inode 257 without updated inode item, send root is 258, \
        parent root is 257
      
      This requires that we deduplicate to the same file in both snapshots for
      the same amount of times on each snapshot. The issue happens because a
      deduplication only updates the iversion of an inode and does not update
      any other field of the inode, therefore if we deduplicate the file on
      each snapshot for the same amount of time, the inode will have the same
      iversion value (stored as the "sequence" field on the inode item) on both
      snapshots, therefore it will be seen as unchanged between in the send
      snapshot while there are new/updated/deleted extent items when comparing
      to the parent snapshot. This makes the send operation return -EIO and
      print an error message.
      
      Example reproducer:
      
        $ mkfs.btrfs -f /dev/sdb
        $ mount /dev/sdb /mnt
      
        # Create our first file. The first half of the file has several 64Kb
        # extents while the second half as a single 512Kb extent.
        $ xfs_io -f -s -c "pwrite -S 0xb8 -b 64K 0 512K" /mnt/foo
        $ xfs_io -c "pwrite -S 0xb8 512K 512K" /mnt/foo
      
        # Create the base snapshot and the parent send stream from it.
        $ btrfs subvolume snapshot -r /mnt /mnt/mysnap1
        $ btrfs send -f /tmp/1.snap /mnt/mysnap1
      
        # Create our second file, that has exactly the same data as the first
        # file.
        $ xfs_io -f -c "pwrite -S 0xb8 0 1M" /mnt/bar
      
        # Create the second snapshot, used for the incremental send, before
        # doing the file deduplication.
        $ btrfs subvolume snapshot -r /mnt /mnt/mysnap2
      
        # Now before creating the incremental send stream:
        #
        # 1) Deduplicate into a subrange of file foo in snapshot mysnap1. This
        #    will drop several extent items and add a new one, also updating
        #    the inode's iversion (sequence field in inode item) by 1, but not
        #    any other field of the inode;
        #
        # 2) Deduplicate into a different subrange of file foo in snapshot
        #    mysnap2. This will replace an extent item with a new one, also
        #    updating the inode's iversion by 1 but not any other field of the
        #    inode.
        #
        # After these two deduplication operations, the inode items, for file
        # foo, are identical in both snapshots, but we have different extent
        # items for this inode in both snapshots. We want to check this doesn't
        # cause send to fail with an error or produce an incorrect stream.
      
        $ xfs_io -r -c "dedupe /mnt/bar 0 0 512K" /mnt/mysnap1/foo
        $ xfs_io -r -c "dedupe /mnt/bar 512K 512K 512K" /mnt/mysnap2/foo
      
        # Create the incremental send stream.
        $ btrfs send -p /mnt/mysnap1 -f /tmp/2.snap /mnt/mysnap2
        ERROR: send ioctl failed with -5: Input/output error
      
      This issue started happening back in 2015 when deduplication was updated
      to not update the inode's ctime and mtime and update only the iversion.
      Back then we would hit a BUG_ON() in send, but later in 2016 send was
      updated to return -EIO and print the error message instead of doing the
      BUG_ON().
      
      A test case for fstests follows soon.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203933
      Fixes: 1c919a5e
      
       ("btrfs: don't update mtime/ctime on deduped inodes")
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      b4f9a1a8
  2. 26 Jul, 2019 1 commit
  3. 25 Jul, 2019 1 commit
  4. 17 Jul, 2019 4 commits
    • Johannes Thumshirn's avatar
      btrfs: don't leak extent_map in btrfs_get_io_geometry() · 373c3b80
      Johannes Thumshirn authored
      btrfs_get_io_geometry() calls btrfs_get_chunk_map() to acquire a reference
      on a extent_map, but on normal operation it does not drop this reference
      anymore.
      
      This leads to excessive kmemleak reports.
      
      Always call free_extent_map(), not just in the error case.
      
      Fixes: 5f141126
      
       ("btrfs: Introduce btrfs_io_geometry infrastructure")
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      373c3b80
    • Johannes Thumshirn's avatar
      btrfs: free checksum hash on in close_ctree · bfcea1c6
      Johannes Thumshirn authored
      fs_info::csum_hash gets initialized in btrfs_init_csum_hash() which is
      called by open_ctree().
      
      But it only gets freed if open_ctree() fails, not on normal operation.
      
      This leads to a memory leak like the following found by kmemleak:
      unreferenced object 0xffff888132cb8720 (size 96):
      
        comm "mount", pid 450, jiffies 4294912436 (age 17.584s)
        hex dump (first 32 bytes):
          04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<000000000c9643d4>] crypto_create_tfm+0x2d/0xd0
          [<00000000ae577f68>] crypto_alloc_tfm+0x4b/0xb0
          [<000000002b5cdf30>] open_ctree+0xb84/0x2060 [btrfs]
          [<0000000043204297>] btrfs_mount_root+0x552/0x640 [btrfs]
          [<00000000c99b10ea>] legacy_get_tree+0x22/0x40
          [<0000000071a6495f>] vfs_get_tree+0x1f/0xc0
          [<00000000f180080e>] fc_mount+0x9/0x30
          [<000000009e36cebd>] vfs_kern_mount.part.11+0x6a/0x80
          [<0000000004594c05>] btrfs_mount+0x174/0x910 [btrfs]
          [<00000000c99b10ea>] legacy_get_tree+0x22/0x40
          [<0000000071a6495f>] vfs_get_tree+0x1f/0xc0
          [<00000000b86e92c5>] do_mount+0x6b0/0x940
          [<0000000097464494>] ksys_mount+0x7b/0xd0
          [<0000000057213c80>] __x64_sys_mount+0x1c/0x20
          [<00000000cb689b5e>] do_syscall_64+0x43/0x130
          [<000000002194e289>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Free fs_info::csum_hash in close_ctree() to avoid the memory leak.
      
      Fixes: 6d97c6e3
      
       ("btrfs: add boilerplate code for directly including the crypto framework")
      Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
      Signed-off-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      bfcea1c6
    • YueHaibing's avatar
      btrfs: Fix build error while LIBCRC32C is module · 314c4cd6
      YueHaibing authored
      
      
      If CONFIG_BTRFS_FS is y and CONFIG_LIBCRC32C is m,
      building fails:
      
        fs/btrfs/super.o: In function `btrfs_mount_root':
        super.c:(.text+0xb7f9): undefined reference to `crc32c_impl'
        fs/btrfs/super.o: In function `init_btrfs_fs':
        super.c:(.init.text+0x3465): undefined reference to `crc32c_impl'
        fs/btrfs/extent-tree.o: In function `hash_extent_data_ref':
        extent-tree.c:(.text+0xe60): undefined reference to `crc32c'
        extent-tree.c:(.text+0xe78): undefined reference to `crc32c'
        extent-tree.c:(.text+0xe8b): undefined reference to `crc32c'
        fs/btrfs/dir-item.o: In function `btrfs_insert_xattr_item':
        dir-item.c:(.text+0x291): undefined reference to `crc32c'
        fs/btrfs/dir-item.o: In function `btrfs_insert_dir_item':
        dir-item.c:(.text+0x429): undefined reference to `crc32c'
      
      Select LIBCRC32C to fix it.
      
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Fixes: d5178578
      
       ("btrfs: directly call into crypto framework for checksumming")
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      314c4cd6
    • Qu Wenruo's avatar
      btrfs: inode: Don't compress if NODATASUM or NODATACOW set · 42c16da6
      Qu Wenruo authored
      As btrfs(5) specified:
      
      	Note
      	If nodatacow or nodatasum are enabled, compression is disabled.
      
      If NODATASUM or NODATACOW set, we should not compress the extent.
      
      Normally NODATACOW is detected properly in run_delalloc_range() so
      compression won't happen for NODATACOW.
      
      However for NODATASUM we don't have any check, and it can cause
      compressed extent without csum pretty easily, just by:
        mkfs.btrfs -f $dev
        mount $dev $mnt -o nodatasum
        touch $mnt/foobar
        mount -o remount,datasum,compress $mnt
        xfs_io -f -c "pwrite 0 128K" $mnt/foobar
      
      And in fact, we have a bug report about corrupted compressed extent
      without proper data checksum so even RAID1 can't recover the corruption.
      (https://bugzilla.kernel.org/show_bug.cgi?id=199707
      
      )
      
      Running compression without proper checksum could cause more damage when
      corruption happens, as compressed data could make the whole extent
      unreadable, so there is no need to allow compression for
      NODATACSUM.
      
      The fix will refactor the inode compression check into two parts:
      
      - inode_can_compress()
        As the hard requirement, checked at btrfs_run_delalloc_range(), so no
        compression will happen for NODATASUM inode at all.
      
      - inode_need_compress()
        As the soft requirement, checked at btrfs_run_delalloc_range() and
        compress_file_range().
      
      Reported-by: default avatarJames Harvey <jamespharvey20@gmail.com>
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      42c16da6
  5. 05 Jul, 2019 1 commit
  6. 04 Jul, 2019 5 commits
  7. 02 Jul, 2019 27 commits