Skip to content
  • Qu Wenruo's avatar
    btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls · d4e20494
    Qu Wenruo authored
    [BUG]
    The following script can cause btrfs qgroup data space leak:
    
      mkfs.btrfs -f $dev
      mount $dev -o nospace_cache $mnt
    
      btrfs subv create $mnt/subv
      btrfs quota en $mnt
      btrfs quota rescan -w $mnt
      btrfs qgroup limit 128m $mnt/subv
    
      for (( i = 0; i < 3; i++)); do
              # Create 3 64M holes for latter fallocate to fail
              truncate -s 192m $mnt/subv/file
              xfs_io -c "pwrite 64m 4k" $mnt/subv/file > /dev/null
              xfs_io -c "pwrite 128m 4k" $mnt/subv/file > /dev/null
              sync
    
              # it's supposed to fail, and each failure will leak at least 64M
              # data space
              xfs_io -f -c "falloc 0 192m" $mnt/subv/file &> /dev/null
              rm $mnt/subv/file
              sync
      done
    
      # Shouldn't fail after we removed the file
      xfs_io -f -c "falloc 0 64m" $mnt/subv/file
    
    [CAUSE]
    Btrfs qgroup data reserve code allow multiple reservations to happen on
    a single extent_changeset:
    E.g:
    	btrfs_qgroup_reserve_data(inode, &data_reserved, 0, SZ_1M);
    	btrfs_qgroup_reserve_data(inode, &data_reserved, SZ_1M, SZ_2M);
    	btrfs_qgroup_reserve_data(inode, &data_reserved, 0, SZ_4M);
    
    Btrfs qgroup code has its internal tracking to make sure we don't
    double-reserve in above example.
    
    The only pattern utilizing this feature is in the main while loop of
    btrfs_fallocate() function.
    
    However btrfs_qgroup_reserve_data()'s error handling has a bug in that
    on error it clears all ranges in the io_tree with EXTENT_QGROUP_RESERVED
    flag but doesn't free previously reserved bytes.
    
    This bug has a two fold effect:
    - Clearing EXTENT_QGROUP_RESERVED ranges
      This is the correct behavior, but it prevents
      btrfs_qgroup_check_reserved_leak() to catch the leakage as the
      detector is purely EXTENT_QGROUP_RESERVED flag based.
    
    - Leak the previously reserved data bytes.
    
    The bug manifests when N calls to btrfs_qgroup_reserve_data are made and
    the last one fails, leaking space reserved in the previous ones.
    
    [FIX]
    Also free previously reserved data bytes when btrfs_qgroup_reserve_data
    fails.
    
    Fixes: 52472553
    
     ("btrfs: qgroup: Introduce btrfs_qgroup_reserve_data function")
    CC: stable@vger.kernel.org # 4.4+
    Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    d4e20494