• Filipe Manana's avatar
    Btrfs: fix assertion failure during fsync in no-holes mode · 6399fb5a
    Filipe Manana authored
    When logging an inode in full mode that has an inline compressed extent
    that represents a range with a size matching the sector size (currently
    the same as the page size), has a trailing hole and the no-holes feature
    is enabled, we end up failing an assertion leading to a trace like the
    [141812.031528] assertion failed: len == i_size, file: fs/btrfs/tree-log.c, line: 4453
    [141812.033069] ------------[ cut here ]------------
    [141812.034330] kernel BUG at fs/btrfs/ctree.h:3452!
    [141812.035137] invalid opcode: 0000 [#1] PREEMPT SMP
    [141812.035932] Modules linked in: btrfs dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio dm_flakey dm_mod dax ppdev evdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 tpm_tis psmouse crypto_simd parport_pc sg pcspkr tpm_tis_core cryptd parport serio_raw glue_helper tpm i2c_piix4 i2c_core button sunrpc loop autofs4 ext4 crc16 jbd2 mbcache raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod ata_generic virtio_scsi ata_piix floppy crc32c_intel libata scsi_mod virtio_pci virtio_ring e1000 virtio [last unloaded: btrfs]
    [141812.036790] CPU: 3 PID: 845 Comm: fdm-stress Tainted: G    B   W       4.12.3-btrfs-next-52+ #1
    [141812.036790] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
    [141812.036790] task: ffff8801e6694180 task.stack: ffffc90009004000
    [141812.036790] RIP: 0010:assfail.constprop.18+0x1c/0x1e [btrfs]
    [141812.036790] RSP: 0018:ffffc90009007bc0 EFLAGS: 00010282
    [141812.036790] RAX: 0000000000000046 RBX: ffff88017512c008 RCX: 0000000000000001
    [141812.036790] RDX: ffff88023fd95201 RSI: ffffffff8182264c RDI: 00000000ffffffff
    [141812.036790] RBP: ffffc90009007bc0 R08: 0000000000000001 R09: 0000000000000001
    [141812.036790] R10: 0000000000001000 R11: ffffffff82f5a0c9 R12: ffff88014e5947e8
    [141812.036790] R13: 00000000000b4000 R14: ffff8801b234d008 R15: 0000000000000000
    [141812.036790] FS:  00007fdba6ffd700(0000) GS:ffff88023fd80000(0000) knlGS:0000000000000000
    [141812.036790] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [141812.036790] CR2: 00007fdb9c000010 CR3: 000000016efa2000 CR4: 00000000001406e0
    [141812.036790] Call Trace:
    [141812.036790]  btrfs_log_inode+0x9f0/0xd3d [btrfs]
    [141812.036790]  ? __mutex_lock+0x120/0x3ce
    [141812.036790]  btrfs_log_inode_parent+0x224/0x685 [btrfs]
    [141812.036790]  ? lock_acquire+0x16b/0x1af
    [141812.036790]  btrfs_log_dentry_safe+0x60/0x7b [btrfs]
    [141812.036790]  btrfs_sync_file+0x32e/0x3f8 [btrfs]
    [141812.036790]  vfs_fsync_range+0x8a/0x9d
    [141812.036790]  vfs_fsync+0x1c/0x1e
    [141812.036790]  do_fsync+0x31/0x4a
    [141812.036790]  SyS_fdatasync+0x13/0x17
    [141812.036790]  entry_SYSCALL_64_fastpath+0x18/0xad
    [141812.036790] RIP: 0033:0x7fdbac41a47d
    [141812.036790] RSP: 002b:00007fdba6ffce30 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
    [141812.036790] RAX: ffffffffffffffda RBX: ffffffff81092c9f RCX: 00007fdbac41a47d
    [141812.036790] RDX: 0000004cf0160a40 RSI: 0000000000000000 RDI: 0000000000000006
    [141812.036790] RBP: ffffc90009007f98 R08: 0000000000000000 R09: 0000000000000010
    [141812.036790] R10: 00000000000002e8 R11: 0000000000000293 R12: ffffffff8110cd90
    [141812.036790] R13: ffffc90009007f78 R14: 0000000000000000 R15: 0000000000000000
    [141812.036790]  ? time_hardirqs_off+0x9/0x14
    [141812.036790]  ? trace_hardirqs_off_caller+0x1f/0xa3
    [141812.036790] Code: c7 d6 61 6b a0 48 89 e5 e8 ba ef a8 e0 0f 0b 55 89 f1 48 c7 c2 6d 65 6b a0 48 89 fe 48 c7 c7 81 65 6b a0 48 89 e5 e8 9c ef a8 e0 <0f> 0b 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 49 89
    [141812.036790] RIP: assfail.constprop.18+0x1c/0x1e [btrfs] RSP: ffffc90009007bc0
    [141812.084448] ---[ end trace 44e472684c7a32cc ]---
    Which happens because the code that logs a trailing hole when the no-holes
    feature is enabled, did not consider that a compressed inline extent can
    represent a range with a size matching the sector size, in which case
    expanding the inode's i_size, through a truncate operation, won't lead
    to padding with zeroes the page that represents the inline extent, and
    therefore the inline extent remains after the truncation.
    Fix this by adapting the assertion to accept inline extents representing
    data with a sector size length if, and only if, the inline extents are
    A sample and trivial reproducer (for systems with a 4K page size) for this
      mkfs.btrfs -O no-holes -f /dev/sdc
      mount -o compress /dev/sdc /mnt
      xfs_io -f -c "pwrite -S 0xab 0 4K" /mnt/foobar
      xfs_io -c "truncate 32K" /mnt/foobar
      xfs_io -c "fsync" /mnt/foobar
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarChris Mason <clm@fb.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>