Skip to content
  • Naoya Horiguchi's avatar
    mm: migrate: fix remove_migration_pte() for ksm pages · 4b0ece6f
    Naoya Horiguchi authored
    I found that calling page migration for ksm pages causes the following
    bug:
    
        page:ffffea0004d51180 count:2 mapcount:2 mapping:ffff88013c785141 index:0x913
        flags: 0x57ffffc0040068(uptodate|lru|active|swapbacked)
        raw: 0057ffffc0040068 ffff88013c785141 0000000000000913 0000000200000001
        raw: ffffea0004d5f9e0 ffffea0004d53f60 0000000000000000 ffff88007d81b800
        page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
        page->mem_cgroup:ffff88007d81b800
        ------------[ cut here ]------------
        kernel BUG at /src/linux-dev/mm/rmap.c:1086!
        invalid opcode: 0000 [#1] SMP
        Modules linked in: ppdev parport_pc virtio_balloon i2c_piix4 pcspkr parport i2c_core acpi_cpufreq ip_tables xfs libcrc32c ata_generic pata_acpi ata_piix 8139too libata virtio_blk 8139cp crc32c_intel mii virtio_pci virtio_ring serio_raw virtio floppy dm_mirror dm_region_hash dm_log dm_mod
        CPU: 0 PID: 3162 Comm: bash Not tainted 4.11.0-rc2-mm1+ #1
        Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
        RIP: 0010:do_page_add_anon_rmap+0x1ba/0x260
        RSP: 0018:ffffc90002473b30 EFLAGS: 00010282
        RAX: 0000000000000021 RBX: ffffea0004d51180 RCX: 0000000000000006
        RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff88007dc0dfe0
        RBP: ffffc90002473b58 R08: 00000000fffffffe R09: 00000000000001c1
        R10: 0000000000000005 R11: 00000000000001c0 R12: ffff880139ab3d80
        R13: 0000000000000000 R14: 0000700000000200 R15: 0000160000000000
        FS:  00007f5195f50740(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007fd450287000 CR3: 000000007a08e000 CR4: 00000000001406f0
        Call Trace:
         page_add_anon_rmap+0x18/0x20
         remove_migration_pte+0x220/0x2c0
         rmap_walk_ksm+0x143/0x220
         rmap_walk+0x55/0x60
         remove_migration_ptes+0x53/0x80
         migrate_pages+0x8ed/0xb60
         soft_offline_page+0x309/0x8d0
         store_soft_offline_page+0xaf/0xf0
         dev_attr_store+0x18/0x30
         sysfs_kf_write+0x3a/0x50
         kernfs_fop_write+0xff/0x180
         __vfs_write+0x37/0x160
         vfs_write+0xb2/0x1b0
         SyS_write+0x55/0xc0
         do_syscall_64+0x67/0x180
         entry_SYSCALL64_slow_path+0x25/0x25
        RIP: 0033:0x7f51956339e0
        RSP: 002b:00007ffcfa0dffc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
        RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f51956339e0
        RDX: 000000000000000c RSI: 00007f5195f53000 RDI: 0000000000000001
        RBP: 00007f5195f53000 R08: 000000000000000a R09: 00007f5195f50740
        R10: 000000000000000b R11: 0000000000000246 R12: 00007f5195907400
        R13: 000000000000000c R14: 0000000000000001 R15: 0000000000000000
        Code: fe ff ff 48 81 c2 00 02 00 00 48 89 55 d8 e8 2e c3 fd ff 48 8b 55 d8 e9 42 ff ff ff 48 c7 c6 e0 52 a1 81 48 89 df e8 46 ad fe ff <0f> 0b 48 83 e8 01 e9 7f fe ff ff 48 83 e8 01 e9 96 fe ff ff 48
        RIP: do_page_add_anon_rmap+0x1ba/0x260 RSP: ffffc90002473b30
        ---[ end trace a679d00f4af2df48 ]---
        Kernel panic - not syncing: Fatal exception
        Kernel Offset: disabled
        ---[ end Kernel panic - not syncing: Fatal exception
    
    The problem is in the following lines:
    
        new = page - pvmw.page->index +
            linear_page_index(vma, pvmw.address);
    
    The 'new' is calculated with 'page' which is given by the caller as a
    destination page and some offset adjustment for thp.  But this doesn't
    properly work for ksm pages because pvmw.page->index doesn't change for
    each address but linear_page_index() changes, which means that 'new'
    points to different pages for each addresses backed by the ksm page.  As
    a result, we try to set totally unrelated pages as destination pages,
    and that causes kernel crash.
    
    This patch fixes the miscalculation and makes ksm page migration work
    fine.
    
    Fixes: 3fe87967 ("mm: convert remove_migration_pte() to use page_vma_mapped_walk()")
    Link: http://lkml.kernel.org/r/1489717683-29905-1-git-send-email-n-horiguchi@ah.jp.nec.com
    
    
    Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    4b0ece6f