Skip to content
  • David Rientjes's avatar
    mm, thp: add new defer+madvise defrag option · 21440d7e
    David Rientjes authored
    There is no thp defrag option that currently allows MADV_HUGEPAGE
    regions to do direct compaction and reclaim while all other thp
    allocations simply trigger kswapd and kcompactd in the background and
    fail immediately.
    
    The "defer" setting simply triggers background reclaim and compaction
    for all regions, regardless of MADV_HUGEPAGE, which makes it unusable
    for our userspace where MADV_HUGEPAGE is being used to indicate the
    application is willing to wait for work for thp memory to be available.
    
    The "madvise" setting will do direct compaction and reclaim for these
    MADV_HUGEPAGE regions, but does not trigger kswapd and kcompactd in the
    background for anybody else.
    
    For reasonable usage, there needs to be a mesh between the two options.
    This patch introduces a fifth mode, "defer+madvise", that will do direct
    reclaim and compaction for MADV_HUGEPAGE regions and trigger background
    reclaim and compaction for everybody else so that hugepages may be
    available in the near future.
    
    A proposal to allow direct reclaim and compaction for MADV_HUGEPAGE
    regions as part of the "defer" mode, making it a very powerful setting
    and avoids breaking userspace, was offered:
         http://marc.info/?t=148236612700003
    This additional mode is a compromise.
    
    A second proposal to allow both "defer" and "madvise" to be selected at
    the same time was also offered:
         http://marc.info/?t=148357345300001.
    This is possible, but there was a concern that it might break existing
    userspaces the parse the output of the defrag mode, so the fifth option
    was introduced instead.
    
    This patch also cleans up the helper function for storing to "enabled"
    and "defrag" since the former supports three modes while the latter
    supports five and triple_flag_store() was getting unnecessarily messy.
    
    Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1701101614330.41805@chino.kir.corp.google.com
    
    
    Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
    Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    21440d7e