Skip to content
  • David Rientjes's avatar
    mm, compaction: drain pcps for zone when kcompactd fails · bc3106b2
    David Rientjes authored
    It's possible for free pages to become stranded on per-cpu pagesets
    (pcps) that, if drained, could be merged with buddy pages on the zone's
    free area to form large order pages, including up to MAX_ORDER.
    
    Consider a verbose example using the tools/vm/page-types tool at the
    beginning of a ZONE_NORMAL ('B' indicates a buddy page and 'S' indicates
    a slab page).  Pages on pcps do not have any page flags set.
    
      109954  1       _______S________________________________________________________
      109955  2       __________B_____________________________________________________
      109957  1       ________________________________________________________________
      109958  1       __________B_____________________________________________________
      109959  7       ________________________________________________________________
      109960  1       __________B_____________________________________________________
      109961  9       ________________________________________________________________
      10996a  1       __________B_____________________________________________________
      10996b  3       ________________________________________________________________
      10996e  1       __________B_____________________________________________________
      10996f  1       ________________________________________________________________
      ...
      109f8c  1       __________B_____________________________________________________
      109f8d  2       ________________________________________________________________
      109f8f  2       __________B_____________________________________________________
      109f91  f       ________________________________________________________________
      109fa0  1       __________B_____________________________________________________
      109fa1  7       ________________________________________________________________
      109fa8  1       __________B_____________________________________________________
      109fa9  1       ________________________________________________________________
      109faa  1       __________B_____________________________________________________
      109fab  1       _______S________________________________________________________
    
    The compaction migration scanner is attempting to defragment this memory
    since it is at the beginning of the zone.  It has done so quite well,
    all movable pages have been migrated.  From pfn [0x109955, 0x109fab),
    there are only buddy pages and pages without flags set.
    
    These pages may be stranded on pcps that could otherwise allow this
    memory to be coalesced if freed back to the zone free area.  It is
    possible that some of these pages may not be on pcps and that something
    has called alloc_pages() and used the memory directly, but we rely on
    the absence of __GFP_MOVABLE in these cases to allocate from
    MIGATE_UNMOVABLE pageblocks to try to keep these MIGRATE_MOVABLE
    pageblocks as free as possible.
    
    These buddy and pcp pages, spanning 1,621 pages, could be coalesced and
    allow for three transparent hugepages to be dynamically allocated.
    Running the numbers for all such spans on the system, it was found that
    there were over 400 such spans of only buddy pages and pages without
    flags set at the time this /proc/kpageflags sample was collected.
    Without this support, there were _no_ order-9 or order-10 pages free.
    
    When kcompactd fails to defragment memory such that a cc.order page can
    be allocated, drain all pcps for the zone back to the buddy allocator so
    this stranding cannot occur.  Compaction for that order will
    subsequently be deferred, which acts as a ratelimit on this drain.
    
    Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1803010340100.88270@chino.kir.corp.google.com
    
    
    Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
    Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    bc3106b2