Skip to content
  • Joonsoo Kim's avatar
    mm/slab: fix the theoretical race by holding proper lock · 18726ca8
    Joonsoo Kim authored
    
    
    While processing concurrent allocation, SLAB could be contended a lot
    because it did a lots of work with holding a lock.  This patchset try to
    reduce the number of critical section to reduce lock contention.  Major
    changes are lockless decision to allocate more slab and lockless cpu
    cache refill from the newly allocated slab.
    
    Below is the result of concurrent allocation/free in slab allocation
    benchmark made by Christoph a long time ago.  I make the output simpler.
    The number shows cycle count during alloc/free respectively so less is
    better.
    
      * Before
      Kmalloc N*alloc N*free(32): Average=365/806
      Kmalloc N*alloc N*free(64): Average=452/690
      Kmalloc N*alloc N*free(128): Average=736/886
      Kmalloc N*alloc N*free(256): Average=1167/985
      Kmalloc N*alloc N*free(512): Average=2088/1125
      Kmalloc N*alloc N*free(1024): Average=4115/1184
      Kmalloc N*alloc N*free(2048): Average=8451/1748
      Kmalloc N*alloc N*free(4096): Average=16024/2048
    
      * After
      Kmalloc N*alloc N*free(32): Average=344/792
      Kmalloc N*alloc N*free(64): Average=347/882
      Kmalloc N*alloc N*free(128): Average=390/959
      Kmalloc N*alloc N*free(256): Average=393/1067
      Kmalloc N*alloc N*free(512): Average=683/1229
      Kmalloc N*alloc N*free(1024): Average=1295/1325
      Kmalloc N*alloc N*free(2048): Average=2513/1664
      Kmalloc N*alloc N*free(4096): Average=4742/2172
    
    It shows that performance improves greatly (roughly more than 50%) for
    the object class whose size is more than 128 bytes.
    
    This patch (of 11):
    
    If we don't hold neither the slab_mutex nor the node lock, node's shared
    array cache could be freed and re-populated.  If __kmem_cache_shrink()
    is called at the same time, it will call drain_array() with n->shared
    without holding node lock so problem can happen.  This patch fix the
    situation by holding the node lock before trying to drain the shared
    array.
    
    In addition, add a debug check to confirm that n->shared access race
    doesn't exist.
    
    Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
    Cc: Jesper Dangaard Brouer <brouer@redhat.com>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Pekka Enberg <penberg@kernel.org>
    Cc: David Rientjes <rientjes@google.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    18726ca8