• Tejun Heo's avatar
    slab: remove synchronous rcu_barrier() call in memcg cache release path · 657dc2f9
    Tejun Heo authored
    With kmem cgroup support enabled, kmem_caches can be created and
    destroyed frequently and a great number of near empty kmem_caches can
    accumulate if there are a lot of transient cgroups and the system is not
    under memory pressure.  When memory reclaim starts under such
    conditions, it can lead to consecutive deactivation and destruction of
    many kmem_caches, easily hundreds of thousands on moderately large
    systems, exposing scalability issues in the current slab management
    code.  This is one of the patches to address the issue.
    
    SLAB_DESTORY_BY_RCU caches need to flush all RCU operations before
    destruction because slab pages are freed through RCU and they need to be
    able to dereference the associated kmem_cache.  Currently, it's done
    synchronously with rcu_barrier().  As rcu_barrier() is expensive
    time-wise, slab implements a batching mechanism so that rcu_barrier()
    can be done for multiple caches at the same time.
    
    Unfortunately, the rcu_barrier() is in synchronous path which is called
    while holding cgroup_mutex and the batching is too limited to be
    actually helpful.
    
    This patch updates the cache release path so that the batching is
    asynchronous and global.  All SLAB_DESTORY_BY_RCU caches are queued
    globally and a work item consumes the list.  The work item calls
    rcu_barrier() only once for all caches that are currently queued.
    
    * release_caches() is removed and shutdown_cache() now either directly
      release the cache or schedules a RCU callback to do that.  This
      makes the cache inaccessible once shutdown_cache() is called and
      makes it impossible for shutdown_memcg_caches() to do memcg-specific
      cleanups afterwards.  Move memcg-specific part into a helper,
      unlink_memcg_cache(), and make shutdown_cache() call it directly.
    
    Link: http://lkml.kernel.org/r/20170117235411.9408-4-tj@kernel.org
    
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Reported-by: default avatarJay Vana <jsvana@fb.com>
    Acked-by: default avatarVladimir Davydov <vdavydov@tarantool.org>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Pekka Enberg <penberg@kernel.org>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    657dc2f9