Commit 010b495e authored by Sergey Senozhatsky's avatar Sergey Senozhatsky Committed by Linus Torvalds
Browse files

zsmalloc: introduce zs_huge_class_size()

Patch series "zsmalloc/zram: drop zram's max_zpage_size", v3.

ZRAM's max_zpage_size is a bad thing.  It forces zsmalloc to store
normal objects as huge ones, which results in bigger zsmalloc memory
usage.  Drop it and use actual zsmalloc huge-class value when decide if
the object is huge or not.

This patch (of 2):

Not every object can be share its zspage with other objects, e.g.  when
the object is as big as zspage or nearly as big a zspage.  For such
objects zsmalloc has a so called huge class - every object which belongs
to huge class consumes the entire zspage (which consists of a physical
page).  On x86_64, PAGE_SHIFT 12 box, the first non-huge class size is
3264, so starting down from size 3264, objects can share page(-s) and
thus minimize memory wastage.

ZRAM, however, has its own statically defined watermark for huge
objects, namely "3 * PAGE_SIZE / 4 = 3072", and forcibly stores every
object larger than this watermark (3072) as a PAGE_SIZE object, in other
words, to a huge class, while zsmalloc can keep some of those objects in
non-huge classes.  This results in increased memory consumption.

zsmalloc knows better if the object is huge or not.  Introduce
zs_huge_class_size() function which tells if the given object can be
stored in one of non-huge classes or not.  This will let us to drop
ZRAM's huge object watermark and fully rely on zsmalloc when we decide
if the object is huge.

[ add pool param to zs_huge_class_size()]

Signed-off-by: default avatarSergey Senozhatsky <>
Acked-by: default avatarMinchan Kim <>
Cc: Mike Rapoport <>
Signed-off-by: default avatarAndrew Morton <>
Signed-off-by: default avatarLinus Torvalds <>
parent cb9f753a
......@@ -47,6 +47,8 @@ void zs_destroy_pool(struct zs_pool *pool);
unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t flags);
void zs_free(struct zs_pool *pool, unsigned long obj);
size_t zs_huge_class_size(struct zs_pool *pool);
void *zs_map_object(struct zs_pool *pool, unsigned long handle,
enum zs_mapmode mm);
void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
......@@ -193,6 +193,7 @@ static struct vfsmount *zsmalloc_mnt;
* (see: fix_fullness_group())
static const int fullness_threshold_frac = 4;
static size_t huge_class_size;
struct size_class {
spinlock_t lock;
......@@ -1407,6 +1408,25 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle)
* zs_huge_class_size() - Returns the size (in bytes) of the first huge
* zsmalloc &size_class.
* @pool: zsmalloc pool to use
* The function returns the size of the first huge class - any object of equal
* or bigger size will be stored in zspage consisting of a single physical
* page.
* Context: Any context.
* Return: the size (in bytes) of the first huge zsmalloc &size_class.
size_t zs_huge_class_size(struct zs_pool *pool)
return huge_class_size;
static unsigned long obj_malloc(struct size_class *class,
struct zspage *zspage, unsigned long handle)
......@@ -2363,6 +2383,27 @@ struct zs_pool *zs_create_pool(const char *name)
pages_per_zspage = get_pages_per_zspage(size);
objs_per_zspage = pages_per_zspage * PAGE_SIZE / size;
* We iterate from biggest down to smallest classes,
* so huge_class_size holds the size of the first huge
* class. Any object bigger than or equal to that will
* endup in the huge class.
if (pages_per_zspage != 1 && objs_per_zspage != 1 &&
!huge_class_size) {
huge_class_size = size;
* The object uses ZS_HANDLE_SIZE bytes to store the
* handle. We need to subtract it, because zs_malloc()
* unconditionally adds handle size before it performs
* size class search - so object may be smaller than
* huge class size, yet it still can end up in the huge
* class because it grows by ZS_HANDLE_SIZE extra bytes
* right before class lookup.
huge_class_size -= (ZS_HANDLE_SIZE - 1);
* size_class is used for normal zsmalloc operation such
* as alloc/free for that size. Although it is natural that we
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment