1. 22 Jan, 2014 10 commits
    • Santosh Shilimkar's avatar
      mm/memblock: add memblock memory allocation apis · 26f09e9b
      Santosh Shilimkar authored
      
      
      Introduce memblock memory allocation APIs which allow to support PAE or
      LPAE extension on 32 bits archs where the physical memory start address
      can be beyond 4GB.  In such cases, existing bootmem APIs which operate
      on 32 bit addresses won't work and needs memblock layer which operates
      on 64 bit addresses.
      
      So we add equivalent APIs so that we can replace usage of bootmem with
      memblock interfaces.  Architectures already converted to NO_BOOTMEM use
      these new memblock interfaces.  The architectures which are still not
      converted to NO_BOOTMEM continue to function as is because we still
      maintain the fal lback option of bootmem back-end supporting these new
      interfaces.  So no functional change as such.
      
      In long run, once all the architectures moves to NO_BOOTMEM, we can get
      rid of bootmem layer completely.  This is one step to remove the core
      code dependency with bootmem and also gives path for architectures to
      move away from bootmem.
      
      The proposed interface will became active if both CONFIG_HAVE_MEMBLOCK
      and CONFIG_NO_BOOTMEM are specified by arch.  In case
      !CONFIG_NO_BOOTMEM, the memblock() wrappers will fallback to the
      existing bootmem apis so that arch's not converted to NO_BOOTMEM
      continue to work as is.
      
      The meaning of MEMBLOCK_ALLOC_ACCESSIBLE and MEMBLOCK_ALLOC_ANYWHERE
      is kept same.
      
      [akpm@linux-foundation.org: s/depricated/deprecated/]
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarSantosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Paul Walmsley <paul@pwsan.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Tony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      26f09e9b
    • Grygorii Strashko's avatar
      mm/memblock: switch to use NUMA_NO_NODE instead of MAX_NUMNODES · b1154233
      Grygorii Strashko authored
      
      
      It's recommended to use NUMA_NO_NODE everywhere to select "process any
      node" behavior or to indicate that "no node id specified".
      
      Hence, update __next_free_mem_range*() API's to accept both NUMA_NO_NODE
      and MAX_NUMNODES, but emit warning once on MAX_NUMNODES, and correct
      corresponding API's documentation to describe new behavior.  Also,
      update other memblock/nobootmem APIs where MAX_NUMNODES is used
      dirrectly.
      
      The change was suggested by Tejun Heo.
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarSantosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Paul Walmsley <paul@pwsan.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Tony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b1154233
    • Grygorii Strashko's avatar
      mm/memblock: reorder parameters of memblock_find_in_range_node · 87029ee9
      Grygorii Strashko authored
      
      
      Reorder parameters of memblock_find_in_range_node to be consistent with
      other memblock APIs.
      
      The change was suggested by Tejun Heo <tj@kernel.org>.
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarSantosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Paul Walmsley <paul@pwsan.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Tony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      87029ee9
    • Grygorii Strashko's avatar
      mm/memblock: drop WARN and use SMP_CACHE_BYTES as a default alignment · 79f40fab
      Grygorii Strashko authored
      Don't produce warning and interpret 0 as "default align" equal to
      SMP_CACHE_BYTES in case if caller of memblock_alloc_base_nid() doesn't
      specify alignment for the block (align == 0).
      
      This is done in preparation of introducing common memblock alloc interface
      to make code behavior consistent.  More details are in below thread :
      
      	https://lkml.org/lkml/2013/10/13/117
      
      .
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarSantosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Paul Walmsley <paul@pwsan.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Tony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      79f40fab
    • Grygorii Strashko's avatar
      mm/memblock: debug: don't free reserved array if !ARCH_DISCARD_MEMBLOCK · fd615c4e
      Grygorii Strashko authored
      
      
      Now the Nobootmem allocator will always try to free memory allocated for
      reserved memory regions (free_low_memory_core_early()) without taking
      into to account current memblock debugging configuration
      (CONFIG_ARCH_DISCARD_MEMBLOCK and CONFIG_DEBUG_FS state).
      
      As result if:
      
       - CONFIG_DEBUG_FS defined
       - CONFIG_ARCH_DISCARD_MEMBLOCK not defined;
       - reserved memory regions array have been resized during boot
      
      then:
      
       - memory allocated for reserved memory regions array will be freed to
         buddy allocator;
       - debug_fs entry "sys/kernel/debug/memblock/reserved" will show garbage
         instead of state of memory reservations.  like:
         0: 0x98393bc0..0x9a393bbf
         1: 0xff120000..0xff11ffff
         2: 0x00000000..0xffffffff
      
      Hence, do not free memory allocated for reserved memory regions if
      defined(CONFIG_DEBUG_FS) && !defined(CONFIG_ARCH_DISCARD_MEMBLOCK).
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarSantosh Shilimkar <santosh.shilimkar@ti.com>
      Reviewed-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Paul Walmsley <paul@pwsan.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Tony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fd615c4e
    • Tang Chen's avatar
      memblock, mem_hotplug: make memblock skip hotpluggable regions if needed · 55ac590c
      Tang Chen authored
      
      
      Linux kernel cannot migrate pages used by the kernel.  As a result,
      hotpluggable memory used by the kernel won't be able to be hot-removed.
      To solve this problem, the basic idea is to prevent memblock from
      allocating hotpluggable memory for the kernel at early time, and arrange
      all hotpluggable memory in ACPI SRAT(System Resource Affinity Table) as
      ZONE_MOVABLE when initializing zones.
      
      In the previous patches, we have marked hotpluggable memory regions with
      MEMBLOCK_HOTPLUG flag in memblock.memory.
      
      In this patch, we make memblock skip these hotpluggable memory regions
      in the default top-down allocation function if movable_node boot option
      is specified.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: default avatarZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
      Cc: Chen Tang <imtangchen@gmail.com>
      Cc: Gong Chen <gong.chen@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Liu Jiang <jiang.liu@huawei.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      55ac590c
    • Tang Chen's avatar
      memblock: make memblock_set_node() support different memblock_type · e7e8de59
      Tang Chen authored
      
      
      [sfr@canb.auug.org.au: fix powerpc build]
      Signed-off-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
      Reviewed-by: default avatarZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
      Cc: Chen Tang <imtangchen@gmail.com>
      Cc: Gong Chen <gong.chen@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Liu Jiang <jiang.liu@huawei.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e7e8de59
    • Tang Chen's avatar
      memblock, mem_hotplug: introduce MEMBLOCK_HOTPLUG flag to mark hotpluggable regions · 66b16edf
      Tang Chen authored
      
      
      In find_hotpluggable_memory, once we find out a memory region which is
      hotpluggable, we want to mark them in memblock.memory.  So that we could
      control memblock allocator not to allocte hotpluggable memory for the
      kernel later.
      
      To achieve this goal, we introduce MEMBLOCK_HOTPLUG flag to indicate the
      hotpluggable memory regions in memblock and a function
      memblock_mark_hotplug() to mark hotpluggable memory if we find one.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
      Reviewed-by: default avatarZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
      Cc: Chen Tang <imtangchen@gmail.com>
      Cc: Gong Chen <gong.chen@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Liu Jiang <jiang.liu@huawei.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      66b16edf
    • Tang Chen's avatar
      memblock, numa: introduce flags field into memblock · 66a20757
      Tang Chen authored
      
      
      There is no flag in memblock to describe what type the memory is.
      Sometimes, we may use memblock to reserve some memory for special usage.
      And we want to know what kind of memory it is.  So we need a way to
      
      In hotplug environment, we want to reserve hotpluggable memory so the
      kernel won't be able to use it.  And when the system is up, we have to
      free these hotpluggable memory to buddy.  So we need to mark these
      memory first.
      
      In order to do so, we need to mark out these special memory in memblock.
      In this patch, we introduce a new "flags" member into memblock_region:
      
         struct memblock_region {
                 phys_addr_t base;
                 phys_addr_t size;
                 unsigned long flags;		/* This is new. */
         #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
                 int nid;
         #endif
         };
      
      This patch does the following things:
      1) Add "flags" member to memblock_region.
      2) Modify the following APIs' prototype:
      	memblock_add_region()
      	memblock_insert_region()
      3) Add memblock_reserve_region() to support reserve memory with flags, and keep
         memblock_reserve()'s prototype unmodified.
      4) Modify other APIs to support flags, but keep their prototype unmodified.
      
      The idea is from Wen Congyang <wency@cn.fujitsu.com> and Liu Jiang <jiang.liu@huawei.com>.
      Suggested-by: default avatarWen Congyang <wency@cn.fujitsu.com>
      Suggested-by: default avatarLiu Jiang <jiang.liu@huawei.com>
      Signed-off-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
      Reviewed-by: default avatarZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
      Cc: Chen Tang <imtangchen@gmail.com>
      Cc: Gong Chen <gong.chen@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      66a20757
    • Grygorii Strashko's avatar
      mm/memblock: debug: correct displaying of upper memory boundary · 931d13f5
      Grygorii Strashko authored
      
      
      Current memblock APIs don't work on 32 PAE or LPAE extension arches
      where the physical memory start address beyond 4GB.  The problem was
      discussed here [3] where Tejun, Yinghai(thanks) proposed a way forward
      with memblock interfaces.  Based on the proposal, this series adds
      necessary memblock interfaces and convert the core kernel code to use
      them.  Architectures already converted to NO_BOOTMEM use these new
      interfaces and other which still uses bootmem, these new interfaces just
      fallback to exiting bootmem APIs.
      
      So no functional change in behavior.  In long run, once all the
      architectures moves to NO_BOOTMEM, we can get rid of bootmem layer
      completely.  This is one step to remove the core code dependency with
      bootmem and also gives path for architectures to move away from bootmem.
      
      Testing is done on ARM architecture with 32 bit ARM LAPE machines with
      normal as well sparse(faked) memory model.
      
      This patch (of 23):
      
      When debugging is enabled (cmdline has "memblock=debug") the memblock
      will display upper memory boundary per each allocated/freed memory range
      wrongly.  For example:
      
       memblock_reserve: [0x0000009e7e8000-0x0000009e7ed000] _memblock_early_alloc_try_nid_nopanic+0xfc/0x12c
      
      The 0x0000009e7ed000 is displayed instead of 0x0000009e7ecfff
      
      Hence, correct this by changing formula used to calculate upper memory
      boundary to (u64)base + size - 1 instead of (u64)base + size everywhere
      in the debug messages.
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarSantosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Paul Walmsley <paul@pwsan.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Tony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      931d13f5
  2. 13 Nov, 2013 2 commits
    • Tang Chen's avatar
      mm/memblock.c: introduce bottom-up allocation mode · 79442ed1
      Tang Chen authored
      
      
      The Linux kernel cannot migrate pages used by the kernel.  As a result,
      kernel pages cannot be hot-removed.  So we cannot allocate hotpluggable
      memory for the kernel.
      
      ACPI SRAT (System Resource Affinity Table) contains the memory hotplug
      info.  But before SRAT is parsed, memblock has already started to allocate
      memory for the kernel.  So we need to prevent memblock from doing this.
      
      In a memory hotplug system, any numa node the kernel resides in should be
      unhotpluggable.  And for a modern server, each node could have at least
      16GB memory.  So memory around the kernel image is highly likely
      unhotpluggable.
      
      So the basic idea is: Allocate memory from the end of the kernel image and
      to the higher memory.  Since memory allocation before SRAT is parsed won't
      be too much, it could highly likely be in the same node with kernel image.
      
      The current memblock can only allocate memory top-down.  So this patch
      introduces a new bottom-up allocation mode to allocate memory bottom-up.
      And later when we use this allocation direction to allocate memory, we
      will limit the start address above the kernel.
      Signed-off-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: default avatarZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Acked-by: default avatarToshi Kani <toshi.kani@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      79442ed1
    • Tang Chen's avatar
      mm/memblock.c: factor out of top-down allocation · 1402899e
      Tang Chen authored
      
      
      [Problem]
      
      The current Linux cannot migrate pages used by the kernel because of the
      kernel direct mapping.  In Linux kernel space, va = pa + PAGE_OFFSET.
      When the pa is changed, we cannot simply update the pagetable and keep the
      va unmodified.  So the kernel pages are not migratable.
      
      There are also some other issues will cause the kernel pages not
      migratable.  For example, the physical address may be cached somewhere and
      will be used.  It is not to update all the caches.
      
      When doing memory hotplug in Linux, we first migrate all the pages in one
      memory device somewhere else, and then remove the device.  But if pages
      are used by the kernel, they are not migratable.  As a result, memory used
      by the kernel cannot be hot-removed.
      
      Modifying the kernel direct mapping mechanism is too difficult to do.  And
      it may cause the kernel performance down and unstable.  So we use the
      following way to do memory hotplug.
      
      [What we are doing]
      
      In Linux, memory in one numa node is divided into several zones.  One of
      the zones is ZONE_MOVABLE, which the kernel won't use.
      
      In order to implement memory hotplug in Linux, we are going to arrange all
      hotpluggable memory in ZONE_MOVABLE so that the kernel won't use these
      memory.  To do this, we need ACPI's help.
      
      In ACPI, SRAT(System Resource Affinity Table) contains NUMA info.  The
      memory affinities in SRAT record every memory range in the system, and
      also, flags specifying if the memory range is hotpluggable.  (Please refer
      to ACPI spec 5.0 5.2.16)
      
      With the help of SRAT, we have to do the following two things to achieve our
      goal:
      
      1. When doing memory hot-add, allow the users arranging hotpluggable as
         ZONE_MOVABLE.
         (This has been done by the MOVABLE_NODE functionality in Linux.)
      
      2. when the system is booting, prevent bootmem allocator from allocating
         hotpluggable memory for the kernel before the memory initialization
         finishes.
      
      The problem 2 is the key problem we are going to solve. But before solving it,
      we need some preparation. Please see below.
      
      [Preparation]
      
      Bootloader has to load the kernel image into memory.  And this memory must
      be unhotpluggable.  We cannot prevent this anyway.  So in a memory hotplug
      system, we can assume any node the kernel resides in is not hotpluggable.
      
      Before SRAT is parsed, we don't know which memory ranges are hotpluggable.
       But memblock has already started to work.  In the current kernel,
      memblock allocates the following memory before SRAT is parsed:
      
      setup_arch()
       |->memblock_x86_fill()            /* memblock is ready */
       |......
       |->early_reserve_e820_mpc_new()   /* allocate memory under 1MB */
       |->reserve_real_mode()            /* allocate memory under 1MB */
       |->init_mem_mapping()             /* allocate page tables, about 2MB to map 1GB memory */
       |->dma_contiguous_reserve()       /* specified by user, should be low */
       |->setup_log_buf()                /* specified by user, several mega bytes */
       |->relocate_initrd()              /* could be large, but will be freed after boot, should reorder */
       |->acpi_initrd_override()         /* several mega bytes */
       |->reserve_crashkernel()          /* could be large, should reorder */
       |......
       |->initmem_init()                 /* Parse SRAT */
      
      According to Tejun's advice, before SRAT is parsed, we should try our best
      to allocate memory near the kernel image.  Since the whole node the kernel
      resides in won't be hotpluggable, and for a modern server, a node may have
      at least 16GB memory, allocating several mega bytes memory around the
      kernel image won't cross to hotpluggable memory.
      
      [About this patchset]
      
      So this patchset is the preparation for the problem 2 that we want to
      solve.  It does the following:
      
      1. Make memblock be able to allocate memory bottom up.
         1) Keep all the memblock APIs' prototype unmodified.
         2) When the direction is bottom up, keep the start address greater than the
            end of kernel image.
      
      2. Improve init_mem_mapping() to support allocate page tables in
         bottom up direction.
      
      3. Introduce "movable_node" boot option to enable and disable this
         functionality.
      
      This patch (of 6):
      
      Create a new function __memblock_find_range_top_down to factor out of
      top-down allocation from memblock_find_in_range_node.  This is a
      preparation because we will introduce a new bottom-up allocation mode in
      the following patch.
      Signed-off-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: default avatarZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarToshi Kani <toshi.kani@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1402899e
  3. 11 Sep, 2013 1 commit
    • Yinghai Lu's avatar
      memblock, numa: binary search node id · e76b63f8
      Yinghai Lu authored
      
      
      Current early_pfn_to_nid() on arch that support memblock go over
      memblock.memory one by one, so will take too many try near the end.
      
      We can use existing memblock_search to find the node id for given pfn,
      that could save some time on bigger system that have many entries
      memblock.memory array.
      
      Here are the timing differences for several machines.  In each case with
      the patch less time was spent in __early_pfn_to_nid().
      
                              3.11-rc5        with patch      difference (%)
                              --------        ----------      --------------
      UV1: 256 nodes  9TB:     411.66          402.47         -9.19 (2.23%)
      UV2: 255 nodes 16TB:    1141.02         1138.12         -2.90 (0.25%)
      UV2:  64 nodes  2TB:     128.15          126.53         -1.62 (1.26%)
      UV2:  32 nodes  2TB:     121.87          121.07         -0.80 (0.66%)
                              Time in seconds.
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Acked-by: default avatarRuss Anderson <rja@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e76b63f8
  4. 09 Jul, 2013 1 commit
  5. 29 Apr, 2013 2 commits
  6. 02 Mar, 2013 1 commit
    • Yinghai Lu's avatar
      x86, ACPI, mm: Revert movablemem_map support · 20e6926d
      Yinghai Lu authored
      Tim found:
      
        WARNING: at arch/x86/kernel/smpboot.c:324 topology_sane.isra.2+0x6f/0x80()
        Hardware name: S2600CP
        sched: CPU #1's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
        smpboot: Booting Node   1, Processors  #1
        Modules linked in:
        Pid: 0, comm: swapper/1 Not tainted 3.9.0-0-generic #1
        Call Trace:
          set_cpu_sibling_map+0x279/0x449
          start_secondary+0x11d/0x1e5
      
      Don Morris reproduced on a HP z620 workstation, and bisected it to
      commit e8d19552 ("acpi, memory-hotplug: parse SRAT before memblock
      is ready")
      
      It turns out movable_map has some problems, and it breaks several things
      
      1. numa_init is called several times, NOT just for srat. so those
      	nodes_clear(numa_nodes_parsed)
      	memset(&numa_meminfo, 0, sizeof(numa_meminfo))
         can not be just removed.  Need to consider sequence is: numaq, srat, amd, dummy.
         and make fall back path working.
      
      2. simply split acpi_numa_init to early_parse_srat.
         a. that early_parse_srat is NOT called for ia64, so you break ia64.
         b.  for (i = 0; i < MAX_LOCAL_APIC; i++)
      	     set_apicid_to_node(i, NUMA_NO_NODE)
           still left in numa_init. So it will just clear result from early_parse_srat.
           it should be moved before that....
         c.  it breaks ACPI_TABLE_OVERIDE...as the acpi table scan is moved
             early before override from INITRD is settled.
      
      3. that patch TITLE is total misleading, there is NO x86 in the title,
         but it changes critical x86 code. It caused x86 guys did not
         pay attention to find the problem early. Those patches really should
         be routed via tip/x86/mm.
      
      4. after that commit, following range can not use movable ram:
        a. real_mode code.... well..funny, legacy Node0 [0,1M) could be hot-removed?
        b. initrd... it will be freed after booting, so it could be on movable...
        c. crashkernel for kdump...: looks like we can not put kdump kernel above 4G
      	anymore.
        d. init_mem_mapping: can not put page table high anymore.
        e. initmem_init: vmemmap can not be high local node anymore. That is
           not good.
      
      If node is hotplugable, the mem related range like page table and
      vmemmap could be on the that node without problem and should be on that
      node.
      
      We have workaround patch that could fix some problems, but some can not
      be fixed.
      
      So just remove that offending commit and related ones including:
      
       f7210e6c ("mm/memblock.c: use CONFIG_HAVE_MEMBLOCK_NODE_MAP to
          protect movablecore_map in memblock_overlaps_region().")
      
       01a178a9 ("acpi, memory-hotplug: support getting hotplug info from
          SRAT")
      
       27168d38 ("acpi, memory-hotplug: extend movablemem_map ranges to
          the end of node")
      
       e8d19552 ("acpi, memory-hotplug: parse SRAT before memblock is
          ready")
      
       fb06bc8e ("page_alloc: bootmem limit with movablecore_map")
      
       42f47e27 ("page_alloc: make movablemem_map have higher priority")
      
       6981ec31 ("page_alloc: introduce zone_movable_limit[] to keep
          movable limit for nodes")
      
       34b71f1e ("page_alloc: add movable_memmap kernel parameter")
      
       4d59a751
      
       ("x86: get pg_data_t's memory from other node")
      
      Later we should have patches that will make sure kernel put page table
      and vmemmap on local node ram instead of push them down to node0.  Also
      need to find way to put other kernel used ram to local node ram.
      Reported-by: default avatarTim Gardner <tim.gardner@canonical.com>
      Reported-by: default avatarDon Morris <don.morris@hp.com>
      Bisected-by: default avatarDon Morris <don.morris@hp.com>
      Tested-by: default avatarDon Morris <don.morris@hp.com>
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Thomas Renninger <trenn@suse.de>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      20e6926d
  7. 24 Feb, 2013 2 commits
  8. 30 Jan, 2013 1 commit
  9. 11 Jan, 2013 1 commit
    • Lin Feng's avatar
      mm: memblock: fix wrong memmove size in memblock_merge_regions() · c0232ae8
      Lin Feng authored
      
      
      The memmove span covers from (next+1) to the end of the array, and the
      index of next is (i+1), so the index of (next+1) is (i+2).  So the size
      of remaining array elements is (type->cnt - (i + 2)).
      
      Since the remaining elements of the memblock array are move forward by
      one element and there is only one additional element caused by this bug.
      So there won't be any write overflow here but read overflow.  It may
      read one more element out of the array address if the array happens to
      be full.  Commonly it doesn't matter at all but if the array happens to
      be located at the end a memblock, it may cause a invalid read operation
      for the physical address doesn't exist.
      
      There are 2 *happens to be* here, so I think the probability is quite
      low, I don't know if any guy is haunted by this bug before.
      
      Mostly I think it's user-invisible.
      Signed-off-by: default avatarLin Feng <linfeng@cn.fujitsu.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarWanpeng Li <liwanp@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c0232ae8
  10. 24 Oct, 2012 1 commit
  11. 09 Oct, 2012 2 commits
  12. 05 Sep, 2012 1 commit
  13. 01 Aug, 2012 1 commit
  14. 11 Jul, 2012 1 commit
    • Yinghai Lu's avatar
      memblock: free allocated memblock_reserved_regions later · 29f67386
      Yinghai Lu authored
      memblock_free_reserved_regions() calls memblock_free(), but
      memblock_free() would double reserved.regions too, so we could free the
      old range for reserved.regions.
      
      Also tj said there is another bug which could be related to this.
      
      | I don't think we're saving any noticeable
      | amount by doing this "free - give it to page allocator - reserve
      | again" dancing.  We should just allocate regions aligned to page
      | boundaries and free them later when memblock is no longer in use.
      
      in that case, when DEBUG_PAGEALLOC, will get panic:
      
           memblock_free: [0x0000102febc080-0x0000102febf080] memblock_free_reserved_regions+0x37/0x39
        BUG: unable to handle kernel paging request at ffff88102febd948
        IP: [<ffffffff836a5774>] __next_free_mem_range+0x9b/0x155
        PGD 4826063 PUD cf67a067 PMD cf7fa067 PTE 800000102febd160
        Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
        CPU 0
        Pid: 0, comm: swapper Not tainted 3.5.0-rc2-next-20120614-sasha #447
        RIP: 0010:[<ffffffff836a5774>]  [<ffffffff836a5774>] __next_free_mem_range+0x9b/0x155
      
      See the discussion at https://lkml.org/lkml/2012/6/13/469
      
      
      
      So try to allocate with PAGE_SIZE alignment and free it later.
      Reported-by: default avatarSasha Levin <levinsasha928@gmail.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      29f67386
  15. 20 Jun, 2012 2 commits
  16. 08 Jun, 2012 1 commit
  17. 29 May, 2012 2 commits
    • Gavin Shan's avatar
      mm/memblock: fix memory leak on extending regions · 181eb394
      Gavin Shan authored
      
      
      The overall memblock has been organized into the memory regions and
      reserved regions.  Initially, the memory regions and reserved regions are
      stored in the predetermined arrays of "struct memblock _region".  It's
      possible for the arrays to be enlarged when we have newly added regions,
      but no free space left there.  The policy here is to create double-sized
      array either by slab allocator or memblock allocator.  Unfortunately, we
      didn't free the old array, which might be allocated through slab allocator
      before.  That would cause memory leak.
      
      The patch introduces 2 variables to trace where (slab or memblock) the
      memory and reserved regions come from.  The memory for the memory or
      reserved regions will be deallocated by kfree() if that was allocated by
      slab allocator.  Thus to fix the memory leak issue.
      Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      181eb394
    • Gavin Shan's avatar
      mm/memblock: cleanup on duplicate VA/PA conversion · 4e2f0775
      Gavin Shan authored
      
      
      The overall memblock has been organized into the memory regions and
      reserved regions.  Initially, the memory regions and reserved regions are
      stored in the predetermined arrays of "struct memblock _region".  It's
      possible for the arrays to be enlarged when we have newly added regions
      for them, but no enough space there.  Under the situation, We will created
      double-sized array to meet the requirement.  However, the original
      implementation converted the VA (Virtual Address) of the newly allocated
      array of regions to PA (Physical Address), then translate back when we
      allocates the new array from slab.  That's actually unnecessary.
      
      The patch removes the duplicate VA/PA conversion.
      Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4e2f0775
  18. 20 Apr, 2012 1 commit
  19. 01 Mar, 2012 1 commit
    • Tejun Heo's avatar
      memblock: Fix size aligning of memblock_alloc_base_nid() · 847854f5
      Tejun Heo authored
      memblock allocator aligns @size to @align to reduce the amount
      of fragmentation.  Commit:
      
       7bd0b0f0
      
       ("memblock: Reimplement memblock allocation using reverse free area iterator")
      
      Broke it by incorrectly relocating @size aligning to
      memblock_find_in_range_node().  As the aligned size is not
      propagated back to memblock_alloc_base_nid(), the actually
      reserved size isn't aligned.
      
      While this increases memory use for memblock reserved array,
      this shouldn't cause any critical failure; however, it seems
      that the size aligning was hiding a use-beyond-allocation bug in
      sparc64 and losing the aligning causes boot failure.
      
      The underlying problem is currently being debugged but this is a
      proper fix in itself, it's already pretty late in -rc cycle for
      boot failures and reverting the change for debugging isn't
      difficult. Restore the size aligning moving it to
      memblock_alloc_base_nid().
      Reported-by: default avatarMeelis Roos <mroos@linux.ee>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Cc: Rob Herring <rob.herring@calxeda.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20120228205621.GC3252@dhcp-172-17-108-109.mtv.corp.google.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      LKML-Reference: <alpine.SOC.1.00.1202130942030.1488@math.ut.ee>
      847854f5
  20. 16 Jan, 2012 1 commit
  21. 08 Dec, 2011 5 commits
    • Tejun Heo's avatar
      memblock: Reimplement memblock allocation using reverse free area iterator · 7bd0b0f0
      Tejun Heo authored
      
      
      Now that all early memory information is in memblock when enabled, we
      can implement reverse free area iterator and use it to implement NUMA
      aware allocator which is then wrapped for simpler variants instead of
      the confusing and inefficient mending of information in separate NUMA
      aware allocator.
      
      Implement for_each_free_mem_range_reverse(), use it to reimplement
      memblock_find_in_range_node() which in turn is used by all allocators.
      
      The visible allocator interface is inconsistent and can probably use
      some cleanup too.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      7bd0b0f0
    • Tejun Heo's avatar
      memblock: Kill early_node_map[] · 0ee332c1
      Tejun Heo authored
      
      
      Now all ARCH_POPULATES_NODE_MAP archs select HAVE_MEBLOCK_NODE_MAP -
      there's no user of early_node_map[] left.  Kill early_node_map[] and
      replace ARCH_POPULATES_NODE_MAP with HAVE_MEMBLOCK_NODE_MAP.  Also,
      relocate for_each_mem_pfn_range() and helper from mm.h to memblock.h
      as page_alloc.c would no longer host an alternative implementation.
      
      This change is ultimately one to one mapping and shouldn't cause any
      observable difference; however, after the recent changes, there are
      some functions which now would fit memblock.c better than page_alloc.c
      and dependency on HAVE_MEMBLOCK_NODE_MAP instead of HAVE_MEMBLOCK
      doesn't make much sense on some of them.  Further cleanups for
      functions inside HAVE_MEMBLOCK_NODE_MAP in mm.h would be nice.
      
      -v2: Fix compile bug introduced by mis-spelling
       CONFIG_HAVE_MEMBLOCK_NODE_MAP to CONFIG_MEMBLOCK_HAVE_NODE_MAP in
       mmzone.h.  Reported by Stephen Rothwell.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Chen Liqin <liqin.chen@sunplusct.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      0ee332c1
    • Tejun Heo's avatar
      memblock: Implement memblock_add_node() · 7fb0bc3f
      Tejun Heo authored
      
      
      Implement memblock_add_node() which can add a new memblock memory
      region with specific node ID.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      7fb0bc3f
    • Tejun Heo's avatar
      memblock: s/memblock_analyze()/memblock_allow_resize()/ and update users · 1aadc056
      Tejun Heo authored
      
      
      The only function of memblock_analyze() is now allowing resize of
      memblock region arrays.  Rename it to memblock_allow_resize() and
      update its users.
      
      * The following users remain the same other than renaming.
      
        arm/mm/init.c::arm_memblock_init()
        microblaze/kernel/prom.c::early_init_devtree()
        powerpc/kernel/prom.c::early_init_devtree()
        openrisc/kernel/prom.c::early_init_devtree()
        sh/mm/init.c::paging_init()
        sparc/mm/init_64.c::paging_init()
        unicore32/mm/init.c::uc32_memblock_init()
      
      * In the following users, analyze was used to update total size which
        is no longer necessary.
      
        powerpc/kernel/machine_kexec.c::reserve_crashkernel()
        powerpc/kernel/prom.c::early_init_devtree()
        powerpc/mm/init_32.c::MMU_init()
        powerpc/mm/tlb_nohash.c::__early_init_mmu()  
        powerpc/platforms/ps3/mm.c::ps3_mm_add_memory()
        powerpc/platforms/embedded6xx/wii.c::wii_memory_fixups()
        sh/kernel/machine_kexec.c::reserve_crashkernel()
      
      * x86/kernel/e820.c::memblock_x86_fill() was directly setting
        memblock_can_resize before populating memblock and calling analyze
        afterwards.  Call memblock_allow_resize() before start populating.
      
      memblock_can_resize is now static inside memblock.c.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      1aadc056
    • Tejun Heo's avatar
      memblock: Track total size of regions automatically · 1440c4e2
      Tejun Heo authored
      
      
      Total size of memory regions was calculated by memblock_analyze()
      requiring explicitly calling the function between operations which can
      change memory regions and possible users of total size, which is
      cumbersome and fragile.
      
      This patch makes each memblock_type track total size automatically
      with minor modifications to memblock manipulation functions and remove
      requirements on calling memblock_analyze().  [__]memblock_dump_all()
      now also dumps the total size of reserved regions.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      1440c4e2