- 16 Feb, 2017 1 commit
-
-
Paul Mackerras authored
Currently, if the kernel is running on a POWER9 processor under a hypervisor, it may try to use the radix MMU even though it doesn't have the necessary code to do so (it doesn't negotiate use of radix, and it doesn't do the H_REGISTER_PROC_TBL hcall). If the hypervisor supports both radix and HPT, then it will set up the guest to use HPT (since the guest doesn't request radix in the CAS call), but if the radix feature bit is set in the ibm,pa-features property (which is valid, since ibm,pa-features is defined to represent the capabilities of the processor) the guest will try to use radix, resulting in a crash when it turns the MMU on. This makes the minimal fix for the current code, which is to disable radix unless we are running in hypervisor mode. Fixes: 2bfd65e4 ("powerpc/mm/radix: Add radix callbacks for early init routines") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by:
Paul Mackerras <paulus@ozlabs.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 24 Dec, 2016 1 commit
-
-
Linus Torvalds authored
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 10 Dec, 2016 1 commit
-
-
Christophe Leroy authored
Today powerpc64 uses a set of pgtable_caches while powerpc32 uses standard pages when using 4k pages and a single pgtable_cache if using other size pages. In preparation of implementing huge pages on the 8xx, this patch replaces the specific powerpc32 handling by the 64 bits approach. This is done by: * moving 64 bits pgtable_cache_add() and pgtable_cache_init() in a new file called init-common.c * modifying pgtable_cache_init() to also handle the case without PMD * removing the 32 bits version of pgtable_cache_add() and pgtable_cache_init() * copying related header contents from 64 bits into both the book3s/32 and nohash/32 header files On the 8xx, the following cache sizes will be used: * 4k pages mode: - PGT_CACHE(10) for PGD - PGT_CACHE(3) for 512k hugepage tables * 16k pages mode: - PGT_CACHE(6) for PGD - PGT_CACHE(7) for 512k hugepage tables - PGT_CACHE(3) for 8M hugepage tables Signed-off-by:
Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Scott Wood <oss@buserror.net>
-
- 01 Aug, 2016 6 commits
-
-
Aneesh Kumar K.V authored
This switches early feature checks to use the non static key variant of the function. In later patches we will be switching cpu_has_feature() and mmu_has_feature() to use static keys and we can use them only after static key/jump label is initialized. Any check for feature before jump label init should be done using this new helper. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
MMU feature bits are defined such that we use the lower half to present MMU family features. Remove the strict split of half and also move Radix to a mmu family feature. Radix introduce a new MMU model and strictly speaking it is a new MMU family. This also free up bits which can be used for individual features later. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
Like we just did for hash, split the device tree scanning parts out and call them from mmu_early_init_devtree(). Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
Currently MMU initialisation (early_init_mmu()) consists of a mixture of scanning the device tree, setting MMU feature bits, and then also doing actual initialisation of MMU data structures. We'd like to decouple the setting of the MMU features from the actual setup. So split out the device tree scanning, and associated code, and call it from mmu_init_early_devtree(). Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
Move the handling of the disable_radix command line argument into the newly created mmu_early_init_devtree(). It's an MMU option so it's preferable to have it in an mm related file, and it also means platforms that don't support radix don't have to carry the code. Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
Empty for now, but we'll add to it in the next patch. Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 01 May, 2016 3 commits
-
-
Aneesh Kumar K.V authored
For hash we create vmemmap mapping using bolted hash page table entries. For radix we fill the radix page table. The next patch will add the radix details for creating vmemmap mappings. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
Radix and hash MMU models support different page table sizes. Make the #defines a variable so that existing code can work with variable sizes. Slice related code is only used by hash, so use hash constants there. We will replicate some of the boundary conditions with resepct to TASK_SIZE using radix values too. Right now we do boundary condition check using hash constants. Swapper pgdir size is initialized in asm code. We select the max pgd size to keep it simple. For now we select hash pgdir. When adding radix we will switch that to radix pgdir which is 64K. BUILD_BUG_ON check which is removed is already done in hugepage_init() using MAYBE_BUILD_BUG_ON(). Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
This patch reduces the number of #ifdefs in C code and will also help in adding radix changes later. Only code movement in this patch. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Propagate copyrights and update GPL text] Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 03 Mar, 2016 1 commit
-
-
Aneesh Kumar K.V authored
This is needed so that we can support both hash and radix page table using single kernel. Radix kernel uses a 4 level table. We now use physical address in upper page table tree levels. Even though they are aligned to their size, for the masked bits we use the bit positions as per PowerISA 3.0. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 01 Mar, 2016 2 commits
-
-
David Gibson authored
This makes a number of cleanups to handling of mapping failures during memory hotplug on Power: For errors creating the linear mapping for the hot-added region: * This is now reported with EFAULT which is more appropriate than the previous EINVAL (the failure is unlikely to be related to the function's parameters) * An error in this path now prints a warning message, rather than just silently failing to add the extra memory. * Previously a failure here could result in the region being partially mapped. We now clean up any partial mapping before failing. For errors creating the vmemmap for the hot-added region: * This is now reported with EFAULT instead of causing a BUG() - this could happen for external reason (e.g. full hash table) so it's better to handle this non-fatally * An error message is also printed, so the failure won't be silent * As above a failure could cause a partially mapped region, we now clean this up. [mpe: move htab_remove_mapping() out of #ifdef CONFIG_MEMORY_HOTPLUG to enable this] Signed-off-by:
David Gibson <david@gibson.dropbear.id.au> Reviewed-by:
Paul Mackerras <paulus@samba.org> Reviewed-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
David Gibson authored
At the moment the hpte_removebolted callback in ppc_md returns void and will BUG_ON() if the hpte it's asked to remove doesn't exist in the first place. This is awkward for the case of cleaning up a mapping which was partially made before failing. So, we add a return value to hpte_removebolted, and have it return ENOENT in the case that the HPTE to remove didn't exist in the first place. In the (sole) caller, we propagate errors in hpte_removebolted to its caller to handle. However, we handle ENOENT specially, continuing to complete the unmapping over the specified range before returning the error to the caller. This means that htab_remove_mapping() will work sanely on a partially present mapping, removing any HPTEs which are present, while also returning ENOENT to its caller in case it's important there. There are two callers of htab_remove_mapping(): - In remove_section_mapping() we already WARN_ON() any error return, which is reasonable - in this case the mapping should be fully present - In vmemmap_remove_mapping() we BUG_ON() any error. We change that to just a WARN_ON() in the case of ENOENT, since failing to remove a mapping that wasn't there in the first place probably shouldn't be fatal. Signed-off-by:
David Gibson <david@gibson.dropbear.id.au> Reviewed-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 14 Dec, 2015 1 commit
-
-
Aneesh Kumar K.V authored
pte and pmd table size are dependent on config items. Don't hard code the same. This make sure we use the right value when masking pmd entries and also while checking pmd_bad Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 26 Mar, 2015 1 commit
-
-
Yanjiang Jin authored
kmem_cache_create()->kmem_cache_create_memcg()->kstrdup() allocates new space and copys name's content, so it is safe to free name memory after calling kmem_cache_create(). Else kmemleak will report the below warning: unreferenced object 0xc0000000f9002160 (size 16): comm "swapper/0", pid 0, jiffies 4294892296 (age 1386.640s) hex dump (first 16 bytes): 70 67 74 61 62 6c 65 2d 32 5e 39 00 de ad be ef pgtable-2^9..... backtrace: [<c0000000004e03ec>] .kvasprintf+0x5c/0xa0 [<c0000000004e045c>] .kasprintf+0x2c/0x50 [<c00000000002e36c>] .pgtable_cache_add+0xac/0x100 [<c00000000002e3e4>] .pgtable_cache_init+0x24/0x80 [<c000000000c6c67c>] .start_kernel+0x228/0x4c8 [<c000000000000594>] .start_here_common+0x24/0x90 Signed-off-by:
Yanjiang Jin <yanjiang.jin@windriver.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 09 Nov, 2014 1 commit
-
-
Anton Blanchard authored
Lots of places included bootmem.h even when not using bootmem. Signed-off-by:
Anton Blanchard <anton@samba.org> Tested-by:
Emil Medve <Emilian.Medve@Freescale.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 25 Sep, 2014 1 commit
-
-
Anton Blanchard authored
A recent patch added a function prototype for htab_remove_mapping in c code. Fix it. Signed-off-by:
Anton Blanchard <anton@samba.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 05 Aug, 2014 4 commits
-
-
Li Zhong authored
vmemmap_populated() checks whether the [start, start + page_size) has valid pfn numbers, to know whether a vmemmap mapping has been created that includes this range. Some range before end might not be checked by this loop: sec11start......start11..sec11end/sec12start..end....start12..sec12end as the above, for start11(section 11), it checks [sec11start, sec11end), and loop ends as the next start(start12) is bigger than end. However, [sec11end/sec12start, end) is not checked here. So before the loop, adjust the start to be the start of the section, so we don't miss ranges like the above. After we adjust start to be the start of the section, it also means it's aligned with vmemmap as of the sizeof struct page, so we could use page_to_pfn directly in the loop. Signed-off-by:
Li Zhong <zhong@linux.vnet.ibm.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Acked-by:
Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Li Zhong authored
vmemmap_free() does the opposite of vmemap_populate(). This patch also puts vmemmap_free() and vmemmap_list_free() into CONFIG_MEMMORY_HOTPLUG. Signed-off-by:
Li Zhong <zhong@linux.vnet.ibm.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Acked-by:
Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Li Zhong authored
This is to be called in vmemmap_free(), leave the implementation on BOOK3E empty as before. Signed-off-by:
Li Zhong <zhong@linux.vnet.ibm.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Acked-by:
Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
Li Zhong authored
This patch implements vmemmap_list_free() for vmemmap_free(). The freed entries will be removed from vmemmap_list, and form a freed list, with next as the header. The next position in the last allocated page is kept at the list tail. When allocation, if there are freed entries left, get it from the freed list; if no freed entries left, get it like before from the last allocated pages. With this change, realmode_pfn_to_page() also needs to be changed to walk all the entries in the vmemmap_list, as the virt_addr of the entries might not be stored in order anymore. It helps to reuse the memory when continuous doing memory hot-plug/remove operations, but didn't reclaim the pages already allocated, so the memory usage will only increase, but won't exceed the value for the largest memory configuration. Signed-off-by:
Li Zhong <zhong@linux.vnet.ibm.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Acked-by:
Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 11 Oct, 2013 1 commit
-
-
Alexey Kardashevskiy authored
The current VFIO-on-POWER implementation supports only user mode driven mapping, i.e. QEMU is sending requests to map/unmap pages. However this approach is really slow, so we want to move that to KVM. Since H_PUT_TCE can be extremely performance sensitive (especially with network adapters where each packet needs to be mapped/unmapped) we chose to implement that as a "fast" hypercall directly in "real mode" (processor still in the guest context but MMU off). To be able to do that, we need to provide some facilities to access the struct page count within that real mode environment as things like the sparsemem vmemmap mappings aren't accessible. This adds an API function realmode_pfn_to_page() to get page struct when MMU is off. This adds to MM a new function put_page_unless_one() which drops a page if counter is bigger than 1. It is going to be used when MMU is off (for example, real mode on PPC64) and we want to make sure that page release will not happen in real mode as it may crash the kernel in a horrible way. CONFIG_SPARSEMEM_VMEMMAP and CONFIG_FLATMEM are supported. Cc: linux-mm@kvack.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andrew Morton <akpm@linux-foundation.org> Reviewed-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 03 Oct, 2013 1 commit
-
-
Nathan Fontenot authored
Previous commit 46723bfa ... introduced a new config option HAVE_BOOTMEM_INFO_NODE that ended up breaking memory hot-remove for ppc when sparse vmemmap is not defined. This patch defines HAVE_BOOTMEM_INFO_NODE for ppc and adds the call to register_page_bootmem_info_node. Without this we get a BUG_ON for memory hot remove in put_page_bootmem(). This also adds a stub for register_page_bootmem_memmap to allow ppc to build with sparse vmemmap defined. Leaving this as a stub is fine since the same vmemmap addresses are also handled in vmemmap_populate and as such are properly mapped. Signed-off-by:
Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: <stable@vger.kernel.org> [v3.9+]
-
- 21 Jun, 2013 1 commit
-
-
Aneesh Kumar K.V authored
THP code does PTE page allocation along with large page request and deposit them for later use. This is to ensure that we won't have any failures when we split hugepages to regular pages. On powerpc we want to use the deposited PTE page for storing hash pte slot and secondary bit information for the HPTEs. We use the second half of the pmd table to save the deposted PTE page. Reviewed-by:
David Gibson <david@gibson.dropbear.id.au> Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 14 May, 2013 1 commit
-
-
Aneesh Kumar K.V authored
Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 30 Apr, 2013 1 commit
-
-
Aneesh Kumar K.V authored
Change the hugepage directory format so that we can have leaf ptes directly at page directory avoiding the allocation of hugepage directory. With the new table format we have 3 cases for pgds and pmds: (1) invalid (all zeroes) (2) pointer to next table, as normal; bottom 6 bits == 0 (4) hugepd pointer, bottom two bits == 00, next 4 bits indicate size of table Instead of storing shift value in hugepd pointer we use mmu_psize_def index so that we can fit all the supported hugepage size in 4 bits Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 29 Apr, 2013 1 commit
-
-
Johannes Weiner authored
The sparse code, when asking the architecture to populate the vmemmap, specifies the section range as a starting page and a number of pages. This is an awkward interface, because none of the arch-specific code actually thinks of the range in terms of 'struct page' units and always translates it to bytes first. In addition, later patches mix huge page and regular page backing for the vmemmap. For this, they need to call vmemmap_populate_basepages() on sub-section ranges with PAGE_SIZE and PMD_SIZE in mind. But these are not necessarily multiples of the 'struct page' size and so this unit is too coarse. Just translate the section range into bytes once in the generic sparse code, then pass byte ranges down the stack. Signed-off-by:
Johannes Weiner <hannes@cmpxchg.org> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Bernhard Schmidt <Bernhard.Schmidt@lrz.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by:
David S. Miller <davem@davemloft.net> Tested-by:
David S. Miller <davem@davemloft.net> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 24 Feb, 2013 2 commits
-
-
Tang Chen authored
Introduce a new API vmemmap_free() to free and remove vmemmap pagetables. Since pagetable implements are different, each architecture has to provide its own version of vmemmap_free(), just like vmemmap_populate(). Note: vmemmap_free() is not implemented for ia64, ppc, s390, and sparc. [mhocko@suse.cz: fix implicit declaration of remove_pagetable] Signed-off-by:
Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by:
Jianguo Wu <wujianguo@huawei.com> Signed-off-by:
Wen Congyang <wency@cn.fujitsu.com> Signed-off-by:
Tang Chen <tangchen@cn.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Wu Jianguo <wujianguo@huawei.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by:
Michal Hocko <mhocko@suse.cz> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Yasuaki Ishimatsu authored
For removing memmap region of sparse-vmemmap which is allocated bootmem, memmap region of sparse-vmemmap needs to be registered by get_page_bootmem(). So the patch searches pages of virtual mapping and registers the pages by get_page_bootmem(). NOTE: register_page_bootmem_memmap() is not implemented for ia64, ppc, s390, and sparc. So introduce CONFIG_HAVE_BOOTMEM_INFO_NODE and revert register_page_bootmem_info_node() when platform doesn't support it. It's implemented by adding a new Kconfig option named CONFIG_HAVE_BOOTMEM_INFO_NODE, which will be automatically selected by memory-hotplug feature fully supported archs(currently only on x86_64). Since we have 2 config options called MEMORY_HOTPLUG and MEMORY_HOTREMOVE used for memory hot-add and hot-remove separately, and codes in function register_page_bootmem_info_node() are only used for collecting infomation for hot-remove, so reside it under MEMORY_HOTREMOVE. Besides page_isolation.c selected by MEMORY_ISOLATION under MEMORY_HOTPLUG is also such case, move it too. [mhocko@suse.cz: put register_page_bootmem_memmap inside CONFIG_MEMORY_HOTPLUG_SPARSE] [linfeng@cn.fujitsu.com: introduce CONFIG_HAVE_BOOTMEM_INFO_NODE and revert register_page_bootmem_info_node()] [mhocko@suse.cz: remove the arch specific functions without any implementation] [linfeng@cn.fujitsu.com: mm/Kconfig: move auto selects from MEMORY_HOTPLUG to MEMORY_HOTREMOVE as needed] [rientjes@google.com: fix defined but not used warning] Signed-off-by:
Wen Congyang <wency@cn.fujitsu.com> Signed-off-by:
Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by:
Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by:
Wu Jianguo <wujianguo@huawei.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by:
Michal Hocko <mhocko@suse.cz> Signed-off-by:
Lin Feng <linfeng@cn.fujitsu.com> Signed-off-by:
David Rientjes <rientjes@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 05 Sep, 2012 1 commit
-
-
Michael Ellerman authored
It's empty now, apart from other includes. Fixup a few files that were getting things via this header. Signed-off-by:
Michael Ellerman <michael@ellerman.id.au> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 28 Mar, 2012 1 commit
-
-
David Howells authored
Disintegrate asm/system.h for PowerPC. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> cc: linuxppc-dev@lists.ozlabs.org
-
- 30 Jun, 2011 1 commit
-
-
Dave Carroll authored
The free_initmem function is basically duplicated in mm/init_32, and init_64, and is moved to the common 32/64-bit mm/mem.c. All other sections except init were removed in v2.6.15 by 6c45ab99 (powerpc: Remove section free() and linker script bits), and therefore the bulk of the executed code is identical. This patch also removes updating ppc_md.progress to NULL in the powermac late_initcall. Suggested-by:
Milton Miller <miltonm@bga.com> Suggested-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Dave Carroll <dcarroll@astekcorp.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 09 Jun, 2011 1 commit
-
-
Benjamin Herrenschmidt authored
When using 64K pages with a separate cpio rootfs, U-Boot will align the rootfs on a 4K page boundary. When the memory is reserved, and subsequent early memblock_alloc is called, it will allocate memory between the 64K page alignment and reserved memory. When the reserved memory is subsequently freed, it is done so by pages, causing the early memblock_alloc requests to be re-used, which in my case, caused the device-tree to be clobbered. This patch forces the reserved memory for initrd to be kernel page aligned, and will move the device tree if it overlaps with the range extension of initrd. This patch will also consolidate the identical function free_initrd_mem() from mm/init_32.c, init_64.c to mm/mem.c, and adds the same range extension when freeing initrd. free_initrd_mem() is also moved to the __init section. Many thanks to Milton Miller for his input on this patch. [BenH: Fixed build without CONFIG_BLK_DEV_INITRD] Signed-off-by:
Dave Carroll <dcarroll@astekcorp.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 24 Aug, 2010 1 commit
-
-
Sonny Rao authored
Some modules (like eHCA) want to map all of kernel memory, for this to work with a relocated kernel, we need to export kernstart_addr so modules can use PHYSICAL_START and memstart_addr so they could use MEMORY_START. Note that the 32bit code already exports these symbols. Signed-off-By:
Sonny Rao <sonnyrao@us.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 05 Aug, 2010 1 commit
-
-
Benjamin Herrenschmidt authored
The RMA (RMO is a misnomer) is a concept specific to ppc64 (in fact server ppc64 though I hijack it on embedded ppc64 for similar purposes) and represents the area of memory that can be accessed in real mode (aka with MMU off), or on embedded, from the exception vectors (which is bolted in the TLB) which pretty much boils down to the same thing. We take that out of the generic MEMBLOCK data structure and move it into arch/powerpc where it belongs, renaming it to "RMA" while at it. Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 14 Jul, 2010 1 commit
-
-
Yinghai Lu authored
via following scripts FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/lmb/memblock/g' \ -e 's/LMB/MEMBLOCK/g' \ $FILES for N in $(find . -name lmb.[ch]); do M=$(echo $N | sed 's/lmb/memblock/g') mv $N $M done and remove some wrong change like lmbench and dlmb etc. also move memblock.c from lib/ to mm/ Suggested-by:
Ingo Molnar <mingo@elte.hu> Acked-by:
"H. Peter Anvin" <hpa@zytor.com> Acked-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Yinghai Lu <yinghai@kernel.org> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 06 May, 2010 1 commit
-
-
Mark Nelson authored
We need to keep track of the backing pages that get allocated by vmemmap_populate() so that when we use kdump, the dump-capture kernel knows where these pages are. We use a simple linked list of structures that contain the physical address of the backing page and corresponding virtual address to track the backing pages. To save space, we just use a pointer to the next struct vmemmap_backing. We can also do this because we never remove nodes. We call the pointer "list" to be compatible with changes made to the crash utility. vmemmap_populate() is called either at boot-time or on a memory hotplug operation. We don't have to worry about the boot-time calls because they will be inherently single-threaded, and for a memory hotplug operation vmemmap_populate() is called through: sparse_add_one_section() | V kmalloc_section_memmap() | V sparse_mem_map_populate() | V vmemmap_populate() and in sparse_add_one_section() we're protected by pgdat_resize_lock(). So, we don't need a spinlock to protect the vmemmap_list. We allocate space for the vmemmap_backing structs by allocating whole pages in vmemmap_list_alloc() and then handing out chunks of this to vmemmap_list_populate(). This means that we waste at most just under one page, but this keeps the code is simple. Signed-off-by:
Mark Nelson <markn@au1.ibm.com> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 30 Mar, 2010 1 commit
-
-
Tejun Heo authored
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by:
Tejun Heo <tj@kernel.org> Guess-its-ok-by:
Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
-