- 24 Sep, 2019 1 commit
-
-
Aneesh Kumar K.V authored
With PFN_MODE_PMEM namespace, the memmap area is allocated from the device area. Some architectures map the memmap area with large page size. On architectures like ppc64, 16MB page for memap mapping can map 262144 pfns. This maps a namespace size of 16G. When populating memmap region with 16MB page from the device area, make sure the allocated space is not used to map resources outside this namespace. Such usage of device area will prevent a namespace destroy. Add resource end pnf in altmap and use that to check if the memmap area allocation can map pfn outside the namespace. On ppc64 in such case we fallback to allocation from memory. This fix kernel crash reported below: [ 132.034989] WARNING: CPU: 13 PID: 13719 at mm/memremap.c:133 devm_memremap_pages_release+0x2d8/0x2e0 [ 133.464754] BUG: Unable to handle kernel data access at 0xc00c00010b204000 [ 133.464760] Faulting instruction address: 0xc00000000007580c [ 133.464766] Oops: Kernel access of bad area, sig: 11 [#1] [ 133.464771] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries ..... [ 133.464901] NIP [c00000000007580c] vmemmap_free+0x2ac/0x3d0 [ 133.464906] LR [c0000000000757f8] vmemmap_free+0x298/0x3d0 [ 133.464910] Call Trace: [ 133.464914] [c000007cbfd0f7b0] [c0000000000757f8] vmemmap_free+0x298/0x3d0 (unreliable) [ 133.464921] [c000007cbfd0f8d0] [c000000000370a44] section_deactivate+0x1a4/0x240 [ 133.464928] [c000007cbfd0f980] [c000000000386270] __remove_pages+0x3a0/0x590 [ 133.464935] [c000007cbfd0fa50] [c000000000074158] arch_remove_memory+0x88/0x160 [ 133.464942] [c000007cbfd0fae0] [c0000000003be8c0] devm_memremap_pages_release+0x150/0x2e0 [ 133.464949] [c000007cbfd0fb70] [c000000000738ea0] devm_action_release+0x30/0x50 [ 133.464955] [c000007cbfd0fb90] [c00000000073a5a4] release_nodes+0x344/0x400 [ 133.464961] [c000007cbfd0fc40] [c00000000073378c] device_release_driver_internal+0x15c/0x250 [ 133.464968] [c000007cbfd0fc80] [c00000000072fd14] unbind_store+0x104/0x110 [ 133.464973] [c000007cbfd0fcd0] [c00000000072ee24] drv_attr_store+0x44/0x70 [ 133.464981] [c000007cbfd0fcf0] [c0000000004a32bc] sysfs_kf_write+0x6c/0xa0 [ 133.464987] [c000007cbfd0fd10] [c0000000004a1dfc] kernfs_fop_write+0x17c/0x250 [ 133.464993] [c000007cbfd0fd60] [c0000000003c348c] __vfs_write+0x3c/0x70 [ 133.464999] [c000007cbfd0fd80] [c0000000003c75d0] vfs_write+0xd0/0x250 djbw: Aneesh notes that this crash can likely be triggered in any kernel that supports 'papr_scm', so flagging that commit for -stable consideration. Fixes: b5beae5e ("powerpc/pseries: Add driver for PAPR SCM regions") Cc: <stable@vger.kernel.org> Reported-by:
Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by:
Pankaj Gupta <pagupta@redhat.com> Tested-by:
Santosh Sivaraj <santosh@fossix.org> Reviewed-by:
Johannes Thumshirn <jthumshirn@suse.de> Link: https://lore.kernel.org/r/20190910062826.10041-1-aneesh.kumar@linux.ibm.com Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 04 Jul, 2019 1 commit
-
-
Aneesh Kumar K.V authored
Allocation from altmap area can fail based on vmemmap page size used. Add kernel info message to indicate the failure. That allows the user to identify whether they are really using persistent memory reserved space for per-page metadata. The message looks like: [ 136.587212] altmap block allocation failed, falling back to system memory Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by:
Oliver O'Halloran <oohall@gmail.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 30 May, 2019 1 commit
-
-
Thomas Gleixner authored
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Reviewed-by:
Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 02 May, 2019 1 commit
-
-
Christophe Leroy authored
This patch make inclusion of mmu_decl.h independant of the location of the file including it. Signed-off-by:
Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 02 Mar, 2019 1 commit
-
-
Qian Cai authored
The commit 24b6d416 ("mm: pass the vmem_altmap to vmemmap_free") removed a line in vmemmap_free(), altmap = to_vmem_altmap((unsigned long) section_base); but left a variable no longer used. arch/powerpc/mm/init_64.c: In function 'vmemmap_free': arch/powerpc/mm/init_64.c:277:16: error: variable 'section_base' set but not used [-Werror=unused-but-set-variable] Signed-off-by:
Qian Cai <cai@lca.pw> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 09 Dec, 2018 1 commit
-
-
Oliver O'Halloran authored
The "altmap" is used to provide a pool of memory that is reserved for the vmemmap backing of hot-plugged memory. This is useful when adding large amount of ZONE_DEVICE memory to a system with a limited amount of normal memory. On ppc64 we use huge pages to map the vmemmap which requires the backing storage to be contigious and aligned to the hugepage size. The altmap implementation allows for the altmap provider to reserve a few PFNs at the start of the range for it's own uses and when this occurs the first chunk of the altmap is not usable for hugepage mappings. On hash there is no sane way to fall back to a normal sized page mapping so we fail the allocation. This results in memory hotplug failing with ENOMEM when the new range doesn't fall into an existing vmemmap block. This patch handles this case by falling back to using system memory rather than failing if we cannot allocate from the altmap. This fallback should only ever be used for the first vmemmap block so it should not cause excess memory consumption. Fixes: 7b73d978 ("mm: pass the vmem_altmap to vmemmap_populate") Signed-off-by:
Oliver O'Halloran <oohall@gmail.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 11 Sep, 2018 1 commit
-
-
Alexey Kardashevskiy authored
At the moment the real mode handler of H_PUT_TCE calls iommu_tce_xchg_rm() which in turn reads the old TCE and if it was a valid entry, marks the physical page dirty if it was mapped for writing. Since it is in real mode, realmode_pfn_to_page() is used instead of pfn_to_page() to get the page struct. However SetPageDirty() itself reads the compound page head and returns a virtual address for the head page struct and setting dirty bit for that kills the system. This adds additional dirty bit tracking into the MM/IOMMU API for use in the real mode. Note that this does not change how VFIO and KVM (in virtual mode) set this bit. The KVM (real mode) changes include: - use the lowest bit of the cached host phys address to carry the dirty bit; - mark pages dirty when they are unpinned which happens when the preregistered memory is released which always happens in virtual mode; - add mm_iommu_ua_mark_dirty_rm() helper to set delayed dirty bit; - change iommu_tce_xchg_rm() to take the kvm struct for the mm to use in the new mm_iommu_ua_mark_dirty_rm() helper; - move iommu_tce_xchg_rm() to book3s_64_vio_hv.c (which is the only caller anyway) to reduce the real mode KVM and IOMMU knowledge across different subsystems. This removes realmode_pfn_to_page() as it is not used anymore. While we at it, remove some EXPORT_SYMBOL_GPL() as that code is for the real mode only and modules cannot call it anyway. Signed-off-by:
Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by:
David Gibson <david@gibson.dropbear.id.au> Signed-off-by:
Paul Mackerras <paulus@ozlabs.org>
-
- 04 Apr, 2018 1 commit
-
-
Aneesh Kumar K.V authored
kernel parameter disable_radix takes different options disable_radix=yes|no|1|0 or just disable_radix. When using the later format we get below error. `Malformed early option 'disable_radix'` Fixes: 1fd6c022 ("powerpc/mm: Add a CONFIG option to choose if radix is used by default") Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 30 Mar, 2018 1 commit
-
-
Aneesh Kumar K.V authored
This patch increases the max virtual (effective) address value to 4PB. With 4K page size config we continue to limit ourself to 64TB. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Keep the H_PGTABLE_RANGE test, update it to work] Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 08 Jan, 2018 3 commits
-
-
Christoph Hellwig authored
No functional changes, just untangling the call chain and document why the altmap is passed around the hotplug code. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Logan Gunthorpe <logang@deltatee.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Christoph Hellwig authored
We can just pass this on instead of having to do a radix tree lookup without proper locking a few levels into the callchain. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Christoph Hellwig authored
We can just pass this on instead of having to do a radix tree lookup without proper locking a few levels into the callchain. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 04 Dec, 2017 1 commit
-
-
Joe Perches authored
At some point, pr_warning will be removed so all logging messages use a consistent <prefix>_warn style. Update arch/powerpc/ Miscellanea: o Coalesce formats o Realign arguments o Use %s, __func__ instead of embedded function names o Remove unnecessary line continuations Signed-off-by:
Joe Perches <joe@perches.com> Acked-by:
Geoff Levand <geoff@infradead.org> [mpe: Rebase due to some %pOF changes.] Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 06 Nov, 2017 2 commits
-
-
Michael Ellerman authored
Currently if the hardware supports the radix MMU we will use it, *unless* "disable_radix" is passed on the kernel command line. However some users would like the reverse semantics. ie. The kernel uses the hash MMU by default, unless radix is explicitly requested on the command line. So add a CONFIG option to choose whether we use radix by default or not, and expand the disable_radix command line option to allow "disable_radix=no" which *enables* radix. Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
CONFIG_PPC_STD_MMU_64 indicates support for the "standard" powerpc MMU on 64-bit CPUs. The "standard" MMU refers to the hash page table MMU found in "server" processors, from IBM mainly. Currently CONFIG_PPC_STD_MMU_64 is == CONFIG_PPC_BOOK3S_64. While it's annoying to have two symbols that always have the same value, it's not quite annoying enough to bother removing one. However with the arrival of Power9, we now have the situation where CONFIG_PPC_STD_MMU_64 is enabled, but the kernel is running using the Radix MMU - *not* the "standard" MMU. So it is now actively confusing to use it, because it implies that code is disabled or inactive when the Radix MMU is in use, however that is not necessarily true. So s/CONFIG_PPC_STD_MMU_64/CONFIG_PPC_BOOK3S_64/, and do some minor formatting updates of some of the affected lines. This will be a pain for backports, but c'est la vie. Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 10 Aug, 2017 1 commit
-
-
Michael Ellerman authored
early_check_vec5() is called from and calls __init routines, so should also be __init. Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 24 Jul, 2017 1 commit
-
-
Aneesh Kumar K.V authored
We can use pfn_to_page() in realmode for other configs. Hence remove the CONFIG_FLATMEM ifdef. Fixes: 8e0861fa ("powerpc: Prepare to support kernel handling of IOMMU map/unmap") Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Also fix up the #endif comment] Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 02 Jul, 2017 2 commits
-
-
Oliver O'Halloran authored
Adds support to powerpc for the altmap feature of ZONE_DEVICE memory. An altmap is a driver provided region that is used to provide the backing storage for the struct pages of ZONE_DEVICE memory. In situations where large amount of ZONE_DEVICE memory is being added to the system the altmap reduces pressure on main system memory by allowing the mm/ metadata to be stored on the device itself rather in main memory. Reviewed-by:
Balbir Singh <bsingharora@gmail.com> Signed-off-by:
Oliver O'Halloran <oohall@gmail.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Oliver O'Halloran authored
Removes an indentation level and shuffles some code around to make the following patch cleaner. No functional changes. Reviewed-by:
Balbir Singh <bsingharora@gmail.com> Signed-off-by:
Oliver O'Halloran <oohall@gmail.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 28 Jun, 2017 1 commit
-
-
Anshuman Khandual authored
Adds some explaination on how the vmemmap based struct page layout's physical mapping is allocated and tracked through linked list. It also keeps note of a possible race condition. Signed-off-by:
Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 31 Mar, 2017 1 commit
-
-
Aneesh Kumar K.V authored
Remove the checks that TASK_SIZE_USER64 is smaller than H_PGTABLE_RANGE and USER_VSID_RANGE. In a following patch we will deliberately add support for a TASK_SIZE smaller than both ranges, so this will no longer be an error condition. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Keep the check in pgtable_64.c that we don't exceed USER_VSID_RANGE] Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 21 Mar, 2017 1 commit
-
-
Paul Mackerras authored
This reverts commit 3f91a89d. Now that we do have the machinery for using the radix MMU under a hypervisor, the extra check and comment introduced in 3f91a89d are no longer correct. The result is that when booted under a hypervisor that only allows use of radix, we clear the MMU_FTR_TYPE_RADIX and then set it again, and print a warning about ignoring the disable_radix command line option, even though the command line does not include "disable_radix". Signed-off-by:
Paul Mackerras <paulus@ozlabs.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 06 Mar, 2017 1 commit
-
-
Suraj Jitindar Singh authored
On POWER9 the ibm,client-architecture-support (CAS) negotiation process has been updated to change how the host to guest negotiation is done for the new hash/radix mmu as well as the nest mmu, process tables and guest translation shootdown (GTSE). This is documented in the unreleased PAPR ACR "CAS option vector additions for P9". The host tells the guest which options it supports in ibm,arch-vec-5-platform-support. The guest then chooses a subset of these to request in the CAS call and these are agreed to in the ibm,architecture-vec-5 property of the chosen node. Thus we read ibm,arch-vec-5-platform-support and make our selection before calling CAS. We then parse the ibm,architecture-vec-5 property of the chosen node to check whether we should run as hash or radix. ibm,arch-vec-5-platform-support format: index value pairs: <index, val> ... <index, val> index: Option vector 5 byte number val: Some representation of supported values Signed-off-by:
Suraj Jitindar Singh <sjitindarsingh@gmail.com> Acked-by:
Paul Mackerras <paulus@ozlabs.org> [mpe: Don't print about unknown options, be consistent with OV5_FEAT] Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 16 Feb, 2017 1 commit
-
-
Paul Mackerras authored
Currently, if the kernel is running on a POWER9 processor under a hypervisor, it may try to use the radix MMU even though it doesn't have the necessary code to do so (it doesn't negotiate use of radix, and it doesn't do the H_REGISTER_PROC_TBL hcall). If the hypervisor supports both radix and HPT, then it will set up the guest to use HPT (since the guest doesn't request radix in the CAS call), but if the radix feature bit is set in the ibm,pa-features property (which is valid, since ibm,pa-features is defined to represent the capabilities of the processor) the guest will try to use radix, resulting in a crash when it turns the MMU on. This makes the minimal fix for the current code, which is to disable radix unless we are running in hypervisor mode. Fixes: 2bfd65e4 ("powerpc/mm/radix: Add radix callbacks for early init routines") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by:
Paul Mackerras <paulus@ozlabs.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 31 Jan, 2017 2 commits
-
-
Paul Mackerras authored
To use radix as a guest, we first need to tell the hypervisor via the ibm,client-architecture call first that we support POWER9 and architecture v3.00, and that we can do either radix or hash and that we would like to choose later using an hcall (the H_REGISTER_PROC_TBL hcall). Then we need to check whether the hypervisor agreed to us using radix. We need to do this very early on in the kernel boot process before any of the MMU initialization is done. If the hypervisor doesn't agree, we can't use radix and therefore clear the radix MMU feature bit. Later, when we have set up our process table, which points to the radix tree for each process, we need to install that using the H_REGISTER_PROC_TBL hcall. Signed-off-by:
Paul Mackerras <paulus@ozlabs.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Paul Mackerras authored
Currently, if the kernel is running on a POWER9 processor under a hypervisor, it will try to use the radix MMU even though it doesn't have the necessary code to use radix under a hypervisor (it doesn't negotiate use of radix, and it doesn't do the H_REGISTER_PROC_TBL hcall). The result is that the guest kernel will crash when it tries to turn on the MMU. This fixes it by looking for the /chosen/ibm,architecture-vec-5 property, and if it exists, clears the radix MMU feature bit, before we decide whether to initialize for radix or HPT. This property is created by the hypervisor as a result of the guest calling the ibm,client-architecture-support method to indicate its capabilities, so it will indicate whether the hypervisor agreed to us using radix. Systems without a hypervisor may have this property also (for example, skiboot creates it), so we check the HV bit in the MSR to see whether we are running as a guest or not. If we are in hypervisor mode, then we can do whatever we like including using the radix MMU. The reason for using this property is that in future, when we have support for using radix under a hypervisor, we will need to check this property to see whether the hypervisor agreed to us using radix. Fixes: 2bfd65e4 ("powerpc/mm/radix: Add radix callbacks for early init routines") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by:
Paul Mackerras <paulus@ozlabs.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 24 Dec, 2016 1 commit
-
-
Linus Torvalds authored
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 10 Dec, 2016 1 commit
-
-
Christophe Leroy authored
Today powerpc64 uses a set of pgtable_caches while powerpc32 uses standard pages when using 4k pages and a single pgtable_cache if using other size pages. In preparation of implementing huge pages on the 8xx, this patch replaces the specific powerpc32 handling by the 64 bits approach. This is done by: * moving 64 bits pgtable_cache_add() and pgtable_cache_init() in a new file called init-common.c * modifying pgtable_cache_init() to also handle the case without PMD * removing the 32 bits version of pgtable_cache_add() and pgtable_cache_init() * copying related header contents from 64 bits into both the book3s/32 and nohash/32 header files On the 8xx, the following cache sizes will be used: * 4k pages mode: - PGT_CACHE(10) for PGD - PGT_CACHE(3) for 512k hugepage tables * 16k pages mode: - PGT_CACHE(6) for PGD - PGT_CACHE(7) for 512k hugepage tables - PGT_CACHE(3) for 8M hugepage tables Signed-off-by:
Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Scott Wood <oss@buserror.net>
-
- 01 Aug, 2016 6 commits
-
-
Aneesh Kumar K.V authored
This switches early feature checks to use the non static key variant of the function. In later patches we will be switching cpu_has_feature() and mmu_has_feature() to use static keys and we can use them only after static key/jump label is initialized. Any check for feature before jump label init should be done using this new helper. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
MMU feature bits are defined such that we use the lower half to present MMU family features. Remove the strict split of half and also move Radix to a mmu family feature. Radix introduce a new MMU model and strictly speaking it is a new MMU family. This also free up bits which can be used for individual features later. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
Like we just did for hash, split the device tree scanning parts out and call them from mmu_early_init_devtree(). Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
Currently MMU initialisation (early_init_mmu()) consists of a mixture of scanning the device tree, setting MMU feature bits, and then also doing actual initialisation of MMU data structures. We'd like to decouple the setting of the MMU features from the actual setup. So split out the device tree scanning, and associated code, and call it from mmu_init_early_devtree(). Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
Move the handling of the disable_radix command line argument into the newly created mmu_early_init_devtree(). It's an MMU option so it's preferable to have it in an mm related file, and it also means platforms that don't support radix don't have to carry the code. Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
Empty for now, but we'll add to it in the next patch. Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 01 May, 2016 3 commits
-
-
Aneesh Kumar K.V authored
For hash we create vmemmap mapping using bolted hash page table entries. For radix we fill the radix page table. The next patch will add the radix details for creating vmemmap mappings. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
Radix and hash MMU models support different page table sizes. Make the #defines a variable so that existing code can work with variable sizes. Slice related code is only used by hash, so use hash constants there. We will replicate some of the boundary conditions with resepct to TASK_SIZE using radix values too. Right now we do boundary condition check using hash constants. Swapper pgdir size is initialized in asm code. We select the max pgd size to keep it simple. For now we select hash pgdir. When adding radix we will switch that to radix pgdir which is 64K. BUILD_BUG_ON check which is removed is already done in hugepage_init() using MAYBE_BUILD_BUG_ON(). Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
This patch reduces the number of #ifdefs in C code and will also help in adding radix changes later. Only code movement in this patch. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Propagate copyrights and update GPL text] Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 03 Mar, 2016 1 commit
-
-
Aneesh Kumar K.V authored
This is needed so that we can support both hash and radix page table using single kernel. Radix kernel uses a 4 level table. We now use physical address in upper page table tree levels. Even though they are aligned to their size, for the masked bits we use the bit positions as per PowerISA 3.0. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- 01 Mar, 2016 2 commits
-
-
David Gibson authored
This makes a number of cleanups to handling of mapping failures during memory hotplug on Power: For errors creating the linear mapping for the hot-added region: * This is now reported with EFAULT which is more appropriate than the previous EINVAL (the failure is unlikely to be related to the function's parameters) * An error in this path now prints a warning message, rather than just silently failing to add the extra memory. * Previously a failure here could result in the region being partially mapped. We now clean up any partial mapping before failing. For errors creating the vmemmap for the hot-added region: * This is now reported with EFAULT instead of causing a BUG() - this could happen for external reason (e.g. full hash table) so it's better to handle this non-fatally * An error message is also printed, so the failure won't be silent * As above a failure could cause a partially mapped region, we now clean this up. [mpe: move htab_remove_mapping() out of #ifdef CONFIG_MEMORY_HOTPLUG to enable this] Signed-off-by:
David Gibson <david@gibson.dropbear.id.au> Reviewed-by:
Paul Mackerras <paulus@samba.org> Reviewed-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
David Gibson authored
At the moment the hpte_removebolted callback in ppc_md returns void and will BUG_ON() if the hpte it's asked to remove doesn't exist in the first place. This is awkward for the case of cleaning up a mapping which was partially made before failing. So, we add a return value to hpte_removebolted, and have it return ENOENT in the case that the HPTE to remove didn't exist in the first place. In the (sole) caller, we propagate errors in hpte_removebolted to its caller to handle. However, we handle ENOENT specially, continuing to complete the unmapping over the specified range before returning the error to the caller. This means that htab_remove_mapping() will work sanely on a partially present mapping, removing any HPTEs which are present, while also returning ENOENT to its caller in case it's important there. There are two callers of htab_remove_mapping(): - In remove_section_mapping() we already WARN_ON() any error return, which is reasonable - in this case the mapping should be fully present - In vmemmap_remove_mapping() we BUG_ON() any error. We change that to just a WARN_ON() in the case of ENOENT, since failing to remove a mapping that wasn't there in the first place probably shouldn't be fatal. Signed-off-by:
David Gibson <david@gibson.dropbear.id.au> Reviewed-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-