• Aneesh Kumar K.V's avatar
    powerpc/mm/hash: Fix sharing context ids between kernel & userspace · 5d2e5dd5
    Aneesh Kumar K.V authored
    Commit 0034d395 ("powerpc/mm/hash64: Map all the kernel regions in
    the same 0xc range") has a bug in the definition of MIN_USER_CONTEXT.
    
    The result is that the context id used for the vmemmap and the lowest
    context id handed out to userspace are the same. The context id is
    essentially the process identifier as far as the first stage of the
    MMU translation is concerned.
    
    This can result in multiple SLB entries with the same VSID (Virtual
    Segment ID), accessible to the kernel and some random userspace
    process that happens to get the overlapping id, which is not expected
    eg:
    
      07 c00c000008000000 40066bdea7000500  1T  ESID=   c00c00  VSID=      66bdea7 LLP:100
      12 0002000008000000 40066bdea7000d80  1T  ESID=      200  VSID=      66bdea7 LLP:100
    
    Even though the user process and the kernel use the same VSID, the
    permissions in the hash page table prevent the user process from
    reading or writing to any kernel mappings.
    
    It can also lead to SLB entries with different base page size
    encodings (LLP), eg:
    
      05 c00c000008000000 00006bde0053b500 256M ESID=c00c00000  VSID=    6bde0053b LLP:100
      09 0000000008000000 00006bde0053bc80 256M ESID=        0  VSID=    6bde0053b LLP:  0
    
    Such SLB entries can result in machine checks, eg. as seen on a G5:
    
      Oops: Machine check, sig: 7 [#1]
      BE PAGE SIZE=64K MU-Hash SMP NR_CPUS=4 NUMA Power Mac
      NIP: c00000000026f248 LR: c000000000295e58 CTR: 0000000000000000
      REGS: c0000000erfd3d70 TRAP: 0200 Tainted: G M (5.5.0-rcl-gcc-8.2.0-00010-g228b667d8ea1)
      MSR: 9000000000109032 <SF,HV,EE,ME,IR,DR,RI> CR: 24282048 XER: 00000000
      DAR: c00c000000612c80 DSISR: 00000400 IRQMASK: 0
      ...
      NIP [c00000000026f248] .kmem_cache_free+0x58/0x140
      LR  [c088000008295e58] .putname 8x88/0xa
      Call Trace:
        .putname+0xB8/0xa
        .filename_lookup.part.76+0xbe/0x160
        .do_faccessat+0xe0/0x380
        system_call+0x5c/ex68
    
    This happens with 256MB segments and 64K pages, as the duplicate VSID
    is hit with the first vmemmap segment and the first user segment, and
    older 32-bit userspace maps things in the first user segment.
    
    On other CPUs a machine check is not seen. Instead the userspace
    process can get stuck continuously faulting, with the fault never
    properly serviced, due to the kernel not understanding that there is
    already a HPTE for the address but with inaccessible permissions.
    
    On machines with 1T segments we've not seen the bug hit other than by
    deliberately exercising it. That seems to be just a matter of luck
    though, due to the typical layout of the user virtual address space
    and the ranges of vmemmap that are typically populated.
    
    To fix it we add 2 to MIN_USER_CONTEXT. This ensures the lowest
    context given to userspace doesn't overlap with the VMEMMAP context,
    or with the context for INVALID_REGION_ID.
    
    Fixes: 0034d395
    
     ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range")
    Cc: stable@vger.kernel.org # v5.2+
    Reported-by: default avatarChristian Marillat <marillat@debian.org>
    Reported-by: default avatarRomain Dolbeau <romain@dolbeau.org>
    Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
    [mpe: Account for INVALID_REGION_ID, mostly rewrite change log]
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20200123102547.11623-1-mpe@ellerman.id.au
    5d2e5dd5