• Aneesh Kumar K.V's avatar
    libnvdimm/dax: Pick the right alignment default when creating dax devices · f5376699
    Aneesh Kumar K.V authored
    Allow arch to provide the supported alignments and use hugepage alignment only
    if we support hugepage. Right now we depend on compile time configs whereas this
    patch switch this to runtime discovery.
    
    Architectures like ppc64 can have THP enabled in code, but then can have
    hugepage size disabled by the hypervisor. This allows us to create dax devices
    with PAGE_SIZE alignment in this case.
    
    Existing dax namespace with alignment larger than PAGE_SIZE will fail to
    initialize in this specific case. We still allow fsdax namespace initialization.
    
    With respect to identifying whether to enable hugepage fault for a dax device,
    if THP is enabled during compile, we default to taking hugepage fault and in dax
    fault handler if we find the fault size > alignment we retry with PAGE_SIZE
    fault size.
    
    This also addresses the below failure scenario on ppc64
    
    ndctl create-namespace --mode=devdax  | grep align
     "align":16777216,
     "align":16777216
    
    cat /sys/devices/ndbus0/region0/dax0.0/supported_alignments
     65536 16777216
    
    daxio.static-debug  -z -o /dev/dax0.0
      Bus error (core dumped)
    
      $ dmesg | tail
       lpar: Failed hash pte insert with error -4
       hash-mmu: mm: Hashing failure ! EA=0x7fff17000000 access=0x8000000000000006 current=daxio
       hash-mmu:     trap=0x300 vsid=0x22cb7a3a
    
     ssize=1 base psize=2 psize 10 pte=0xc000000501002b86
       daxio[3860]: bus error (7) at 7fff17000000 nip 7fff973c007c lr 7fff973bff34 code 2 in libpmem.so.1.0.0[7fff973b0000+20000]
       daxio[3860]: code: 792945e4 7d494b78 e95f0098 7d494b78 f93f00a0 4800012c e93f0088 f93f0120
       daxio[3860]: code: e93f00a0 f93f0128 e93f0120 e95f0128 <f9490000> e93f0088 39290008 f93f0110
    
    The failure was due to guest kernel using wrong page size.
    
    The namespaces created with 16M alignment will appear as below on a config with
    16M page size disabled.
    
    $ ndctl list -Ni
    [
      {
        "dev":"namespace0.1",
        "mode":"fsdax",
        "map":"dev",
        "size":5351931904,
        "uuid":"fc6e9667-461a-4718-82b4-69b24570bddb",
        "align":16777216,
        "blockdev":"pmem0.1",
        "supported_alignments":[
          65536
        ]
      },
      {
        "dev":"namespace0.0",
        "mode":"fsdax",    <==== devdax 16M alignment marked disabled.
        "map":"mem",
        "size":5368709120,
        "uuid":"a4bdf81a-f2ee-4bc6-91db-7b87eddd0484",
        "state":"disabled"
      }
    ]
    
    Cc: linux-mm@kvack.org
    Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
    Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
    Link: https://lore.kernel.org/r/20190905154603.10349-8-aneesh.kumar@linux.ibm.com
    
    Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
    f5376699