1. 24 Sep, 2019 6 commits
    • Aneesh Kumar K.V's avatar
      libnvdimm/region: Enable MAP_SYNC for volatile regions · 4c806b89
      Aneesh Kumar K.V authored
      
      
      Some environments want to use a host tmpfs/ramdisk to back guest pmem.
      While the data is not persisted relative to the host it *is* persisted
      relative to guest crashes / reboots. The guest is free to use dax and
      MAP_SYNC to keep filesystem metadata consistent with dax accesses
      without requiring guest fsync(). The guest can also observe that the
      region is volatile and skip cache flushing as global visibility is
      enough to "persist" data relative to the host staying alive over guest
      reset events.
      
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Reviewed-by: default avatarPankaj Gupta <pagupta@redhat.com>
      Link: https://lore.kernel.org/r/20190924114327.14700-1-aneesh.kumar@linux.ibm.com
      
      
      [djbw: reword the changelog]
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      4c806b89
    • Dave Jiang's avatar
      libnvdimm: prevent nvdimm from requesting key when security is disabled · 674f31a3
      Dave Jiang authored
      Current implementation attempts to request keys from the keyring even when
      security is not enabled. Change behavior so when security is disabled it
      will skip key request.
      
      Error messages seen when no keys are installed and libnvdimm is loaded:
      
          request-key[4598]: Cannot find command to construct key 661489677
          request-key[4606]: Cannot find command to construct key 34713726
      
      Cc: stable@vger.kernel.org
      Fixes: 4c6926a2
      
       ("acpi/nfit, libnvdimm: Add unlock of nvdimm support for Intel DIMMs")
      Signed-off-by: default avatarDave Jiang <dave.jiang@intel.com>
      Link: https://lore.kernel.org/r/156934642272.30222.5230162488753445916.stgit@djiang5-desk3.ch.intel.com
      
      
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      674f31a3
    • Aneesh Kumar K.V's avatar
      libnvdimm/region: Initialize bad block for volatile namespaces · c42adf87
      Aneesh Kumar K.V authored
      
      
      We do check for a bad block during namespace init and that use
      region bad block list. We need to initialize the bad block
      for volatile regions for this to work. We also observe a lockdep
      warning as below because the lock is not initialized correctly
      since we skip bad block init for volatile regions.
      
       INFO: trying to register non-static key.
       the code is fine but needs lockdep annotation.
       turning off the locking correctness validator.
       CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc1-15699-g3dee241c937e #149
       Call Trace:
       [c0000000f95cb250] [c00000000147dd84] dump_stack+0xe8/0x164 (unreliable)
       [c0000000f95cb2a0] [c00000000022ccd8] register_lock_class+0x308/0xa60
       [c0000000f95cb3a0] [c000000000229cc0] __lock_acquire+0x170/0x1ff0
       [c0000000f95cb4c0] [c00000000022c740] lock_acquire+0x220/0x270
       [c0000000f95cb580] [c000000000a93230] badblocks_check+0xc0/0x290
       [c0000000f95cb5f0] [c000000000d97540] nd_pfn_validate+0x5c0/0x7f0
       [c0000000f95cb6d0] [c000000000d98300] nd_dax_probe+0xd0/0x1f0
       [c0000000f95cb760] [c000000000d9b66c] nd_pmem_probe+0x10c/0x160
       [c0000000f95cb790] [c000000000d7f5ec] nvdimm_bus_probe+0x10c/0x240
       [c0000000f95cb820] [c000000000d0f844] really_probe+0x254/0x4e0
       [c0000000f95cb8b0] [c000000000d0fdfc] driver_probe_device+0x16c/0x1e0
       [c0000000f95cb930] [c000000000d10238] device_driver_attach+0x68/0xa0
       [c0000000f95cb970] [c000000000d1040c] __driver_attach+0x19c/0x1c0
       [c0000000f95cb9f0] [c000000000d0c4c4] bus_for_each_dev+0x94/0x130
       [c0000000f95cba50] [c000000000d0f014] driver_attach+0x34/0x50
       [c0000000f95cba70] [c000000000d0e208] bus_add_driver+0x178/0x2f0
       [c0000000f95cbb00] [c000000000d117c8] driver_register+0x108/0x170
       [c0000000f95cbb70] [c000000000d7edb0] __nd_driver_register+0xe0/0x100
       [c0000000f95cbbd0] [c000000001a6baa4] nd_pmem_driver_init+0x34/0x48
       [c0000000f95cbbf0] [c0000000000106f4] do_one_initcall+0x1d4/0x4b0
       [c0000000f95cbcd0] [c0000000019f499c] kernel_init_freeable+0x544/0x65c
       [c0000000f95cbdb0] [c000000000010d6c] kernel_init+0x2c/0x180
       [c0000000f95cbe20] [c00000000000b954] ret_from_kernel_thread+0x5c/0x68
      
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Link: https://lore.kernel.org/r/20190919083355.26340-1-aneesh.kumar@linux.ibm.com
      
      
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      c42adf87
    • Aneesh Kumar K.V's avatar
      libnvdimm/altmap: Track namespace boundaries in altmap · cf387d96
      Aneesh Kumar K.V authored
      With PFN_MODE_PMEM namespace, the memmap area is allocated from the device
      area. Some architectures map the memmap area with large page size. On
      architectures like ppc64, 16MB page for memap mapping can map 262144 pfns.
      This maps a namespace size of 16G.
      
      When populating memmap region with 16MB page from the device area,
      make sure the allocated space is not used to map resources outside this
      namespace. Such usage of device area will prevent a namespace destroy.
      
      Add resource end pnf in altmap and use that to check if the memmap area
      allocation can map pfn outside the namespace. On ppc64 in such case we fallback
      to allocation from memory.
      
      This fix kernel crash reported below:
      
      [  132.034989] WARNING: CPU: 13 PID: 13719 at mm/memremap.c:133 devm_memremap_pages_release+0x2d8/0x2e0
      [  133.464754] BUG: Unable to handle kernel data access at 0xc00c00010b204000
      [  133.464760] Faulting instruction address: 0xc00000000007580c
      [  133.464766] Oops: Kernel access of bad area, sig: 11 [#1]
      [  133.464771] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
      .....
      [  133.464901] NIP [c00000000007580c] vmemmap_free+0x2ac/0x3d0
      [  133.464906] LR [c0000000000757f8] vmemmap_free+0x298/0x3d0
      [  133.464910] Call Trace:
      [  133.464914] [c000007cbfd0f7b0] [c0000000000757f8] vmemmap_free+0x298/0x3d0 (unreliable)
      [  133.464921] [c000007cbfd0f8d0] [c000000000370a44] section_deactivate+0x1a4/0x240
      [  133.464928] [c000007cbfd0f980] [c000000000386270] __remove_pages+0x3a0/0x590
      [  133.464935] [c000007cbfd0fa50] [c000000000074158] arch_remove_memory+0x88/0x160
      [  133.464942] [c000007cbfd0fae0] [c0000000003be8c0] devm_memremap_pages_release+0x150/0x2e0
      [  133.464949] [c000007cbfd0fb70] [c000000000738ea0] devm_action_release+0x30/0x50
      [  133.464955] [c000007cbfd0fb90] [c00000000073a5a4] release_nodes+0x344/0x400
      [  133.464961] [c000007cbfd0fc40] [c00000000073378c] device_release_driver_internal+0x15c/0x250
      [  133.464968] [c000007cbfd0fc80] [c00000000072fd14] unbind_store+0x104/0x110
      [  133.464973] [c000007cbfd0fcd0] [c00000000072ee24] drv_attr_store+0x44/0x70
      [  133.464981] [c000007cbfd0fcf0] [c0000000004a32bc] sysfs_kf_write+0x6c/0xa0
      [  133.464987] [c000007cbfd0fd10] [c0000000004a1dfc] kernfs_fop_write+0x17c/0x250
      [  133.464993] [c000007cbfd0fd60] [c0000000003c348c] __vfs_write+0x3c/0x70
      [  133.464999] [c000007cbfd0fd80] [c0000000003c75d0] vfs_write+0xd0/0x250
      
      djbw: Aneesh notes that this crash can likely be triggered in any kernel that
      supports 'papr_scm', so flagging that commit for -stable consideration.
      
      Fixes: b5beae5e
      
       ("powerpc/pseries: Add driver for PAPR SCM regions")
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarSachin Sant <sachinp@linux.vnet.ibm.com>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Reviewed-by: default avatarPankaj Gupta <pagupta@redhat.com>
      Tested-by: default avatarSantosh Sivaraj <santosh@fossix.org>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Link: https://lore.kernel.org/r/20190910062826.10041-1-aneesh.kumar@linux.ibm.com
      
      
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      cf387d96
    • Aneesh Kumar K.V's avatar
      libnvdimm: Fix endian conversion issues  · 86aa6668
      Aneesh Kumar K.V authored
      nd_label->dpa issue was observed when trying to enable the namespace created
      with little-endian kernel on a big-endian kernel. That made me run
      `sparse` on the rest of the code and other changes are the result of that.
      
      Fixes: d9b83c75 ("libnvdimm, btt: rework error clearing")
      Fixes: 9dedc73a
      
       ("libnvdimm/btt: Fix LBA masking during 'free list' population")
      Reviewed-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Link: https://lore.kernel.org/r/20190809074726.27815-1-aneesh.kumar@linux.ibm.com
      
      
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      86aa6668
    • Aneesh Kumar K.V's avatar
      libnvdimm/dax: Pick the right alignment default when creating dax devices · f5376699
      Aneesh Kumar K.V authored
      Allow arch to provide the supported alignments and use hugepage alignment only
      if we support hugepage. Right now we depend on compile time configs whereas this
      patch switch this to runtime discovery.
      
      Architectures like ppc64 can have THP enabled in code, but then can have
      hugepage size disabled by the hypervisor. This allows us to create dax devices
      with PAGE_SIZE alignment in this case.
      
      Existing dax namespace with alignment larger than PAGE_SIZE will fail to
      initialize in this specific case. We still allow fsdax namespace initialization.
      
      With respect to identifying whether to enable hugepage fault for a dax device,
      if THP is enabled during compile, we default to taking hugepage fault and in dax
      fault handler if we find the fault size > alignment we retry with PAGE_SIZE
      fault size.
      
      This also addresses the below failure scenario on ppc64
      
      ndctl create-namespace --mode=devdax  | grep align
       "align":16777216,
       "align":16777216
      
      cat /sys/devices/ndbus0/region0/dax0.0/supported_alignments
       65536 16777216
      
      daxio.static-debug  -z -o /dev/dax0.0
        Bus error (core dumped)
      
        $ dmesg | tail
         lpar: Failed hash pte insert with error -4
         hash-mmu: mm: Hashing failure ! EA=0x7fff17000000 access=0x8000000000000006 current=daxio
         hash-mmu:     trap=0x300 vsid=0x22cb7a3a
      
       ssize=1 base psize=2 psize 10 pte=0xc000000501002b86
         daxio[3860]: bus error (7) at 7fff17000000 nip 7fff973c007c lr 7fff973bff34 code 2 in libpmem.so.1.0.0[7fff973b0000+20000]
         daxio[3860]: code: 792945e4 7d494b78 e95f0098 7d494b78 f93f00a0 4800012c e93f0088 f93f0120
         daxio[3860]: code: e93f00a0 f93f0128 e93f0120 e95f0128 <f9490000> e93f0088 39290008 f93f0110
      
      The failure was due to guest kernel using wrong page size.
      
      The namespaces created with 16M alignment will appear as below on a config with
      16M page size disabled.
      
      $ ndctl list -Ni
      [
        {
          "dev":"namespace0.1",
          "mode":"fsdax",
          "map":"dev",
          "size":5351931904,
          "uuid":"fc6e9667-461a-4718-82b4-69b24570bddb",
          "align":16777216,
          "blockdev":"pmem0.1",
          "supported_alignments":[
            65536
          ]
        },
        {
          "dev":"namespace0.0",
          "mode":"fsdax",    <==== devdax 16M alignment marked disabled.
          "map":"mem",
          "size":5368709120,
          "uuid":"a4bdf81a-f2ee-4bc6-91db-7b87eddd0484",
          "state":"disabled"
        }
      ]
      
      Cc: linux-mm@kvack.org
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Link: https://lore.kernel.org/r/20190905154603.10349-8-aneesh.kumar@linux.ibm.com
      
      
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      f5376699
  2. 07 Sep, 2019 1 commit
  3. 05 Sep, 2019 6 commits
  4. 29 Aug, 2019 4 commits
  5. 28 Aug, 2019 1 commit
  6. 14 Aug, 2019 1 commit
  7. 19 Jul, 2019 2 commits
    • Dan Williams's avatar
      libnvdimm/pfn: stop padding pmem namespaces to section alignment · a3619190
      Dan Williams authored
      Now that the mm core supports section-unaligned hotplug of ZONE_DEVICE
      memory, we no longer need to add padding at pfn/dax device creation
      time.  The kernel will still honor padding established by older kernels.
      
      Link: http://lkml.kernel.org/r/156092356588.979959.6793371748950931916.stgit@dwillia2-desk3.amr.corp.intel.com
      
      
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reported-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>	[ppc64]
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Jane Chu <jane.chu@oracle.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Wei Yang <richardw.yang@linux.intel.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a3619190
    • Dan Williams's avatar
      libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields · 7e3e888d
      Dan Williams authored
      At namespace creation time there is the potential for the "expected to
      be zero" fields of a 'pfn' info-block to be filled with indeterminate
      data.  While the kernel buffer is zeroed on allocation it is immediately
      overwritten by nd_pfn_validate() filling it with the current contents of
      the on-media info-block location.  For fields like, 'flags' and the
      'padding' it potentially means that future implementations can not rely on
      those fields being zero.
      
      In preparation to stop using the 'start_pad' and 'end_trunc' fields for
      section alignment, arrange for fields that are not explicitly
      initialized to be guaranteed zero.  Bump the minor version to indicate
      it is safe to assume the 'padding' and 'flags' are zero.  Otherwise,
      this corruption is expected to benign since all other critical fields
      are explicitly initialized.
      
      Note The cc: stable is about spreading this new policy to as many
      kernels as possible not fixing an issue in those kernels.  It is not
      until the change titled "libnvdimm/pfn: Stop padding pmem namespaces to
      section alignment" where this improper initialization becomes a problem.
      So if someone decides to backport "libnvdimm/pfn: Stop padding pmem
      namespaces to section alignment" (which is not tagged for stable), make
      sure this pre-requisite is flagged.
      
      Link: http://lkml.kernel.org/r/156092356065.979959.6681003754765958296.stgit@dwillia2-desk3.amr.corp.intel.com
      Fixes: 32ab0a3f
      
       ("libnvdimm, pmem: 'struct page' for pmem")
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>	[ppc64]
      Cc: <stable@vger.kernel.org>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Jane Chu <jane.chu@oracle.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Wei Yang <richardw.yang@linux.intel.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7e3e888d
  8. 18 Jul, 2019 6 commits
    • Dan Williams's avatar
      driver-core, libnvdimm: Let device subsystems add local lockdep coverage · 87a30e1f
      Dan Williams authored
      
      
      For good reason, the standard device_lock() is marked
      lockdep_set_novalidate_class() because there is simply no sane way to
      describe the myriad ways the device_lock() ordered with other locks.
      However, that leaves subsystems that know their own local device_lock()
      ordering rules to find lock ordering mistakes manually. Instead,
      introduce an optional / additional lockdep-enabled lock that a subsystem
      can acquire in all the same paths that the device_lock() is acquired.
      
      A conversion of the NFIT driver and NVDIMM subsystem to a
      lockdep-validate device_lock() scheme is included. The
      debug_nvdimm_lock() implementation implements the correct lock-class and
      stacking order for the libnvdimm device topology hierarchy.
      
      Yes, this is a hack, but hopefully it is a useful hack for other
      subsystems device_lock() debug sessions. Quoting Greg:
      
          "Yeah, it feels a bit hacky but it's really up to a subsystem to mess up
           using it as much as anything else, so user beware :)
      
           I don't object to it if it makes things easier for you to debug."
      
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Link: https://lore.kernel.org/r/156341210661.292348.7014034644265455704.stgit@dwillia2-desk3.amr.corp.intel.com
      87a30e1f
    • Dan Williams's avatar
      libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock · ca6bf264
      Dan Williams authored
      A multithreaded namespace creation/destruction stress test currently
      deadlocks with the following lockup signature:
      
          INFO: task ndctl:2924 blocked for more than 122 seconds.
                Tainted: G           OE     5.2.0-rc4+ #3382
          "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
          ndctl           D    0  2924   1176 0x00000000
          Call Trace:
           ? __schedule+0x27e/0x780
           schedule+0x30/0xb0
           wait_nvdimm_bus_probe_idle+0x8a/0xd0 [libnvdimm]
           ? finish_wait+0x80/0x80
           uuid_store+0xe6/0x2e0 [libnvdimm]
           kernfs_fop_write+0xf0/0x1a0
           vfs_write+0xb7/0x1b0
           ksys_write+0x5c/0xd0
           do_syscall_64+0x60/0x240
      
           INFO: task ndctl:2923 blocked for more than 122 seconds.
                 Tainted: G           OE     5.2.0-rc4+ #3382
           "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
           ndctl           D    0  2923   1175 0x00000000
           Call Trace:
            ? __schedule+0x27e/0x780
            ? __mutex_lock+0x489/0x910
            schedule+0x30/0xb0
            schedule_preempt_disabled+0x11/0x20
            __mutex_lock+0x48e/0x910
            ? nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
            ? __lock_acquire+0x23f/0x1710
            ? nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
            nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
            __dax_pmem_probe+0x5e/0x210 [dax_pmem_core]
            ? nvdimm_bus_probe+0x1d0/0x2c0 [libnvdimm]
            dax_pmem_probe+0xc/0x20 [dax_pmem]
            nvdimm_bus_probe+0x90/0x2c0 [libnvdimm]
            really_probe+0xef/0x390
            driver_probe_device+0xb4/0x100
      
      In this sequence an 'nd_dax' device is being probed and trying to take
      the lock on its backing namespace to validate that the 'nd_dax' device
      indeed has exclusive access to the backing namespace. Meanwhile, another
      thread is trying to update the uuid property of that same backing
      namespace. So one thread is in the probe path trying to acquire the
      lock, and the other thread has acquired the lock and tries to flush the
      probe path.
      
      Fix this deadlock by not holding the namespace device_lock over the
      wait_nvdimm_bus_probe_idle() synchronization step. In turn this requires
      the device_lock to be held on entry to wait_nvdimm_bus_probe_idle() and
      subsequently dropped internally to wait_nvdimm_bus_probe_idle().
      
      Cc: <stable@vger.kernel.org>
      Fixes: bf9bccc1
      
       ("libnvdimm: pmem label sets and namespace instantiation")
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Tested-by: default avatarJane Chu <jane.chu@oracle.com>
      Link: https://lore.kernel.org/r/156341210094.292348.2384694131126767789.stgit@dwillia2-desk3.amr.corp.intel.com
      
      
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      ca6bf264
    • Dan Williams's avatar
      libnvdimm/bus: Stop holding nvdimm_bus_list_mutex over __nd_ioctl() · b70d31d0
      Dan Williams authored
      In preparation for fixing a deadlock between wait_for_bus_probe_idle()
      and the nvdimm_bus_list_mutex arrange for __nd_ioctl() without
      nvdimm_bus_list_mutex held. This also unifies the 'dimm' and 'bus' level
      ioctls into a common nd_ioctl() preamble implementation.
      
      Marked for -stable as it is a pre-requisite for a follow-on fix.
      
      Cc: <stable@vger.kernel.org>
      Fixes: bf9bccc1
      
       ("libnvdimm: pmem label sets and namespace instantiation")
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Tested-by: default avatarJane Chu <jane.chu@oracle.com>
      Link: https://lore.kernel.org/r/156341209518.292348.7183897251740665198.stgit@dwillia2-desk3.amr.corp.intel.com
      
      
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      b70d31d0
    • Dan Williams's avatar
      libnvdimm/bus: Prepare the nd_ioctl() path to be re-entrant · 6de5d06e
      Dan Williams authored
      
      
      In preparation for not holding a lock over the execution of nd_ioctl(),
      update the implementation to allow multiple threads to be attempting
      ioctls at the same time. The bus lock still prevents multiple in-flight
      ->ndctl() invocations from corrupting each other's state, but static
      global staging buffers are moved to the heap.
      
      Reported-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Reviewed-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Tested-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Link: https://lore.kernel.org/r/156341208947.292348.10560140326807607481.stgit@dwillia2-desk3.amr.corp.intel.com
      
      
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      6de5d06e
    • Dan Williams's avatar
      libnvdimm/region: Register badblocks before namespaces · 700cd033
      Dan Williams authored
      Namespace activation expects to be able to reference region badblocks.
      The following warning sometimes triggers when asynchronous namespace
      activation races in front of the completion of namespace probing. Move
      all possible namespace probing after region badblocks initialization.
      
      Otherwise, lockdep sometimes catches the uninitialized state of the
      badblocks seqlock with stack trace signatures like:
      
          INFO: trying to register non-static key.
          pmem2: detected capacity change from 0 to 136365211648
          the code is fine but needs lockdep annotation.
          turning off the locking correctness validator.
          CPU: 9 PID: 358 Comm: kworker/u80:5 Tainted: G           OE     5.2.0-rc4+ #3382
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
          Workqueue: events_unbound async_run_entry_fn
          Call Trace:
           dump_stack+0x85/0xc0
          pmem1.12: detected capacity change from 0 to 8589934592
           register_lock_class+0x56a/0x570
           ? check_object+0x140/0x270
           __lock_acquire+0x80/0x1710
           ? __mutex_lock+0x39d/0x910
           lock_acquire+0x9e/0x180
           ? nd_pfn_validate+0x28f/0x440 [libnvdimm]
           badblocks_check+0x93/0x1f0
           ? nd_pfn_validate+0x28f/0x440 [libnvdimm]
           nd_pfn_validate+0x28f/0x440 [libnvdimm]
           ? lockdep_hardirqs_on+0xf0/0x180
           nd_dax_probe+0x9a/0x120 [libnvdimm]
           nd_pmem_probe+0x6d/0x180 [nd_pmem]
           nvdimm_bus_probe+0x90/0x2c0 [libnvdimm]
      
      Fixes: 48af2f7e
      
       ("libnvdimm, pfn: during init, clear errors...")
      Cc: <stable@vger.kernel.org>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Reviewed-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Link: https://lore.kernel.org/r/156341208365.292348.1547528796026249120.stgit@dwillia2-desk3.amr.corp.intel.com
      
      
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      700cd033
    • Dan Williams's avatar
      libnvdimm/bus: Prevent duplicate device_unregister() calls · 8aac0e23
      Dan Williams authored
      
      
      A multithreaded namespace creation/destruction stress test currently
      fails with signatures like the following:
      
          sysfs group 'power' not found for kobject 'dax1.1'
          RIP: 0010:sysfs_remove_group+0x76/0x80
          Call Trace:
           device_del+0x73/0x370
           device_unregister+0x16/0x50
           nd_async_device_unregister+0x1e/0x30 [libnvdimm]
           async_run_entry_fn+0x39/0x160
           process_one_work+0x23c/0x5e0
           worker_thread+0x3c/0x390
      
          BUG: kernel NULL pointer dereference, address: 0000000000000020
          RIP: 0010:klist_put+0x1b/0x6c
          Call Trace:
           klist_del+0xe/0x10
           device_del+0x8a/0x2c9
           ? __switch_to_asm+0x34/0x70
           ? __switch_to_asm+0x40/0x70
           device_unregister+0x44/0x4f
           nd_async_device_unregister+0x22/0x2d [libnvdimm]
           async_run_entry_fn+0x47/0x15a
           process_one_work+0x1a2/0x2eb
           worker_thread+0x1b8/0x26e
      
      Use the kill_device() helper to atomically resolve the race of multiple
      threads issuing kill, device_unregister(), requests.
      
      Reported-by: default avatarJane Chu <jane.chu@oracle.com>
      Reported-by: default avatarErwin Tsaur <erwin.tsaur@oracle.com>
      Fixes: 4d88a97a ("libnvdimm, nvdimm: dimm driver and base libnvdimm device-driver...")
      Cc: <stable@vger.kernel.org>
      Link: https://github.com/pmem/ndctl/issues/96
      
      
      Tested-by: default avatarTested-by: Jane Chu <jane.chu@oracle.com>
      Link: https://lore.kernel.org/r/156341207846.292348.10435719262819764054.stgit@dwillia2-desk3.amr.corp.intel.com
      
      
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      8aac0e23
  9. 17 Jul, 2019 1 commit
  10. 15 Jul, 2019 2 commits
  11. 11 Jul, 2019 1 commit
  12. 05 Jul, 2019 4 commits
  13. 02 Jul, 2019 5 commits