1. 25 Aug, 2019 6 commits
    • Jason Xing's avatar
      psi: get poll_work to run when calling poll syscall next time · 7b2b55da
      Jason Xing authored
      Only when calling the poll syscall the first time can user receive
      POLLPRI correctly.  After that, user always fails to acquire the event
      Reproduce case:
       1. Get the monitor code in Documentation/accounting/psi.txt
       2. Run it, and wait for the event triggered.
       3. Kill and restart the process.
      The question is why we can end up with poll_scheduled = 1 but the work
      not running (which would reset it to 0).  And the answer is because the
      scheduling side sees group->poll_kworker under RCU protection and then
      schedules it, but here we cancel the work and destroy the worker.  The
      cancel needs to pair with resetting the poll_scheduled flag.
      Link: http://lkml.kernel.org/r/1566357985-97781-1-git-send-email-joseph.qi@linux.alibaba.com
      Signed-off-by: default avatarJason Xing <kerneljasonxing@linux.alibaba.com>
      Signed-off-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Reviewed-by: default avatarCaspar Zhang <caspar@linux.alibaba.com>
      Reviewed-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Roman Gushchin's avatar
      mm: memcontrol: flush percpu vmevents before releasing memcg · bb65f89b
      Roman Gushchin authored
      Similar to vmstats, percpu caching of local vmevents leads to an
      accumulation of errors on non-leaf levels.  This happens because some
      leftovers may remain in percpu caches, so that they are never propagated
      up by the cgroup tree and just disappear into nonexistence with on
      releasing of the memory cgroup.
      To fix this issue let's accumulate and propagate percpu vmevents values
      before releasing the memory cgroup similar to what we're doing with
      Since on cpu hotplug we do flush percpu vmstats anyway, we can iterate
      only over online cpus.
      Link: http://lkml.kernel.org/r/20190819202338.363363-4-guro@fb.com
      Fixes: 42a30035
       ("mm: memcontrol: fix recursive statistics correctness & scalabilty")
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Roman Gushchin's avatar
      mm: memcontrol: flush percpu vmstats before releasing memcg · c350a99e
      Roman Gushchin authored
      Percpu caching of local vmstats with the conditional propagation by the
      cgroup tree leads to an accumulation of errors on non-leaf levels.
      Let's imagine two nested memory cgroups A and A/B.  Say, a process
      belonging to A/B allocates 100 pagecache pages on the CPU 0.  The percpu
      cache will spill 3 times, so that 32*3=96 pages will be accounted to A/B
      and A atomic vmstat counters, 4 pages will remain in the percpu cache.
      Imagine A/B is nearby memory.max, so that every following allocation
      triggers a direct reclaim on the local CPU.  Say, each such attempt will
      free 16 pages on a new cpu.  That means every percpu cache will have -16
      pages, except the first one, which will have 4 - 16 = -12.  A/B and A
      atomic counters will not be touched at all.
      Now a user removes A/B.  All percpu caches are freed and corresponding
      vmstat numbers are forgotten.  A has 96 pages more than expected.
      As memory cgroups are created and destroyed, errors do accumulate.  Even
      1-2 pages differences can accumulate into large numbers.
      To fix this issue let's accumulate and propagate percpu vmstat values
      before releasing the memory cgroup.  At this point these numbers are
      stable and cannot be changed.
      Since on cpu hotplug we do flush percpu vmstats anyway, we can iterate
      only over online cpus.
      Link: http://lkml.kernel.org/r/20190819202338.363363-2-guro@fb.com
      Fixes: 42a30035
       ("mm: memcontrol: fix recursive statistics correctness & scalabilty")
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Qian Cai's avatar
      parisc: fix compilation errrors · bbcb03a9
      Qian Cai authored
      Commit 0cfaee2a ("include/asm-generic/5level-fixup.h: fix variable
      'p4d' set but not used") converted a few functions from macros to static
      inline, which causes parisc to complain,
        In file included from include/asm-generic/4level-fixup.h:38:0,
                         from arch/parisc/include/asm/pgtable.h:5,
                         from arch/parisc/include/asm/io.h:6,
                         from include/linux/io.h:13,
                         from sound/core/memory.c:9:
        include/asm-generic/5level-fixup.h:14:18: error: unknown type name 'pgd_t'; did you mean 'pid_t'?
         #define p4d_t    pgd_t
        include/asm-generic/5level-fixup.h:24:28: note: in expansion of macro 'p4d_t'
         static inline int p4d_none(p4d_t p4d)
      It is because "4level-fixup.h" is included before "asm/page.h" where
      "pgd_t" is defined.
      Link: http://lkml.kernel.org/r/20190815205305.1382-1-cai@lca.pw
      Fixes: 0cfaee2a
       ("include/asm-generic/5level-fixup.h: fix variable 'p4d' set but not used")
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • David Rientjes's avatar
      mm, page_alloc: move_freepages should not examine struct page of reserved memory · cd961038
      David Rientjes authored
      After commit 907ec5fc ("mm: zero remaining unavailable struct
      pages"), struct page of reserved memory is zeroed.  This causes
      page->flags to be 0 and fixes issues related to reading
      /proc/kpageflags, for example, of reserved memory.
      The VM_BUG_ON() in move_freepages_block(), however, assumes that
      page_zone() is meaningful even for reserved memory.  That assumption is
      no longer true after the aforementioned commit.
      There's no reason why move_freepages_block() should be testing the
      legitimacy of page_zone() for reserved memory; its scope is limited only
      to pages on the zone's freelist.
      Note that pfn_valid() can be true for reserved memory: there is a
      backing struct page.  The check for page_to_nid(page) is also buggy but
      reserved memory normally only appears on node 0 so the zeroing doesn't
      affect this.
      Move the debug checks to after verifying PageBuddy is true.  This
      isolates the scope of the checks to only be for buddy pages which are on
      the zone's freelist which move_freepages_block() is operating on.  In
      this case, an incorrect node or zone is a bug worthy of being warned
      about (and the examination of struct page is acceptable bcause this
      memory is not reserved).
      Why does move_freepages_block() gets called on reserved memory? It's
      simply math after finding a valid free page from the per-zone free area
      to use as fallback.  We find the beginning and end of the pageblock of
      the valid page and that can bring us into memory that was reserved per
      the e820.  pfn_valid() is still true (it's backed by a struct page), but
      since it's zero'd we shouldn't make any inferences here about comparing
      its node or zone.  The current node check just happens to succeed most
      of the time by luck because reserved memory typically appears on node 0.
      The fix here is to validate that we actually have buddy pages before
      testing if there's any type of zone or node strangeness going on.
      We noticed it almost immediately after bringing 907ec5fc in on
      CONFIG_DEBUG_VM builds.  It depends on finding specific free pages in
      the per-zone free area where the math in move_freepages() will bring the
      start or end pfn into reserved memory and wanting to claim that entire
      pageblock as a new migratetype.  So the path will be rare, require
      CONFIG_DEBUG_VM, and require fallback to a different migratetype.
      Some struct pages were already zeroed from reserve pages before
      907ec5fca3c so it theoretically could trigger before this commit.  I
      think it's rare enough under a config option that most people don't run
      that others may not have noticed.  I wouldn't argue against a stable tag
      and the backport should be easy enough, but probably wouldn't single out
      a commit that this is fixing.
      Mel said:
      : The overhead of the debugging check is higher with this patch although
      : it'll only affect debug builds and the path is not particularly hot.
      : If this was a concern, I think it would be reasonable to simply remove
      : the debugging check as the zone boundaries are checked in
      : move_freepages_block and we never expect a zone/node to be smaller than
      : a pageblock and stuck in the middle of another zone.
      Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1908122036560.10779@chino.kir.corp.google.com
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Henry Burns's avatar
      mm/z3fold.c: fix race between migration and destruction · d776aaa9
      Henry Burns authored
      In z3fold_destroy_pool() we call destroy_workqueue(&pool->compact_wq).
      However, we have no guarantee that migration isn't happening in the
      background at that time.
      Migration directly calls queue_work_on(pool->compact_wq), if destruction
      wins that race we are using a destroyed workqueue.
      Link: http://lkml.kernel.org/r/20190809213828.202833-1-henryburns@google.com
      Signed-off-by: default avatarHenry Burns <henryburns@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Jonathan Adams <jwadams@google.com>
      Cc: Henry Burns <henrywolfeburns@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  2. 24 Aug, 2019 7 commits
  3. 23 Aug, 2019 16 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 9140d8bd
      Linus Torvalds authored
      Pull rdma fixes from Doug Ledford:
       "No beating around the bush: this is a monster pull request for an -rc5
        kernel. Intel hit me with a series of fixes for TID processing.
        Mellanox hit me with a series for their UMR memory support.
        And we had one fix for siw that fixes the 32bit build warnings and
        because of the number of casts that had to be changed to properly
        silence the warnings, that one patch alone is a full 40% of the LOC of
        this entire pull request. Given that this is the initial release
        kernel for siw, I'm trying to fix anything in it that we can, so that
        adds to the impetus to take fixes for it like this one.
        I had to do a rebase early in the week. Jason had thought he put a
        patch on the rc queue that he needed to be there so he could base some
        work off of it, and it had actually not been placed there. So he asked
        me (on Tuesday) to fix that up before pushing my wip branch to the
        official rc branch. I did, and that's why the early patches look like
        they were all committed at the same time on Tuesday. That bunch had
        been in my queue prior.
        The various patches all pass my test for being legitimate fixes and
        not attempts to slide new features or development into a late rc.
        Well, they were all fixes with the exception of a couple clean up
        patches people wrote for making the fixes they also wrote better (like
        a cleanup patch to move UMR checking into a function so that the
        remaining UMR fix patches can reference that function), so I left
        those in place too.
        My apologies for the LOC count and the number of patches here, it's
        just how the cards fell this cycle.
         - Fix siw buffer mapping issue
         - Fix siw 32/64 casting issues
         - Fix a KASAN access issue in bnxt_re
         - Fix several memory leaks (hfi1, mlx4)
         - Fix a NULL deref in cma_cleanup
         - Fixes for UMR memory support in mlx5 (4 patch series)
         - Fix namespace check for restrack
         - Fixes for counter support
         - Fixes for hfi1 TID processing (5 patch series)
         - Fix potential NULL deref in siw
         - Fix memory page calculations in mlx5"
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (21 commits)
        RDMA/siw: Fix 64/32bit pointer inconsistency
        RDMA/siw: Fix SGL mapping issues
        RDMA/bnxt_re: Fix stack-out-of-bounds in bnxt_qplib_rcfw_send_message
        infiniband: hfi1: fix memory leaks
        infiniband: hfi1: fix a memory leak bug
        IB/mlx4: Fix memory leaks
        RDMA/cma: fix null-ptr-deref Read in cma_cleanup
        IB/mlx5: Block MR WR if UMR is not possible
        IB/mlx5: Fix MR re-registration flow to use UMR properly
        IB/mlx5: Report and handle ODP support properly
        IB/mlx5: Consolidate use_umr checks into single function
        RDMA/restrack: Rewrite PID namespace check to be reliable
        RDMA/counters: Properly implement PID checks
        IB/core: Fix NULL pointer dereference when bind QP to counter
        IB/hfi1: Drop stale TID RDMA packets that cause TIDErr
        IB/hfi1: Add additional checks when handling TID RDMA WRITE DATA packet
        IB/hfi1: Add additional checks when handling TID RDMA READ RESP packet
        IB/hfi1: Unsafe PSN checking for TID RDMA READ Resp packet
        IB/hfi1: Drop stale TID RDMA packets
        RDMA/siw: Fix potential NULL de-ref
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20190823' of git://git.kernel.dk/linux-block · b9bd6806
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Here's a set of fixes that should go into this release. This contains:
         - Three minor fixes for NVMe.
         - Three minor tweaks for the io_uring polling logic.
         - Officially mark Song as the MD maintainer, after he's been filling
           that role sucessfully for the last 6 months or so"
      * tag 'for-linus-20190823' of git://git.kernel.dk/linux-block:
        io_uring: add need_resched() check in inner poll loop
        md: update MAINTAINERS info
        io_uring: don't enter poll loop if we have CQEs pending
        nvme: Add quirk for LiteON CL1 devices running FW 22301111
        nvme: Fix cntlid validation when not using NVMEoF
        nvme-multipath: fix possible I/O hang when paths are updated
        io_uring: fix potential hang with polled IO
    • Linus Torvalds's avatar
      Merge tag 'for-5.3/dm-fixes-2' of... · dd469a45
      Linus Torvalds authored
      Merge tag 'for-5.3/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      Pull device mapper fixes from Mike Snitzer:
       - Revert a DM bufio change from during the 5.3 merge window now that a
         proper fix has been made to the block loopback driver.
       - Fix DM kcopyd to wakeup so failed subjobs get completed.
       - Various fixes to DM zoned target to address error handling, and other
         small tweaks (SPDX license identifiers and fix typos).
       - Fix DM integrity range locking race by tracking whether journal has
       - Fix DM dust target to detect reads of badblocks beyond the first 512b
         sector (applicable if blocksize is larger than 512b).
       - Fix DM persistent-data issue in both the DM btree and DM
         space-map-metadata interfaces.
       - Fix out of bounds memory access with certain DM table configurations.
      * tag 'for-5.3/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm table: fix invalid memory accesses with too high sector number
        dm space map metadata: fix missing store of apply_bops() return value
        dm btree: fix order of block initialization in btree_split_beneath
        dm raid: add missing cleanup in raid_ctr()
        dm zoned: fix potential NULL dereference in dmz_do_reclaim()
        dm dust: use dust block size for badblocklist index
        dm integrity: fix a crash due to BUG_ON in __journal_read_write()
        dm zoned: fix a few typos
        dm zoned: add SPDX license identifiers
        dm zoned: properly handle backing device failure
        dm zoned: improve error handling in i/o map code
        dm zoned: improve error handling in reclaim
        dm kcopyd: always complete failed jobs
        Revert "dm bufio: fix deadlock with loop device"
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.3-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · f576518c
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
       "Here are a few more bug fixes that trickled in since the last pull.
        They've survived the usual xfstests runs and merge cleanly with this
        morning's master.
        I expect there to be one more pull request tomorrow for the fix to
        that quota related inode unlock bug that we were reviewing last night,
        but it will continue to soak in the testing machine for several more
         - Fix missing compat ioctl handling for get/setlabel
         - Fix missing ioctl pointer sanitization on s390
         - Fix a page locking deadlock in the dedupe comparison code
         - Fix inadequate locking in reflink code w.r.t. concurrent directio
         - Fix broken error detection when breaking layouts"
      * tag 'xfs-5.3-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        fs/xfs: Fix return code of xfs_break_leased_layouts()
        xfs: fix reflink source file racing with directio writes
        vfs: fix page locking deadlocks when deduping files
        xfs: compat_ioctl: use compat_ptr()
        xfs: fall back to native ioctls for unhandled compat ones
    • Andre Przywara's avatar
      KVM: arm/arm64: VGIC: Properly initialise private IRQ affinity · 2e16f3e9
      Andre Przywara authored
      At the moment we initialise the target *mask* of a virtual IRQ to the
      VCPU it belongs to, even though this mask is only defined for GICv2 and
      quickly runs out of bits for many GICv3 guests.
      This behaviour triggers an UBSAN complaint for more than 32 VCPUs:
      [ 5659.462377] UBSAN: Undefined behaviour in virt/kvm/arm/vgic/vgic-init.c:223:21
      [ 5659.471689] shift exponent 32 is too large for 32-bit type 'unsigned int'
      Also for GICv3 guests the reporting of TARGET in the "vgic-state" debugfs
      dump is wrong, due to this very same problem.
      Because there is no requirement to create the VGIC device before the
      VCPUs (and QEMU actually does it the other way round), we can't safely
      initialise mpidr or targets in kvm_vgic_vcpu_init(). But since we touch
      every private IRQ for each VCPU anyway later (in vgic_init()), we can
      just move the initialisation of those fields into there, where we
      definitely know the VGIC type.
      On the way make sure we really have either a VGICv2 or a VGICv3 device,
      since the existing code is just checking for "VGICv3 or not", silently
      ignoring the uninitialised case.
      Signed-off-by: Andre Przywara's avatarAndre Przywara <andre.przywara@arm.com>
      Reported-by: default avatarDave Martin <dave.martin@arm.com>
      Tested-by: default avatarJulien Grall <julien.grall@arm.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
    • Linus Torvalds's avatar
      Merge tag 'modules-for-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux · e3fb13b7
      Linus Torvalds authored
      Pull modules fixes from Jessica Yu:
       "Fix BUG_ON() being triggered in frob_text() due to non-page-aligned
        module sections"
      * tag 'modules-for-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
        modules: page-align module section allocations only for arches supporting strict module rwx
        modules: always page-align module section allocations
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-5.3-rc6' of git://github.com/ceph/ceph-client · 4e563944
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
       "Three important fixes tagged for stable (an indefinite hang, a crash
        on an assert and a NULL pointer dereference) plus a small series from
        Luis fixing instances of vfree() under spinlock"
      * tag 'ceph-for-5.3-rc6' of git://github.com/ceph/ceph-client:
        libceph: fix PG split vs OSD (re)connect race
        ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply
        ceph: clear page dirty before invalidate page
        ceph: fix buffer free while holding i_ceph_lock in fill_inode()
        ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob()
        ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()
        libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer
    • Bernard Metzler's avatar
      RDMA/siw: Fix 64/32bit pointer inconsistency · c536277e
      Bernard Metzler authored
      Fixes improper casting between addresses and unsigned types.
      Changes siw_pbl_get_buffer() function to return appropriate
      dma_addr_t, and not u64.
      Also fixes debug prints. Now any potentially kernel private
      pointers are printed formatted as '%pK', to allow keeping that
      information secret.
      Fixes: d941bfe500be ("RDMA/siw: Change CQ flags from 64->32 bits")
      Fixes: b0fff731 ("rdma/siw: completion queue methods")
      Fixes: 8b6a361b ("rdma/siw: receive path")
      Fixes: b9be6f18 ("rdma/siw: transmit path")
      Fixes: f29dd55b ("rdma/siw: queue pair methods")
      Fixes: 2251334d ("rdma/siw: application buffer management")
      Fixes: 303ae1cd ("rdma/siw: application interface")
      Fixes: 6c52fdc2 ("rdma/siw: connection management")
      Fixes: a5319752
       ("rdma/siw: main include file")
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Reported-by: default avatarJason Gunthorpe <jgg@ziepe.ca>
      Reported-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarBernard Metzler <bmt@zurich.ibm.com>
      Link: https://lore.kernel.org/r/20190822173738.26817-1-bmt@zurich.ibm.com
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2019-08-23' of git://anongit.freedesktop.org/drm/drm · 1374a22e
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Live from the laundromat after my washing machine broke down, we have
        the 5.3-rc6 fixes. Changelog is in the tag below, but nothing too
        noteworthy in here:
         - LVDS dual-link mode fix
         - of node refcount fix
         - prime buffer import fix
         - dma max seg fix
         - output polling fix
         - abfc format fix
         - memory-region DT fix
         - bpc display fix
         - ioctl memory leak fix
         - gfxoff fix
         - smu warnings fix
         - HDMI mode readout fix"
      * tag 'drm-fixes-2019-08-23' of git://anongit.freedesktop.org/drm/drm:
        drm/amdgpu/powerplay: silence a warning in smu_v11_0_setup_pptable
        drm/amd/display: Calculate bpc based on max_requested_bpc
        drm/amdgpu: prevent memory leaks in AMDGPU_CS ioctl
        drm/amd/amdgpu: disable MMHUB PG for navi10
        drm/amd/powerplay: remove duplicate macro smu_get_uclk_dpm_states in amdgpu_smu.h
        drm/amd/powerplay: fix variable type errors in smu_v11_0_setup_pptable
        drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible
        drm/i915: Fix HW readout for crtc_clock in HDMI mode
        drm/mediatek: mtk_drm_drv.c: Add of_node_put() before goto
        drm: rcar_lvds: Fix dual link mode operations
        drm/mediatek: set DMA max segment size
        drm/mediatek: use correct device to import PRIME buffers
        drm/omap: ensure we have a valid dma_mask
        drm/komeda: Add support for 'memory-region' DT node property
        drm/komeda: Adds internal bpp computing for arm afbc only format YU08 YU10
        drm/komeda: Initialize and enable output polling on Komeda
    • Mikulas Patocka's avatar
      dm table: fix invalid memory accesses with too high sector number · 1cfd5d33
      Mikulas Patocka authored
      If the sector number is too high, dm_table_find_target() should return a
      pointer to a zeroed dm_target structure (the caller should test it with
      However, for some table sizes, the code in dm_table_find_target() that
      performs btree lookup will access out of bound memory structures.
      Fix this bug by testing the sector number at the beginning of
      dm_table_find_target(). Also, add an "inline" keyword to the function
      dm_table_get_size() because this is a hot path.
      Fixes: 512875bd
       ("dm: table detect io beyond device")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarZhang Tao <kontais@zoho.com>
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
    • Darrick J. Wong's avatar
      xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT · 1fb254aa
      Darrick J. Wong authored
      Benjamin Moody reported to Debian that XFS partially wedges when a chgrp
      fails on account of being out of disk quota.  I ran his reproducer
      # adduser dummy
      # adduser dummy plugdev
      # dd if=/dev/zero bs=1M count=100 of=test.img
      # mkfs.xfs test.img
      # mount -t xfs -o gquota test.img /mnt
      # mkdir -p /mnt/dummy
      # chown -c dummy /mnt/dummy
      # xfs_quota -xc 'limit -g bsoft=100k bhard=100k plugdev' /mnt
      (and then as user dummy)
      $ dd if=/dev/urandom bs=1M count=50 of=/mnt/dummy/foo
      $ chgrp plugdev /mnt/dummy/foo
      and saw:
      WARNING: lock held when returning to user space!
      5.3.0-rc5 #rc5 Tainted: G        W
      chgrp/47006 is leaving the kernel with locks still held!
      1 lock held by chgrp/47006:
       #0: 000000006664ea2d (&xfs_nondir_ilock_class){++++}, at: xfs_ilock+0xd2/0x290 [xfs]
      ...which is clearly caused by xfs_setattr_nonsize failing to unlock the
      ILOCK after the xfs_qm_vop_chown_reserve call fails.  Add the missing
      Reported-by: default avatar <benjamin.moody@gmail.com>
      Fixes: 253f4911
       ("xfs: better xfs_trans_alloc interface")
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Tested-by: default avatarSalvatore Bonaccorso <carnil@debian.org>
    • Dave Airlie's avatar
    • Lyude Paul's avatar
      drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX · c358ebf5
      Lyude Paul authored
      While I had thought I had fixed this issue in:
      commit 342406e4
       ("drm/nouveau/i2c: Disable i2c bus access after
      It turns out that while I did fix the error messages I was seeing on my
      P50 when trying to access i2c busses with the GPU in runtime suspend, I
      accidentally had missed one important detail that was mentioned on the
      bug report this commit was supposed to fix: that the CPU would only lock
      up when trying to access i2c busses _on connected devices_ _while the
      GPU is not in runtime suspend_. Whoops. That definitely explains why I
      was not able to get my machine to hang with i2c bus interactions until
      now, as plugging my P50 into it's dock with an HDMI monitor connected
      allowed me to finally reproduce this locally.
      Now that I have managed to reproduce this issue properly, it looks like
      the problem is much simpler then it looks. It turns out that some
      connected devices, such as MST laptop docks, will actually ACK i2c reads
      even if no data was actually read:
      [  275.063043] nouveau 0000:01:00.0: i2c: aux 000a: 1: 0000004c 1
      [  275.063447] nouveau 0000:01:00.0: i2c: aux 000a: 00 01101000 10040000
      [  275.063759] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000001
      [  275.064024] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
      [  275.064285] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
      [  275.064594] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
      Because we don't handle the situation of i2c ack without any data, we
      end up entering an infinite loop in nvkm_i2c_aux_i2c_xfer() since the
      value of cnt always remains at 0. This finally properly explains how
      this could result in a CPU hang like the ones observed in the
      aforementioned commit.
      So, fix this by retrying transactions if no data is written or received,
      and give up and fail the transaction if we continue to not write or
      receive any data after 32 retries.
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
    • Alex Deucher's avatar
      drm/amdgpu/powerplay: silence a warning in smu_v11_0_setup_pptable · 75710f08
      Alex Deucher authored
      I think gcc is confused as I don't see how size could be used
      unitialized, but go ahead and silence the warning.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Reviewed-by: default avatarEvan Quan <evan.quan@amd.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190822032527.1376-1-alexander.deucher@amd.com
    • Dave Airlie's avatar
      Merge tag 'drm-misc-fixes-2019-08-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes · cf3627fb
      Dave Airlie authored
      Fixes for v5.3-rc6:
      - dma fix for omap.
      - Make output polling work on komeda.
      - Fix bpp computing for AFBC formats in komeda.
      - Support the memory-region property in komeda.
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/5f1fdfe3-814e-fad1-663c-7279217fc085@linux.intel.com
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2019-08-22' of... · dd89c112
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2019-08-22' of git://anongit.freedesktop.org/drm/drm-intel
       into drm-fixes
      drm/i915 fixes for v5.3-rc6:
      - fix hardware state readout for 10 bpc HDMI
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Jani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/87sgptd114.fsf@intel.com
  4. 22 Aug, 2019 11 commits
    • Jens Axboe's avatar
      io_uring: add need_resched() check in inner poll loop · 08f5439f
      Jens Axboe authored
      The outer poll loop checks for whether we need to reschedule, and
      returns to userspace if we do. However, it's possible to get stuck
      in the inner loop as well, if the CPU we are running on needs to
      reschedule to finish the IO work.
      Add the need_resched() check in the inner loop as well. This fixes
      a potential hang if the kernel is configured with
      Reported-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Tested-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    • Linus Torvalds's avatar
      Merge tag 'pci-v5.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 59c36bc8
      Linus Torvalds authored
      Pull PCI fixes from Bjorn Helgaas:
       - Reset both NVIDIA GPU and HDA in ThinkPad P50 quirk, which was broken
         by another quirk that enabled the HDA device (Lyude Paul)
       - Fix pciebus-howto.rst documentation filename typo (Bjorn Helgaas)
      * tag 'pci-v5.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        Documentation PCI: Fix pciebus-howto.rst filename typo
        PCI: Reset both NVIDIA GPU and HDA in ThinkPad P50 workaround
    • ZhangXiaoxu's avatar
      dm space map metadata: fix missing store of apply_bops() return value · ae148243
      ZhangXiaoxu authored
      In commit 6096d91a ("dm space map metadata: fix occasional leak
      of a metadata block on resize"), we refactor the commit logic to a new
      function 'apply_bops'.  But when that logic was replaced in out() the
      return value was not stored.  This may lead out() returning a wrong
      value to the caller.
      Fixes: 6096d91a
       ("dm space map metadata: fix occasional leak of a metadata block on resize")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarZhangXiaoxu <zhangxiaoxu5@huawei.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
    • ZhangXiaoxu's avatar
      dm btree: fix order of block initialization in btree_split_beneath · e4f9d601
      ZhangXiaoxu authored
      When btree_split_beneath() splits a node to two new children, it will
      allocate two blocks: left and right.  If right block's allocation
      failed, the left block will be unlocked and marked dirty.  If this
      happened, the left block'ss content is zero, because it wasn't
      initialized with the btree struct before the attempot to allocate the
      right block.  Upon return, when flushing the left block to disk, the
      validator will fail when check this block.  Then a BUG_ON is raised.
      Fix this by completely initializing the left block before allocating and
      initializing the right block.
      Fixes: 4dcb8b57
       ("dm btree: fix leak of bufio-backed block in btree_split_beneath error path")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarZhangXiaoxu <zhangxiaoxu5@huawei.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
    • Linus Torvalds's avatar
      Merge tag 'Wimplicit-fallthrough-5.3-rc6' of... · 20eabc89
      Linus Torvalds authored
      Merge tag 'Wimplicit-fallthrough-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
      Pull more fallthrough fixes from Gustavo A. R. Silva:
       "Fix fall-through warnings on arm and mips for multiple configurations"
      * tag 'Wimplicit-fallthrough-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
        video: fbdev: acornfb: Mark expected switch fall-through
        scsi: libsas: sas_discover: Mark expected switch fall-through
        MIPS: Octeon: Mark expected switch fall-through
        power: supply: ab8500_charger: Mark expected switch fall-through
        watchdog: wdt285: Mark expected switch fall-through
        mtd: sa1100: Mark expected switch fall-through
        drm/sun4i: tcon: Mark expected switch fall-through
        drm/sun4i: sun6i_mipi_dsi: Mark expected switch fall-through
        ARM: riscpc: Mark expected switch fall-through
        dmaengine: fsldma: Mark expected switch fall-through
    • Linus Torvalds's avatar
      Merge tag 'tag-chrome-platform-fixes-for-v5.3-rc6' of... · e5b7c167
      Linus Torvalds authored
      Merge tag 'tag-chrome-platform-fixes-for-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux
      Pull chrome platform fix from Benson Leung:
       "Fix a kernel crash during suspend/resume of cros_ec_ishtp"
      * tag 'tag-chrome-platform-fixes-for-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
        platform/chrome: cros_ec_ishtp: fix crash during suspend
    • Linus Torvalds's avatar
      Merge tag 'afs-fixes-20190822' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · e8c3fa9f
      Linus Torvalds authored
      Pull AFS fixes from David Howells:
       - Fix a cell record leak due to the default error not being cleared.
       - Fix an oops in tracepoint due to a pointer that may contain an error.
       - Fix the ACL storage op for YFS where the wrong op definition is being
         used. By luck, this only actually affects the information appearing
         in traces.
      * tag 'afs-fixes-20190822' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        afs: use correct afs_call_type in yfs_fs_store_opaque_acl2
        afs: Fix possible oops in afs_lookup trace event
        afs: Fix leak in afs_lookup_cell_rcu()
    • Bernard Metzler's avatar
      RDMA/siw: Fix SGL mapping issues · fab4f97e
      Bernard Metzler authored
      All user level and most in-kernel applications submit WQEs
      where the SG list entries are all of a single type.
      iSER in particular, however, will send us WQEs with mixed SG
      types: sge[0] = kernel buffer, sge[1] = PBL region.
      Check and set is_kva on each SG entry individually instead of
      assuming the first SGE type carries through to the last.
      This fixes iSER over siw.
      Fixes: b9be6f18
       ("rdma/siw: transmit path")
      Reported-by: default avatarKrishnamraju Eraparaju <krishna2@chelsio.com>
      Tested-by: default avatarKrishnamraju Eraparaju <krishna2@chelsio.com>
      Signed-off-by: default avatarBernard Metzler <bmt@zurich.ibm.com>
      Link: https://lore.kernel.org/r/20190822150741.21871-1-bmt@zurich.ibm.com
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
    • Selvin Xavier's avatar
      RDMA/bnxt_re: Fix stack-out-of-bounds in bnxt_qplib_rcfw_send_message · d37b1e53
      Selvin Xavier authored
      Driver copies FW commands to the HW queue as  units of 16 bytes. Some
      of the command structures are not exact multiple of 16. So while copying
      the data from those structures, the stack out of bounds messages are
      reported by KASAN. The following error is reported.
      [ 1337.530155] ==================================================================
      [ 1337.530277] BUG: KASAN: stack-out-of-bounds in bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
      [ 1337.530413] Read of size 16 at addr ffff888725477a48 by task rmmod/2785
      [ 1337.530540] CPU: 5 PID: 2785 Comm: rmmod Tainted: G           OE     5.2.0-rc6+ #75
      [ 1337.530541] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
      [ 1337.530542] Call Trace:
      [ 1337.530548]  dump_stack+0x5b/0x90
      [ 1337.530556]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
      [ 1337.530560]  print_address_description+0x65/0x22e
      [ 1337.530568]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
      [ 1337.530575]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
      [ 1337.530577]  __kasan_report.cold.3+0x37/0x77
      [ 1337.530581]  ? _raw_write_trylock+0x10/0xe0
      [ 1337.530588]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
      [ 1337.530590]  kasan_report+0xe/0x20
      [ 1337.530592]  memcpy+0x1f/0x50
      [ 1337.530600]  bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
      [ 1337.530608]  ? bnxt_qplib_creq_irq+0xa0/0xa0 [bnxt_re]
      [ 1337.530611]  ? xas_create+0x3aa/0x5f0
      [ 1337.530613]  ? xas_start+0x77/0x110
      [ 1337.530615]  ? xas_clear_mark+0x34/0xd0
      [ 1337.530623]  bnxt_qplib_free_mrw+0x104/0x1a0 [bnxt_re]
      [ 1337.530631]  ? bnxt_qplib_destroy_ah+0x110/0x110 [bnxt_re]
      [ 1337.530633]  ? bit_wait_io_timeout+0xc0/0xc0
      [ 1337.530641]  bnxt_re_dealloc_mw+0x2c/0x60 [bnxt_re]
      [ 1337.530648]  bnxt_re_destroy_fence_mr+0x77/0x1d0 [bnxt_re]
      [ 1337.530655]  bnxt_re_dealloc_pd+0x25/0x60 [bnxt_re]
      [ 1337.530677]  ib_dealloc_pd_user+0xbe/0xe0 [ib_core]
      [ 1337.530683]  srpt_remove_one+0x5de/0x690 [ib_srpt]
      [ 1337.530689]  ? __srpt_close_all_ch+0xc0/0xc0 [ib_srpt]
      [ 1337.530692]  ? xa_load+0x87/0xe0
      [ 1337.530840]  do_syscall_64+0x6d/0x1f0
      [ 1337.530843]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [ 1337.530845] RIP: 0033:0x7ff5b389035b
      [ 1337.530848] Code: 73 01 c3 48 8b 0d 2d 0b 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fd 0a 2c 00 f7 d8 64 89 01 48
      [ 1337.530849] RSP: 002b:00007fff83425c28 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
      [ 1337.530852] RAX: ffffffffffffffda RBX: 00005596443e6750 RCX: 00007ff5b389035b
      [ 1337.530853] RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005596443e67b8
      [ 1337.530854] RBP: 0000000000000000 R08: 00007fff83424ba1 R09: 0000000000000000
      [ 1337.530856] R10: 00007ff5b3902960 R11: 0000000000000206 R12: 00007fff83425e50
      [ 1337.530857] R13: 00007fff8342673c R14: 00005596443e6260 R15: 00005596443e6750
      [ 1337.530885] The buggy address belongs to the page:
      [ 1337.530962] page:ffffea001c951dc0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
      [ 1337.530964] flags: 0x57ffffc0000000()
      [ 1337.530967] raw: 0057ffffc0000000 0000000000000000 ffffffff1c950101 0000000000000000
      [ 1337.530970] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
      [ 1337.530970] page dumped because: kasan: bad access detected
      [ 1337.530996] Memory state around the buggy address:
      [ 1337.531072]  ffff888725477900: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 f2 f2 f2
      [ 1337.531180]  ffff888725477980: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
      [ 1337.531288] >ffff888725477a00: 00 f2 f2 f2 f2 f2 f2 00 00 00 f2 00 00 00 00 00
      [ 1337.531393]                                                  ^
      [ 1337.531478]  ffff888725477a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [ 1337.531585]  ffff888725477b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [ 1337.531691] ==================================================================
      Fix this by passing the exact size of each FW command to
      bnxt_qplib_rcfw_send_message as req->cmd_size. Before sending
      the command to HW, modify the req->cmd_size to number of 16 byte units.
      Fixes: 1ac5a404
       ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
      Signed-off-by: default avatarSelvin Xavier <selvin.xavier@broadcom.com>
      Link: https://lore.kernel.org/r/1566468170-489-1-git-send-email-selvin.xavier@broadcom.com
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
    • YueHaibing's avatar
      afs: use correct afs_call_type in yfs_fs_store_opaque_acl2 · 7533be85
      YueHaibing authored
      It seems that 'yfs_RXYFSStoreOpaqueACL2' should be use in
      Fixes: f5e45463
       ("afs: Implement YFS ACL setting")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
    • Marc Dionne's avatar
      afs: Fix possible oops in afs_lookup trace event · c4c613ff
      Marc Dionne authored
      The afs_lookup trace event can cause the following:
      [  216.576777] BUG: kernel NULL pointer dereference, address: 000000000000023b
      [  216.576803] #PF: supervisor read access in kernel mode
      [  216.576813] #PF: error_code(0x0000) - not-present page
      [  216.576913] RIP: 0010:trace_event_raw_event_afs_lookup+0x9e/0x1c0 [kafs]
      If the inode from afs_do_lookup() is an error other than ENOENT, or if it
      is ENOENT and afs_try_auto_mntpt() returns an error, the trace event will
      try to dereference the error pointer as a valid pointer.
      Use IS_ERR_OR_NULL to only pass a valid pointer for the trace, or NULL.
      Ideally the trace would include the error value, but for now just avoid
      the oops.
      Fixes: 80548b03
       ("afs: Add more tracepoints")
      Signed-off-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>