1. 01 Apr, 2017 3 commits
  2. 25 Mar, 2017 2 commits
    • Dmitry V. Levin's avatar
      uapi: fix rdma/mlx5-abi.h userspace compilation errors · 812755d6
      Dmitry V. Levin authored
      Consistently use types from linux/types.h to fix the following
      rdma/mlx5-abi.h userspace compilation errors:
      /usr/include/rdma/mlx5-abi.h:69:25: error: 'u64' undeclared here (not in a function)
        MLX5_LIB_CAP_4K_UAR = (u64)1 << 0,
      /usr/include/rdma/mlx5-abi.h:69:29: error: expected ',' or '}' before numeric constant
        MLX5_LIB_CAP_4K_UAR = (u64)1 << 0,
      Include <linux/if_ether.h> to fix the following rdma/mlx5-abi.h
      userspace compilation error:
      /usr/include/rdma/mlx5-abi.h:286:12: error: 'ETH_ALEN' undeclared here (not in a function)
        __u8 dmac[ETH_ALEN];
      Signed-off-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
    • Bart Van Assche's avatar
      IB/core: Restore I/O MMU, s390 and powerpc support · 0957c29f
      Bart Van Assche authored
      Avoid that the following error message is reported on the console
      while loading an RDMA driver with I/O MMU support enabled:
      DMAR: Allocating domain for mlx5_0 failed
      Ensure that DMA mapping operations that use to_pci_dev() to
      access to struct pci_dev see the correct PCI device. E.g. the s390
      and powerpc DMA mapping operations use to_pci_dev() even with I/O
      MMU support disabled.
      This patch preserves the following changes of the DMA mapping updates
      patch series:
      - Introduction of dma_virt_ops.
      - Removal of ib_device.dma_ops.
      - Removal of struct ib_dma_mapping_ops.
      - Removal of an if-statement from each ib_dma_*() operation.
      - IB HW drivers no longer set dma_device directly.
      Reported-by: default avatarSebastian Ott <sebott@linux.vnet.ibm.com>
      Reported-by: default avatarParav Pandit <parav@mellanox.com>
      Fixes: commit 99db9494
       ("IB/core: Remove ib_device.dma_device")
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Reviewed-by: parav@mellanox.com
      Tested-by: parav@mellanox.com
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
  3. 23 Mar, 2017 1 commit
  4. 22 Mar, 2017 5 commits
  5. 21 Mar, 2017 5 commits
  6. 20 Mar, 2017 1 commit
    • Stafford Horne's avatar
      generic syscalls: Wire up statx syscall · fdfe4a39
      Stafford Horne authored
      The new syscall statx is implemented as generic code, so enable it
      for architectures like openrisc which use the generic syscall table.
      Fixes: a528d35e
       ("statx: Add a system call to make enhanced file info available")
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Signed-off-by: default avatarStafford Horne <shorne@gmail.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
  7. 18 Mar, 2017 2 commits
    • Mike Christie's avatar
      target: fix ALUA transition timeout handling · d7175373
      Mike Christie authored
      The implicit transition time tells initiators the min time
      to wait before timing out a transition. We currently schedule
      the transition to occur in tg_pt_gp_implicit_trans_secs
      seconds so there is no room for delays. If
      needs to write out info to a remote file, then the initiator can
      easily time out the operation.
      Signed-off-by: default avatarMike Christie <mchristi@redhat.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
    • Mike Christie's avatar
      target: allow ALUA setup for some passthrough backends · 530c6891
      Mike Christie authored
      This patch allows passthrough backends to use the core/base LIO
      ALUA setup and state checks, but still handle the execution of
      This will allow the target_core_user module to execute STPG and RTPG
      in userspace, and not have to duplicate the ALUA state checks, path
      information (needed so we can check if command is executable on
      specific paths) and setup (rtslib sets/updates the configfs ALUA
      interface like it does for iblock or file).
      For STPG, the target_core_user userspace daemon, tcmu-runner will
      still execute the STPG, and to update the core/base LIO state it
      will use the existing configfs interface. For RTPG, tcmu-runner
      will loop over configfs and/or cache the state.
      Signed-off-by: default avatarMike Christie <mchristi@redhat.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
  8. 17 Mar, 2017 1 commit
    • Jack Morgenstein's avatar
      net/mlx4_core: Avoid delays during VF driver device shutdown · 4cbe4dac
      Jack Morgenstein authored
      Some Hypervisors detach VFs from VMs by instantly causing an FLR event
      to be generated for a VF.
      In the mlx4 case, this will cause that VF's comm channel to be disabled
      before the VM has an opportunity to invoke the VF device's "shutdown"
      For such Hypervisors, there is a race condition between the VF's
      shutdown method and its internal-error detection/reset thread.
      The internal-error detection/reset thread (which runs every 5 seconds) also
      detects a disabled comm channel. If the internal-error detection/reset
      flow wins the race, we still get delays (while that flow tries repeatedly
      to detect comm-channel recovery).
      The cited commit fixed the command timeout problem when the
      internal-error detection/reset flow loses the race.
      This commit avoids the unneeded delays when the internal-error
      detection/reset flow wins.
      Fixes: d585df1c
       ("net/mlx4_core: Avoid command timeouts during VF driver device shutdown")
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Reported-by: default avatarSimon Xiao <sixiao@microsoft.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  9. 16 Mar, 2017 5 commits
  10. 15 Mar, 2017 1 commit
    • Eric Biggers's avatar
      fscrypt: eliminate ->prepare_context() operation · 94840e3c
      Eric Biggers authored
      The only use of the ->prepare_context() fscrypt operation was to allow
      ext4 to evict inline data from the inode before ->set_context().
      However, there is no reason why this cannot be done as simply the first
      step in ->set_context(), and in fact it makes more sense to do it that
      way because then the policy modes and flags get validated before any
      real work is done.  Therefore, merge ext4_prepare_context() into
      ext4_set_context(), and remove ->prepare_context().
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
  11. 14 Mar, 2017 2 commits
  12. 13 Mar, 2017 6 commits
    • Lars-Peter Clausen's avatar
      iio: sw-device: Fix config group initialization · c42f8218
      Lars-Peter Clausen authored
      Use the IS_ENABLED() helper macro to ensure that the configfs group is
      initialized either when configfs is built-in or when configfs is built as a
      module. Otherwise software device creation will result in undefined
      behaviour when configfs is built as a module since the configfs group for
      the device not properly initialized.
      Similar to commit b2f0c096 ("iio: sw-trigger: Fix config group
      Fixes: 0f3a8c3f
       ("iio: Add support for creating IIO devices via configfs")
      Reported-by: default avatarMiguel Robles <miguel.robles@farole.net>
      Signed-off-by: default avatarLars-Peter Clausen <lars@metafoo.de>
      Acked-by: default avatarDaniel Baluta <daniel.baluta@gmail.com>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <jic23@kernel.org>
    • Pablo Neira Ayuso's avatar
      Revert "netfilter: nf_tables: add flush field to struct nft_set_iter" · 04166f48
      Pablo Neira Ayuso authored
      This reverts commit 1f48ff6c
      This patch is not required anymore now that we keep a dummy list of
      set elements in the bitmap set implementation, so revert this before
      we forget this code has no clients.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
    • Steven Rostedt (VMware)'s avatar
      netfilter: Force fake conntrack entry to be at least 8 bytes aligned · 170a1fb9
      Steven Rostedt (VMware) authored
      Since the nfct and nfctinfo have been combined, the nf_conn structure
      must be at least 8 bytes aligned, as the 3 LSB bits are used for the
      nfctinfo. But there's a fake nf_conn structure to denote untracked
      connections, which is created by a PER_CPU construct. This does not
      guarantee that it will be 8 bytes aligned and can break the logic in
      determining the correct nfctinfo.
      I triggered this on a 32bit machine with the following error:
      BUG: unable to handle kernel NULL pointer dereference at 00000af4
      IP: nf_ct_deliver_cached_events+0x1b/0xfb
      *pdpt = 0000000031962001 *pde = 0000000000000000
      Oops: 0000 [#1] SMP
      [Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 crc_ccitt ppdev r8169 parport_pc parport
        OK  ]
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-test+ #75
      Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
      task: c126ec00 task.stack: c1258000
      EIP: nf_ct_deliver_cached_events+0x1b/0xfb
      EFLAGS: 00010202 CPU: 0
      EAX: 0021cd01 EBX: 00000000 ECX: 27b0c767 EDX: 32bcb17a
      ESI: f34135c0 EDI: f34135c0 EBP: f2debd60 ESP: f2debd3c
       DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
      CR0: 80050033 CR2: 00000af4 CR3: 309a0440 CR4: 001406f0
      Call Trace:
       ? ipv6_skip_exthdr+0xac/0xcb
       ipv6_confirm+0x10c/0x119 [nf_conntrack_ipv6]
       nf_hook+0x9a/0xad [ipv6]
       ? ip6t_do_table+0x356/0x379 [ip6_tables]
       ? ip6_fragment+0x9e9/0x9e9 [ipv6]
       ip6_output+0xee/0x107 [ipv6]
       ? ip6_fragment+0x9e9/0x9e9 [ipv6]
       dst_output+0x36/0x4d [ipv6]
       NF_HOOK.constprop.37+0xb2/0xba [ipv6]
       ? icmp6_dst_alloc+0x2c/0xfd [ipv6]
       ? local_bh_enable+0x14/0x14 [ipv6]
       mld_sendpack+0x1c5/0x281 [ipv6]
       ? mark_held_locks+0x40/0x5c
       mld_ifc_timer_expire+0x1f6/0x21e [ipv6]
       ? detach_if_pending+0x55/0x55
       ? mld_dad_timer_expire+0x3e/0x3e [ipv6]
       ? mld_dad_timer_expire+0x3e/0x3e [ipv6]
       ? test_ti_thread_flag.constprop.19+0xd/0xd
      By using DEFINE/DECLARE_PER_CPU_ALIGNED we can enforce at least 8 byte
      alignment as all cache line sizes are at least 8 bytes or more.
      Fixes: a9e419dc
       ("netfilter: merge ctinfo into nfct pointer storage area")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
    • Liping Zhang's avatar
      netfilter: nf_tables: fix mismatch in big-endian system · 10596608
      Liping Zhang authored
      Currently, there are two different methods to store an u16 integer to
      the u32 data register. For example:
        u32 *dest = &regs->data[priv->dreg];
        1. *dest = 0; *(u16 *) dest = val_u16;
        2. *dest = val_u16;
      For method 1, the u16 value will be stored like this, either in
      big-endian or little-endian system:
        0          15           31
        |   Value   |     0     |
      For method 2, in little-endian system, the u16 value will be the same
      as listed above. But in big-endian system, the u16 value will be stored
      like this:
        0          15           31
        |     0     |   Value   |
      So later we use "memcmp(&regs->data[priv->sreg], data, 2);" to do
      compare in nft_cmp, nft_lookup expr ..., method 2 will get the wrong
      result in big-endian system, as 0~15 bits will always be zero.
      For the similar reason, when loading an u16 value from the u32 data
      register, we should use "*(u16 *) sreg;" instead of "(u16)*sreg;",
      the 2nd method will get the wrong value in the big-endian system.
      So introduce some wrapper functions to store/load an u8 or u16
      integer to/from the u32 data register, and use them in the right
      Signed-off-by: default avatarLiping Zhang <zlpnobody@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
    • Dmitry V. Levin's avatar
      uapi: fix drm/omap_drm.h userspace compilation errors · 337ba7fb
      Dmitry V. Levin authored
      Consistently use types from linux/types.h like in other uapi drm/*_drm.h
      header files to fix the following drm/omap_drm.h userspace compilation
      /usr/include/drm/omap_drm.h:36:2: error: unknown type name 'uint64_t'
        uint64_t param;   /* in */
      /usr/include/drm/omap_drm.h:37:2: error: unknown type name 'uint64_t'
        uint64_t value;   /* in (set_param), out (get_param) */
      /usr/include/drm/omap_drm.h:56:2: error: unknown type name 'uint32_t'
        uint32_t bytes;  /* (for non-tiled formats) */
      /usr/include/drm/omap_drm.h:58:3: error: unknown type name 'uint16_t'
         uint16_t width;
      /usr/include/drm/omap_drm.h:59:3: error: unknown type name 'uint16_t'
         uint16_t height;
      /usr/include/drm/omap_drm.h:65:2: error: unknown type name 'uint32_t'
        uint32_t flags;   /* in */
      /usr/include/drm/omap_drm.h:66:2: error: unknown type name 'uint32_t'
        uint32_t handle;  /* out */
      /usr/include/drm/omap_drm.h:67:2: error: unknown type name 'uint32_t'
        uint32_t __pad;
      /usr/include/drm/omap_drm.h:77:2: error: unknown type name 'uint32_t'
        uint32_t handle;  /* buffer handle (in) */
      /usr/include/drm/omap_drm.h:78:2: error: unknown type name 'uint32_t'
        uint32_t op;   /* mask of omap_gem_op (in) */
      /usr/include/drm/omap_drm.h:82:2: error: unknown type name 'uint32_t'
        uint32_t handle;  /* buffer handle (in) */
      /usr/include/drm/omap_drm.h:83:2: error: unknown type name 'uint32_t'
        uint32_t op;   /* mask of omap_gem_op (in) */
      /usr/include/drm/omap_drm.h:88:2: error: unknown type name 'uint32_t'
        uint32_t nregions;
      /usr/include/drm/omap_drm.h:89:2: error: unknown type name 'uint32_t'
        uint32_t __pad;
      /usr/include/drm/omap_drm.h:93:2: error: unknown type name 'uint32_t'
        uint32_t handle;  /* buffer handle (in) */
      /usr/include/drm/omap_drm.h:94:2: error: unknown type name 'uint32_t'
        uint32_t pad;
      /usr/include/drm/omap_drm.h:95:2: error: unknown type name 'uint64_t'
        uint64_t offset;  /* mmap offset (out) */
      /usr/include/drm/omap_drm.h:102:2: error: unknown type name 'uint32_t'
        uint32_t size;   /* virtual size for mmap'ing (out) */
      /usr/include/drm/omap_drm.h:103:2: error: unknown type name 'uint32_t'
        uint32_t __pad;
      Fixes: ef6503e8
       ("drm: Kbuild: add omap_drm.h to the installed headers")
      Signed-off-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: default avatarTomi Valkeinen <tomi.valkeinen@ti.com>
    • Daniel Borkmann's avatar
      bpf: improve read-only handling · 65869a47
      Daniel Borkmann authored
      Improve bpf_{prog,jit_binary}_{un,}lock_ro() by throwing a
      one-time warning in case of an error when the image couldn't
      be set read-only, and also mark struct bpf_prog as locked when
      bpf_prog_lock_ro() was called.
      Reason for the latter is that bpf_prog_unlock_ro() is called from
      various places including error paths, and we shouldn't mess with
      page attributes when really not needed.
      For bpf_jit_binary_unlock_ro() this is not needed as jited flag
      implicitly indicates this, thus for archs with ARCH_HAS_SET_MEMORY
      we're guaranteed to have a previously locked image. Overall, this
      should also help us to identify any further potential issues with
      set_memory_*() helpers.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  13. 11 Mar, 2017 2 commits
  14. 10 Mar, 2017 4 commits
    • Thomas Gleixner's avatar
      kexec, x86/purgatory: Unbreak it and clean it up · 40c50c1f
      Thomas Gleixner authored
      The purgatory code defines global variables which are referenced via a
      symbol lookup in the kexec code (core and arch).
      A recent commit addressing sparse warnings made these static and thereby
      broke kexec_file.
      Why did this happen? Simply because the whole machinery is undocumented and
      lacks any form of forward declarations. The variable names are unspecific
      and lack a prefix, so adding forward declarations creates shadow variables
      in the core code. Aside of that the code relies on magic constants and
      duplicate struct definitions with no way to ensure that these things stay
      in sync. The section placement of the purgatory variables happened by
      chance and not by design.
      Unbreak kexec and cleanup the mess:
       - Add proper forward declarations and document the usage
       - Use common struct definition
       - Use the proper common defines instead of magic constants
       - Add a purgatory_ prefix to have a proper name space
       - Use ARRAY_SIZE() instead of a homebrewn reimplementation
       - Add proper sections to the purgatory variables [ From Mike ]
      Fixes: 72042a8c
       ("x86/purgatory: Make functions and variables static")
      Reported-by: default avatarMike Galbraith <&lt;efault@gmx.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Nicholas Mc Guire <der.herr@hofr.at>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: "Tobin C. Harding" <me@tobin.cc>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1703101315140.3681@nanos
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    • David Howells's avatar
      net: Work around lockdep limitation in sockets that use sockets · cdfbabfb
      David Howells authored
      Lockdep issues a circular dependency warning when AFS issues an operation
      through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.
      The theory lockdep comes up with is as follows:
       (1) If the pagefault handler decides it needs to read pages from AFS, it
           calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
           creating a call requires the socket lock:
      	mmap_sem must be taken before sk_lock-AF_RXRPC
       (2) afs_open_socket() opens an AF_RXRPC socket and binds it.  rxrpc_bind()
           binds the underlying UDP socket whilst holding its socket lock.
           inet_bind() takes its own socket lock:
      	sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET
       (3) Reading from a TCP socket into a userspace buffer might cause a fault
           and thus cause the kernel to take the mmap_sem, but the TCP socket is
           locked whilst doing this:
      	sk_lock-AF_INET must be taken before mmap_sem
      However, lockdep's theory is wrong in this...
    • Andrea Arcangeli's avatar
      userfaultfd: non-cooperative: userfaultfd_remove revalidate vma in MADV_DONTNEED · 70ccb92f
      Andrea Arcangeli authored
      userfaultfd_remove() has to be execute before zapping the pagetables or
      UFFDIO_COPY could keep filling pages after zap_page_range returned,
      which would result in non zero data after a MADV_DONTNEED.
      However userfaultfd_remove() may have to release the mmap_sem.  This was
      handled correctly in MADV_REMOVE, but MADV_DONTNEED accessed a
      potentially stale vma (the very vma passed to zap_page_range(vma, ...)).
      The fix consists in revalidating the vma in case userfaultfd_remove()
      had to release the mmap_sem.
      This also optimizes away an unnecessary down_read/up_read in the
      MADV_REMOVE case if UFFD_EVENT_FORK had to be delivered.
      It all remains zero runtime cost in case CONFIG_USERFAULTFD=n as
      userfaultfd_remove() will be defined as "true" at build time.
      Link: http://lkml.kernel.org/r/20170302173738.18994-3-aarcange@redhat.com
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Acked-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Yisheng Xie's avatar
      mm/vmstats: add thp_split_pud event for clarity · ce9311cf
      Yisheng Xie authored
      We added support for PUD-sized transparent hugepages, however we count
      the event "thp split pud" into thp_split_pmd event.
      To separate the event count of thp split pud from pmd, add a new event
      named thp_split_pud.
      Link: http://lkml.kernel.org/r/1488282380-5076-1-git-send-email-xieyisheng1@huawei.com
      Signed-off-by: default avatarYisheng Xie <xieyisheng1@huawei.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Ebru Akagunduz <ebru.akagunduz@gmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>