1. 01 Dec, 2018 10 commits
  2. 30 Nov, 2018 9 commits
    • John Hurley's avatar
      nfp: flower: prevent offload if rhashtable insert fails · b5f0cf08
      John Hurley authored
      For flow offload adds, if the rhash insert code fails, the flow will still
      have been offloaded but the reference to it in the driver freed.
      
      Re-order the offload setup calls to ensure that a flow will only be written
      to FW if a kernel reference is held and stored in the rhashtable. Remove
      this hashtable entry if the offload fails.
      
      Fixes: c01d0efa
      
       ("nfp: flower: use rhashtable for flow caching")
      Signed-off-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: default avatarPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b5f0cf08
    • John Hurley's avatar
      nfp: flower: release metadata on offload failure · 11664948
      John Hurley authored
      Calling nfp_compile_flow_metadata both assigns a stats context and
      increments a ref counter on (or allocates) a mask id table entry. These
      are released by the nfp_modify_flow_metadata call on flow deletion,
      however, if a flow add fails after metadata is set then the flow entry
      will be deleted but the metadata assignments leaked.
      
      Add an error path to the flow add offload function to ensure allocated
      metadata is released in the event of an offload fail.
      
      Fixes: 81f3ddf2
      
       ("nfp: add control message passing capabilities to flower offloads")
      Signed-off-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: default avatarPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11664948
    • Toni Peltonen's avatar
      bonding: fix 802.3ad state sent to partner when unbinding slave · 3b5b3a33
      Toni Peltonen authored
      
      
      Previously when unbinding a slave the 802.3ad implementation only told
      partner that the port is not suitable for aggregation by setting the port
      aggregation state from aggregatable to individual. This is not enough. If the
      physical layer still stays up and we only unbinded this port from the bond there
      is nothing in the aggregation status alone to prevent the partner from sending
      traffic towards us. To ensure that the partner doesn't consider this
      port at all anymore we should also disable collecting and distributing to
      signal that this actor is going away. Also clear AD_STATE_SYNCHRONIZATION to
      ensure partner exits collecting + distributing state.
      
      I have tested this behaviour againts Arista EOS switches with mlx5 cards
      (physical link stays up even when interface is down) and simulated
      the same situation virtually Linux <-> Linux with two network namespaces
      running two veth device pairs. In both cases setting aggregation to
      individual doesn't alone prevent traffic from being to sent towards this
      port given that the link stays up in partners end. Partner still keeps
      it's end in collecting + distributing state and continues until timeout is
      reached. In most cases this means we are losing the traffic partner sends
      towards our port while we wait for timeout. This is most visible with slow
      periodic time (LACP rate slow).
      
      Other open source implementations like Open VSwitch and libreswitch, and
      vendor implementations like Arista EOS, seem to disable collecting +
      distributing to when doing similar port disabling/detaching/removing change.
      With this patch kernel implementation would behave the same way and ensure
      partner doesn't consider our actor viable anymore.
      
      Signed-off-by: default avatarToni Peltonen <peltzi@peltzi.fi>
      Signed-off-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Acked-by: default avatarJonathan Toppins <jtoppins@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b5b3a33
    • Dmitry Bogdanov's avatar
      net: aquantia: fix rx checksum offload bits · 37c4b91f
      Dmitry Bogdanov authored
      
      
      The last set of csum offload fixes had a leak:
      
      Checksum enabled status bits from rx descriptor were incorrectly
      interpreted. Consequently all the other valid logic worked on zero bits.
      That caused rx checksum offloads never to trigger.
      
      Tested by dumping rx descriptors and validating resulting csum_level.
      
      Reported-by: default avatarIgor Russkikh <igor.russkikh@aquantia.com>
      Signed-off-by: default avatarDmitry Bogdanov <dmitry.bogdanov@aquantia.com>
      Signed-off-by: default avatarIgor Russkikh <igor.russkikh@aquantia.com>
      Fixes: ad703c2b
      
       ("net: aquantia: invalid checksumm offload implementation")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37c4b91f
    • Colin Ian King's avatar
      openvswitch: fix spelling mistake "execeeds" -> "exceeds" · 43d0e960
      Colin Ian King authored
      
      
      There is a spelling mistake in a net_warn_ratelimited message, fix this.
      
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43d0e960
    • Colin Ian King's avatar
      liquidio: fix spelling mistake "deferal" -> "deferral" · 56e0e295
      Colin Ian King authored
      
      
      There is a spelling mistake in the oct_stats_strings array, fix it.
      
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      56e0e295
    • Thierry Reding's avatar
      net: stmmac: Move debugfs init/exit to ->probe()/->remove() · 5f2b8b62
      Thierry Reding authored
      
      
      Setting up and tearing down debugfs is current unbalanced, as seen by
      this error during resume from suspend:
      
          [  752.134067] dwc-eth-dwmac 2490000.ethernet eth0: ERROR failed to create debugfs directory
          [  752.134347] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_hw_setup: failed debugFS registration
      
      The imbalance happens because the driver creates the debugfs hierarchy
      when the device is opened and tears it down when the device is closed.
      There's little gain in that, and it could be argued that it is even
      surprising because it's not usually done for other devices. Fix the
      imbalance by moving the debugfs creation and teardown to the driver's
      ->probe() and ->remove() implementations instead.
      
      Note that the ring descriptors cannot be read while the interface is
      down, so make sure to return an empty file when the descriptors_status
      debugfs file is read.
      
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Acked-by: default avatarJose Abreu <joabreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5f2b8b62
    • Xin Long's avatar
      sctp: update frag_point when stream_interleave is set · 4135cce7
      Xin Long authored
      sctp_assoc_update_frag_point() should be called whenever asoc->pathmtu
      changes, but we missed one place in sctp_association_init(). It would
      cause frag_point is zero when sending data.
      
      As says in Jakub's reproducer, if sp->pathmtu is set by socketopt, the
      new asoc->pathmtu inherits it in sctp_association_init(). Later when
      transports are added and their pmtu >= asoc->pathmtu, it will never
      call sctp_assoc_update_frag_point() to set frag_point.
      
      This patch is to fix it by updating frag_point after asoc->pathmtu is
      set as sp->pathmtu in sctp_association_init(). Note that it moved them
      after sctp_stream_init(), as stream->si needs to be set first.
      
      Frag_point's calculation is also related with datachunk's type, so it
      needs to update frag_point when stream->si may be changed in
      sctp_process_init().
      
      v1->v2:
        - call sctp_assoc_update_frag_point() separately in sctp_process_init
          and sctp_association_init, per Marcelo's suggestion.
      
      Fixes: 2f5e3c9d
      
       ("sctp: introduce sctp_assoc_update_frag_point")
      Reported-by: default avatarJakub Audykowicz <jakub.audykowicz@gmail.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4135cce7
    • Christoph Paasch's avatar
      net: Prevent invalid access to skb->prev in __qdisc_drop_all · 9410d386
      Christoph Paasch authored
      __qdisc_drop_all() accesses skb->prev to get to the tail of the
      segment-list.
      
      With commit 68d2f84a ("net: gro: properly remove skb from list")
      the skb-list handling has been changed to set skb->next to NULL and set
      the list-poison on skb->prev.
      
      With that change, __qdisc_drop_all() will panic when it tries to
      dereference skb->prev.
      
      Since commit 992cba7e ("net: Add and use skb_list_del_init().")
      __list_del_entry is used, leaving skb->prev unchanged (thus,
      pointing to the list-head if it's the first skb of the list).
      This will make __qdisc_drop_all modify the next-pointer of the list-head
      and result in a panic later on:
      
      [   34.501053] general protection fault: 0000 [#1] SMP KASAN PTI
      [   34.501968] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.20.0-rc2.mptcp #108
      [   34.502887] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
      [   34.504074] RIP: 0010:dev_gro_receive+0x343/0x1f90
      [   34.504751] Code: e0 48 c1 e8 03 42 80 3c 30 00 0f 85 4a 1c 00 00 4d 8b 24 24 4c 39 65 d0 0f 84 0a 04 00 00 49 8d 7c 24 38 48 89 f8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 04
      [   34.507060] RSP: 0018:ffff8883af507930 EFLAGS: 00010202
      [   34.507761] RAX: 0000000000000007 RBX: ffff8883970b2c80 RCX: 1ffff11072e165a6
      [   34.508640] RDX: 1ffff11075867008 RSI: ffff8883ac338040 RDI: 0000000000000038
      [   34.509493] RBP: ffff8883af5079d0 R08: ffff8883970b2d40 R09: 0000000000000062
      [   34.510346] R10: 0000000000000034 R11: 0000000000000000 R12: 0000000000000000
      [   34.511215] R13: 0000000000000000 R14: dffffc0000000000 R15: ffff8883ac338008
      [   34.512082] FS:  0000000000000000(0000) GS:ffff8883af500000(0000) knlGS:0000000000000000
      [   34.513036] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   34.513741] CR2: 000055ccc3e9d020 CR3: 00000003abf32000 CR4: 00000000000006e0
      [   34.514593] Call Trace:
      [   34.514893]  <IRQ>
      [   34.515157]  napi_gro_receive+0x93/0x150
      [   34.515632]  receive_buf+0x893/0x3700
      [   34.516094]  ? __netif_receive_skb+0x1f/0x1a0
      [   34.516629]  ? virtnet_probe+0x1b40/0x1b40
      [   34.517153]  ? __stable_node_chain+0x4d0/0x850
      [   34.517684]  ? kfree+0x9a/0x180
      [   34.518067]  ? __kasan_slab_free+0x171/0x190
      [   34.518582]  ? detach_buf+0x1df/0x650
      [   34.519061]  ? lapic_next_event+0x5a/0x90
      [   34.519539]  ? virtqueue_get_buf_ctx+0x280/0x7f0
      [   34.520093]  virtnet_poll+0x2df/0xd60
      [   34.520533]  ? receive_buf+0x3700/0x3700
      [   34.521027]  ? qdisc_watchdog_schedule_ns+0xd5/0x140
      [   34.521631]  ? htb_dequeue+0x1817/0x25f0
      [   34.522107]  ? sch_direct_xmit+0x142/0xf30
      [   34.522595]  ? virtqueue_napi_schedule+0x26/0x30
      [   34.523155]  net_rx_action+0x2f6/0xc50
      [   34.523601]  ? napi_complete_done+0x2f0/0x2f0
      [   34.524126]  ? kasan_check_read+0x11/0x20
      [   34.524608]  ? _raw_spin_lock+0x7d/0xd0
      [   34.525070]  ? _raw_spin_lock_bh+0xd0/0xd0
      [   34.525563]  ? kvm_guest_apic_eoi_write+0x6b/0x80
      [   34.526130]  ? apic_ack_irq+0x9e/0xe0
      [   34.526567]  __do_softirq+0x188/0x4b5
      [   34.527015]  irq_exit+0x151/0x180
      [   34.527417]  do_IRQ+0xdb/0x150
      [   34.527783]  common_interrupt+0xf/0xf
      [   34.528223]  </IRQ>
      
      This patch makes sure that skb->prev is set to NULL when entering
      netem_enqueue.
      
      Cc: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Cc: Tyler Hicks <tyhicks@canonical.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Fixes: 68d2f84a
      
       ("net: gro: properly remove skb from list")
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9410d386
  3. 29 Nov, 2018 12 commits
  4. 28 Nov, 2018 9 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 60b54823
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) ARM64 JIT fixes for subprog handling from Daniel Borkmann.
      
       2) Various sparc64 JIT bug fixes (fused branch convergance, frame
          pointer usage detection logic, PSEODU call argument handling).
      
       3) Fix to use BH locking in nf_conncount, from Taehee Yoo.
      
       4) Fix race of TX skb freeing in ipheth driver, from Bernd Eckstein.
      
       5) Handle return value of TX NAPI completion properly in lan743x
          driver, from Bryan Whitehead.
      
       6) MAC filter deletion in i40e driver clears wrong state bit, from
          Lihong Yang.
      
       7) Fix use after free in rionet driver, from Pan Bian.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits)
        s390/qeth: fix length check in SNMP processing
        net: hisilicon: remove unexpected free_netdev
        rapidio/rionet: do not free skb before reading its length
        i40e: fix kerneldoc for xsk methods
        ixgbe: recognize 1000BaseLX SFP modules as 1Gbps
        i40e: Fix deletion of MAC filters
        igb: fix uninitialized variables
        netfilter: nf_tables: deactivate expressions in rule replecement routine
        lan743x: Enable driver to work with LAN7431
        tipc: fix lockdep warning during node delete
        lan743x: fix return value for lan743x_tx_napi_poll
        net: via: via-velocity: fix spelling mistake "alignement" -> "alignment"
        qed: fix spelling mistake "attnetion" -> "attention"
        net: thunderx: fix NULL pointer dereference in nic_remove
        sctp: increase sk_wmem_alloc when head->truesize is increased
        firestream: fix spelling mistake: "Inititing" -> "Initializing"
        net: phy: add workaround for issue where PHY driver doesn't bind to the device
        usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2
        sparc: Adjust bpf JIT prologue for PSEUDO calls.
        bpf, doc: add entries of who looks over which jits
        ...
      60b54823
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20181128' of git://github.com/jcmvbkbc/linux-xtensa · b26b2b24
      Linus Torvalds authored
      Pull Xtensa fixes from Max Filippov:
      
       - fix kernel exception on userspace access to a currently disabled
         coprocessor
      
       - fix coprocessor data saving/restoring in configurations with multiple
         coprocessors
      
       - fix ptrace access to coprocessor data on configurations with multiple
         coprocessors with high alignment requirements
      
      * tag 'xtensa-20181128' of git://github.com/jcmvbkbc/linux-xtensa:
        xtensa: fix coprocessor part of ptrace_{get,set}xregs
        xtensa: fix coprocessor context offset definitions
        xtensa: enable coprocessors that are being flushed
      b26b2b24
    • David S. Miller's avatar
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · d78a5ebd
      David S. Miller authored
      
      
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Fixes 2018-11-28
      
      This series contains fixes to igb, ixgbe and i40e.
      
      Yunjian Wang from Huawei resolves a variable that could potentially be
      NULL before it is used.
      
      Lihong fixes an i40e issue which goes back to 4.17 kernels, where
      deleting any of the MAC filters was causing the incorrect syncing for
      the PF.
      
      Josh Elsasser caught that there were missing enum values in the link
      capabilities for x550 devices, which was preventing link for 1000BaseLX
      SFP modules.
      
      Jan fixes the function header comments for XSK methods.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d78a5ebd
    • Julian Wiedmann's avatar
      s390/qeth: fix length check in SNMP processing · 9a764c1e
      Julian Wiedmann authored
      The response for a SNMP request can consist of multiple parts, which
      the cmd callback stages into a kernel buffer until all parts have been
      received. If the callback detects that the staging buffer provides
      insufficient space, it bails out with error.
      This processing is buggy for the first part of the response - while it
      initially checks for a length of 'data_len', it later copies an
      additional amount of 'offsetof(struct qeth_snmp_cmd, data)' bytes.
      
      Fix the calculation of 'data_len' for the first part of the response.
      This also nicely cleans up the memcpy code.
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Reviewed-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a764c1e
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · e9d8faf9
      David S. Miller authored
      
      
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Disable BH while holding list spinlock in nf_conncount, from
         Taehee Yoo.
      
      2) List corruption in nf_conncount, also from Taehee.
      
      3) Fix race that results in leaving around an empty list node in
         nf_conncount, from Taehee Yoo.
      
      4) Proper chain handling for inactive chains from the commit path,
         from Florian Westphal. This includes a selftest for this.
      
      5) Do duplicate rule handles when replacing rules, also from Florian.
      
      6) Remove net_exit path in xt_RATEEST that results in splat, from Taehee.
      
      7) Possible use-after-free in nft_compat when releasing extensions.
         From Florian.
      
      8) Memory leak in xt_hashlimit, from Taehee.
      
      9) Call ip_vs_dst_notifier after ipv6_dev_notf, from Xin Long.
      
      10) Fix cttimeout with udplite and gre, from Florian.
      
      11) Preserve oif for IPv6 link-local generated traffic from mangle
          table, from Alin Nastac.
      
      12) Missing error handling in masquerade notifiers, from Taehee Yoo.
      
      13) Use mutex to protect registration/unregistration of masquerade
          extensions in order to prevent a race, from Taehee.
      
      14) Incorrect condition check in tree_nodes_free(), also from Taehee.
      
      15) Fix chain counter leak in rule replacement path, from Taehee.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9d8faf9
    • Pan Bian's avatar
      net: hisilicon: remove unexpected free_netdev · c7589401
      Pan Bian authored
      
      
      The net device ndev is freed via free_netdev when failing to register
      the device. The control flow then jumps to the error handling code
      block. ndev is used and freed again. Resulting in a use-after-free bug.
      
      Signed-off-by: default avatarPan Bian <bianpan2016@163.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c7589401
    • Pan Bian's avatar
      rapidio/rionet: do not free skb before reading its length · cfc43519
      Pan Bian authored
      skb is freed via dev_kfree_skb_any, however, skb->len is read then. This
      may result in a use-after-free bug.
      
      Fixes: e6161d64
      
       ("rapidio/rionet: rework driver initialization and removal")
      Signed-off-by: default avatarPan Bian <bianpan2016@163.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cfc43519
    • Jan Sokolowski's avatar
      i40e: fix kerneldoc for xsk methods · 529eb362
      Jan Sokolowski authored
      
      
      One method, xsk_umem_setup, had an incorrect kernel doc
      description, which has been corrected.
      
      Also fixes small typos found in the comments.
      
      Signed-off-by: default avatarJan Sokolowski <jan.sokolowski@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      529eb362
    • Linus Torvalds's avatar
      Merge tag 'for-4.20-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 121b018f
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "Some of these bugs are being hit during testing so we'd like to get
        them merged, otherwise there are usual stability fixes for stable
        trees"
      
      * tag 'for-4.20-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: relocation: set trans to be NULL after ending transaction
        Btrfs: fix race between enabling quotas and subvolume creation
        Btrfs: send, fix infinite loop due to directory rename dependencies
        Btrfs: ensure path name is null terminated at btrfs_control_ioctl
        Btrfs: fix rare chances for data loss when doing a fast fsync
        btrfs: Always try all copies when reading extent buffers
      121b018f