1. 18 Jan, 2019 9 commits
    • Ido Schimmel's avatar
      mlxsw: pci: Ring CQ's doorbell before RDQ's · c9ebea04
      Ido Schimmel authored
      When a packet should be trapped to the CPU the device consumes a WQE
      (work queue element) from an RDQ (receive descriptor queue) and copies
      the packet to the address specified in the WQE. The device then tries to
      post a CQE (completion queue element) that contains various metadata
      (e.g., ingress port) about the packet to a CQ (completion queue).
      
      In case the device managed to consume a WQE, but did not manage to post
      the corresponding CQE, it will get stuck. This unlikely situation can be
      triggered due to the scheme the driver is currently using to process
      CQEs.
      
      The driver will consume up to 512 CQEs at a time and after processing
      each corresponding WQE it will ring the RDQ's doorbell, letting the
      device know that a new WQE was posted for it to consume. Only after
      processing all the CQEs (up to 512), the driver will ring the CQ's
      doorbell, letting the device know that new ones can be posted.
      
      Fix this by having the driver ring the CQ's doorbell for every processed
      CQE, but before ringing the RDQ's doorbell. This guarantees that
      whenever we post a new WQE, there is a corresponding CQE available. Copy
      the currently processed CQE to prevent the device from overwriting it
      with a new CQE after ringing the doorbell.
      
      Note that the driver still arms the CQ only after processing all the
      pending CQEs, so that interrupts for this CQ will only be delivered
      after the driver finished its processing.
      
      Before commit 8404f6f2 ("mlxsw: pci: Allow to use CQEs of version 1
      and version 2") the issue was virtually impossible to trigger since the
      number of CQEs was twice the number of WQEs and the number of CQEs
      processed at a time was equal to the number of available WQEs.
      
      Fixes: 8404f6f2
      
       ("mlxsw: pci: Allow to use CQEs of version 1 and version 2")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reported-by: default avatarSemion Lisyansky <semionl@mellanox.com>
      Tested-by: default avatarSemion Lisyansky <semionl@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c9ebea04
    • Jonathan Neuschäfer's avatar
      net: Fix typo in NET_FAILOVER help text · 9437b629
      Jonathan Neuschäfer authored
      "also enables" should not be spelled as one word.
      
      Fixes: cfc80d9a
      
       ("net: Introduce net_failover driver")
      Signed-off-by: default avatarJonathan Neuschäfer <j.neuschaefer@gmx.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9437b629
    • Ross Lagerwall's avatar
      net: Fix usage of pskb_trim_rcsum · 6c57f045
      Ross Lagerwall authored
      
      
      In certain cases, pskb_trim_rcsum() may change skb pointers.
      Reinitialize header pointers afterwards to avoid potential
      use-after-frees. Add a note in the documentation of
      pskb_trim_rcsum(). Found by KASAN.
      Signed-off-by: default avatarRoss Lagerwall <ross.lagerwall@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6c57f045
    • Thomas Petazzoni's avatar
      net: phy: mdio_bus: add missing device_del() in mdiobus_register() error handling · e40e2a2e
      Thomas Petazzoni authored
      The current code in __mdiobus_register() doesn't properly handle
      failures returned by the devm_gpiod_get_optional() call: it returns
      immediately, without unregistering the device that was added by the
      call to device_register() earlier in the function.
      
      This leaves a stale device, which then causes a NULL pointer
      dereference in the code that handles deferred probing:
      
      [    1.489982] Unable to handle kernel NULL pointer dereference at virtual address 00000074
      [    1.498110] pgd = (ptrval)
      [    1.500838] [00000074] *pgd=00000000
      [    1.504432] Internal error: Oops: 17 [#1] SMP ARM
      [    1.509133] Modules linked in:
      [    1.512192] CPU: 1 PID: 51 Comm: kworker/1:3 Not tainted 4.20.0-00039-g3b73a4cc8b3e-dirty #99
      [    1.520708] Hardware name: Xilinx Zynq Platform
      [    1.525261] Workqueue: events deferred_probe_work_func
      [    1.530403] PC is at klist_next+0x10/0xfc
      [    1.534403] LR is at device_for_each_child+0x40/0x94
      [    1.539361] pc : [<c0683fbc>]    lr : [<c0455d90>]    psr: 200e0013
      [    1.545628] sp : ceeefe68  ip : 00000001  fp : ffffe000
      [    1.550863] r10: 00000000  r9 : c0c66790  r8 : 00000000
      [    1.556079] r7 : c0457d44  r6 : 00000000  r5 : ceeefe8c  r4 : cfa2ec78
      [    1.562604] r3 : 00000064  r2 : c0457d44  r1 : ceeefe8c  r0 : 00000064
      [    1.569129] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
      [    1.576263] Control: 18c5387d  Table: 0ed7804a  DAC: 00000051
      [    1.582013] Process kworker/1:3 (pid: 51, stack limit = 0x(ptrval))
      [    1.588280] Stack: (0xceeefe68 to 0xceef0000)
      [    1.592630] fe60:                   cfa2ec78 c0c03c08 00000000 c0457d44 00000000 c0c66790
      [    1.600814] fe80: 00000000 c0455d90 ceeefeac 00000064 00000000 0d7a542e cee9d494 cfa2ec78
      [    1.608998] fea0: cfa2ec78 00000000 c0457d44 c0457d7c cee9d494 c0c03c08 00000000 c0455dac
      [    1.617182] fec0: cf98ba44 cf926a00 cee9d494 0d7a542e 00000000 cf935a10 cf935a10 cf935a10
      [    1.625366] fee0: c0c4e9b8 c0457d7c c0c4e80c 00000001 cf935a10 c0457df4 cf935a10 c0c4e99c
      [    1.633550] ff00: c0c4e99c c045a27c c0c4e9c4 ced63f80 cfde8a80 cfdebc00 00000000 c013893c
      [    1.641734] ff20: cfde8a80 cfde8a80 c07bd354 ced63f80 ced63f94 cfde8a80 00000008 c0c02d00
      [    1.649936] ff40: cfde8a98 cfde8a80 ffffe000 c0139a30 ffffe000 c0c6624a c07bd354 00000000
      [    1.658120] ff60: ffffe000 cee9e780 ceebfe00 00000000 ceeee000 ced63f80 c0139788 cf8cdea4
      [    1.666304] ff80: cee9e79c c013e598 00000001 ceebfe00 c013e44c 00000000 00000000 00000000
      [    1.674488] ffa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000
      [    1.682671] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      [    1.690855] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
      [    1.699058] [<c0683fbc>] (klist_next) from [<c0455d90>] (device_for_each_child+0x40/0x94)
      [    1.707241] [<c0455d90>] (device_for_each_child) from [<c0457d7c>] (device_reorder_to_tail+0x38/0x88)
      [    1.716476] [<c0457d7c>] (device_reorder_to_tail) from [<c0455dac>] (device_for_each_child+0x5c/0x94)
      [    1.725692] [<c0455dac>] (device_for_each_child) from [<c0457d7c>] (device_reorder_to_tail+0x38/0x88)
      [    1.734927] [<c0457d7c>] (device_reorder_to_tail) from [<c0457df4>] (device_pm_move_to_tail+0x28/0x40)
      [    1.744235] [<c0457df4>] (device_pm_move_to_tail) from [<c045a27c>] (deferred_probe_work_func+0x58/0x8c)
      [    1.753746] [<c045a27c>] (deferred_probe_work_func) from [<c013893c>] (process_one_work+0x210/0x4fc)
      [    1.762888] [<c013893c>] (process_one_work) from [<c0139a30>] (worker_thread+0x2a8/0x5c0)
      [    1.771072] [<c0139a30>] (worker_thread) from [<c013e598>] (kthread+0x14c/0x154)
      [    1.778482] [<c013e598>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
      [    1.785689] Exception stack(0xceeeffb0 to 0xceeefff8)
      [    1.790739] ffa0:                                     00000000 00000000 00000000 00000000
      [    1.798923] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      [    1.807107] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000
      [    1.813724] Code: e92d47f0 e1a05000 e8900048 e1a00003 (e5937010)
      [    1.819844] ---[ end trace 3c2c0c8b65399ec9 ]---
      
      The actual error that we had from devm_gpiod_get_optional() was
      -EPROBE_DEFER, due to the GPIO being provided by a driver that is
      probed later than the Ethernet controller driver.
      
      To fix this, we simply add the missing device_del() invocation in the
      error path.
      
      Fixes: 69226896
      
       ("mdio_bus: Issue GPIO RESET to PHYs")
      Signed-off-by: default avatarThomas Petazzoni <thomas.petazzoni@bootlin.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e40e2a2e
    • Yang Wei's avatar
      macvlan: replace kfree_skb by consume_skb for drop profiles · bf97403a
      Yang Wei authored
      
      
      Replace the kfree_skb() by consume_skb() to be drop monitor(dropwatch,
      perf) friendly.
      Signed-off-by: default avatarYang Wei <yang.wei9@zte.com.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf97403a
    • Lendacky, Thomas's avatar
      amd-xgbe: Fix mdio access for non-zero ports and clause 45 PHYs · 5ab3121b
      Lendacky, Thomas authored
      The XGBE hardware has support for performing MDIO operations using an
      MDIO command request. The driver mistakenly uses the mdio port address
      as the MDIO command request device address instead of the MDIO command
      request port address. Additionally, the driver does not properly check
      for and create a clause 45 MDIO command.
      
      Check the supplied MDIO register to determine if the request is a clause
      45 operation (MII_ADDR_C45). For a clause 45 operation, extract the device
      address and register number from the supplied MDIO register and use them
      to set the MDIO command request device address and register number fields.
      For a clause 22 operation, the MDIO request device address is set to zero
      and the MDIO command request register number is set to the supplied MDIO
      register. In either case, the supplied MDIO port address is used as the
      MDIO command request port address.
      
      Fixes: 732f2ab7
      
       ("amd-xgbe: Add support for MDIO attached PHYs")
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Tested-by: default avatarShyam Sundar S K <Shyam-sundar.S-k@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5ab3121b
    • Camelia Groza's avatar
      net: phy: add missing phy driver features · 40f89ebf
      Camelia Groza authored
      The phy drivers for CS4340 and TN2020 are missing their
      features attributes. Add them.
      
      Fixes: 719655a1
      
       ("net: phy: Replace phy driver features u32 with link_mode bitmap")
      Reported-by: default avatarScott Wood <oss@buserror.net>
      Signed-off-by: default avatarCamelia Groza <camelia.groza@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40f89ebf
    • Madalin Bucur's avatar
      dpaa_eth: NETIF_F_LLTX requires to do our own update of trans_start · c6ddfb9a
      Madalin Bucur authored
      
      
      As txq_trans_update() only updates trans_start when the lock is held,
      trans_start does not get updated if NETIF_F_LLTX is declared.
      Signed-off-by: default avatarMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c6ddfb9a
    • Jason Wang's avatar
      vhost: log dirty page correctly · cc5e7107
      Jason Wang authored
      Vhost dirty page logging API is designed to sync through GPA. But we
      try to log GIOVA when device IOTLB is enabled. This is wrong and may
      lead to missing data after migration.
      
      To solve this issue, when logging with device IOTLB enabled, we will:
      
      1) reuse the device IOTLB translation result of GIOVA->HVA mapping to
         get HVA, for writable descriptor, get HVA through iovec. For used
         ring update, translate its GIOVA to HVA
      2) traverse the GPA->HVA mapping to get the possible GPA and log
         through GPA. Pay attention this reverse mapping is not guaranteed
         to be unique, so we should log each possible GPA in this case.
      
      This fix the failure of scp to guest during migration. In -next, we
      will probably support passing GIOVA->GPA instead of GIOVA->HVA.
      
      Fixes: 6b1e6cc7
      
       ("vhost: new device IOTLB API")
      Reported-by: default avatarJintack Lim <jintack@cs.columbia.edu>
      Cc: Jintack Lim <jintack@cs.columbia.edu>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc5e7107
  2. 17 Jan, 2019 2 commits
  3. 16 Jan, 2019 6 commits
  4. 15 Jan, 2019 2 commits
  5. 13 Jan, 2019 1 commit
  6. 12 Jan, 2019 5 commits
    • Michael Chan's avatar
      bnxt_en: Fix context memory allocation. · 6ef982de
      Michael Chan authored
      When allocating memory pages for context memory, if the last page table
      should be fully populated, the current code will set nr_pages to 0 when
      calling bnxt_alloc_ctx_mem_blk().  This will cause the last page table
      to be completely blank and causing some RDMA failures.
      
      Fix it by setting the last page table's nr_pages to the remainder only
      if it is non-zero.
      
      Fixes: 08fe9d18
      
       ("bnxt_en: Add Level 2 context memory paging support.")
      Reported-by: default avatarEric Davis <eric.davis@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6ef982de
    • Michael Chan's avatar
      bnxt_en: Fix ring checking logic on 57500 chips. · 0b815023
      Michael Chan authored
      In bnxt_hwrm_check_pf_rings(), add the proper flag to test the NQ
      resources.  Without the proper flag, the firmware will change
      the NQ resource allocation and remap the IRQ, causing missing
      IRQs.  This issue shows up when adding MQPRIO TX queues, for example.
      
      Fixes: 36d65be9
      
       ("bnxt_en: Disable MSIX before re-reserving NQs/CMPL rings.")
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b815023
    • Gustavo A. R. Silva's avatar
      mISDN: hfcsusb: Use struct_size() in kzalloc() · 8d008e64
      Gustavo A. R. Silva authored
      
      
      One of the more common cases of allocation size calculations is finding the
      size of a structure that has a zero-sized array at the end, along with memory
      for some number of elements for that array. For example:
      
      struct foo {
          int stuff;
          void *entry[];
      };
      
      instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
      
      Instead of leaving these open-coded and prone to type mistakes, we can now
      use the new struct_size() helper:
      
      instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
      
      This code was detected with the help of Coccinelle.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d008e64
    • Jia-Ju Bai's avatar
      isdn: i4l: isdn_tty: Fix some concurrency double-free bugs · 2ff33d66
      Jia-Ju Bai authored
      
      
      The functions isdn_tty_tiocmset() and isdn_tty_set_termios() may be
      concurrently executed.
      
      isdn_tty_tiocmset
        isdn_tty_modem_hup
          line 719: kfree(info->dtmf_state);
          line 721: kfree(info->silence_state);
          line 723: kfree(info->adpcms);
          line 725: kfree(info->adpcmr);
      
      isdn_tty_set_termios
        isdn_tty_modem_hup
          line 719: kfree(info->dtmf_state);
          line 721: kfree(info->silence_state);
          line 723: kfree(info->adpcms);
          line 725: kfree(info->adpcmr);
      
      Thus, some concurrency double-free bugs may occur.
      
      These possible bugs are found by a static tool written by myself and
      my manual code review.
      
      To fix these possible bugs, the mutex lock "modem_info_mutex" used in
      isdn_tty_tiocmset() is added in isdn_tty_set_termios().
      Signed-off-by: default avatarJia-Ju Bai <baijiaju1990@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ff33d66
    • Zha Bin's avatar
      vhost/vsock: fix vhost vsock cid hashing inconsistent · 7fbe078c
      Zha Bin authored
      The vsock core only supports 32bit CID, but the Virtio-vsock spec define
      CID (dst_cid and src_cid) as u64 and the upper 32bits is reserved as
      zero. This inconsistency causes one bug in vhost vsock driver. The
      scenarios is:
      
        0. A hash table (vhost_vsock_hash) is used to map an CID to a vsock
        object. And hash_min() is used to compute the hash key. hash_min() is
        defined as:
        (sizeof(val) <= 4 ? hash_32(val, bits) : hash_long(val, bits)).
        That means the hash algorithm has dependency on the size of macro
        argument 'val'.
        0. In function vhost_vsock_set_cid(), a 64bit CID is passed to
        hash_min() to compute the hash key when inserting a vsock object into
        the hash table.
        0. In function vhost_vsock_get(), a 32bit CID is passed to hash_min()
        to compute the hash key when looking up a vsock for an CID.
      
      Because the different size of the CID, hash_min() returns different hash
      key, thus fails to look up the vsock object for an CID.
      
      To fix this bug, we keep CID as u64 in the IOCTLs and virtio message
      headers, but explicitly convert u64 to u32 when deal with the hash table
      and vsock core.
      
      Fixes: 834e772c ("vhost/vsock: fix use-after-free in network stack callers")
      Link: https://github.com/stefanha/virtio/blob/vsock/trunk/content.tex
      
      Signed-off-by: default avatarZha Bin <zhabin@linux.alibaba.com>
      Reviewed-by: default avatarLiu Jiang <gerry@linux.alibaba.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7fbe078c
  7. 11 Jan, 2019 15 commits