1. 18 Jan, 2019 1 commit
    • Ido Schimmel's avatar
      mlxsw: pci: Ring CQ's doorbell before RDQ's · c9ebea04
      Ido Schimmel authored
      When a packet should be trapped to the CPU the device consumes a WQE
      (work queue element) from an RDQ (receive descriptor queue) and copies
      the packet to the address specified in the WQE. The device then tries to
      post a CQE (completion queue element) that contains various metadata
      (e.g., ingress port) about the packet to a CQ (completion queue).
      
      In case the device managed to consume a WQE, but did not manage to post
      the corresponding CQE, it will get stuck. This unlikely situation can be
      triggered due to the scheme the driver is currently using to process
      CQEs.
      
      The driver will consume up to 512 CQEs at a time and after processing
      each corresponding WQE it will ring the RDQ's doorbell, letting the
      device know that a new WQE was posted for it to consume. Only after
      processing all the CQEs (up to 512), the driver will ring the CQ's
      doorbell, letting the device know that new ones can be posted.
      
      Fix this by having the driver ring the CQ's doorbell for every processed
      CQE, but before ringing the RDQ's doorbell. This guarantees that
      whenever we post a new WQE, there is a corresponding CQE available. Copy
      the currently processed CQE to prevent the device from overwriting it
      with a new CQE after ringing the doorbell.
      
      Note that the driver still arms the CQ only after processing all the
      pending CQEs, so that interrupts for this CQ will only be delivered
      after the driver finished its processing.
      
      Before commit 8404f6f2 ("mlxsw: pci: Allow to use CQEs of version 1
      and version 2") the issue was virtually impossible to trigger since the
      number of CQEs was twice the number of WQEs and the number of CQEs
      processed at a time was equal to the number of available WQEs.
      
      Fixes: 8404f6f2
      
       ("mlxsw: pci: Allow to use CQEs of version 1 and version 2")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reported-by: default avatarSemion Lisyansky <semionl@mellanox.com>
      Tested-by: default avatarSemion Lisyansky <semionl@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c9ebea04
  2. 08 Jan, 2019 7 commits
  3. 24 Dec, 2018 1 commit
  4. 20 Dec, 2018 8 commits
  5. 19 Dec, 2018 9 commits
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Hold a reference on RIF's netdev · b61cd7c6
      Ido Schimmel authored
      
      
      Previous patches tried to make RIF deletion more robust and avoid
      use-after-free situations.
      
      As another precaution, hold a reference on a RIF's netdev and release it
      when the RIF is deleted.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b61cd7c6
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Make RIF deletion more robust · 965fa8e6
      Ido Schimmel authored
      
      
      In the past we had multiple instances where RIFs were not properly
      deleted.
      
      One of the reasons for leaking a RIF was that at the time when IP
      addresses were flushed from the respective netdev (prompting the
      destruction of the RIF), the netdev was no longer a mlxsw upper. This
      caused the inet{,6}addr notification blocks to ignore the NETDEV_DOWN
      event and leak the RIF.
      
      Instead of checking whether the netdev is our upper when an IP address
      is removed, we can instead check if the netdev has a RIF configured.
      
      To look up a RIF we need to access mlxsw private data, so the patch
      stores the notification blocks inside a mlxsw struct. This then allows
      us to use container_of() and extract the required private data.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      965fa8e6
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Propagate 'struct mlxsw_sp' further · 21ffedb6
      Ido Schimmel authored
      
      
      Next patch is going to make RIF deletion more robust by removing
      reliance on fragile mlxsw_sp_lower_get(). This is because a netdev is
      not necessarily our upper anymore when its IP addresses are flushed.
      
      The inet{,6}addr notification blocks are going to resolve 'struct
      mlxsw_sp' using container_of(), but the functions they call still use
      mlxsw_sp_lower_get().
      
      As a preparation for the next patch, propagate 'struct mlxsw_sp' down to
      the functions called from the notification blocks and remove reliance on
      mlxsw_sp_lower_get().
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      21ffedb6
    • Ido Schimmel's avatar
      mlxsw: spectrum: Properly cleanup LAG uppers when removing port from LAG · be2d6f42
      Ido Schimmel authored
      
      
      When a LAG device or a VLAN device on top of it is enslaved to a bridge,
      the driver propagates the CHANGEUPPER event to the LAG's slaves.
      
      This causes each physical port to increase the reference count of the
      internal representation of the bridge port by calling
      mlxsw_sp_port_bridge_join().
      
      However, when a port is removed from a LAG, the corresponding leave()
      function is not called and the reference count is not decremented. This
      leads to ugly hacks such as mlxsw_sp_bridge_port_should_destroy() that
      try to understand if the bridge port should be destroyed even when its
      reference count is not 0.
      
      Instead, make sure that when a port is unlinked from a LAG it would see
      the same events as if the LAG (or its uppers) were unlinked from a
      bridge.
      
      The above is achieved by walking the LAG's uppers when a port is
      unlinked and calling mlxsw_sp_port_bridge_leave() for each upper that is
      enslaved to a bridge.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      be2d6f42
    • Ido Schimmel's avatar
      mlxsw: spectrum: Remove reference count from VLAN entries · 635c8c8b
      Ido Schimmel authored
      Commit b3529af6
      
       ("spectrum: Reference count VLAN entries") started
      reference counting port-VLAN entries in a similar fashion to the 8021q
      driver.
      
      However, this is not actually needed and only complicates things.
      Instead, the driver should forbid the creation of a VLAN on a port if
      this VLAN already exists. This would also solve the issue fixed by the
      mentioned commit.
      
      Therefore, remove the get()/put() API and use create()/destroy()
      instead.
      
      One place that needs special attention is VLAN addition in a VLAN-aware
      bridge via switchdev operations. In case the VLAN flags (e.g., 'pvid')
      are toggled, then the VLAN entry already exists. To prevent the driver
      from wrongly returning EEXIST, the driver is changed to check in the
      prepare phase whether the entry already exists and only returns an error
      in case it is not associated with the correct bridge port.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      635c8c8b
    • Ido Schimmel's avatar
      mlxsw: spectrum: Handle VLAN device unlinking · e149113a
      Ido Schimmel authored
      In commit 993107fe
      
       ("mlxsw: spectrum_switchdev: Fix VLAN device
      deletion via ioctl") I fixed a bug caused by the fact that the driver
      views differently the deletion of a VLAN device when it is deleted via
      an ioctl and netlink.
      
      Instead of relying on a specific order of events (device being
      unregistered vs. VLAN filter being updated), simply make sure that the
      driver performs the necessary cleanup when the VLAN device is unlinked,
      which always happens before the other two events.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e149113a
    • Ido Schimmel's avatar
      mlxsw: spectrum_fid: Remove unused function · f1d7c33d
      Ido Schimmel authored
      
      
      This function is no longer used. Remove it.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f1d7c33d
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Do not destroy RIFs based on FID's reference count · 32fd4b49
      Ido Schimmel authored
      
      
      Currently, when a RIF is constructed on top of a FID, the RIF increments
      the FID's reference count and the RIF is destroyed when the FID's
      reference count drops to 1. This effectively means that when no local
      ports are member in the FID, the FID is destroyed regardless if the
      router port is a member in the FID or not.
      
      The above can lead to the unexpected behavior in which routes using a
      VLAN interface as their nexthop device are no longer offloaded after the
      last local port leaves the corresponding VLAN (FID).
      
      Example:
      # ip -4 route show dev br0.10
      192.0.2.0/24 proto kernel scope link src 192.0.2.1 offload
      # bridge vlan del vid 10 dev swp3
      # ip -4 route show dev br0.10
      192.0.2.0/24 proto kernel scope link src 192.0.2.1
      
      After the patch, the route is offloaded before and after the VLAN is
      removed from local port 'swp3', as the RIF corresponding to 'br0.10'
      continues to exists.
      
      In order to remove RIFs' reliance on the underlying FID's reference
      count, we need to add a reference count to sub-port RIFs, which are RIFs
      that correspond to physical ports and their uppers (e.g., LAG devices).
      
      In this case, each {Port, VID} ('struct mlxsw_sp_port_vlan') needs to
      hold a reference on the RIF. For example:
      
                             bond0.10
                                |
                              bond0
                                |
                            +-------+
                            |       |
                          swp1    swp2
      
      Both {Port 1, VID 10} and {Port 2, VID 10} will hold a reference on the
      RIF corresponding to 'bond0.10'. When the last reference is dropped, the
      RIF will be destroyed.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32fd4b49
    • Ido Schimmel's avatar
      mlxsw: spectrum: Sanitize VLAN interface's uppers · 927d0ef1
      Ido Schimmel authored
      
      
      Currently, only VRF and macvlan uppers are supported on top of VLAN
      device configured over a bridge, so make sure the driver forbids other
      uppers.
      
      Note that enslavement to a VRF is handled earlier in the notification
      block, so there is no need to check for a VRF upper here.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      927d0ef1
  6. 18 Dec, 2018 4 commits
    • Ido Schimmel's avatar
      mlxsw: spectrum_nve: Fix memory leak upon driver reload · 5edb7e8b
      Ido Schimmel authored
      The pointer was NULLed before freeing the memory, resulting in a memory
      leak. Trace from kmemleak:
      
      unreferenced object 0xffff88820ae36528 (size 512):
        comm "devlink", pid 5374, jiffies 4295354033 (age 10829.296s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000a43f5195>] kmem_cache_alloc_trace+0x1be/0x330
          [<00000000312f8140>] mlxsw_sp_nve_init+0xcb/0x1ae0
          [<0000000009201d22>] mlxsw_sp_init+0x1382/0x2690
          [<000000007227d877>] mlxsw_sp1_init+0x1b5/0x260
          [<000000004a16feec>] __mlxsw_core_bus_device_register+0x776/0x1360
          [<0000000070ab954c>] mlxsw_devlink_core_bus_device_reload+0x129/0x220
          [<00000000432313d5>] devlink_nl_cmd_reload+0x119/0x1e0
          [<000000003821a06b>] genl_family_rcv_msg+0x813/0x1150
          [<00000000d54d04c0>] genl_rcv_msg+0xd1/0x180
          [<0000000040543d12>] netlink_rcv_skb+0x152/0x3c0
          [<00000000efc4eae8>] genl_rcv+0x2d/0x40
          [<00000000ea645603>] netlink_unicast+0x52f/0x740
          [<00000000641fca1a>] netlink_sendmsg+0x9c7/0xf50
          [<00000000fed4a4b8>] sock_sendmsg+0xbe/0x120
          [<00000000d85795a9>] __sys_sendto+0x397/0x620
          [<00000000c5f84622>] __x64_sys_sendto+0xe6/0x1a0
      
      Fixes: 6e6030bd
      
       ("mlxsw: spectrum_nve: Implement common NVE core")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5edb7e8b
    • Ido Schimmel's avatar
      mlxsw: spectrum: Add trap for decapsulated ARP packets · 5d504391
      Ido Schimmel authored
      After a packet was decapsulated it is classified to the relevant FID
      based on its VNI and undergoes L2 forwarding.
      
      Unlike regular (non-encapsulated) ARP packets, Spectrum does not trap
      decapsulated ARP packets during L2 forwarding and instead can only trap
      such packets in the underlay router during decapsulation.
      
      Add this missing packet trap, which is required for VXLAN routing when
      the MAC of the target host is not known.
      
      Fixes: b02597d5
      
       ("mlxsw: spectrum: Add NVE packet traps")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d504391
    • Shalom Toledo's avatar
      mlxsw: core: Increase timeout during firmware flash process · cf0b70e7
      Shalom Toledo authored
      During the firmware flash process, some of the EMADs get timed out, which
      causes the driver to send them again with a limit of 5 retries. There are
      some situations in which 5 retries is not enough and the EMAD access fails.
      If the failed EMAD was related to the flashing process, the driver fails
      the flashing.
      
      The reason for these timeouts during firmware flashing is cache misses in
      the CPU running the firmware. In case the CPU needs to fetch instructions
      from the flash when a firmware is flashed, it needs to wait for the
      flashing to complete. Since flashing takes time, it is possible for pending
      EMADs to timeout.
      
      Fix by increasing EMADs' timeout while flashing firmware.
      
      Fixes: ce6ef68f
      
       ("mlxsw: spectrum: Implement the ethtool flash_device callback")
      Signed-off-by: default avatarShalom Toledo <shalomt@mellanox.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cf0b70e7
    • Shalom Toledo's avatar
      mlxsw: spectrum: Update the supported firmware to version 13.1910.622 · d1675a16
      Shalom Toledo authored
      
      
      This new firmware contains:
       * New packet traps for discarded packets
       * Secure firmware flash bug fix
       * Fence mechanism bug fix
       * TCAM RMA bug fix
      Signed-off-by: default avatarShalom Toledo <shalomt@mellanox.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d1675a16
  7. 16 Dec, 2018 9 commits
  8. 14 Dec, 2018 1 commit