      ethernet: use net core MTU range checking in more drivers · d894be57
      Somehow, I missed a healthy number of ethernet drivers in the last pass.
      Most of these drivers either were in need of an updated max_mtu to make
      jumbo frames possible to enable again. In a few cases, also setting a
      different min_mtu to match previous lower bounds. There are also a few
      drivers that had no upper bounds checking, so they're getting a brand new
      ETH_MAX_MTU that is identical to IP_MAX_MTU, but accessible by includes
      all ethernet and ethernet-like drivers all have already.
      - min_mtu = 0, max_mtu = 9000
      - min_mtu = 128, max_mtu = adapter->max_mtu
      - min_mtu = 0, max_mtu = 9000
      - min_mtu = 0, max_mtu = 1518
      - min_mtu = 81, max_mtu = 65535
      - min_mtu = 81, max_mtu = 9600
      - min_mtu = 81, max_mtu = 65535
      - min_mtu = 256, max_mtu = 9000
      - min_mtu = 68, max_mtu = 65535
      - min_mtu = adapter->min_mtu, max_mtu = adapter->max_mtu
      - remove now redundant ibmvnic_change_mtu
      - min_mtu = 1280, max_mtu = 9202
      - min_mtu = 64, max_mtu = 9500
      - min_mtu = 0, max_mtu = 65535
      - Basically bypassing the core checks, and instead relying on dynamic
        checks in the respective switch drivers' ndo_change_mtu functions
      - min_mtu = 0
      - remove redundant ns83820_change_mtu, only checked for mtu > 1500
      - min_mtu = 0, max_mtu = 8000 (P2), max_mtu = 9600 (P3)
      - min_mtu = 1500, max_mtu = 9000
      - driver only supports setting mtu to 1500 or 9000, so the core check only
        rules out < 1500 and > 9000, qlge_change_mtu still needs to check that
        the value is 1500 or 9000
      - min_mtu = 46, max_mtu = 9194
      - min_mtu = 64, max_mtu = 9000
      Fixes: 61e84623
       ("net: centralize net_device min/max MTU checking")
      Signed-off-by: default avatarJarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      net/mlx5: Add MLX5_ARRAY_SET64 to fix BUILD_BUG_ON · b8a4ddb2
      I am hitting this in mlx5:
      drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c: In function
      drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:346: error: call
      to __compiletime_assert_346 declared with attribute error:
      BUILD_BUG_ON failed: __mlx5_bit_off(manage_pages_out, pas[i]) % 64
      drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c: In function give_pages:
      drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:291: error: call
      to __compiletime_assert_291 declared with attribute error:
      BUILD_BUG_ON failed: __mlx5_bit_off(manage_pages_in, pas[i]) % 64
      Problem is that this is doing a BUILD_BUG_ON on a non-constant
      expression because of trying to take offset of pas[i] in the
      Fix is to create MLX5_ARRAY_SET64 that takes an additional argument
      that is the field index to separate between BUILD_BUG_ON on the array
      constant field and the indexed field to assign the value to.
      There are two callers of MLX5_SET64 that are trying to get a variable
      offset, change those to call MLX5_ARRAY_SET64 passing 'pas' and 'i'
      as the arguments to use in the offset check and the indexed value
      Fixes: a533ed5e
       ("net/mlx5: Pages management commands via mlx5 ifc")
      Signed-off-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets · fd10ed8e
      In MLX qp packets, the LRH (built by the driver) has both a VL field
      and an SL field. When building a QP1 packet, the VL field should
      reflect the SLtoVL mapping and not arbitrarily contain zero (as is
      done now). This bug causes credit problems in IB switches at
      high rates of QP1 packets.
      The fix is to cache the SL to VL mapping in the driver, and look up
      the VL mapped to the SL provided in the send request when sending
      QP1 packets.
      For FW versions which support generating a port_management_config_change
      event with subtype sl-to-vl-table-change, the driver uses that event
      to update its sl-to-vl mapping cache.  Otherwise, the driver snoops
      incoming SMP mads to update the cache.
      There remains the case where the FW is running in secure-host mode
      (so no QP0 packets are delivered to the driver), and the FW does not
      generate the sl2vl mapping change event. To support this case, the
      driver updates (via querying the FW) its sl2vl mapping cache when
      running in secure-host mode when it receives either a Port Up event
      or a client-reregister event (where the port is still up, but there
      may have been an opensm failover).
      OpenSM modifies the sl2vl mapping before Port Up and Client-reregister
      events occur, so if there is a mapping change the driver's cache will
      be properly updated.
      Fixes: 225c7b1f
       ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      mlxsw: spectrum_router: avoid potential uninitialized data usage · ab580705
      If fi->fib_nhs is zero, the router interface pointer is uninitialized, as shown by
      this warning:
      drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c: In function 'mlxsw_sp_router_fib_event':
      drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:1674:21: error: 'r' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:1643:23: note: 'r' was declared here
      This changes the loop so we handle the case the same way as finding no router
      interface pointer attached to one of the nexthops to ensure we always
      trap here instead of using uninitialized data.
      Fixes: b45f64d1
       ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      net/mlx5e: shut up maybe-uninitialized warning · d0debb76
      Build-testing this driver with -Wmaybe-uninitialized gives a new false-positive
      warning that I can't really explain:
      drivers/net/ethernet/mellanox/mlx5/core/en_tc.c: In function 'mlx5e_configure_flower':
      drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:509:3: error: 'old_attr' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      It's obvious from the code that 'old_attr' is initialized whenever 'old'
      is non-NULL here. The warning appears with all versions I tested from gcc-4.7
      through gcc-6.1, and I could not come up with a way to rewrite the function
      in a more readable way that avoids the warning, so I'm adding another
      initialization to shut it up.
      Fixes: 8b32580d
       ("net/mlx5e: Add TC vlan action for SRIOV offloads")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      net/mlx5e: XDP TX xmit more · 35b510e2
      Previously we rang XDP SQ doorbell on every forwarded XDP packet.
      Here we introduce a xmit more like mechanism that will queue up more
      than one packet into SQ (up to RX napi budget) w/o notifying the hardware.
      Once RX napi budget is consumed and we exit napi RX loop, we will
      flush (doorbell) all XDP looped packets in case there are such.
      XDP forward packet rate:
      Comparing XDP with and w/o xmit more (bulk transmit):
      RX Cores    XDP TX       XDP TX (xmit more)
      1           6.5Mpps      12.4Mpps
      2          13.2Mpps      24.2Mpps
      4          25.2Mpps      36.3Mpps*
      8          36.3Mpps*     36.3Mpps*
      *My xmitter was limited to 36.3Mpps, so it is the bottleneck.
      It seems that receive side can handle more.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      net/mlx5e: XDP TX forwarding support · b5503b99
      Adding support for XDP_TX forwarding from xdp program.
      Using XDP, now user can loop packets out of the same port.
      We create a dedicated TX SQ for each channel that will serve
      XDP programs that return XDP_TX action to loop packets back to
      the wire directly from the channel RQ RX path.
      For that RX pages will now need to be mapped bi-directionally,
      and on XDP_TX action we will sync the page back to device then
      queue it into SQ for transmission.  The XDP xmit frame function will
      report back to the RX path if the page was consumed (transmitted), if so,
      RX path will forget about that page as if it were released to the stack.
      Later on, on XDP TX completion, the page will be released back to the
      page cache.
      For simplicity this patch will hit a doorbell on every XDP TX packet.
      Next patch will introduce a xmit more like mechanism that will
      queue up more than one packet into SQ w/o notifying the hardware,
      once RX napi loop is done we will hit doorbell once for all XDP TX
      packets form the previous loop.  This should drastically improve
      XDP TX performance.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      net/mlx5e: Have a clear separation between different SQ types · f10b7cc7
      Make a clear separate between Regular SQ (TXQ) and ICO SQ creation,
      destruction and union their mutual information structures.
      Don't allocate redundant TXQ skb/wqe_info/dma_fifo arrays for ICO SQ.
      And have a different SQ edge for ICO SQ than TXQ SQ, to be more
      In preparation for XDP TX support.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      net/mlx5e: XDP fast RX drop bpf programs support · 86994156
      Add support for the BPF_PROG_TYPE_PHYS_DEV hook in mlx5e driver.
      When XDP is on we make sure to change channels RQs type to
      MLX5_WQ_TYPE_LINKED_LIST rather than "striding RQ" type to
      ensure "page per packet".
      On XDP set, we fail if HW LRO is set and request from user to turn it
      off.  Since on ConnectX4-LX HW LRO is always on by default, this will be
      annoying, but we prefer not to enforce LRO off from XDP set function.
      Full channels reset (close/open) is required only when setting XDP
      When XDP set is called just to exchange programs, we will update
      each RQ xdp program on the fly and for synchronization with current
      data path RX activity of that RQ, we temporally disable that RQ and
      ensure RX path is not running, quickly update and re-enable that RQ,
      for that we do:
      	- rq.state = disabled
      	- napi_synnchronize
      	- xchg(rq->xdp_prg)
      	- rq.state = enabled
      	- napi_schedule // Just in case we've missed an IRQ
      Packet rate performance testing was done with pktgen 64B packets and on
      TX side and, TC drop action on RX side compared to XDP fast drop.
      CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
      Comparison is done between:
      	1. Baseline, Before this patch with TC drop action
      	2. This patch with TC drop action
      	3. This patch with XDP RX fast drop
      RX Cores  Baseline(TC drop)    TC drop    XDP fast Drop
      1            5.3Mpps           5.3Mpps     16.5Mpps
      2           10.2Mpps          10.2Mpps     31.3Mpps
      4           20.5Mpps          19.9Mpps     36.3Mpps*
      *My xmitter was limited to 36.3Mpps, so it is the bottleneck.
      It seems that receive side can handle more.
      Signed-off-by: default avatarRana Shahout <ranas@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>