1. 05 Feb, 2015 11 commits
    • Moni Shoua's avatar
      net/mlx4_en: Port aggregation configuration · 5da03547
      Moni Shoua authored
      
      
      Capture NETDEV events generated by the bonding driver and based on that
      make decisions of how to configure port aggregation in the mlx4 core driver.
      
      This includes setting the V2P port table and re-creating the interested
      interfaces in bonded/non-bonded mode.
      
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5da03547
    • Moni Shoua's avatar
      net/mlx4_core: Port aggregation upper layer interface · 53f33ae2
      Moni Shoua authored
      
      
      Supply interface functions to bond and unbond ports of a mlx4 internal
      interfaces. Example for such an interface is the one registered by the
      mlx4 IB driver under RoCE.
      
      There are
      
      1. Functions to go in/out to/from bonded mode
      2. Function to remap virtual ports to physical ports
      
      The bond_mutex prevents simultaneous access to data that keep status of
      the device in bonded mode.
      
      The upper mlx4 interface marks to the mlx4 core module that they
      want to be subject for such bonding by setting the MLX4_INTFF_BONDING
      flag. Interface which goes to/from bonded mode is re-created.
      
      The mlx4 Ethernet driver does not set this flag when registering the
      interface, the IB driver does.
      
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      53f33ae2
    • Moni Shoua's avatar
      net/mlx4_core: Port aggregation low level interface · 59e14e32
      Moni Shoua authored
      
      
      Implement the hardware interface required for port aggregation.
      
      1. Disable RX port check on receive - don't perform a validity check
      that matches to QP's port and the port where the packet is received.
      
      2. Virtual to physical port remap - configure virtual to physical port
      mapping. Port remap capability for virtual functions.
      
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      59e14e32
    • Moni Shoua's avatar
      net/bonding: Notify state change on slaves · 69e61133
      Moni Shoua authored
      
      
      Use notifier chain to dispatch an event upon a change in slave state.
      Event is dispatched with slave specific info.
      
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69e61133
    • Moni Shoua's avatar
      net/bonding: Move slave state changes to a helper function · 69a2338e
      Moni Shoua authored
      
      
      Move slave state changes to a helper function, this is a pre-step for adding
      functionality of dispatching an event when this helper is called.
      
      This commit doesn't add new functionality.
      
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69a2338e
    • Moni Shoua's avatar
      net/core: Add event for a change in slave state · 61bd3857
      Moni Shoua authored
      
      
      Add event which provides an indication on a change in the state
      of a bonding slave. The event handler should cast the pointer to the
      appropriate type (struct netdev_bonding_info) in order to get the
      full info about the slave.
      
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      61bd3857
    • David S. Miller's avatar
      Merge branch 'tipc-next' · 251c005a
      David S. Miller authored
      
      
      Jon Maloy says:
      
      ====================
      tipc: some small fixes
      
      During extensive testing and analysis of running dual links between
      nodes, we have encountered some issues that potentially may cause
      problems. We choose to fix those proactively in this series.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      251c005a
    • Jon Paul Maloy's avatar
      tipc: separate link starting event from link timeout event · af9946fd
      Jon Paul Maloy authored
      
      
      When a new link instance is created, it is trigged to start by
      sending it a TIPC_STARTING_EVT, whereafter a regular link
      reset is applied to it.
      
      The starting event is codewise treated as a timeout event, and prompts
      a link RESET message to be sent to the peer node, carrying a link
      session identifier. The later link_reset() call nudges this session
      identifier, whereafter all subsequent RESET messages will be sent out
      with the new identifier. The latter session number overrides the former,
      causing the peer to unconditionally accept it irrespective of its
      current working state.
      
      We don't think that this causes any problem, but it is not in accordance
      with the protocol spec, and may cause confusion when debugging TIPC
      sessions.
      
      To avoid this, we make the starting event distinct from the
      subsequent timeout events, by not allowing the former to send
      out any RESET message. This eliminates the described problem.
      
      Reviewed-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af9946fd
    • Jon Paul Maloy's avatar
      tipc: eliminate race during node creation · b45db71b
      Jon Paul Maloy authored
      
      
      Instances of struct node are created in the function tipc_disc_rcv()
      under the assumption that there is no race between received discovery
      messages arriving from the same node. This assumption is wrong.
      When we use more than one bearer, it is possible that discovery
      messages from the same node arrive at the same moment, resulting in
      creation of two instances of struct tipc_node. This may later cause
      confusion during link establishment, and may result in one of the links
      never becoming activated.
      
      We fix this by making lookup and potential creation of nodes atomic.
      Instead of first looking up the node, and in case of failure, create it,
      we now start with looking up the node inside node_link_create(), and
      return a reference to that one if found. Otherwise, we go ahead and
      create the node as we did before.
      
      Reviewed-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b45db71b
    • Jon Paul Maloy's avatar
      tipc: avoid stale link after aborted failover · 7d24dcdb
      Jon Paul Maloy authored
      
      
      During link failover it may happen that the remaining link goes
      down while it is still in the process of taking over traffic
      from a previously failed link. When this happens, we currently
      abort the failover procedure and reset the first failed link to
      non-failover mode, so that it will be ready to re-establish
      contact with its peer when it comes available.
      
      However, if the first link goes down because its bearer was manually
      disabled, it is not enough to reset it; it must also be deleted;
      which is supposed to happen when the failover procedure is finished.
      Otherwise it will remain a zombie link: attached to the owner node
      structure, in mode LINK_STOPPED, and permanently blocking any re-
      establishing of the link to the peer via the interface in question.
      
      We fix this by amending the failover abort procedure. Apart from
      resetting the link to non-failover state, we test if the link is
      also in LINK_STOPPED mode. If so, we delete it, using the conditional
      tipc_link_delete() function introduced in the previous commit.
      
      Reviewed-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d24dcdb
    • Jon Paul Maloy's avatar
      tipc: add reference count to struct tipc_link · 2d72d495
      Jon Paul Maloy authored
      
      
      When a bearer is disabled, all pertaining links will be reset and
      deleted. However, if there is a second active link towards a killed
      link's destination, the delete has to be postponed until the failover
      is finished. During this interval, we currently put the link in zombie
      mode, i.e., we take it out of traffic, delete its timer, but leave it
      attached to the owner node structure until all missing packets have
      been received.  When this is done, we detach the link from its node
      and delete it, assuming that the synchronous timer deletion that was
      initiated earlier in a different thread has finished.
      
      This is unsafe, as the failover may finish before del_timer_sync()
      has returned in the other thread.
      
      We fix this by adding an atomic reference counter of type kref in
      struct tipc_link. The counter keeps track of the references kept
      to the link by the owner node and the timer. We then do a conditional
      delete, based on the reference counter, both after the failover has
      been finished and when the timer expires, if applicable. Whoever
      comes last, will actually delete the link. This approach also implies
      that we can make the deletion of the timer asynchronous.
      
      Reviewed-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d72d495
  2. 04 Feb, 2015 14 commits
  3. 03 Feb, 2015 15 commits