1. 24 Oct, 2015 8 commits
    • Jon Paul Maloy's avatar
      tipc: introduce capability bit for broadcast synchronization · fd556f20
      Jon Paul Maloy authored
      
      
      Until now, we have tried to support both the newer, dedicated broadcast
      synchronization mechanism along with the older, less safe, RESET_MSG/
      ACTIVATE_MSG based one. The latter method has turned out to be a hazard
      in a highly dynamic cluster, so we find it safer to disable it completely
      when we find that the former mechanism is supported by the peer node.
      
      For this purpose, we now introduce a new capabability bit,
      TIPC_BCAST_SYNCH, to inform any peer nodes that dedicated broadcast
      syncronization is supported by the present node. The new bit is conveyed
      between peers in the 'capabilities' field of neighbor discovery messages.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fd556f20
    • Jon Paul Maloy's avatar
      tipc: let broadcast transmission use new link transmit function · 2f566124
      Jon Paul Maloy authored
      
      
      This commit simplifies the broadcast link transmission function, by
      leveraging previous changes to the link transmission function and the
      broadcast transmission link life cycle.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2f566124
    • Jon Paul Maloy's avatar
      tipc: make struct tipc_link generic to support broadcast · c1ab3f1d
      Jon Paul Maloy authored
      
      
      Realizing that unicast is just a special case of broadcast, we also see
      that we can go in the other direction, i.e., that modest changes to the
      current unicast link can make it generic enough to support broadcast.
      
      The following changes are introduced here:
      
      - A new counter ("ackers") in struct tipc_link, to indicate how many
        peers need to ack a packet before it can be released.
      - A corresponding counter in the skb user area, to keep track of how
        many peers a are left to ack before a buffer can be released.
      - A new counter ("acked"), to keep persistent track of how far a peer
        has acked at the moment, i.e., where in the transmission queue to
        start updating buffers when the next ack arrives. This is to avoid
        double acknowledgements from a peer, with inadvertent relase of
        packets as a result.
      - A more generic tipc_link_retrans() function, where retransmit starts
        from a given sequence number, instead of the first packet in the
        transmision queue. This is to minimize the number of retransmitted
        packets on the broadcast media.
      
      When the new functionality is taken into use in the next commits,
      we expect it to have minimal effect on unicast mode performance.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1ab3f1d
    • Jon Paul Maloy's avatar
      tipc: use explicit allocation of broadcast send link · 32301906
      Jon Paul Maloy authored
      
      
      The broadcast link instance (struct tipc_link) used for sending is
      currently aggregated into struct tipc_bclink. This means that we cannot
      use the regular tipc_link_create() function for initiating the link, but
      do instead have to initiate numerous fields directly from the
      bcast_init() function.
      
      We want to reduce dependencies between the broadcast functionality
      and the inner workings of tipc_link. In this commit, we introduce
      a new function tipc_bclink_create() to link.c, and allocate the
      instance of the link separately using this function.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32301906
    • Jon Paul Maloy's avatar
      tipc: make link implementation independent from struct tipc_bearer · 0e05498e
      Jon Paul Maloy authored
      
      
      In reality, the link implementation is already independent from
      struct tipc_bearer, in that it doesn't store any reference to it.
      However, we still pass on a pointer to a bearer instance in the
      function tipc_link_create(), just to have it extract some
      initialization information from it.
      
      I later commits, we need to create instances of tipc_link without
      having any associated struct tipc_bearer. To facilitate this, we
      want to extract the initialization data already in the creator
      function in node.c, before calling tipc_link_create(), and pass
      this info on as individual parameters in the call.
      
      This commit introduces this change.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e05498e
    • Jon Paul Maloy's avatar
      tipc: create broadcast transmission link at namespace init · 5fd9fd63
      Jon Paul Maloy authored
      
      
      The broadcast transmission link is currently instantiated when the
      network subsystem is started, i.e., on order from user space via netlink.
      
      This forces the broadcast transmission code to do unnecessary tests for
      the existence of the transmission link, as well in single mode node as
      in network mode.
      
      In this commit, we do instead create the link during initialization of
      the name space, and remove it when it is stopped. The fact that the
      transmission link now has a guaranteed longer life cycle than any of its
      potential clients paves the way for further code simplifcations
      and optimizations.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5fd9fd63
    • Jon Paul Maloy's avatar
      tipc: move broadcast link lock to struct tipc_net · 0043550b
      Jon Paul Maloy authored
      
      
      The broadcast lock will need to be acquired outside bcast.c in a later
      commit. For this reason, we move the lock to struct tipc_net. Consistent
      with the changes in the previous commit, we also introducee two new
      functions tipc_bcast_lock() and tipc_bcast_unlock(). The code that is
      currently using tipc_bclink_lock()/unlock() will be phased out during
      the coming commits in this series.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0043550b
    • Jon Paul Maloy's avatar
      tipc: move bcast definitions to bcast.c · 6beb19a6
      Jon Paul Maloy authored
      
      
      Currently, a number of structure and function definitions related
      to the broadcast functionality are unnecessarily exposed in the file
      bcast.h. This obscures the fact that the external interface towards
      the broadcast link in fact is very narrow, and causes unnecessary
      recompilations of other files when anything changes in those
      definitions.
      
      In this commit, we move as many of those definitions as is currently
      possible to the file bcast.c.
      
      We also rename the structure 'tipc_bclink' to 'tipc_bc_base', both
      since the name does not correctly describe the contents of this
      struct, and will do so even less in the future, and because we want
      to use the term 'link' more appropriately in the functionality
      introduced later in this series.
      
      Finally, we rename a couple of functions, such as tipc_bclink_xmit()
      and others that will be kept in the future, to include the term 'bcast'
      instead.
      
      There are no functional changes in this commit.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6beb19a6
  2. 22 Oct, 2015 3 commits
    • Jon Paul Maloy's avatar
      tipc: conditionally expand buffer headroom over udp tunnel · e5356794
      Jon Paul Maloy authored
      In commit d999297c ("tipc: reduce locking scope during packet reception")
      we altered the packet retransmission function. Since then, when
      restransmitting packets, we create a clone of the original buffer
      using __pskb_copy(skb, MIN_H_SIZE), where MIN_H_SIZE is the size of
      the area we want to have copied, but also the smallest possible TIPC
      packet size. The value of MIN_H_SIZE is 24.
      
      Unfortunately, __pskb_copy() also has the effect that the headroom
      of the cloned buffer takes the size MIN_H_SIZE. This is too small
      for carrying the packet over the UDP tunnel bearer, which requires
      a minimum headroom of 28 bytes. A change to just use pskb_copy()
      lets the clone inherit the original headroom of 80 bytes, but also
      assumes that the copied data area is of at least that size, something
      that is not always the case. So that is not a viable solution.
      
      We now fix this by adding a check for sufficient headroom in the
      transmit function of udp_media.c, and expanding it when necessary.
      
      Fixes: commit d999297c
      
       ("tipc: reduce locking scope during packet reception")
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5356794
    • Jon Paul Maloy's avatar
      tipc: allow non-linear first fragment buffer · 45c8b7b1
      Jon Paul Maloy authored
      The current code for message reassembly is erroneously assuming that
      the the first arriving fragment buffer always is linear, and then goes
      ahead resetting the fragment list of that buffer in anticipation of
      more arriving fragments.
      
      However, if the buffer already happens to be non-linear, we will
      inadvertently drop the already attached fragment list, and later
      on trig a BUG() in __pskb_pull_tail().
      
      We see this happen when running fragmented TIPC multicast across UDP,
      something made possible since
      commit d0f91938 ("tipc: add ip/udp media type")
      
      We fix this by not resetting the fragment list when the buffer is non-
      linear, and by initiatlizing our private fragment list tail pointer to
      the tail of the existing fragment list.
      
      Fixes: commit d0f91938
      
       ("tipc: add ip/udp media type")
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      45c8b7b1
    • Jon Paul Maloy's avatar
      tipc: extend broadcast link window size · 53387c4e
      Jon Paul Maloy authored
      The default fix broadcast window size is currently set to 20 packets.
      This is a very low value, set at a time when we were still testing on
      10 Mb/s hubs, and a change to it is long overdue.
      
      Commit 7845989c ("net: tipc: fix stall during bclink wakeup procedure")
      revealed a problem with this low value. For messages of importance LOW,
      the backlog queue limit will be  calculated to 30 packets, while a
      single, maximum sized message of 66000 bytes, carried across a 1500 MTU
      network consists of 46 packets.
      
      This leads to the following scenario (among others leading to the same
      situation):
      
      1: Msg 1 of 46 packets is sent. 20 packets go to the transmit queue, 26
         packets to the backlog queue.
      2: Msg 2 of 46 packets is attempted sent, but rejected because there is
         no more space in the backlog queue at this level. The sender is added
         to the wakeup queue with a "pending packets chain size" number of 46.
      3: Some packets in the transmit queue are acked and released. We try to
         wake up the sender, but the pending size of 46 is bigger than the LOW
         wakeup limit of 30, so this doesn't happen.
      5: Subsequent acks releases all the remaining buffers. Each time we test
         for the wakeup criteria and find that 46 still is larger than 30,
         even after both the transmit and the backlog queues are empty.
      6: The sender is never woken up and given a chance to send its message.
         He is stuck.
      
      We could now loosen the wakeup criteria (used by link_prepare_wakeup())
      to become equal to the send criteria (used by tipc_link_xmit()), i.e.,
      by ignoring the "pending packets chain size" value altogether, or we can
      just increase the queue limits so that the criteria can be satisfied
      anyway. There are good reasons (potentially multiple waiting senders) to
      not opt for the former solution, so we choose the latter one.
      
      This commit fixes the problem by giving the broadcast link window a
      default value of 50 packets. We also introduce a new minimum link
      window size BCLINK_MIN_WIN of 32, which is enough to always avoid the
      described situation. Finally, in order to not break any existing users
      which may set the window explicitly, we enforce that the window is set
      to the new minimum value in case the user is trying to set it to
      anything lower.
      
      Fixes: 7845989c
      
       ("net: tipc: fix stall during bclink wakeup procedure")
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      53387c4e
  3. 16 Oct, 2015 7 commits
    • Jon Paul Maloy's avatar
      tipc: update node FSM when peer RESET message is received · c8199300
      Jon Paul Maloy authored
      
      
      The change made in the previous commit revealed a small flaw in the way
      the node FSM is updated. When the function tipc_node_link_down() is
      called for the last link to a node, we should check whether this was
      caused by a local reset or by a received RESET message from the peer.
      In the latter case, we can directly issue a PEER_LOST_CONTACT_EVT to
      the node FSM, so that it is ready to re-establish contact. If this is
      not done, the peer node will sometimes have to go through a second
      establish cycle before the link becomes stable.
      
      We fix this in this commit by conditionally issuing the mentioned
      event in the function tipc_node_link_down(). We also move LINK_RESET
      FSM even away from the link_reset() function and into the caller
      function, partially because it is easier to follow the code when state
      changes are gathered at a limited number of locations, partially
      because there will be cases in future commits where we don't want the
      link to go RESET mode when link_reset() is called.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Acked-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c8199300
    • Jon Paul Maloy's avatar
      tipc: send out RESET immediately when link goes down · 282b3a05
      Jon Paul Maloy authored
      
      
      When a link is taken down because of a node local event, such as
      disabling of a bearer or an interface, we currently leave it to the
      peer node to discover the broken communication. The default time for
      such failure discovery is 1.5-2 seconds.
      
      If we instead allow the terminating link endpoint to send out a RESET
      message at the moment it is reset, we can achieve the impression that
      both endpoints are going down instantly. Since this is a very common
      scenario, we find it worthwhile to make this small modification.
      
      Apart from letting the link produce the said message, we also have to
      ensure that the interface is able to transmit it before TIPC is
      detached. We do this by performing the disabling of a bearer in three
      steps:
      
      1) Disable reception of TIPC packets from the interface in question.
      2) Take down the links, while allowing them so send out a RESET message.
      3) Disable transmission of TIPC packets on the interface.
      
      Apart from this, we now have to react on the NETDEV_GOING_DOWN event,
      instead of as currently the NEDEV_DOWN event, to ensure that such
      transmission is possible during the teardown phase.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Acked-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      282b3a05
    • Jon Paul Maloy's avatar
      tipc: delay ESTABLISH state event when link is established · 73f646ce
      Jon Paul Maloy authored
      
      
      Link establishing, just like link teardown, is a non-atomic action, in
      the sense that discovering that conditions are right to establish a link,
      and the actual adding of the link to one of the node's send slots is done
      in two different lock contexts. The link FSM is designed to help bridging
      the gap between the two contexts in a safe manner.
      
      We have now discovered a weakness in the implementaton of this FSM.
      Because we directly let the link go from state LINK_ESTABLISHING to
      state LINK_ESTABLISHED already in the first lock context, we are unable
      to distinguish between a fully established link, i.e., a link that has
      been added to its slot, and a link that has not yet reached the second
      lock context. It may hence happen that a manual intervention, e.g., when
      disabling an interface, causes the function tipc_node_link_down() to try
      removing the link from the node slots, decrementing its active link
      counter etc, although the link was never added there in the first place.
      
      We solve this by delaying the actual state change until we reach the
      second lock context, inside the function tipc_node_link_up(). This
      makes it possible for potentail callers of __tipc_node_link_down() to
      know if they should proceed or not, and the problem is solved.
      
      Unforunately, the situation described above also has a second problem.
      Since there by necessity is a tipc_node_link_up() call pending once
      the node lock has been released, we must defuse that call by setting
      the link back from LINK_ESTABLISHING to LINK_RESET state. This forces
      us to make a slight modification to the link FSM, which will now look
      as follows.
      
       +------------------------------------+
       |RESET_EVT                           |
       |                                    |
       |                             +--------------+
       |           +-----------------|   SYNCHING   |-----------------+
       |           |FAILURE_EVT      +--------------+   PEER_RESET_EVT|
       |           |                  A            |                  |
       |           |                  |            |                  |
       |           |                  |            |                  |
       |           |                  |SYNCH_      |SYNCH_            |
       |           |                  |BEGIN_EVT   |END_EVT           |
       |           |                  |            |                  |
       |           V                  |            V                  V
       |    +-------------+          +--------------+          +------------+
       |    |  RESETTING  |<---------|  ESTABLISHED |--------->| PEER_RESET |
       |    +-------------+ FAILURE_ +--------------+ PEER_    +------------+
       |           |        EVT        |    A         RESET_EVT       |
       |           |                   |    |                         |
       |           |  +----------------+    |                         |
       |  RESET_EVT|  |RESET_EVT            |                         |
       |           |  |                     |                         |
       |           |  |                     |ESTABLISH_EVT            |
       |           |  |  +-------------+    |                         |
       |           |  |  | RESET_EVT   |    |                         |
       |           |  |  |             |    |                         |
       |           V  V  V             |    |                         |
       |    +-------------+          +--------------+        RESET_EVT|
       +--->|    RESET    |--------->| ESTABLISHING |<----------------+
            +-------------+ PEER_    +--------------+
             |           A  RESET_EVT       |
             |           |                  |
             |           |                  |
             |FAILOVER_  |FAILOVER_         |FAILOVER_
             |BEGIN_EVT  |END_EVT           |BEGIN_EVT
             |           |                  |
             V           |                  |
            +-------------+                 |
            | FAILINGOVER |<----------------+
            +-------------+
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Acked-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      73f646ce
    • Jon Paul Maloy's avatar
      tipc: disallow packet duplicates in link deferred queue · 8306f99a
      Jon Paul Maloy authored
      
      
      After the previous commits, we are guaranteed that no packets
      of type LINK_PROTOCOL or with illegal sequence numbers will be
      attempted added to the link deferred queue. This makes it possible to
      make some simplifications to the sorting algorithm in the function
      tipc_skb_queue_sorted().
      
      We also alter the function so that it will drop packets if one with
      the same seqeunce number is already present in the queue. This is
      necessary because we have identified weird packet sequences, involving
      duplicate packets, where a legitimate in-sequence packet may advance to
      the head of the queue without being detected and de-queued.
      
      Finally, we make this function outline, since it will now be called only
      in exceptional cases.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Acked-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8306f99a
    • Jon Paul Maloy's avatar
      tipc: improve sequence number checking · 81204c49
      Jon Paul Maloy authored
      
      
      The sequence number of an incoming packet is currently only checked
      for less than, equality to, or bigger than the next expected number,
      meaning that the receive window in practice becomes one half sequence
      number cycle, or U16_MAX/2. This does not make sense, and may not even
      be safe if there are extreme delays in the network. Any packet sent by
      the peer during the ongoing cycle must belong inside his current send
      window, or should otherwise be dropped if possible.
      
      Since a link endpoint cannot know its peer's current send window, it
      has to base this sanity check on a worst-case assumption, i.e., that
      the peer is using a maximum sized window of 8191 packets. Using this
      assumption, we now add a check that the sequence number is not bigger
      than next_expected + TIPC_MAX_LINK_WIN. We also re-order the checks
      done, so that the receive window test is performed before the gap test.
      This way, we are guaranteed that no packet with illegal sequence numbers
      are ever added to the deferred queue.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Acked-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      81204c49
    • Jon Paul Maloy's avatar
      tipc: simplify tipc_link_rcv() reception loop · f9aa358a
      Jon Paul Maloy authored
      
      
      Currently, all packets received in tipc_link_rcv() are unconditionally
      added to the packet deferred queue, whereafter that queue is walked and
      all its buffers evaluated for delivery. This is both non-optimal and
      and makes the queue sorting function unnecessary complex.
      
      This commit changes the loop so that an arrived packet is evaluated
      first, and added to the deferred queue only when a sequence number gap
      is discovered. A non-empty deferred queue is walked until it is empty
      or until its head's sequence number doesn't fit.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Acked-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f9aa358a
    • Jon Paul Maloy's avatar
      tipc: limit usage of temporary skb list during packet reception · 9945e804
      Jon Paul Maloy authored
      
      
      During packet reception, the function tipc_link_rcv() adds its accepted
      packets to a temporary buffer queue, before finally splicing this queue
      into the lock protected input queue that will be delivered up to the
      socket layer. The purpose is to reduce potential contention on the input
      queue lock. However, since the vast majority of packets arrive in
      sequence, they will anyway be added one by one to the input queue, and
      the use of the temporary queue becomes a sub-optimization.
      
      The only case where this queue makes sense is when unpacking buffers
      from a bundle packet; here we want to avoid dozens of small buffers
      to be added individually to the lock-protected input queue in a tight
      loop.
      
      In this commit, we remove the general usage of the temporary queue,
      and keep it only for the packet unbundling case.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Acked-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9945e804
  4. 15 Oct, 2015 1 commit
    • Jon Paul Maloy's avatar
      tipc: move fragment importance field to new header position · dde4b5ae
      Jon Paul Maloy authored
      In commit e3eea1eb ("tipc: clean up handling of message priorities")
      we introduced a field in the packet header for keeping track of the
      priority of fragments, since this value is not present in the specified
      protocol header. Since the value so far only is used at the transmitting
      end of the link, we have not yet officially defined it as part of the
      protocol.
      
      Unfortunately, the field we use for keeping this value, bits 13-15 in
      in word 5, has turned out to be a poor choice; it is already used by the
      broadcast protocol for carrying the 'network id' field of the sending
      node. Since packet fragments also need to be transported across the
      broadcast protocol, the risk of conflict is obvious, and we see this
      happen when we use network identities larger than 2^13-1. This has
      escaped our testing because we have so far only been using small network
      id values.
      
      We now move this field to bits 0-2 in word 9, a field that is guaranteed
      to be unused by all involved protocols.
      
      Fixes: e3eea1eb
      
       ("tipc: clean up handling of message priorities")
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Acked-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dde4b5ae
  5. 14 Oct, 2015 1 commit
    • Jon Paul Maloy's avatar
      tipc: eliminate risk of stalled link synchronization · 0f8b8e28
      Jon Paul Maloy authored
      In commit 6e498158 ("tipc: move link synch and failover to link aggregation level")
      we introduced a new mechanism for performing link failover and
      synchronization. We have now detected a bug in this mechanism.
      
      During link synchronization we use the arrival of any packet on
      the tunnel link to trig a check for whether it has reached the
      synchronization point or not. This has turned out to be too
      permissive, since it may cause an arriving non-last SYNCH packet to
      end the synch state, just to see the next SYNCH packet initiate a
      new synch state with a new, higher synch point. This is not fatal,
      but should be avoided, because it may significantly extend the
      synchronization period, while at the same time we are not allowed
      to send NACKs if packets are lost. In the worst case, a low-traffic
      user may see its traffic stall until a LINK_PROTOCOL state message
      trigs the link to leave synchronization state.
      
      At the same time, LINK_PROTOCOL packets which happen to have a (non-
      valid) sequence number lower than the tunnel link's rcv_nxt value will
      be consistently dropped, and will never be able to resolve the situation
      described above.
      
      We fix this by exempting LINK_PROTOCOL packets from the sequence number
      check, as they should be. We also reduce (but don't completely
      eliminate) the risk of entering multiple synchronization states by only
      allowing the (logically) first SYNCH packet to initiate a synchronization
      state. This works independently of actual packet arrival order.
      
      Fixes: commit 6e498158
      
       ("tipc: move link synch and failover to link aggregation level")
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Acked-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f8b8e28
  6. 21 Sep, 2015 1 commit
  7. 09 Sep, 2015 1 commit
    • Kolmakov Dmitriy's avatar
      net: tipc: fix stall during bclink wakeup procedure · 7845989c
      Kolmakov Dmitriy authored
      
      
      If an attempt to wake up users of broadcast link is made when there is
      no enough place in send queue than it may hang up inside the
      tipc_sk_rcv() function since the loop breaks only after the wake up
      queue becomes empty. This can lead to complete CPU stall with the
      following message generated by RCU:
      
      INFO: rcu_sched self-detected stall on CPU { 0}  (t=2101 jiffies
      					g=54225 c=54224 q=11465)
      Task dump for CPU 0:
      tpch            R  running task        0 39949  39948 0x0000000a
       ffffffff818536c0 ffff88181fa037a0 ffffffff8106a4be 0000000000000000
       ffffffff818536c0 ffff88181fa037c0 ffffffff8106d8a8 ffff88181fa03800
       0000000000000001 ffff88181fa037f0 ffffffff81094a50 ffff88181fa15680
      Call Trace:
       <IRQ>  [<ffffffff8106a4be>] sched_show_task+0xae/0x120
       [<ffffffff8106d8a8>] dump_cpu_task+0x38/0x40
       [<ffffffff81094a50>] rcu_dump_cpu_stacks+0x90/0xd0
       [<ffffffff81097c3b>] rcu_check_callbacks+0x3eb/0x6e0
       [<ffffffff8106e53f>] ? account_system_time+0x7f/0x170
       [<ffffffff81099e64>] update_process_times+0x34/0x60
       [<ffffffff810a84d1>] tick_sched_handle.isra.18+0x31/0x40
       [<ffffffff810a851c>] tick_sched_timer+0x3c/0x70
       [<ffffffff8109a43d>] __run_hrtimer.isra.34+0x3d/0xc0
       [<ffffffff8109aa95>] hrtimer_interrupt+0xc5/0x1e0
       [<ffffffff81030d52>] ? native_smp_send_reschedule+0x42/0x60
       [<ffffffff81032f04>] local_apic_timer_interrupt+0x34/0x60
       [<ffffffff810335bc>] smp_apic_timer_interrupt+0x3c/0x60
       [<ffffffff8165a3fb>] apic_timer_interrupt+0x6b/0x70
       [<ffffffff81659129>] ? _raw_spin_unlock_irqrestore+0x9/0x10
       [<ffffffff8107eb9f>] __wake_up_sync_key+0x4f/0x60
       [<ffffffffa313ddd1>] tipc_write_space+0x31/0x40 [tipc]
       [<ffffffffa313dadf>] filter_rcv+0x31f/0x520 [tipc]
       [<ffffffffa313d699>] ? tipc_sk_lookup+0xc9/0x110 [tipc]
       [<ffffffff81659259>] ? _raw_spin_lock_bh+0x19/0x30
       [<ffffffffa314122c>] tipc_sk_rcv+0x2dc/0x3e0 [tipc]
       [<ffffffffa312e7ff>] tipc_bclink_wakeup_users+0x2f/0x40 [tipc]
       [<ffffffffa313ce26>] tipc_node_unlock+0x186/0x190 [tipc]
       [<ffffffff81597c1c>] ? kfree_skb+0x2c/0x40
       [<ffffffffa313475c>] tipc_rcv+0x2ac/0x8c0 [tipc]
       [<ffffffffa312ff58>] tipc_l2_rcv_msg+0x38/0x50 [tipc]
       [<ffffffff815a76d3>] __netif_receive_skb_core+0x5a3/0x950
       [<ffffffff815a98d3>] __netif_receive_skb+0x13/0x60
       [<ffffffff815a993e>] netif_receive_skb_internal+0x1e/0x90
       [<ffffffff815aa138>] napi_gro_receive+0x78/0xa0
       [<ffffffffa07f93f4>] tg3_poll_work+0xc54/0xf40 [tg3]
       [<ffffffff81597c8c>] ? consume_skb+0x2c/0x40
       [<ffffffffa07f9721>] tg3_poll_msix+0x41/0x160 [tg3]
       [<ffffffff815ab0f2>] net_rx_action+0xe2/0x290
       [<ffffffff8104b92a>] __do_softirq+0xda/0x1f0
       [<ffffffff8104bc26>] irq_exit+0x76/0xa0
       [<ffffffff81004355>] do_IRQ+0x55/0xf0
       [<ffffffff8165a12b>] common_interrupt+0x6b/0x6b
       <EOI>
      
      The issue occurs only when tipc_sk_rcv() is used to wake up postponed
      senders:
      
      	tipc_bclink_wakeup_users()
      		// wakeupq - is a queue which consists of special
      		// 		 messages with SOCK_WAKEUP type.
      		tipc_sk_rcv(wakeupq)
      			...
      			while (skb_queue_len(inputq)) {
      				filter_rcv(skb)
      					// Here the type of message is checked
      					// and if it is SOCK_WAKEUP then
      					// it tries to wake up a sender.
      					tipc_write_space(sk)
      						wake_up_interruptible_sync_poll()
      			}
      
      After the sender thread is woke up it can gather control and perform
      an attempt to send a message. But if there is no enough place in send
      queue it will call link_schedule_user() function which puts a message
      of type SOCK_WAKEUP to the wakeup queue and put the sender to sleep.
      Thus the size of the queue actually is not changed and the while()
      loop never exits.
      
      The approach I proposed is to wake up only senders for which there is
      enough place in send queue so the described issue can't occur.
      Moreover the same approach is already used to wake up senders on
      unicast links.
      
      I have got into the issue on our product code but to reproduce the
      issue I changed a benchmark test application (from
      tipcutils/demos/benchmark) to perform the following scenario:
      	1. Run 64 instances of test application (nodes). It can be done
      	   on the one physical machine.
      	2. Each application connects to all other using TIPC sockets in
      	   RDM mode.
      	3. When setup is done all nodes start simultaneously send
      	   broadcast messages.
      	4. Everything hangs up.
      
      The issue is reproducible only when a congestion on broadcast link
      occurs. For example, when there are only 8 nodes it works fine since
      congestion doesn't occur. Send queue limit is 40 in my case (I use a
      critical importance level) and when 64 nodes send a message at the
      same moment a congestion occurs every time.
      
      Signed-off-by: default avatarDmitry S Kolmakov <kolmakov.dmitriy@huawei.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Acked-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7845989c
  8. 23 Aug, 2015 3 commits
    • Jon Paul Maloy's avatar
      tipc: fix stale link problem during synchronization · 2be80c2d
      Jon Paul Maloy authored
      
      
      Recent changes to the link synchronization means that we can now just
      drop packets arriving on the synchronizing link before the synch point
      is reached. This has lead to significant simplifications to the
      implementation, but also turns out to have a flip side that we need
      to consider.
      
      Under unlucky circumstances, the two endpoints may end up
      repeatedly dropping each other's packets, while immediately
      asking for retransmission of the same packets, just to drop
      them once more. This pattern will eventually be broken when
      the synch point is reached on the other link, but before that,
      the endpoints may have arrived at the retransmission limit
      (stale counter) that indicates that the link should be broken.
      We see this happen at rare occasions.
      
      The fix for this is to not ask for retransmissions when a link is in
      state LINK_SYNCHING. The fact that the link has reached this state
      means that it has already received the first SYNCH packet, and that it
      knows the synch point. Hence, it doesn't need any more packets until the
      other link has reached the synch point, whereafter it can go ahead and
      ask for the missing packets.
      
      However, because of the reduced traffic on the synching link that
      follows this change, it may now take longer to discover that the
      synch point has been reached. We compensate for this by letting all
      packets, on any of the links, trig a check for synchronization
      termination. This is possible because the packets themselves don't
      contain any information that is needed for discovering this condition.
      
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2be80c2d
    • Jon Paul Maloy's avatar
      tipc: interrupt link synchronization when a link goes down · 5ae2f8e6
      Jon Paul Maloy authored
      When we introduced the new link failover/synch mechanism
      in commit 6e498158
      
      
      ("tipc: move link synch and failover to link aggregation level"),
      we missed the case when the non-tunnel link goes down during the link
      synchronization period. In this case the tunnel link will remain in
      state LINK_SYNCHING, something leading to unpredictable behavior when
      the failover procedure is initiated.
      
      In this commit, we ensure that the node and remaining link goes
      back to regular communication state (SELF_UP_PEER_UP/LINK_ESTABLISHED)
      when one of the parallel links goes down. We also ensure that we don't
      re-enter synch mode if subsequent SYNCH packets arrive on the remaining
      link.
      
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5ae2f8e6
    • Jon Paul Maloy's avatar
      tipc: eliminate risk of premature link setup during failover · 17b20630
      Jon Paul Maloy authored
      
      
      When a link goes down, and there is still a working link towards its
      destination node, a failover is initiated, and the failed link is not
      allowed to re-establish until that procedure is finished. To ensure
      this, the concerned link endpoints are set to state LINK_FAILINGOVER,
      and the node endpoints to NODE_FAILINGOVER during the failover period.
      
      However, if the link reset is due to a disabled bearer, the corres-
      ponding link endpoint is deleted, and only the node endpoint knows
      about the ongoing failover. Now, if the disabled bearer is re-enabled
      during the failover period, the discovery mechanism may create a new
      link endpoint that is ready to be established, despite that this is not
      permitted. This situation may cause both the ongoing failover and any
      subsequent link synchronization to fail.
      
      In this commit, we ensure that a newly created link goes directly to
      state LINK_FAILINGOVER if the corresponding node state is
      NODE_FAILINGOVER. This eliminates the problem described above.
      
      Furthermore, we tighten the criteria for which packets are allowed
      to end a failover state in the function tipc_node_check_state().
      By checking that the receiving link is up and running, instead of just
      checking that it is not in failover mode, we eliminate the risk that
      protocol packets from the re-created link may cause the failover to
      be prematurely terminated.
      
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      17b20630
  9. 17 Aug, 2015 1 commit
  10. 31 Jul, 2015 13 commits
    • Roopa Prabhu's avatar
      ipv6: change ipv6_stub_impl.ipv6_dst_lookup to take net argument · 343d60aa
      Roopa Prabhu authored
      
      
      This patch adds net argument to ipv6_stub_impl.ipv6_dst_lookup
      for use cases where sk is not available (like mpls).
      sk appears to be needed to get the namespace 'net' and is optional
      otherwise. This patch series changes ipv6_stub_impl.ipv6_dst_lookup
      to take net argument. sk remains optional.
      
      All callers of ipv6_stub_impl.ipv6_dst_lookup have been modified
      to pass net. I have modified them to use already available
      'net' in the scope of the call. I can change them to
      sock_net(sk) to avoid any unintended change in behaviour if sock
      namespace is different. They dont seem to be from code inspection.
      
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      343d60aa
    • Jon Paul Maloy's avatar
      tipc: clean up link creation · 440d8963
      Jon Paul Maloy authored
      
      
      We simplify the link creation function tipc_link_create() and the way
      the link struct it is connected to the node struct. In particular, we
      remove the duplicate initialization of some fields which are anyway set
      in tipc_link_reset().
      
      Tested-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      440d8963
    • Jon Paul Maloy's avatar
      tipc: use temporary, non-protected skb queue for bundle reception · 9073fb8b
      Jon Paul Maloy authored
      
      
      Currently, when we extract small messages from a message bundle, or
      when many messages have accumulated in the link arrival queue, those
      messages are added one by one to the lock protected link input queue.
      This may increase contention with the reader of that queue, in
      the function tipc_sk_rcv().
      
      This commit introduces a temporary, unprotected input queue in
      tipc_link_rcv() for such cases. Only when the arrival queue has been
      emptied, and the function is ready to return, does it splice the whole
      temporary queue into the real input queue.
      
      Tested-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9073fb8b
    • Jon Paul Maloy's avatar
      tipc: remove implicit message delivery in node_unlock() · 23d8335d
      Jon Paul Maloy authored
      
      
      After the most recent changes, all access calls to a link which
      may entail addition of messages to the link's input queue are
      postpended by an explicit call to tipc_sk_rcv(), using a reference
      to the correct queue.
      
      This means that the potentially hazardous implicit delivery, using
      tipc_node_unlock() in combination with a binary flag and a cached
      queue pointer, now has become redundant.
      
      This commit removes this implicit delivery mechanism both for regular
      data messages and for binding table update messages.
      
      Tested-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23d8335d
    • Jon Paul Maloy's avatar
      tipc: make resetting of links non-atomic · 598411d7
      Jon Paul Maloy authored
      
      
      In order to facilitate future improvements to the locking structure, we
      want to make resetting and establishing of links non-atomic. I.e., the
      functions tipc_node_link_up() and tipc_node_link_down() should be called
      from outside the node lock context, and grab/release the node lock
      themselves. This requires that we can freeze the link state from the
      moment it is set to RESETTING or PEER_RESET in one lock context until
      it is set to RESET or ESTABLISHING in a later context. The recently
      introduced link FSM makes this possible, so we are now ready to introduce
      the above change.
      
      This commit implements this.
      
      Tested-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      598411d7
    • Jon Paul Maloy's avatar
      tipc: move received discovery data evaluation inside node.c · cf148816
      Jon Paul Maloy authored
      
      
      The node lock is currently grabbed and and released in the function
      tipc_disc_rcv() in the file discover.c. As a preparation for the next
      commits, we need to move this node lock handling, along with the code
      area it is covering, to node.c.
      
      This commit introduces this change.
      
      Tested-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cf148816
    • Jon Paul Maloy's avatar
      tipc: merge link->exec_mode and link->state into one FSM · 662921cd
      Jon Paul Maloy authored
      
      
      Until now, we have been handling link failover and synchronization
      by using an additional link state variable, "exec_mode". This variable
      is not independent of the link FSM state, something causing a risk of
      inconsistencies, apart from the fact that it clutters the code.
      
      The conditions are now in place to define a new link FSM that covers
      all existing use cases, including failover and synchronization, and
      eliminate the "exec_mode" field altogether. The FSM must also support
      non-atomic resetting of links, which will be introduced later.
      
      The new link FSM is shown below, with 7 states and 8 events.
      Only events leading to state change are shown as edges.
      
      +------------------------------------+
      |RESET_EVT                           |
      |                                    |
      |                             +--------------+
      |           +-----------------|   SYNCHING   |-----------------+
      |           |FAILURE_EVT      +--------------+   PEER_RESET_EVT|
      |           |                  A            |                  |
      |           |                  |            |                  |
      |           |                  |            |                  |
      |           |                  |SYNCH_      |SYNCH_            |
      |           |                  |BEGIN_EVT   |END_EVT           |
      |           |                  |            |                  |
      |           V                  |            V                  V
      |    +-------------+          +--------------+          +------------+
      |    |  RESETTING  |<---------|  ESTABLISHED |--------->| PEER_RESET |
      |    +-------------+ FAILURE_ +--------------+ PEER_    +------------+
      |           |        EVT        |    A         RESET_EVT       |
      |           |                   |    |                         |
      |           |                   |    |                         |
      |           |    +--------------+    |                         |
      |  RESET_EVT|    |RESET_EVT          |ESTABLISH_EVT            |
      |           |    |                   |                         |
      |           |    |                   |                         |
      |           V    V                   |                         |
      |    +-------------+          +--------------+        RESET_EVT|
      +--->|    RESET    |--------->| ESTABLISHING |<----------------+
           +-------------+ PEER_    +--------------+
            |           A  RESET_EVT       |
            |           |                  |
            |           |                  |
            |FAILOVER_  |FAILOVER_         |FAILOVER_
            |BEGIN_EVT  |END_EVT           |BEGIN_EVT
            |           |                  |
            V           |                  |
           +-------------+                 |
           | FAILINGOVER |<----------------+
           +-------------+
      
      These changes are fully backwards compatible.
      
      Tested-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      662921cd
    • Jon Paul Maloy's avatar
      tipc: move protocol message sending away from link FSM · 5045f7b9
      Jon Paul Maloy authored
      
      
      The implementation of the link FSM currently takes decisions about and
      sends out link protocol messages. This is unnecessary, since such
      actions are not the result of any link state change, and are even
      decided based on non-FSM state information ("silent_intv_cnt").
      
      We now move the sending of unicast link protocol messages to the
      function tipc_link_timeout(), and the initial broadcast synchronization
      message to tipc_node_link_up(). The latter is done because a link
      instance should not need to know whether it is the first or second
      link to a destination. Such information is now restricted to and
      handled by the link aggregation layer in node.c
      
      Tested-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5045f7b9
    • Jon Paul Maloy's avatar
      tipc: move link synch and failover to link aggregation level · 6e498158
      Jon Paul Maloy authored
      
      
      Link failover and synchronization have until now been handled by the
      links themselves, forcing them to have knowledge about and to access
      parallel links in order to make the two algorithms work correctly.
      
      In this commit, we move the control part of this functionality to the
      link aggregation level in node.c, which is the right location for this.
      As a result, the two algorithms become easier to follow, and the link
      implementation becomes simpler.
      
      Tested-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e498158
    • Jon Paul Maloy's avatar
      tipc: extend node FSM · 66996b6c
      Jon Paul Maloy authored
      
      
      In the next commit, we will move link synch/failover orchestration to
      the link aggregation level. In order to do this, we first need to extend
      the node FSM with two more states, NODE_SYNCHING and NODE_FAILINGOVER,
      plus four new events to enter and leave those states.
      
      This commit introduces this change, without yet making use of it.
      The node FSM now looks as follows:
      
                                 +-----------------------------------------+
                                 |                            PEER_DOWN_EVT|
                                 |                                         |
        +------------------------+----------------+                        |
        |SELF_DOWN_EVT           |                |                        |
        |                        |                |                        |
        |              +-----------+          +-----------+                |
        |              |NODE_      |          |NODE_      |                |
        |   +----------|FAILINGOVER|<---------|SYNCHING   |------------+   |
        |   |SELF_     +-----------+ FAILOVER_+-----------+    PEER_   |   |
        |   |DOWN_EVT   |         A  BEGIN_EVT A         |     DOWN_EVT|   |
        |   |           |         |            |         |             |   |
        |   |           |         |            |         |             |   |
        |   |           |FAILOVER_|FAILOVER_   |SYNCH_   |SYNCH_       |   |
        |   |           |END_EVT  |BEGIN_EVT   |BEGIN_EVT|END_EVT      |   |
        |   |           |         |            |         |             |   |
        |   |           |         |            |         |             |   |
        |   |           |        +--------------+        |             |   |
        |   |           +------->|   SELF_UP_   |<-------+             |   |
        |   |   +----------------|   PEER_UP    |------------------+   |   |
        |   |   |SELF_DOWN_EVT   +--------------+     PEER_DOWN_EVT|   |   |
        |   |   |                   A          A                   |   |   |
        |   |   |                   |          |                   |   |   |
        |   |   |        PEER_UP_EVT|          |SELF_UP_EVT        |   |   |
        |   |   |                   |          |                   |   |   |
        V   V   V                   |          |                   V   V   V
      +------------+       +-----------+    +-----------+       +------------+
      |SELF_DOWN_  |       |SELF_UP_   |    |PEER_UP_   |       |PEER_DOWN   |
      |PEER_LEAVING|<------|PEER_COMING|    |SELF_COMING|------>|SELF_LEAVING|
      +------------+ SELF_ +-----------+    +-----------+ PEER_ +------------+
             |       DOWN_EVT       A          A          DOWN_EVT     |
             |                      |          |                       |
             |                      |          |                       |
             |           SELF_UP_EVT|          |PEER_UP_EVT            |
             |                      |          |                       |
             |                      |          |                       |
             |PEER_DOWN_EVT       +--------------+        SELF_DOWN_EVT|
             +------------------->|  SELF_DOWN_  |<--------------------+
                                  |  PEER_DOWN   |
                                  +--------------+
      
      Tested-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      66996b6c
    • Jon Paul Maloy's avatar
      tipc: reverse call order for link_reset()->node_link_down() · 655fb243
      Jon Paul Maloy authored
      
      
      In many cases the call order when a link is reset goes as follows:
      tipc_node_xx()->tipc_link_reset()->tipc_node_link_down()
      
      This is not the right order if we want the node to be in control,
      so in this commit we change the order to:
      tipc_node_xx()->tipc_node_link_down()->tipc_link_reset()
      
      The fact that tipc_link_reset() now is called from only one
      location with a well-defined state will also facilitate later
      simplifications of tipc_link_reset() and the link FSM.
      
      Tested-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      655fb243
    • Jon Paul Maloy's avatar
      tipc: move all link_reset() calls to link aggregation level · 6144a996
      Jon Paul Maloy authored
      
      
      In line with our effort to let the node level have full control over
      its links, we want to move all link reset calls from link.c to node.c.
      Some of the calls can be moved by simply moving the calling function,
      when this is the right thing to do. For the remaining calls we use
      the now established technique of returning a TIPC_LINK_DOWN_EVT
      flag from tipc_link_rcv(), whereafter we perform the reset call when
      the call returns.
      
      This change serves as a preparation for the coming commits.
      
      Tested-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6144a996
    • Jon Paul Maloy's avatar
      tipc: eliminate function tipc_link_activate() · cbeb83ca
      Jon Paul Maloy authored
      
      
      The function tipc_link_activate() is redundant, since it mostly performs
      settings that have already been done in a preceding tipc_link_reset().
      
      There are three exceptions to this:
      - The actual state change to TIPC_LINK_WORKING. This should anyway be done
        in the FSM, and not in a separate function.
      - Registration of the link with the bearer. This should be done by the
        node, since we don't want the link to have any knowledge about its
        specific bearer.
      - Call to tipc_node_link_up() for user access registration. With the new
        role distribution between link aggregation and link level this becomes
        the wrong call order; tipc_node_link_up() should instead be called
        directly as a result of a TIPC_LINK_UP event, hence by the node itself.
      
      This commit implements those changes.
      
      Tested-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cbeb83ca
  11. 29 Jul, 2015 1 commit
    • Jon Maloy's avatar
      tipc: fix bug in broadcast synch message create function · 5a4c3552
      Jon Maloy authored
      In commit d999297c
      
      
      ("tipc: reduce locking scope during packet reception") we introduced
      a new function tipc_build_bcast_sync_msg(), which carries initial
      synchronization data between two nodes at first contact and at
      re-contact. In this function, we missed to add synchronization data,
      with the effect that the broadcast link endpoints will fail to
      synchronize correctly at re-contact between a running and a restarted
      node. All other cases work as intended.
      
      With this commit, we fix this bug.
      
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a4c3552