1. 24 Oct, 2015 1 commit
    • Jon Paul Maloy's avatar
      tipc: introduce capability bit for broadcast synchronization · fd556f20
      Jon Paul Maloy authored
      Until now, we have tried to support both the newer, dedicated broadcast
      synchronization mechanism along with the older, less safe, RESET_MSG/
      ACTIVATE_MSG based one. The latter method has turned out to be a hazard
      in a highly dynamic cluster, so we find it safer to disable it completely
      when we find that the former mechanism is supported by the peer node.
      For this purpose, we now introduce a new capabability bit,
      TIPC_BCAST_SYNCH, to inform any peer nodes that dedicated broadcast
      syncronization is supported by the present node. The new bit is conveyed
      between peers in the 'capabilities' field of neighbor discovery messages.
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  2. 31 Jul, 2015 1 commit
  3. 21 Jul, 2015 2 commits
  4. 29 Mar, 2015 1 commit
    • Ying Xue's avatar
      tipc: involve reference counter for node structure · 8a0f6ebe
      Ying Xue authored
      TIPC node hash node table is protected with rcu lock on read side.
      tipc_node_find() is used to look for a node object with node address
      through iterating the hash node table. As the entire process of what
      tipc_node_find() traverses the table is guarded with rcu read lock,
      it's safe for us. However, when callers use the node object returned
      by tipc_node_find(), there is no rcu read lock applied. Therefore,
      this is absolutely unsafe for callers of tipc_node_find().
      Now we introduce a reference counter for node structure. Before
      tipc_node_find() returns node object to its caller, it first increases
      the reference counter. Accordingly, after its caller used it up,
      it decreases the counter again. This can prevent a node being used by
      one thread from being freed by another thread.
      Reviewed-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericson.com>
      Signed-off-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  5. 14 Mar, 2015 1 commit
  6. 06 Mar, 2015 1 commit
  7. 06 Feb, 2015 1 commit
    • Jon Paul Maloy's avatar
      tipc: reduce usage of context info in socket and link · c5898636
      Jon Paul Maloy authored
      The most common usage of namespace information is when we fetch the
      own node addess from the net structure. This leads to a lot of
      passing around of a parameter of type 'struct net *' between
      functions just to make them able to obtain this address.
      However, in many cases this is unnecessary. The own node address
      is readily available as a member of both struct tipc_sock and
      tipc_link, and can be fetched from there instead.
      The fact that the vast majority of functions in socket.c and link.c
      anyway are maintaining a pointer to their respective base structures
      makes this option even more compelling.
      In this commit, we introduce the inline functions tsk_own_node()
      and link_own_node() to make it easy for functions to fetch the node
      address from those structs instead of having to pass along and
      dereference the namespace struct.
      In particular, we make calls to the msg_xx() functions in msg.{h,c}
      context independent by directly passing them the own node address
      as parameter when needed. Those functions should be regarded as
      leaves in the code dependency tree, and it is hence desirable to
      keep them namspace unaware.
      Apart from a potential positive effect on cache behavior, these
      changes make it easier to introduce the changes that will follow
      later in this series.
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  8. 05 Feb, 2015 1 commit
    • Jon Paul Maloy's avatar
      tipc: eliminate race during node creation · b45db71b
      Jon Paul Maloy authored
      Instances of struct node are created in the function tipc_disc_rcv()
      under the assumption that there is no race between received discovery
      messages arriving from the same node. This assumption is wrong.
      When we use more than one bearer, it is possible that discovery
      messages from the same node arrive at the same moment, resulting in
      creation of two instances of struct tipc_node. This may later cause
      confusion during link establishment, and may result in one of the links
      never becoming activated.
      We fix this by making lookup and potential creation of nodes atomic.
      Instead of first looking up the node, and in case of failure, create it,
      we now start with looking up the node inside node_link_create(), and
      return a reference to that one if found. Otherwise, we go ahead and
      create the node as we did before.
      Reviewed-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  9. 12 Jan, 2015 6 commits
  10. 14 May, 2014 2 commits
    • Jon Paul Maloy's avatar
      tipc: clean up neigbor discovery message reception · c82910e2
      Jon Paul Maloy authored
      The function tipc_disc_rcv(), which is handling received neighbor
      discovery messages, is perceived as messy, and it is hard to verify
      its correctness by code inspection. The fact that the task it is set
      to resolve is fairly complex does not make the situation better.
      In this commit we try to take a more systematic approach to the
      problem. We define a decision machine which takes three state flags
       as input, and produces three action flags as output. We then walk
      through all permutations of the state flags, and for each of them we
      describe verbally what is going on, plus that we set zero or more of
      the action flags. The action flags indicate what should be done once
      the decision machine has finished its job, while the last part of the
      function deals with performing those actions.
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Jon Paul Maloy's avatar
      tipc: improve and extend media address conversion functions · 38504c28
      Jon Paul Maloy authored
      TIPC currently handles two media specific addresses: Ethernet MAC
      addresses and InfiniBand addresses. Those are kept in three different
      1) A "raw" format as obtained from the device. This format is known
         only by the media specific adapter code in eth_media.c and
      2) A "generic" internal format, in the form of struct tipc_media_addr,
         which can be referenced and passed around by the generic media-
         unaware code.
      3) A serialized version of the latter, to be conveyed in neighbor
         discovery messages.
      Conversion between the three formats can only be done by the media
      specific code, so we have function pointers for this purpose in
      struct tipc_media. Here, the media adapters can install their own
      conversion functions at startup.
      We now introduce a new such function, 'raw2addr()', whose purpose
      is to convert from format 1 to format 2 above. We also try to as far
      as possible uniform commenting, variable names and usage of these
      functions, with the purpose of making them more comprehensible.
      We can now also remove the function tipc_l2_media_addr_set(), whose
      job is done better by the new function.
      Finally, we expand the field for serialized addresses (format 3)
      in discovery messages from 20 to 32 bytes. This is permitted
      according to the spec, and reduces the risk of problems when we
      add new media in the future.
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  11. 27 Apr, 2014 1 commit
  12. 23 Apr, 2014 2 commits
    • Ying Xue's avatar
      tipc: fix race in disc create/delete · a8b9b96e
      Ying Xue authored
      Commit a21a584d
       (tipc: fix neighbor
      detection problem after hw address change) introduces a race condition
      involving tipc_disc_delete() and tipc_disc_add/remove_dest that can
      cause TIPC to dereference the pointer to the bearer discovery request
      structure after it has been freed since a stray pointer is left in the
      bearer structure.
      In order to fix the issue, the process of resetting the discovery
      request handler is optimized: the discovery request handler and request
      buffer are just reset instead of being freed, allocated and initialized.
      As the request point is always valid and the request's lock is taken
      while the request handler is reset, the race doesn't happen any more.
      Reported-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Reviewed-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Tested-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Ying Xue's avatar
      tipc: decouple the relationship between bearer and link · 7a2f7d18
      Ying Xue authored
      Currently on both paths of message transmission and reception, the
      read lock of tipc_net_lock must be held before bearer is accessed,
      while the write lock of tipc_net_lock has to be taken before bearer
      is configured. Although it can ensure that bearer is always valid on
      the two data paths, link and bearer is closely bound together.
      So as the part of effort of removing tipc_net_lock, the locking
      policy of bearer protection will be adjusted as below: on the two
      data paths, RCU is used, and on the configuration path of bearer,
      RTNL lock is applied.
      Now RCU just covers the path of message reception. To make it possible
      to protect the path of message transmission with RCU, link should not
      use its stored bearer pointer to access bearer, but it should use the
      bearer identity of its attached bearer as index to get bearer instance
      from bearer_list array, which can help us decouple the relationship
      between bearer and link. As a result, bearer on the path of message
      transmission can be safely protected by RCU when we access bearer_list
      array within RCU lock protection.
      Signed-off-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Reviewed-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Tested-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  13. 28 Mar, 2014 1 commit
  14. 18 Feb, 2014 1 commit
    • Ying Xue's avatar
      tipc: align tipc function names with common naming practice in the network · 247f0f3c
      Ying Xue authored
      Rename the following functions, which are shorter and more in line
      with common naming practice in the network subsystem.
      Above changes have no impact on current users of the functions.
      Signed-off-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  15. 07 Jan, 2014 1 commit
    • Ying Xue's avatar
      tipc: introduce new spinlock to protect struct link_req · f9a2c80b
      Ying Xue authored
      Currently, only 'bearer_lock' is used to protect struct link_req in
      the function disc_timeout(). This is unsafe, since the member fields
      'num_nodes' and 'timer_intv' might be accessed by below three different
      threads simultaneously, none of them grabbing bearer_lock in the
      critical region:
              read req->num_nodes
      	write req->timer_intv
        read req->num_nodes
        read/write req->timer_intv
      Without lock protection, the only symptom of a race is that discovery
      messages occasionally may not be sent out. This is not fatal, since such
      messages are best-effort anyway. On the other hand, since discovery
      messages are not time critical, adding a protecting lock brings no
      serious overhead either. So we add a new, dedicated spinlock in
      order to guarantee absolute data consistency in link_req objects.
      This also helps reduce the overall role of the bearer_lock, which
      we want to remove completely in a later commit series.
      Signed-off-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Reviewed-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  16. 10 Dec, 2013 1 commit
    • Erik Hugne's avatar
      tipc: remove interface state mirroring in bearer · 512137ee
      Erik Hugne authored
      struct 'tipc_bearer' is a generic representation of the underlying
      media type, and exists in a one-to-one relationship to each interface
      TIPC is using. The struct contains a 'blocked' flag that mirrors the
      operational and execution state of the represented interface, and is
      updated through notification calls from the latter. The users of
      tipc_bearer are checking this flag before each attempt to send a
      packet via the interface.
      This state mirroring serves no purpose in the current code base. TIPC
      links will not discover a media failure any faster through this
      mechanism, and in reality the flag only adds overhead at packet
      sending and reception.
      Furthermore, the fact that the flag needs to be protected by a spinlock
      aggregated into tipc_bearer has turned out to cause a serious and
      completely unnecessary deadlock problem.
      CPU0                                    CPU1
      ----                                    ----
      Time 0: bearer_disable()                link_timeout()
      Time 1:   spin_lock_bh(&b_ptr->lock)      tipc_link_push_queue()
      Time 2:   tipc_link_delete()                tipc_bearer_blocked(b_ptr)
      Time 3:     k_cancel_timer(&req->timer)       spin_lock_bh(&b_ptr->lock)
      Time 4:       del_timer_sync(&req->timer)
      I.e., del_timer_sync() on CPU0 never returns, because the timer handler
      on CPU1 is waiting for the bearer lock.
      We eliminate the 'blocked' flag from struct tipc_bearer, along with all
      tests on this flag. This not only resolves the deadlock, but also
      simplifies and speeds up the data path execution of TIPC. It also fits
      well into our ongoing effort to make the locking policy simpler and
      more manageable.
      An effect of this change is that we can get rid of functions such as
      tipc_bearer_blocked(), tipc_continue() and tipc_block_bearer().
      We replace the latter with a new function, tipc_reset_bearer(), which
      resets all links associated to the bearer immediately after an
      interface goes down.
      A user might notice one slight change in link behaviour after this
      change. When an interface goes down, (e.g. through a NETDEV_DOWN
      event) all attached links will be reset immediately, instead of
      leaving it to each link to detect the failure through a timer-driven
      mechanism. We consider this an improvement, and see no obvious risks
      with the new behavior.
      Signed-off-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Reviewed-by: default avatarPaul Gortmaker <Paul.Gortmaker@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  17. 17 Jun, 2013 1 commit
  18. 17 Apr, 2013 1 commit
  19. 22 Nov, 2012 1 commit
    • Ying Xue's avatar
      tipc: remove the bearer congestion mechanism · 3c294cb3
      Ying Xue authored
      Currently at the TIPC bearer layer there is the following congestion
      Once sending packets has failed via that bearer, the bearer will be
      flagged as being in congested state at once. During bearer congestion,
      all packets arriving at link will be queued on the link's outgoing
      buffer.  When we detect that the state of bearer congestion has
      relaxed (e.g. some packets are received from the bearer) we will try
      our best to push all packets in the link's outgoing buffer until the
      buffer is empty, or until the bearer is congested again.
      However, in fact the TIPC bearer never receives any feedback from the
      device layer whether a send was successful or not, so it must always
      assume it was successful. Therefore, the bearer congestion mechanism
      as it exists currently is of no value.
      But the bearer blocking state is still useful for us. For example,
      when the physical media goes down/up, we need to change the state of
      the links bound to the bearer.  So the code maintaing the state
      information is not removed.
      Signed-off-by: Ying Xue's avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
  20. 13 Jul, 2012 2 commits
    • Erik Hugne's avatar
      tipc: phase out most of the struct print_buf usage · dc1aed37
      Erik Hugne authored
      The tipc_printf is renamed to tipc_snprintf, as the new name
      describes more what the function actually does.  It is also
      changed to take a buffer and length parameter and return
      number of characters written to the buffer.  All callers of
      this function that used to pass a print_buf are updated.
      Final removal of the struct print_buf itself will be done
      synchronously with the pending removal of the deprecated
      logging code that also was using it.
      Functions that build up a response message with a list of
      ports, nametable contents etc. are changed to return the number
      of characters written to the output buffer. This information
      was previously hidden in a field of the print_buf struct, and
      the number of chars written was fetched with a call to
      tipc_printbuf_validate.  This function is removed since it
      is no longer referenced nor needed.
      A generic max size ULTRA_STRING_MAX_LEN is defined, named
      in keeping with the existing TIPC_TLV_ULTRA_STRING, and the
      various definitions in port, link and nametable code that
      largely duplicated this information are removed.  This means
      that amount of link statistics that can be returned is now
      increased from 2k to 32k.
      The buffer overflow check is now done just before the reply
      message is passed over netlink or TIPC to a remote node and
      the message indicating a truncated buffer is changed to a less
      dramatic one (less CAPS), placed at the end of the message.
      Signed-off-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
    • Erik Hugne's avatar
      tipc: use standard printk shortcut macros (pr_err etc.) · 2cf8aa19
      Erik Hugne authored
      All messages should go directly to the kernel log.  The TIPC
      specific error, warning, info and debug trace macro's are
      removed and all references replaced with pr_err, pr_warn,
      pr_info and pr_debug.
      Commonly used sub-strings are explicitly declared as a const
      char to reduce .text size.
      Note that this means the debug messages (changed to pr_debug),
      are now enabled through dynamic debugging, instead of a TIPC
      specific Kconfig option (TIPC_DEBUG).  The latter will be
      phased out completely
      Signed-off-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      [PG: use pr_fmt as suggested by Joe Perches <joe@perches.com>]
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
  21. 30 Apr, 2012 1 commit
    • Paul Gortmaker's avatar
      tipc: compress out gratuitous extra carriage returns · 617d3c7a
      Paul Gortmaker authored
      Some of the comment blocks are floating in limbo between two
      functions, or between blocks of code.  Delete the extra line
      feeds between any comment and its associated following block
      of code, to be consistent with the majority of the rest of
      the kernel.  Also delete trailing newlines at EOF and fix
      a couple trivial typos in existing comments.
      This is a 100% cosmetic change with no runtime impact.  We get
      rid of over 500 lines of non-code, and being blank line deletes,
      they won't even show up as noise in git blame.
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
  22. 24 Feb, 2012 3 commits
    • Allan Stephens's avatar
      tipc: Eliminate trivial buffer manipulation helper routines · 5f6d9123
      Allan Stephens authored
      Gets rid of two inlined routines that simply call existing sk_buff
      manipulation routines, since there is no longer any extra processing
      done by the helper routines.
      Note that these changes are essentially cosmetic in nature, and have
      no impact on the actual operation of TIPC.
      Signed-off-by: default avatarAllan Stephens <allan.stephens@windriver.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
    • Allan Stephens's avatar
      tipc: Detect duplicate nodes using different network interfaces · 97878a40
      Allan Stephens authored
      Utilizes the new "node signature" field in neighbor discovery messages
      to ensure that all links TIPC associates with a given <Z.C.N> network
      address belong to the same neighboring node. (Previously, TIPC could not
      tell if link setup requests arriving on different interfaces were from
      the same node or from two different nodes that has mistakenly been assigned
      the same network address.)
      The revised algorithm for detecting a duplicate node considers both the
      node signature and the network interface adddress specified in a request
      message when deciding how to respond to a link setup request. This prevents
      false alarms that might otherwise arise during normal network operation
      under the following scenarios:
      a) A neighboring node reboots. (The node's signature changes, but the
      network interface address remains unchanged.)
      b) A neighboring node's network interface is replaced. (The node's signature
      remains unchanged, but the network interface address changes.)
      c) A neighboring node is completely replaced. (The node's signature and
      network interface address both change.)
      The algorithm also handles cases in which a node reboots and re-establishes
      its links to TIPC (or begins re-establishing those links) before TIPC
      detects that it is using a new node signature. In such cases of "delayed
      rediscovery" TIPC simply accepts the new signature without disrupting
      communication that is already underway over the links.
      Thanks to Laser [gotolaser@gmail.com] for his contributions to the
      development of this enhancement.
      Signed-off-by: default avatarAllan Stephens <allan.stephens@windriver.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
    • Allan Stephens's avatar
      tipc: Introduce node signature field in neighbor discovery message · fc0eea69
      Allan Stephens authored
      Adds support for the new "node signature" in neighbor discovery messages,
      which is a 16 bit identifier chosen randomly when TIPC is initialized.
      This field makes it possible for nodes receiving a neighbor discovery
      message to detect if multiple neighboring nodes are using the same network
      address (i.e. <Z.C.N>), even when the messages are arriving on different
      This first phase of node signature support creates the signature,
      incorporates it into outgoing neighbor discovery messages, and tracks
      the signature used by valid neighbors. An upcoming patch builds on this
      foundation to implement the improved duplicate neighbor detection checking.
      Signed-off-by: default avatarAllan Stephens <allan.stephens@windriver.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
  23. 30 Dec, 2011 1 commit
  24. 27 Dec, 2011 2 commits
  25. 18 Sep, 2011 1 commit
    • Allan Stephens's avatar
      tipc: Ensure both nodes recognize loss of contact between them · b4b56102
      Allan Stephens authored
      Enhances TIPC to ensure that a node that loses contact with a
      neighboring node does not allow contact to be re-established until
      it sees that its peer has also recognized the loss of contact.
      Previously, nodes that were connected by two or more links could
      encounter a situation in which node A would lose contact with node B
      on all of its links, purge its name table of names published by B,
      and then fail to repopulate those names once contact with B was restored.
      This would happen because B was able to re-establish one or more links
      so quickly that it never reached a point where it had no links to A --
      meaning that B never saw a loss of contact with A, and consequently
      didn't re-publish its names to A.
      This problem is now prevented by enhancing the cleanup done by TIPC
      following a loss of contact with a neighboring node to ensure that
      node A ignores all messages sent by B until it receives a LINK_PROTOCOL
      message that indicates B has lost contact with A, thereby preventing
      the (re)establishment of links between the nodes. The loss of contact
      is recognized when a RESET or ACTIVATE message is received that has
      a "redundant link exists" field of 0, indicating that B's sending link
      endpoint is in a reset state and that B has no other working links.
      Additionally, TIPC now suppresses the sending of (most) link protocol
      messages to a neighboring node while it is cleaning up after an earlier
      loss of contact with that node. This stops the peer node from prematurely
      activating its link endpoint, which would prevent TIPC from later
      activating its own end. TIPC still allows outgoing RESET messages to
      occur during cleanup, to avoid problems if its own node recognizes
      the loss of contact first and tries to notify the peer of the situation.
      Finally, TIPC now recognizes an impending loss of contact with a peer node
      as soon as it receives a RESET message on a working link that is the
      peer's only link to the node, and ensures that the link protocol
      suppression mentioned above goes into effect right away -- that is,
      even before its own link endpoints have failed. This is necessary to
      ensure correct operation when there are redundant links between the nodes,
      since otherwise TIPC would send an ACTIVATE message upon receiving a RESET
      on its first link and only begin suppressing when a RESET on its second
      link was received, instead of initiating suppression with the first RESET
      message as it needs to.
      Note: The reworked cleanup code also eliminates a check that prevented
      a link endpoint's discovery object from responding to incoming messages
      while stale name table entries are being purged. This check is now
      unnecessary and would have slowed down re-establishment of communication
      between the nodes in some situations.
      Signed-off-by: default avatarAllan Stephens <allan.stephens@windriver.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
  26. 10 May, 2011 3 commits