Skip to content
  • Jon Paul Maloy's avatar
    tipc: add neighbor monitoring framework · 35c55c98
    Jon Paul Maloy authored
    TIPC based clusters are by default set up with full-mesh link
    connectivity between all nodes. Those links are expected to provide
    a short failure detection time, by default set to 1500 ms. Because
    of this, the background load for neighbor monitoring in an N-node
    cluster increases with a factor N on each node, while the overall
    monitoring traffic through the network infrastructure increases at
    a ~(N * (N - 1)) rate. Experience has shown that such clusters don't
    scale well beyond ~100 nodes unless we significantly increase failure
    discovery tolerance.
    
    This commit introduces a framework and an algorithm that drastically
    reduces this background load, while basically maintaining the original
    failure detection times across the whole cluster. Using this algorithm,
    background load will now grow at a rate of ~(2 * sqrt(N)) per node, and
    at ~(2 * N * sqrt(N)) in traffic overhead. As an example, each node will
    now have to actively monitor 38 neighbors in a 400-node cluster, i...
    35c55c98