• Eric Dumazet's avatar
    net: align sk_refcnt on 128 bytes boundary · 8e5eb54d
    Eric Dumazet authored
    sk->sk_refcnt is dirtied for every TCP/UDP incoming packet.
    This is a performance issue if multiple cpus hit a common socket,
    or multiple sockets are chained due to SO_REUSEPORT.
    By moving sk_refcnt 8 bytes further, first 128 bytes of sockets
    are mostly read. As they contain the lookup keys, this has
    a considerable performance impact, as cpus can cache them.
    These 8 bytes are not wasted, we use them as a place holder
    for various fields, depending on the socket type.
     SYN flood hitting a 16 RX queues NIC.
     TCP listener using 16 sockets and SO_REUSEPORT
     and SO_INCOMING_CPU for proper siloing.
     Could process 6.0 Mpps SYN instead of 4.2 Mpps
     Kernel profile looked like :
        11.68%  [kernel]  [k] sha_transform
         6.51%  [kernel]  [k] __inet_lookup_listener
         5.07%  [kernel]  [k] __inet_lookup_established
         4.15%  [kernel]  [k] memcpy_erms
         3.46%  [kernel]  [k] ipt_do_table
         2.74%  [kernel]  [k] fib_table_lookup
         2.54%  [kernel]  [k] tcp_make_synack
         2.34%  [kernel]  [k] tcp_conn_request
         2.05%  [kernel]  [k] __netif_receive_skb_core
         2.03%  [kernel]  [k] kmem_cache_alloc
    Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>