Skip to content
  • Mahesh Bandewar's avatar
    bonding: fix wq initialization for links created via netlink · ea8ffc08
    Mahesh Bandewar authored
    Earlier patch 4493b81b ("bonding: initialize work-queues during
    creation of bond") moved the work-queue initialization from bond_open()
    to bond_create(). However this caused the link those are created using
    netlink 'create bond option' (ip link add bondX type bond); create the
    new trunk without initializing work-queues. Prior to the above mentioned
    change, ndo_open was in both paths and things worked correctly. The
    consequence is visible in the report shared by Joe Stringer -
    
    I've noticed that this patch breaks bonding within namespaces if
    you're not careful to perform device cleanup correctly.
    
    Here's my repro script, you can run on any net-next with this patch
    and you'll start seeing some weird behaviour:
    
    ip netns add foo
    ip li add veth0 type veth peer name veth0+ netns foo
    ip li add veth1 type veth peer name veth1+ netns foo
    ip netns exec foo ip li add bond0 type bond
    ip netns exec foo ip li set dev veth0+ master bond0
    ip netns exec foo ip li set dev veth1+ master bond0
    ip netns exec foo ip addr add dev bond0 192.168.0.1/24
    ip netns exec foo ip li set dev bond0 up
    ip li del dev veth0
    ip li del dev veth1
    
    The second to last command segfaults, last command hangs. rtnl is now
    permanently locked. It's not a problem if you take bond0 down before
    deleting veths, or delete bond0 before deleting veths. If you delete
    either end of the veth pair as per above, either inside or outside the
    namespace, it hits this problem.
    
    Here's some kernel logs:
    [ 1221.801610] bond0: Enslaving veth0+ as an active interface with an up link
    [ 1224.449581] bond0: Enslaving veth1+ as an active interface with an up link
    [ 1281.193863] bond0: Releasing backup interface veth0+
    [ 1281.193866] bond0: the permanent HWaddr of veth0+ -
    16:bf:fb:e0:b8:43 - is still in use by bond0 - set the HWaddr of
    veth0+ to a different address to avoid conflicts
    [ 1281.193867] ------------[ cut here ]------------
    [ 1281.193873] WARNING: CPU: 0 PID: 2024 at kernel/workqueue.c:1511
    __queue_delayed_work+0x13f/0x150
    [ 1281.193873] Modules linked in: bonding veth openvswitch nf_nat_ipv6
    nf_nat_ipv4 nf_nat autofs4 nfsd auth_rpcgss nfs_acl binfmt_misc nfs
    lockd grace sunrpc fscache ppdev vmw_balloon coretemp psmouse
    serio_raw vmwgfx ttm drm_kms_helper vmw_vmci netconsole parport_pc
    configfs drm i2c_piix4 fb_sys_fops syscopyarea sysfillrect sysimgblt
    shpchp mac_hid nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4
    nf_defrag_ipv4 nf_conntrack libcrc32c lp parport hid_generic usbhid
    hid mptspi mptscsih e1000 mptbase ahci libahci
    [ 1281.193905] CPU: 0 PID: 2024 Comm: ip Tainted: G        W
    4.10.0-bisect-bond-v0.14 #37
    [ 1281.193906] Hardware name: VMware, Inc. VMware Virtual
    Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
    [ 1281.193906] Call Trace:
    [ 1281.193912]  dump_stack+0x63/0x89
    [ 1281.193915]  __warn+0xd1/0xf0
    [ 1281.193917]  warn_slowpath_null+0x1d/0x20
    [ 1281.193918]  __queue_delayed_work+0x13f/0x150
    [ 1281.193920]  queue_delayed_work_on+0x27/0x40
    [ 1281.193929]  bond_change_active_slave+0x25b/0x670 [bonding]
    [ 1281.193932]  ? synchronize_rcu_expedited+0x27/0x30
    [ 1281.193935]  __bond_release_one+0x489/0x510 [bonding]
    [ 1281.193939]  ? addrconf_notify+0x1b7/0xab0
    [ 1281.193942]  bond_netdev_event+0x2c5/0x2e0 [bonding]
    [ 1281.193944]  ? netconsole_netdev_event+0x124/0x190 [netconsole]
    [ 1281.193947]  notifier_call_chain+0x49/0x70
    [ 1281.193948]  raw_notifier_call_chain+0x16/0x20
    [ 1281.193950]  call_netdevice_notifiers_info+0x35/0x60
    [ 1281.193951]  rollback_registered_many+0x23b/0x3e0
    [ 1281.193953]  unregister_netdevice_many+0x24/0xd0
    [ 1281.193955]  rtnl_delete_link+0x3c/0x50
    [ 1281.193956]  rtnl_dellink+0x8d/0x1b0
    [ 1281.193960]  rtnetlink_rcv_msg+0x95/0x220
    [ 1281.193962]  ? __kmalloc_node_track_caller+0x35/0x280
    [ 1281.193964]  ? __netlink_lookup+0xf1/0x110
    [ 1281.193966]  ? rtnl_newlink+0x830/0x830
    [ 1281.193967]  netlink_rcv_skb+0xa7/0xc0
    [ 1281.193969]  rtnetlink_rcv+0x28/0x30
    [ 1281.193970]  netlink_unicast+0x15b/0x210
    [ 1281.193971]  netlink_sendmsg+0x319/0x390
    [ 1281.193974]  sock_sendmsg+0x38/0x50
    [ 1281.193975]  ___sys_sendmsg+0x25c/0x270
    [ 1281.193978]  ? mem_cgroup_commit_charge+0x76/0xf0
    [ 1281.193981]  ? page_add_new_anon_rmap+0x89/0xc0
    [ 1281.193984]  ? lru_cache_add_active_or_unevictable+0x35/0xb0
    [ 1281.193985]  ? __handle_mm_fault+0x4e9/0x1170
    [ 1281.193987]  __sys_sendmsg+0x45/0x80
    [ 1281.193989]  SyS_sendmsg+0x12/0x20
    [ 1281.193991]  do_syscall_64+0x6e/0x180
    [ 1281.193993]  entry_SYSCALL64_slow_path+0x25/0x25
    [ 1281.193995] RIP: 0033:0x7f6ec122f5a0
    [ 1281.193995] RSP: 002b:00007ffe69e89c48 EFLAGS: 00000246 ORIG_RAX:
    000000000000002e
    [ 1281.193997] RAX: ffffffffffffffda RBX: 00007ffe69e8dd60 RCX: 00007f6ec122f5a0
    [ 1281.193997] RDX: 0000000000000000 RSI: 00007ffe69e89c90 RDI: 0000000000000003
    [ 1281.193998] RBP: 00007ffe69e89c90 R08: 0000000000000000 R09: 0000000000000003
    [ 1281.193999] R10: 00007ffe69e89a10 R11: 0000000000000246 R12: 0000000058f14b9f
    [ 1281.193999] R13: 0000000000000000 R14: 00000000006473a0 R15: 00007ffe69e8e450
    [ 1281.194001] ---[ end trace 713a77486cbfbfa3 ]---
    
    Fixes: 4493b81b
    
     ("bonding: initialize work-queues during creation of bond")
    Reported-by: default avatarJoe Stringer <joe@ovn.org>
    Tested-by: default avatarJoe Stringer <joe@ovn.org>
    Signed-off-by: default avatarMahesh Bandewar <maheshb@google.com>
    Acked-by: default avatarAndy Gospodarek <andy@greyhouse.net>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    ea8ffc08