Skip to content
  • Maor Gottlieb's avatar
    net/mlx5: Fix deadlock in fs_core · c1948390
    Maor Gottlieb authored
    free_match_list could be called when the flow table is already
    locked. We need to pass this notation to tree_put_node.
    
    It fixes the following lockdep warnning:
    
    [ 1797.268537] ============================================
    [ 1797.276837] WARNING: possible recursive locking detected
    [ 1797.285101] 5.5.0-rc5+ #10 Not tainted
    [ 1797.291641] --------------------------------------------
    [ 1797.299917] handler10/9296 is trying to acquire lock:
    [ 1797.307885] ffff889ad399a0a0 (&node->lock){++++}, at:
    tree_put_node+0x1d5/0x210 [mlx5_core]
    [ 1797.319694]
    [ 1797.319694] but task is already holding lock:
    [ 1797.330904] ffff889ad399a0a0 (&node->lock){++++}, at:
    nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core]
    [ 1797.344707]
    [ 1797.344707] other info that might help us debug this:
    [ 1797.356952]  Possible unsafe locking scenario:
    [ 1797.356952]
    [ 1797.368333]        CPU0
    [ 1797.373357]        ----
    [ 1797.378364]   lock(&node->lock);
    [ 1797.384222]   lock(&node->lock);
    [ 1797.390031]
    [ 1797.390031]  *** DEADLOCK ***
    [ 1797.390031]
    [ 1797.403003]  May be due to missing lock nesting notation
    [ 1797.403003]
    [ 1797.414691] 3 locks held by handler10/9296:
    [ 1797.421465]  #0: ffff889cf2c5a110 (&block->cb_lock){++++}, at:
    tc_setup_cb_add+0x70/0x250
    [ 1797.432810]  #1: ffff88a030081490 (&comp->sem){++++}, at:
    mlx5_devcom_get_peer_data+0x4c/0xb0 [mlx5_core]
    [ 1797.445829]  #2: ffff889ad399a0a0 (&node->lock){++++}, at:
    nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core]
    [ 1797.459913]
    [ 1797.459913] stack backtrace:
    [ 1797.469436] CPU: 1 PID: 9296 Comm: handler10 Kdump: loaded Not
    tainted 5.5.0-rc5+ #10
    [ 1797.480643] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS
    2.4.3 01/17/2017
    [ 1797.491480] Call Trace:
    [ 1797.496701]  dump_stack+0x96/0xe0
    [ 1797.502864]  __lock_acquire.cold.63+0xf8/0x212
    [ 1797.510301]  ? lockdep_hardirqs_on+0x250/0x250
    [ 1797.517701]  ? mark_held_locks+0x55/0xa0
    [ 1797.524547]  ? quarantine_put+0xb7/0x160
    [ 1797.531422]  ? lockdep_hardirqs_on+0x17d/0x250
    [ 1797.538913]  lock_acquire+0xd6/0x1f0
    [ 1797.545529]  ? tree_put_node+0x1d5/0x210 [mlx5_core]
    [ 1797.553701]  down_write+0x94/0x140
    [ 1797.560206]  ? tree_put_node+0x1d5/0x210 [mlx5_core]
    [ 1797.568464]  ? down_write_killable_nested+0x170/0x170
    [ 1797.576925]  ? del_hw_flow_group+0xde/0x1f0 [mlx5_core]
    [ 1797.585629]  tree_put_node+0x1d5/0x210 [mlx5_core]
    [ 1797.593891]  ? free_match_list.part.25+0x147/0x170 [mlx5_core]
    [ 1797.603389]  free_match_list.part.25+0xe0/0x170 [mlx5_core]
    [ 1797.612654]  _mlx5_add_flow_rules+0x17e2/0x20b0 [mlx5_core]
    [ 1797.621838]  ? lock_acquire+0xd6/0x1f0
    [ 1797.629028]  ? esw_get_prio_table+0xb0/0x3e0 [mlx5_core]
    [ 1797.637981]  ? alloc_insert_flow_group+0x420/0x420 [mlx5_core]
    [ 1797.647459]  ? try_to_wake_up+0x4c7/0xc70
    [ 1797.654881]  ? lock_downgrade+0x350/0x350
    [ 1797.662271]  ? __mutex_unlock_slowpath+0xb1/0x3f0
    [ 1797.670396]  ? find_held_lock+0xac/0xd0
    [ 1797.677540]  ? mlx5_add_flow_rules+0xdc/0x360 [mlx5_core]
    [ 1797.686467]  mlx5_add_flow_rules+0xdc/0x360 [mlx5_core]
    [ 1797.695134]  ? _mlx5_add_flow_rules+0x20b0/0x20b0 [mlx5_core]
    [ 1797.704270]  ? irq_exit+0xa5/0x170
    [ 1797.710764]  ? retint_kernel+0x10/0x10
    [ 1797.717698]  ? mlx5_eswitch_set_rule_source_port.isra.9+0x122/0x230
    [mlx5_core]
    [ 1797.728708]  mlx5_eswitch_add_offloaded_rule+0x465/0x6d0 [mlx5_core]
    [ 1797.738713]  ? mlx5_eswitch_get_prio_range+0x30/0x30 [mlx5_core]
    [ 1797.748384]  ? mlx5_fc_stats_work+0x670/0x670 [mlx5_core]
    [ 1797.757400]  mlx5e_tc_offload_fdb_rules.isra.27+0x24/0x90 [mlx5_core]
    [ 1797.767665]  mlx5e_tc_add_fdb_flow+0xaf8/0xd40 [mlx5_core]
    [ 1797.776886]  ? mlx5e_encap_put+0xd0/0xd0 [mlx5_core]
    [ 1797.785562]  ? mlx5e_alloc_flow.isra.43+0x18c/0x1c0 [mlx5_core]
    [ 1797.795353]  __mlx5e_add_fdb_flow+0x2e2/0x440 [mlx5_core]
    [ 1797.804558]  ? mlx5e_tc_update_neigh_used_value+0x8c0/0x8c0
    [mlx5_core]
    [ 1797.815093]  ? wait_for_completion+0x260/0x260
    [ 1797.823272]  mlx5e_configure_flower+0xe94/0x1620 [mlx5_core]
    [ 1797.832792]  ? __mlx5e_add_fdb_flow+0x440/0x440 [mlx5_core]
    [ 1797.842096]  ? down_read+0x11a/0x2e0
    [ 1797.849090]  ? down_write+0x140/0x140
    [ 1797.856142]  ? mlx5e_rep_indr_setup_block_cb+0xc0/0xc0 [mlx5_core]
    [ 1797.866027]  tc_setup_cb_add+0x11a/0x250
    [ 1797.873339]  fl_hw_replace_filter+0x25e/0x320 [cls_flower]
    [ 1797.882385]  ? fl_hw_destroy_filter+0x1c0/0x1c0 [cls_flower]
    [ 1797.891607]  fl_change+0x1d54/0x1fb6 [cls_flower]
    [ 1797.899772]  ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0
    [cls_flower]
    [ 1797.910728]  ? lock_downgrade+0x350/0x350
    [ 1797.918187]  ? __radix_tree_lookup+0xa5/0x130
    [ 1797.926046]  ? fl_set_key+0x1590/0x1590 [cls_flower]
    [ 1797.934611]  ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0
    [cls_flower]
    [ 1797.945673]  tc_new_tfilter+0xcd1/0x1240
    [ 1797.953138]  ? tc_del_tfilter+0xb10/0xb10
    [ 1797.960688]  ? avc_has_perm_noaudit+0x92/0x320
    [ 1797.968721]  ? avc_has_perm_noaudit+0x1df/0x320
    [ 1797.976816]  ? avc_has_extended_perms+0x990/0x990
    [ 1797.985090]  ? mark_lock+0xaa/0x9e0
    [ 1797.991988]  ? match_held_lock+0x1b/0x240
    [ 1797.999457]  ? match_held_lock+0x1b/0x240
    [ 1798.006859]  ? find_held_lock+0xac/0xd0
    [ 1798.014045]  ? symbol_put_addr+0x40/0x40
    [ 1798.021317]  ? rcu_read_lock_sched_held+0xd0/0xd0
    [ 1798.029460]  ? tc_del_tfilter+0xb10/0xb10
    [ 1798.036810]  rtnetlink_rcv_msg+0x4d5/0x620
    [ 1798.044236]  ? rtnl_bridge_getlink+0x460/0x460
    [ 1798.052034]  ? lockdep_hardirqs_on+0x250/0x250
    [ 1798.059837]  ? match_held_lock+0x1b/0x240
    [ 1798.067146]  ? find_held_lock+0xac/0xd0
    [ 1798.074246]  netlink_rcv_skb+0xc6/0x1f0
    [ 1798.081339]  ? rtnl_bridge_getlink+0x460/0x460
    [ 1798.089104]  ? netlink_ack+0x440/0x440
    [ 1798.096061]  netlink_unicast+0x2d4/0x3b0
    [ 1798.103189]  ? netlink_attachskb+0x3f0/0x3f0
    [ 1798.110724]  ? _copy_from_iter_full+0xda/0x370
    [ 1798.118415]  netlink_sendmsg+0x3ba/0x6a0
    [ 1798.125478]  ? netlink_unicast+0x3b0/0x3b0
    [ 1798.132705]  ? netlink_unicast+0x3b0/0x3b0
    [ 1798.139880]  sock_sendmsg+0x94/0xa0
    [ 1798.146332]  ____sys_sendmsg+0x36c/0x3f0
    [ 1798.153251]  ? copy_msghdr_from_user+0x165/0x230
    [ 1798.160941]  ? kernel_sendmsg+0x30/0x30
    [ 1798.167738]  ___sys_sendmsg+0xeb/0x150
    [ 1798.174411]  ? sendmsg_copy_msghdr+0x30/0x30
    [ 1798.181649]  ? lock_downgrade+0x350/0x350
    [ 1798.188559]  ? rcu_read_lock_sched_held+0xd0/0xd0
    [ 1798.196239]  ? __fget+0x21d/0x320
    [ 1798.202335]  ? do_dup2+0x2a0/0x2a0
    [ 1798.208499]  ? lock_downgrade+0x350/0x350
    [ 1798.215366]  ? __fget_light+0xd6/0xf0
    [ 1798.221808]  ? syscall_trace_enter+0x369/0x5d0
    [ 1798.229112]  __sys_sendmsg+0xd3/0x160
    [ 1798.235511]  ? __sys_sendmsg_sock+0x60/0x60
    [ 1798.242478]  ? syscall_trace_enter+0x233/0x5d0
    [ 1798.249721]  ? syscall_slow_exit_work+0x280/0x280
    [ 1798.257211]  ? do_syscall_64+0x1e/0x2e0
    [ 1798.263680]  do_syscall_64+0x72/0x2e0
    [ 1798.269950]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
    
    Fixes: bd71b08e
    
     ("net/mlx5: Support multiple updates of steering rules in parallel")
    Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
    Signed-off-by: default avatarAlaa Hleihel <alaa@mellanox.com>
    Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
    Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
    c1948390