1. 01 Jan, 2019 1 commit
    • Willem de Bruijn's avatar
      ip: validate header length on virtual device xmit · cb9f1b78
      Willem de Bruijn authored
      KMSAN detected read beyond end of buffer in vti and sit devices when
      passing truncated packets with PF_PACKET. The issue affects additional
      ip tunnel devices.
      
      Extend commit 76c0ddd8 ("ip6_tunnel: be careful when accessing the
      inner header") and commit ccfec9e5
      
       ("ip_tunnel: be careful when
      accessing the inner header").
      
      Move the check to a separate helper and call at the start of each
      ndo_start_xmit function in net/ipv4 and net/ipv6.
      
      Minor changes:
      - convert dev_kfree_skb to kfree_skb on error path,
        as dev_kfree_skb calls consume_skb which is not for error paths.
      - use pskb_network_may_pull even though that is pedantic here,
        as the same as pskb_may_pull for devices without llheaders.
      - do not cache ipv6 hdrs if used only once
        (unsafe across pskb_may_pull, was more relevant to earlier patch)
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb9f1b78
  2. 21 Dec, 2018 1 commit
    • Eric Dumazet's avatar
      ipv6: tunnels: fix two use-after-free · cbb49697
      Eric Dumazet authored
      xfrm6_policy_check() might have re-allocated skb->head, we need
      to reload ipv6 header pointer.
      
      sysbot reported :
      
      BUG: KASAN: use-after-free in __ipv6_addr_type+0x302/0x32f net/ipv6/addrconf_core.c:40
      Read of size 4 at addr ffff888191b8cb70 by task syz-executor2/1304
      
      CPU: 0 PID: 1304 Comm: syz-executor2 Not tainted 4.20.0-rc7+ #356
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x244/0x39d lib/dump_stack.c:113
       print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412
       __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:432
       __ipv6_addr_type+0x302/0x32f net/ipv6/addrconf_core.c:40
       ipv6_addr_type include/net/ipv6.h:403 [inline]
       ip6_tnl_get_cap+0x27/0x190 net/ipv6/ip6_tunnel.c:727
       ip6_tnl_rcv_ctl+0xdb/0x2a0 net/ipv6/ip6_tunnel.c:757
       vti6_rcv+0x336/0x8f3 net/ipv6/ip6_vti.c:321
       xfrm6_ipcomp_rcv+0x1a5/0x3a0 net/ipv6/xfrm6_protocol.c:132
       ip6_protocol_deliver_rcu+0x372/0x1940 net/ipv6/ip6_input.c:394
       ip6_input_finish+0x84/0x170 net/ipv6/ip6_input.c:434
       NF_HOOK include/linux/netfilter.h:289 [inline]
       ip6_input+0xe9/0x600 net/ipv6/ip6_input.c:443
      IPVS: ftp: loaded support on port[0] = 21
       ip6_mc_input+0x514/0x11c0 net/ipv6/ip6_input.c:537
       dst_input include/net/dst.h:450 [inline]
       ip6_rcv_finish+0x17a/0x330 net/ipv6/ip6_input.c:76
       NF_HOOK include/linux/netfilter.h:289 [inline]
       ipv6_rcv+0x115/0x640 net/ipv6/ip6_input.c:272
       __netif_receive_skb_one_core+0x14d/0x200 net/core/dev.c:4973
       __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:5083
       process_backlog+0x24e/0x7a0 net/core/dev.c:5923
       napi_poll net/core/dev.c:6346 [inline]
       net_rx_action+0x7fa/0x19b0 net/core/dev.c:6412
       __do_softirq+0x308/0xb7e kernel/softirq.c:292
       do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1027
       </IRQ>
       do_softirq.part.14+0x126/0x160 kernel/softirq.c:337
       do_softirq+0x19/0x20 kernel/softirq.c:340
       netif_rx_ni+0x521/0x860 net/core/dev.c:4569
       dev_loopback_xmit+0x287/0x8c0 net/core/dev.c:3576
       NF_HOOK include/linux/netfilter.h:289 [inline]
       ip6_finish_output2+0x193a/0x2930 net/ipv6/ip6_output.c:84
       ip6_fragment+0x2b06/0x3850 net/ipv6/ip6_output.c:727
       ip6_finish_output+0x6b7/0xc50 net/ipv6/ip6_output.c:152
       NF_HOOK_COND include/linux/netfilter.h:278 [inline]
       ip6_output+0x232/0x9d0 net/ipv6/ip6_output.c:171
       dst_output include/net/dst.h:444 [inline]
       ip6_local_out+0xc5/0x1b0 net/ipv6/output_core.c:176
       ip6_send_skb+0xbc/0x340 net/ipv6/ip6_output.c:1727
       ip6_push_pending_frames+0xc5/0xf0 net/ipv6/ip6_output.c:1747
       rawv6_push_pending_frames net/ipv6/raw.c:615 [inline]
       rawv6_sendmsg+0x3a3e/0x4b40 net/ipv6/raw.c:945
      kobject: 'queues' (0000000089e6eea2): kobject_add_internal: parent: 'tunl0', set: '<NULL>'
      kobject: 'queues' (0000000089e6eea2): kobject_uevent_env
       inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798
      kobject: 'queues' (0000000089e6eea2): kobject_uevent_env: filter function caused the event to drop!
       sock_sendmsg_nosec net/socket.c:621 [inline]
       sock_sendmsg+0xd5/0x120 net/socket.c:631
       sock_write_iter+0x35e/0x5c0 net/socket.c:900
       call_write_iter include/linux/fs.h:1857 [inline]
       new_sync_write fs/read_write.c:474 [inline]
       __vfs_write+0x6b8/0x9f0 fs/read_write.c:487
      kobject: 'rx-0' (00000000e2d902d9): kobject_add_internal: parent: 'queues', set: 'queues'
      kobject: 'rx-0' (00000000e2d902d9): kobject_uevent_env
       vfs_write+0x1fc/0x560 fs/read_write.c:549
       ksys_write+0x101/0x260 fs/read_write.c:598
      kobject: 'rx-0' (00000000e2d902d9): fill_kobj_path: path = '/devices/virtual/net/tunl0/queues/rx-0'
       __do_sys_write fs/read_write.c:610 [inline]
       __se_sys_write fs/read_write.c:607 [inline]
       __x64_sys_write+0x73/0xb0 fs/read_write.c:607
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
      kobject: 'tx-0' (00000000443b70ac): kobject_add_internal: parent: 'queues', set: 'queues'
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x457669
      Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f9bd200bc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457669
      RDX: 000000000000058f RSI: 00000000200033c0 RDI: 0000000000000003
      kobject: 'tx-0' (00000000443b70ac): kobject_uevent_env
      RBP: 000000000072bf00 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9bd200c6d4
      R13: 00000000004c2dcc R14: 00000000004da398 R15: 00000000ffffffff
      
      Allocated by task 1304:
       save_stack+0x43/0xd0 mm/kasan/kasan.c:448
       set_track mm/kasan/kasan.c:460 [inline]
       kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
       __do_kmalloc_node mm/slab.c:3684 [inline]
       __kmalloc_node_track_caller+0x50/0x70 mm/slab.c:3698
       __kmalloc_reserve.isra.41+0x41/0xe0 net/core/skbuff.c:140
       __alloc_skb+0x155/0x760 net/core/skbuff.c:208
      kobject: 'tx-0' (00000000443b70ac): fill_kobj_path: path = '/devices/virtual/net/tunl0/queues/tx-0'
       alloc_skb include/linux/skbuff.h:1011 [inline]
       __ip6_append_data.isra.49+0x2f1a/0x3f50 net/ipv6/ip6_output.c:1450
       ip6_append_data+0x1bc/0x2d0 net/ipv6/ip6_output.c:1619
       rawv6_sendmsg+0x15ab/0x4b40 net/ipv6/raw.c:938
       inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798
       sock_sendmsg_nosec net/socket.c:621 [inline]
       sock_sendmsg+0xd5/0x120 net/socket.c:631
       ___sys_sendmsg+0x7fd/0x930 net/socket.c:2116
       __sys_sendmsg+0x11d/0x280 net/socket.c:2154
       __do_sys_sendmsg net/socket.c:2163 [inline]
       __se_sys_sendmsg net/socket.c:2161 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      kobject: 'gre0' (00000000cb1b2d7b): kobject_add_internal: parent: 'net', set: 'devices'
      
      Freed by task 1304:
       save_stack+0x43/0xd0 mm/kasan/kasan.c:448
       set_track mm/kasan/kasan.c:460 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
       kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
       __cache_free mm/slab.c:3498 [inline]
       kfree+0xcf/0x230 mm/slab.c:3817
       skb_free_head+0x93/0xb0 net/core/skbuff.c:553
       pskb_expand_head+0x3b2/0x10d0 net/core/skbuff.c:1498
       __pskb_pull_tail+0x156/0x18a0 net/core/skbuff.c:1896
       pskb_may_pull include/linux/skbuff.h:2188 [inline]
       _decode_session6+0xd11/0x14d0 net/ipv6/xfrm6_policy.c:150
       __xfrm_decode_session+0x71/0x140 net/xfrm/xfrm_policy.c:3272
      kobject: 'gre0' (00000000cb1b2d7b): kobject_uevent_env
       __xfrm_policy_check+0x380/0x2c40 net/xfrm/xfrm_policy.c:3322
       __xfrm_policy_check2 include/net/xfrm.h:1170 [inline]
       xfrm_policy_check include/net/xfrm.h:1175 [inline]
       xfrm6_policy_check include/net/xfrm.h:1185 [inline]
       vti6_rcv+0x4bd/0x8f3 net/ipv6/ip6_vti.c:316
       xfrm6_ipcomp_rcv+0x1a5/0x3a0 net/ipv6/xfrm6_protocol.c:132
       ip6_protocol_deliver_rcu+0x372/0x1940 net/ipv6/ip6_input.c:394
       ip6_input_finish+0x84/0x170 net/ipv6/ip6_input.c:434
       NF_HOOK include/linux/netfilter.h:289 [inline]
       ip6_input+0xe9/0x600 net/ipv6/ip6_input.c:443
       ip6_mc_input+0x514/0x11c0 net/ipv6/ip6_input.c:537
       dst_input include/net/dst.h:450 [inline]
       ip6_rcv_finish+0x17a/0x330 net/ipv6/ip6_input.c:76
       NF_HOOK include/linux/netfilter.h:289 [inline]
       ipv6_rcv+0x115/0x640 net/ipv6/ip6_input.c:272
       __netif_receive_skb_one_core+0x14d/0x200 net/core/dev.c:4973
       __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:5083
       process_backlog+0x24e/0x7a0 net/core/dev.c:5923
      kobject: 'gre0' (00000000cb1b2d7b): fill_kobj_path: path = '/devices/virtual/net/gre0'
       napi_poll net/core/dev.c:6346 [inline]
       net_rx_action+0x7fa/0x19b0 net/core/dev.c:6412
       __do_softirq+0x308/0xb7e kernel/softirq.c:292
      
      The buggy address belongs to the object at ffff888191b8cac0
       which belongs to the cache kmalloc-512 of size 512
      The buggy address is located 176 bytes inside of
       512-byte region [ffff888191b8cac0, ffff888191b8ccc0)
      The buggy address belongs to the page:
      page:ffffea000646e300 count:1 mapcount:0 mapping:ffff8881da800940 index:0x0
      flags: 0x2fffc0000000200(slab)
      raw: 02fffc0000000200 ffffea0006eaaa48 ffffea00065356c8 ffff8881da800940
      raw: 0000000000000000 ffff888191b8c0c0 0000000100000006 0000000000000000
      page dumped because: kasan: bad access detected
      kobject: 'queues' (000000005fd6226e): kobject_add_internal: parent: 'gre0', set: '<NULL>'
      
      Memory state around the buggy address:
       ffff888191b8ca00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff888191b8ca80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
      >ffff888191b8cb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                   ^
       ffff888191b8cb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff888191b8cc00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      Fixes: 0d3c703a ("ipv6: Cleanup IPv6 tunnel receive path")
      Fixes: ed1efb2a
      
       ("ipv6: Add support for IPsec virtual tunnel interfaces")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cbb49697
  3. 30 Aug, 2018 1 commit
  4. 20 Aug, 2018 1 commit
    • Haishuang Yan's avatar
      ip6_vti: fix a null pointer deference when destroy vti6 tunnel · 9c86336c
      Haishuang Yan authored
      If load ip6_vti module and create a network namespace when set
      fb_tunnels_only_for_init_net to 1, then exit the namespace will
      cause following crash:
      
      [ 6601.677036] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      [ 6601.679057] PGD 8000000425eca067 P4D 8000000425eca067 PUD 424292067 PMD 0
      [ 6601.680483] Oops: 0000 [#1] SMP PTI
      [ 6601.681223] CPU: 7 PID: 93 Comm: kworker/u16:1 Kdump: loaded Tainted: G            E     4.18.0+ #3
      [ 6601.683153] Hardware name: Fedora Project OpenStack Nova, BIOS seabios-1.7.5-11.el7 04/01/2014
      [ 6601.684919] Workqueue: netns cleanup_net
      [ 6601.685742] RIP: 0010:vti6_exit_batch_net+0x87/0xd0 [ip6_vti]
      [ 6601.686932] Code: 7b 08 48 89 e6 e8 b9 ea d3 dd 48 8b 1b 48 85 db 75 ec 48 83 c5 08 48 81 fd 00 01 00 00 75 d5 49 8b 84 24 08 01 00 00 48 89 e6 <48> 8b 78 08 e8 90 ea d3 dd 49 8b 45 28 49 39 c6 4c 8d 68 d8 75 a1
      [ 6601.690735] RSP: 0018:ffffa897c2737de0 EFLAGS: 00010246
      [ 6601.691846] RAX: 0000000000000000 RBX: 0000000000000000 RCX: dead000000000200
      [ 6601.693324] RDX: 0000000000000015 RSI: ffffa897c2737de0 RDI: ffffffff9f2ea9e0
      [ 6601.694824] RBP: 0000000000000100 R08: 0000000000000000 R09: 0000000000000000
      [ 6601.696314] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8dc323c07e00
      [ 6601.697812] R13: ffff8dc324a63100 R14: ffffa897c2737e30 R15: ffffa897c2737e30
      [ 6601.699345] FS:  0000000000000000(0000) GS:ffff8dc33fdc0000(0000) knlGS:0000000000000000
      [ 6601.701068] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 6601.702282] CR2: 0000000000000008 CR3: 0000000424966002 CR4: 00000000001606e0
      [ 6601.703791] Call Trace:
      [ 6601.704329]  cleanup_net+0x1b4/0x2c0
      [ 6601.705268]  process_one_work+0x16c/0x370
      [ 6601.706145]  worker_thread+0x49/0x3e0
      [ 6601.706942]  kthread+0xf8/0x130
      [ 6601.707626]  ? rescuer_thread+0x340/0x340
      [ 6601.708476]  ? kthread_bind+0x10/0x10
      [ 6601.709266]  ret_from_fork+0x35/0x40
      
      Reproduce:
      modprobe ip6_vti
      echo 1 > /proc/sys/net/core/fb_tunnels_only_for_init_net
      unshare -n
      exit
      
      This because ip6n->tnls_wc[0] point to fallback device in default, but
      in non-default namespace, ip6n->tnls_wc[0] will be NULL, so add the NULL
      check comparatively.
      
      Fixes: e2948e5a
      
       ("ip6_vti: fix creating fallback tunnel device for vti6")
      Signed-off-by: default avatarHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c86336c
  5. 19 Aug, 2018 1 commit
  6. 18 Aug, 2018 1 commit
  7. 07 Jun, 2018 1 commit
  8. 01 May, 2018 1 commit
  9. 27 Apr, 2018 1 commit
  10. 05 Apr, 2018 1 commit
  11. 27 Mar, 2018 1 commit
  12. 19 Mar, 2018 3 commits
    • Stefano Brivio's avatar
      vti6: Fix dev->max_mtu setting · f8a554b4
      Stefano Brivio authored
      We shouldn't allow a tunnel to have IP_MAX_MTU as MTU, because
      another IPv6 header is going on top of our packets. Without this
      patch, we might end up building packets bigger than IP_MAX_MTU.
      
      Fixes: b96f9afe
      
       ("ipv4/6: use core net MTU range checking")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Acked-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      f8a554b4
    • Stefano Brivio's avatar
      vti6: Keep set MTU on link creation or change, validate it · 7a67e69a
      Stefano Brivio authored
      In vti6_link_config(), if MTU is already given on link creation
      or change, validate and use it instead of recomputing it. To do
      that, we need to propagate the knowledge that MTU was set by
      userspace all the way down to vti6_link_config().
      
      To keep this simple, vti6_dev_init() sets the new 'keep_mtu'
      argument of vti6_link_config() to true: on initialization, we
      don't have convenient access to netlink attributes there, but we
      will anyway check whether dev->mtu is set in vti6_link_config().
      If it's non-zero, it was set to the value of the IFLA_MTU
      attribute during creation. Otherwise, determine a reasonable
      value.
      
      Fixes: ed1efb2a ("ipv6: Add support for IPsec virtual tunnel interfaces")
      Fixes: 53c81e95
      
       ("ip6_vti: adjust vti mtu according to mtu of lower device")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Acked-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      7a67e69a
    • Stefano Brivio's avatar
      vti6: Properly adjust vti6 MTU from MTU of lower device · c6741fbe
      Stefano Brivio authored
      If a lower device is found, we don't need to subtract
      LL_MAX_HEADER to calculate our MTU: just use its MTU, the link
      layer headers are already taken into account by it.
      
      If the lower device is not found, start from ETH_DATA_LEN
      instead, and only in this case subtract a worst-case
      LL_MAX_HEADER.
      
      We then need to subtract our additional IPv6 header from the
      calculation.
      
      While at it, note that vti6 doesn't have a hardware header, so
      it doesn't need to set dev->hard_header_len. And as
      vti6_link_config() now always sets the MTU, there's no need to
      set a default value in vti6_dev_setup().
      
      This makes the behaviour consistent with IPv4 vti, after
      commit a3245236 ("vti4: Don't count header length twice."),
      which was accidentally reverted by merge commit f895f0cf
      ("Merge branch 'master' of
      git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec").
      
      While commit 53c81e95 ("ip6_vti: adjust vti mtu according to
      mtu of lower device") improved on the original situation, this
      was still not ideal. As reported in that commit message itself,
      if we start from an underlying veth MTU of 9000, we end up with
      an MTU of 8832, that is, 9000 - LL_MAX_HEADER - sizeof(ipv6hdr).
      This should simply be 8880, or 9000 - sizeof(ipv6hdr) instead:
      we found the lower device (veth) and we know we don't have any
      additional link layer header, so there's no need to subtract an
      hypothetical worst-case number.
      
      Fixes: 53c81e95
      
       ("ip6_vti: adjust vti mtu according to mtu of lower device")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Acked-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      c6741fbe
  13. 04 Mar, 2018 1 commit
  14. 27 Feb, 2018 1 commit
  15. 25 Jan, 2018 1 commit
  16. 20 Dec, 2017 1 commit
    • Alexey Kodanev's avatar
      ip6_vti: adjust vti mtu according to mtu of lower device · 53c81e95
      Alexey Kodanev authored
      
      
      LTP/udp6_ipsec_vti tests fail when sending large UDP datagrams over
      ip6_vti that require fragmentation and the underlying device has an
      MTU smaller than 1500 plus some extra space for headers. This happens
      because ip6_vti, by default, sets MTU to ETH_DATA_LEN and not updating
      it depending on a destination address or link parameter. Further
      attempts to send UDP packets may succeed because pmtu gets updated on
      ICMPV6_PKT_TOOBIG in vti6_err().
      
      In case the lower device has larger MTU size, e.g. 9000, ip6_vti works
      but not using the possible maximum size, output packets have 1500 limit.
      
      The above cases require manual MTU setup after ip6_vti creation. However
      ip_vti already updates MTU based on lower device with ip_tunnel_bind_dev().
      
      Here is the example when the lower device MTU is set to 9000:
      
        # ip a sh ltp_ns_veth2
            ltp_ns_veth2@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 ...
              inet 10.0.0.2/24 scope global ltp_ns_veth2
              inet6 fd00::2/64 scope global
      
        # ip li add vti6 type vti6 local fd00::2 remote fd00::1
        # ip li show vti6
            vti6@NONE: <POINTOPOINT,NOARP> mtu 1500 ...
              link/tunnel6 fd00::2 peer fd00::1
      
      After the patch:
        # ip li add vti6 type vti6 local fd00::2 remote fd00::1
        # ip li show vti6
            vti6@NONE: <POINTOPOINT,NOARP> mtu 8832 ...
              link/tunnel6 fd00::2 peer fd00::1
      Reported-by: default avatarPetr Vorel <pvorel@suse.cz>
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      53c81e95
  17. 26 Sep, 2017 1 commit
    • Alexey Kodanev's avatar
      vti: fix use after free in vti_tunnel_xmit/vti6_tnl_xmit · 36f6ee22
      Alexey Kodanev authored
      When running LTP IPsec tests, KASan might report:
      
      BUG: KASAN: use-after-free in vti_tunnel_xmit+0xeee/0xff0 [ip_vti]
      Read of size 4 at addr ffff880dc6ad1980 by task swapper/0/0
      ...
      Call Trace:
        <IRQ>
        dump_stack+0x63/0x89
        print_address_description+0x7c/0x290
        kasan_report+0x28d/0x370
        ? vti_tunnel_xmit+0xeee/0xff0 [ip_vti]
        __asan_report_load4_noabort+0x19/0x20
        vti_tunnel_xmit+0xeee/0xff0 [ip_vti]
        ? vti_init_net+0x190/0x190 [ip_vti]
        ? save_stack_trace+0x1b/0x20
        ? save_stack+0x46/0xd0
        dev_hard_start_xmit+0x147/0x510
        ? icmp_echo.part.24+0x1f0/0x210
        __dev_queue_xmit+0x1394/0x1c60
      ...
      Freed by task 0:
        save_stack_trace+0x1b/0x20
        save_stack+0x46/0xd0
        kasan_slab_free+0x70/0xc0
        kmem_cache_free+0x81/0x1e0
        kfree_skbmem+0xb1/0xe0
        kfree_skb+0x75/0x170
        kfree_skb_list+0x3e/0x60
        __dev_queue_xmit+0x1298/0x1c60
        dev_queue_xmit+0x10/0x20
        neigh_resolve_output+0x3a8/0x740
        ip_finish_output2+0x5c0/0xe70
        ip_finish_output+0x4ba/0x680
        ip_output+0x1c1/0x3a0
        xfrm_output_resume+0xc65/0x13d0
        xfrm_output+0x1e4/0x380
        xfrm4_output_finish+0x5c/0x70
      
      Can be fixed if we get skb->len before dst_output().
      
      Fixes: b9959fd3 ("vti: switch to new ip tunnel code")
      Fixes: 22e1b23d
      
       ("vti6: Support inter address family tunneling.")
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      36f6ee22
  18. 19 Sep, 2017 1 commit
    • Eric Dumazet's avatar
      ipv6: speedup ipv6 tunnels dismantle · bb401cae
      Eric Dumazet authored
      
      
      Implement exit_batch() method to dismantle more devices
      per round.
      
      (rtnl_lock() ...
       unregister_netdevice_many() ...
       rtnl_unlock())
      
      Tested:
      $ cat add_del_unshare.sh
      for i in `seq 1 40`
      do
       (for j in `seq 1 100` ; do unshare -n /bin/true >/dev/null ; done) &
      done
      wait ; grep net_namespace /proc/slabinfo
      
      Before patch :
      $ time ./add_del_unshare.sh
      net_namespace        110    267   5504    1    2 : tunables    8    4    0 : slabdata    110    267      0
      
      real    3m25.292s
      user    0m0.644s
      sys     0m40.153s
      
      After patch:
      
      $ time ./add_del_unshare.sh
      net_namespace        126    282   5504    1    2 : tunables    8    4    0 : slabdata    126    282      0
      
      real	1m38.965s
      user	0m0.688s
      sys	0m37.017s
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb401cae
  19. 18 Jul, 2017 1 commit
  20. 27 Jun, 2017 3 commits
  21. 07 Jun, 2017 1 commit
    • David S. Miller's avatar
      net: Fix inconsistent teardown and release of private netdev state. · cf124db5
      David S. Miller authored
      
      
      Network devices can allocate reasources and private memory using
      netdev_ops->ndo_init().  However, the release of these resources
      can occur in one of two different places.
      
      Either netdev_ops->ndo_uninit() or netdev->destructor().
      
      The decision of which operation frees the resources depends upon
      whether it is necessary for all netdev refs to be released before it
      is safe to perform the freeing.
      
      netdev_ops->ndo_uninit() presumably can occur right after the
      NETDEV_UNREGISTER notifier completes and the unicast and multicast
      address lists are flushed.
      
      netdev->destructor(), on the other hand, does not run until the
      netdev references all go away.
      
      Further complicating the situation is that netdev->destructor()
      almost universally does also a free_netdev().
      
      This creates a problem for the logic in register_netdevice().
      Because all callers of register_netdevice() manage the freeing
      of the netdev, and invoke free_netdev(dev) if register_netdevice()
      fails.
      
      If netdev_ops->ndo_init() succeeds, but something else fails inside
      of register_netdevice(), it does call ndo_ops->ndo_uninit().  But
      it is not able to invoke netdev->destructor().
      
      This is because netdev->destructor() will do a free_netdev() and
      then the caller of register_netdevice() will do the same.
      
      However, this means that the resources that would normally be released
      by netdev->destructor() will not be.
      
      Over the years drivers have added local hacks to deal with this, by
      invoking their destructor parts by hand when register_netdevice()
      fails.
      
      Many drivers do not try to deal with this, and instead we have leaks.
      
      Let's close this hole by formalizing the distinction between what
      private things need to be freed up by netdev->destructor() and whether
      the driver needs unregister_netdevice() to perform the free_netdev().
      
      netdev->priv_destructor() performs all actions to free up the private
      resources that used to be freed by netdev->destructor(), except for
      free_netdev().
      
      netdev->needs_free_netdev is a boolean that indicates whether
      free_netdev() should be done at the end of unregister_netdevice().
      
      Now, register_netdevice() can sanely release all resources after
      ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
      and netdev->priv_destructor().
      
      And at the end of unregister_netdevice(), we invoke
      netdev->priv_destructor() and optionally call free_netdev().
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cf124db5
  22. 21 Apr, 2017 1 commit
  23. 24 Feb, 2017 1 commit
  24. 16 Feb, 2017 1 commit
  25. 27 Jan, 2017 1 commit
  26. 06 Jan, 2017 1 commit
  27. 18 Nov, 2016 1 commit
    • Alexey Dobriyan's avatar
      netns: make struct pernet_operations::id unsigned int · c7d03a00
      Alexey Dobriyan authored
      Make struct pernet_operations::id unsigned.
      
      There are 2 reasons to do so:
      
      1)
      This field is really an index into an zero based array and
      thus is unsigned entity. Using negative value is out-of-bound
      access by definition.
      
      2)
      On x86_64 unsigned 32-bit data which are mixed with pointers
      via array indexing or offsets added or subtracted to pointers
      are preffered to signed 32-bit data.
      
      "int" being used as an array index needs to be sign-extended
      to 64-bit before being used.
      
      	void f(long *p, int i)
      	{
      		g(p[i]);
      	}
      
        roughly translates to
      
      	movsx	rsi, esi
      	mov	rdi, [rsi+...]
      	call 	g
      
      MOVSX is 3 byte instruction which isn't necessary if the variable is
      unsigned because x86_64 is zero extending by default.
      
      Now, there is net_generic() function which, you guessed it right, uses
      "int" as an array index:
      
      	static inline void *net_generic(const struct net *net, int id)
      	{
      		...
      		ptr = ng->ptr[id - 1];
      		...
      	}
      
      And this function is used a lot, so those sign...
      c7d03a00
  28. 04 Nov, 2016 1 commit
    • Lorenzo Colitti's avatar
      net: inet: Support UID-based routing in IP protocols. · e2d118a1
      Lorenzo Colitti authored
      - Use the UID in routing lookups made by protocol connect() and
        sendmsg() functions.
      - Make sure that routing lookups triggered by incoming packets
        (e.g., Path MTU discovery) take the UID of the socket into
        account.
      - For packets not associated with a userspace socket, (e.g., ping
        replies) use UID 0 inside the user namespace corresponding to
        the network namespace the socket belongs to. This allows
        all namespaces to apply routing and iptables rules to
        kernel-originated traffic in that namespaces by matching UID 0.
        This is better than using the UID of the kernel socket that is
        sending the traffic, because the UID of kernel sockets created
        at namespace creation time (e.g., the per-processor ICMP and
        TCP sockets) is the UID of the user that created the socket,
        which might not be mapped in the namespace.
      
      Tested: compiles allnoconfig, allyesconfig, allmodconfig
      Tested: https://android-review.googlesource.com/253302
      
      Signed-off-by: default avatarLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e2d118a1
  29. 20 Oct, 2016 1 commit
    • Jarod Wilson's avatar
      ipv4/6: use core net MTU range checking · b96f9afe
      Jarod Wilson authored
      
      
      ipv4/ip_tunnel:
      - min_mtu = 68, max_mtu = 0xFFF8 - dev->hard_header_len - t_hlen
      - preserve all ndo_change_mtu checks for now to prevent regressions
      
      ipv6/ip6_tunnel:
      - min_mtu = 68, max_mtu = 0xFFF8 - dev->hard_header_len
      - preserve all ndo_change_mtu checks for now to prevent regressions
      
      ipv6/ip6_vti:
      - min_mtu = 1280, max_mtu = 65535
      - remove redundant vti6_change_mtu
      
      ipv6/sit:
      - min_mtu = 1280, max_mtu = 0xFFF8 - t_hlen
      - remove redundant ipip6_tunnel_change_mtu
      
      CC: netdev@vger.kernel.org
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      CC: James Morris <jmorris@namei.org>
      CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      CC: Patrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarJarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b96f9afe
  30. 11 Oct, 2016 1 commit
    • Nicolas Dichtel's avatar
      vti6: flush x-netns xfrm cache when vti interface is removed · 7f92083e
      Nicolas Dichtel authored
      This is the same fix than commit a5d0dc81
      
       ("vti: flush x-netns xfrm
      cache when vti interface is removed")
      
      This patch fixes a refcnt problem when a x-netns vti6 interface is removed:
      unregister_netdevice: waiting for vti6_test to become free. Usage count = 1
      
      Here is a script to reproduce the problem:
      
      ip link set dev ntfp2 up
      ip addr add dev ntfp2 2001::1/64
      ip link add vti6_test type vti6 local 2001::1 remote 2001::2 key 1
      ip netns add secure
      ip link set vti6_test netns secure
      ip netns exec secure ip link set vti6_test up
      ip netns exec secure ip link s lo up
      ip netns exec secure ip addr add dev vti6_test 2003::1/64
      ip -6 xfrm policy add dir out tmpl src 2001::1 dst 2001::2 proto esp \
      	   mode tunnel mark 1
      ip -6 xfrm policy add dir in tmpl src 2001::2 dst 2001::1 proto esp \
      	   mode tunnel mark 1
      ip xfrm state add src 2001::1 dst 2001::2 proto esp spi 1 mode tunnel \
      	   enc des3_ede 0x112233445566778811223344556677881122334455667788 mark 1
      ip xfrm state add src 2001::2 dst 2001::1 proto esp spi 1 mode tunnel \
      	   enc des3_ede 0x112233445566778811223344556677881122334455667788 mark 1
      ip netns exec secure  ping6 -c 4 2003::2
      ip netns del secure
      
      CC: Lance Richardson <lrichard@redhat.com>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: default avatarLance Richardson <lrichard@redhat.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      7f92083e
  31. 21 Sep, 2016 1 commit
    • Nicolas Dichtel's avatar
      vti6: fix input path · 63c43787
      Nicolas Dichtel authored
      Since commit 1625f452, vti6 is broken, all input packets are dropped
      (LINUX_MIB_XFRMINNOSTATES is incremented).
      
      XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6 is set by vti6_rcv() before calling
      xfrm6_rcv()/xfrm6_rcv_spi(), thus we cannot set to NULL that value in
      xfrm6_rcv_spi().
      
      A new function xfrm6_rcv_tnl() that enables to pass a value to
      xfrm6_rcv_spi() is added, so that xfrm6_rcv() is not touched (this function
      is used in several handlers).
      
      CC: Alexey Kodanev <alexey.kodanev@oracle.com>
      Fixes: 1625f452
      
       ("net/xfrm_input: fix possible NULL deref of tunnel.ip6->parms.i_key")
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      63c43787
  32. 09 Sep, 2016 1 commit
  33. 11 Aug, 2016 1 commit
  34. 17 Feb, 2016 1 commit
  35. 08 Oct, 2015 1 commit
  36. 18 Sep, 2015 1 commit