1. 26 Mar, 2014 2 commits
    • Zoltan Kiss's avatar
      xen-netback: Non-functional follow-up patch for grant mapping series · 0e59a4a5
      Zoltan Kiss authored
      
      
      Ian made some late comments about the grant mapping series, I incorporated the
      non-functional outcomes into this patch:
      
      - typo fixes in a comment of xenvif_free(), and add another one there as well
      - typo fix for comment of rx_drain_timeout_msecs
      - remove stale comment before calling xenvif_grant_handle_reset()
      Signed-off-by: default avatarZoltan Kiss <zoltan.kiss@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e59a4a5
    • Zoltan Kiss's avatar
      xen-netback: Stop using xenvif_tx_pending_slots_available · 869b9b19
      Zoltan Kiss authored
      
      
      Since the early days TX stops if there isn't enough free pending slots to
      consume a maximum sized (slot-wise) packet. Probably the reason for that is to
      avoid the case when we don't have enough free pending slot in the ring to finish
      the packet. But if we make sure that the pending ring has the same size as the
      shared ring, that shouldn't really happen. The frontend can only post packets
      which fit the to the free space of the shared ring. If it doesn't, the frontend
      has to stop, as it can only increase the req_prod when the whole packet fits
      onto the ring.
      This patch avoid using this checking, makes sure the 2 ring has the same size,
      and remove a checking from the callback. As now we don't stop the NAPI instance
      on this condition, we don't have to wake it up if we free pending slots up.
      Signed-off-by: default avatarZoltan Kiss <zoltan.kiss@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      869b9b19
  2. 25 Mar, 2014 2 commits
  3. 11 Mar, 2014 1 commit
  4. 07 Mar, 2014 9 commits
    • Zoltan Kiss's avatar
      xen-netback: Aggregate TX unmap operations · e9275f5e
      Zoltan Kiss authored
      
      
      Unmapping causes TLB flushing, therefore we should make it in the largest
      possible batches. However we shouldn't starve the guest for too long. So if
      the guest has space for at least two big packets and we don't have at least a
      quarter ring to unmap, delay it for at most 1 milisec.
      Signed-off-by: default avatarZoltan Kiss <zoltan.kiss@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9275f5e
    • Zoltan Kiss's avatar
      xen-netback: Timeout packets in RX path · 09350788
      Zoltan Kiss authored
      
      
      A malicious or buggy guest can leave its queue filled indefinitely, in which
      case qdisc start to queue packets for that VIF. If those packets came from an
      another guest, it can block its slots and prevent shutdown. To avoid that, we
      make sure the queue is drained in every 10 seconds.
      The QDisc queue in worst case takes 3 round to flush usually.
      Signed-off-by: default avatarZoltan Kiss <zoltan.kiss@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      09350788
    • Zoltan Kiss's avatar
      xen-netback: Handle guests with too many frags · e3377f36
      Zoltan Kiss authored
      
      
      Xen network protocol had implicit dependency on MAX_SKB_FRAGS. Netback has to
      handle guests sending up to XEN_NETBK_LEGACY_SLOTS_MAX slots. To achieve that:
      - create a new skb
      - map the leftover slots to its frags (no linear buffer here!)
      - chain it to the previous through skb_shinfo(skb)->frag_list
      - map them
      - copy and coalesce the frags into a brand new one and send it to the stack
      - unmap the 2 old skb's pages
      
      It's also introduces new stat counters, which help determine how often the guest
      sends a packet with more than MAX_SKB_FRAGS frags.
      
      NOTE: if bisect brought you here, you should apply the series up until
      "xen-netback: Timeout packets in RX path", otherwise malicious guests can block
      other guests by not releasing their sent packets.
      Signed-off-by: default avatarZoltan Kiss <zoltan.kiss@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3377f36
    • Zoltan Kiss's avatar
      xen-netback: Add stat counters for zerocopy · 1bb332af
      Zoltan Kiss authored
      
      
      These counters help determine how often the buffers had to be copied. Also
      they help find out if packets are leaked, as if "sent != success + fail",
      there are probably packets never freed up properly.
      
      NOTE: if bisect brought you here, you should apply the series up until
      "xen-netback: Timeout packets in RX path", otherwise Windows guests can't work
      properly and malicious guests can block other guests by not releasing their sent
      packets.
      Signed-off-by: default avatarZoltan Kiss <zoltan.kiss@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1bb332af
    • Zoltan Kiss's avatar
      xen-netback: Remove old TX grant copy definitons and fix indentations · 62bad319
      Zoltan Kiss authored
      
      
      These became obsolete with grant mapping. I've left intentionally the
      indentations in this way, to improve readability of previous patches.
      
      NOTE: if bisect brought you here, you should apply the series up until
      "xen-netback: Timeout packets in RX path", otherwise Windows guests can't work
      properly and malicious guests can block other guests by not releasing their sent
      packets.
      Signed-off-by: default avatarZoltan Kiss <zoltan.kiss@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      62bad319
    • Zoltan Kiss's avatar
      xen-netback: Introduce TX grant mapping · f53c3fe8
      Zoltan Kiss authored
      
      
      This patch introduces grant mapping on netback TX path. It replaces grant copy
      operations, ditching grant copy coalescing along the way. Another solution for
      copy coalescing is introduced in "xen-netback: Handle guests with too many
      frags", older guests and Windows can broke before that patch applies.
      There is a callback (xenvif_zerocopy_callback) from core stack to release the
      slots back to the guests when kfree_skb or skb_orphan_frags called. It feeds a
      separate dealloc thread, as scheduling NAPI instance from there is inefficient,
      therefore we can't do dealloc from the instance.
      Signed-off-by: default avatarZoltan Kiss <zoltan.kiss@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f53c3fe8
    • Zoltan Kiss's avatar
      xen-netback: Handle foreign mapped pages on the guest RX path · 3e2234b3
      Zoltan Kiss authored
      
      
      RX path need to know if the SKB fragments are stored on pages from another
      domain.
      Logically this patch should be after introducing the grant mapping itself, as
      it makes sense only after that. But to keep bisectability, I moved it here. It
      shouldn't change any functionality here. xenvif_zerocopy_callback and
      ubuf_to_vif are just stubs here, they will be introduced properly later on.
      Signed-off-by: default avatarZoltan Kiss <zoltan.kiss@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3e2234b3
    • Zoltan Kiss's avatar
      xen-netback: Minor refactoring of netback code · 121fa4b7
      Zoltan Kiss authored
      
      
      This patch contains a few bits of refactoring before introducing the grant
      mapping changes:
      - introducing xenvif_tx_pending_slots_available(), as this is used several
        times, and will be used more often
      - rename the thread to vifX.Y-guest-rx, to signify it does RX work from the
        guest point of view
      Signed-off-by: default avatarZoltan Kiss <zoltan.kiss@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      121fa4b7
    • Zoltan Kiss's avatar
      xen-netback: Use skb->cb for pending_idx · 8f13dd96
      Zoltan Kiss authored
      
      
      Storing the pending_idx at the first byte of the linear buffer never looked
      good, skb->cb is a more proper place for this. It also prevents the header to
      be directly grant copied there, and we don't have the pending_idx after we
      copied the header here, so it's time to change it.
      It also introduces helpers for the RX side
      Signed-off-by: default avatarZoltan Kiss <zoltan.kiss@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f13dd96
  5. 06 Feb, 2014 1 commit
    • Zoltan Kiss's avatar
      xen-netback: Fix Rx stall due to race condition · 9ab9831b
      Zoltan Kiss authored
      The recent patch to fix receive side flow control
      (11b57f90
      
      : xen-netback: stop vif thread
      spinning if frontend is unresponsive) solved the spinning thread problem,
      however caused an another one. The receive side can stall, if:
      - [THREAD] xenvif_rx_action sets rx_queue_stopped to true
      - [INTERRUPT] interrupt happens, and sets rx_event to true
      - [THREAD] then xenvif_kthread sets rx_event to false
      - [THREAD] rx_work_todo doesn't return true anymore
      
      Also, if interrupt sent but there is still no room in the ring, it take quite a
      long time until xenvif_rx_action realize it. This patch ditch that two variable,
      and rework rx_work_todo. If the thread finds it can't fit more skb's into the
      ring, it saves the last slot estimation into rx_last_skb_slots, otherwise it's
      kept as 0. Then rx_work_todo will check if:
      - there is something to send to the ring (like before)
      - there is space for the topmost packet in the queue
      
      I think that's more natural and optimal thing to test than two bool which are
      set somewhere else.
      Signed-off-by: default avatarZoltan Kiss <zoltan.kiss@citrix.com>
      Reviewed-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Acked-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ab9831b
  6. 14 Jan, 2014 1 commit
  7. 10 Jan, 2014 1 commit
    • Paul Durrant's avatar
      xen-netback: stop vif thread spinning if frontend is unresponsive · 11b57f90
      Paul Durrant authored
      The recent patch to improve guest receive side flow control (ca2f09f2
      
      ) had a
      slight flaw in the wait condition for the vif thread in that any remaining
      skbs in the guest receive side netback internal queue would prevent the
      thread from sleeping. An unresponsive frontend can lead to a permanently
      non-empty internal queue and thus the thread will spin. In this case the
      thread should really sleep until the frontend becomes responsive again.
      
      This patch adds an extra flag to the vif which is set if the shared ring
      is full and cleared when skbs are drained into the shared ring. Thus,
      if the thread runs, finds the shared ring full and can make no progress the
      flag remains set. If the flag remains set then the thread will sleep,
      regardless of a non-empty queue, until the next event from the frontend.
      Signed-off-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Acked-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11b57f90
  8. 30 Dec, 2013 1 commit
    • Paul Durrant's avatar
      xen-netback: fix guest-receive-side array sizes · ac3d5ac2
      Paul Durrant authored
      
      
      The sizes chosen for the metadata and grant_copy_op arrays on the guest
      receive size are wrong;
      
      - The meta array is needlessly twice the ring size, when we only ever
        consume a single array element per RX ring slot
      - The grant_copy_op array is way too small. It's sized based on a bogus
        assumption: that at most two copy ops will be used per ring slot. This
        may have been true at some point in the past but it's clear from looking
        at start_new_rx_buffer() that a new ring slot is only consumed if a frag
        would overflow the current slot (plus some other conditions) so the actual
        limit is MAX_SKB_FRAGS grant_copy_ops per ring slot.
      
      This patch fixes those two sizing issues and, because grant_copy_ops grows
      so much, it pulls it out into a separate chunk of vmalloc()ed memory.
      Signed-off-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Acked-by: default avatarWei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac3d5ac2
  9. 19 Dec, 2013 2 commits
  10. 17 Dec, 2013 1 commit
  11. 12 Dec, 2013 3 commits
    • Paul Durrant's avatar
      xen-netback: fix gso_prefix check · a3314f3d
      Paul Durrant authored
      
      
      There is a mistake in checking the gso_prefix mask when passing large
      packets to a guest. The wrong shift is applied to the bit - the raw skb
      gso type is used rather then the translated one. This leads to large packets
      being handed to the guest without the GSO metadata. This patch fixes the
      check.
      
      The mistake manifested as errors whilst running Microsoft HCK large packet
      offload tests between a pair of Windows 8 VMs. I have verified this patch
      fixes those errors.
      Signed-off-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Acked-by: default avatarIan Campbell <ian.campbell@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a3314f3d
    • Paul Durrant's avatar
      xen-netback: napi: don't prematurely request a tx event · d9601a36
      Paul Durrant authored
      
      
      This patch changes the RING_FINAL_CHECK_FOR_REQUESTS in
      xenvif_build_tx_gops to a check for RING_HAS_UNCONSUMED_REQUESTS as the
      former call has the side effect of advancing the ring event pointer and
      therefore inviting another interrupt from the frontend before the napi
      poll has actually finished, thereby defeating the point of napi.
      
      The event pointer is updated by RING_FINAL_CHECK_FOR_REQUESTS in
      xenvif_poll, the napi poll function, if the work done is less than the
      budget i.e. when actually transitioning back to interrupt mode.
      Reported-by: default avatarMalcolm Crossley <malcolm.crossley@citrix.com>
      Signed-off-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9601a36
    • Paul Durrant's avatar
      xen-netback: napi: fix abuse of budget · 10574059
      Paul Durrant authored
      
      
      netback seems to be somewhat confused about the napi budget parameter. The
      parameter is supposed to limit the number of skbs processed in each poll,
      but netback has this confused with grant operations.
      
      This patch fixes that, properly limiting the work done in each poll. Note
      that this limit makes sure we do not process any more data from the shared
      ring than we intend to pass back from the poll. This is important to
      prevent tx_queue potentially growing without bound.
      Signed-off-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10574059
  12. 11 Dec, 2013 1 commit
    • Paul Durrant's avatar
      xen-netback: make sure skb linear area covers checksum field · d52eb0d4
      Paul Durrant authored
      
      
      skb_partial_csum_set requires that the linear area of the skb covers the
      checksum field. The checksum setup code in netback was only doing that
      pullup in the case when the pseudo header checksum was being recalculated
      though. This patch makes that pullup unconditional. (I pullup the whole
      transport header just for simplicity; the requirement is only for the check
      field but in the case of UDP this is the last field in the header and in the
      case of TCP it's the last but one).
      
      The lack of pullup manifested as failures running Microsoft HCK network
      tests on a pair of Windows 8 VMs and it has been verified that this patch
      fixes the problem.
      Suggested-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Reviewed-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d52eb0d4
  13. 10 Dec, 2013 1 commit
    • Paul Durrant's avatar
      xen-netback: improve guest-receive-side flow control · ca2f09f2
      Paul Durrant authored
      
      
      The way that flow control works without this patch is that, in start_xmit()
      the code uses xenvif_count_skb_slots() to predict how many slots
      xenvif_gop_skb() will consume and then adds this to a 'req_cons_peek'
      counter which it then uses to determine if the shared ring has that amount
      of space available by checking whether 'req_prod' has passed that value.
      If the ring doesn't have space the tx queue is stopped.
      xenvif_gop_skb() will then consume slots and update 'req_cons' and issue
      responses, updating 'rsp_prod' as it goes. The frontend will consume those
      responses and post new requests, by updating req_prod. So, req_prod chases
      req_cons which chases rsp_prod, and can never exceed that value. Thus if
      xenvif_count_skb_slots() ever returns a number of slots greater than
      xenvif_gop_skb() uses, req_cons_peek will get to a value that req_prod cannot
      possibly achieve (since it's limited by the 'real' req_cons) and, if this
      happens enough times, req_cons_peek gets more than a ring size ahead of
      req_cons and the tx queue then remains stopped forever waiting for an
      unachievable amount of space to become available in the ring.
      
      Having two routines trying to calculate the same value is always going to be
      fragile, so this patch does away with that. All we essentially need to do is
      make sure that we have 'enough stuff' on our internal queue without letting
      it build up uncontrollably. So start_xmit() makes a cheap optimistic check
      of how much space is needed for an skb and only turns the queue off if that
      is unachievable. net_rx_action() is the place where we could do with an
      accurate predicition but, since that has proven tricky to calculate, a cheap
      worse-case (but not too bad) estimate is all we really need since the only
      thing we *must* prevent is xenvif_gop_skb() consuming more slots than are
      available.
      
      Without this patch I can trivially stall netback permanently by just doing
      a large guest to guest file copy between two Windows Server 2008R2 VMs on a
      single host.
      
      Patch tested with frontends in:
      - Windows Server 2008R2
      - CentOS 6.0
      - Debian Squeeze
      - Debian Wheezy
      - SLES11
      Signed-off-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Annie Li <annie.li@oracle.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Acked-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ca2f09f2
  14. 06 Dec, 2013 1 commit
  15. 28 Nov, 2013 1 commit
  16. 29 Oct, 2013 1 commit
  17. 17 Oct, 2013 3 commits
  18. 08 Oct, 2013 1 commit
  19. 30 Sep, 2013 1 commit
    • Wei Liu's avatar
      xen-netback: improve ring effeciency for guest RX · 4f0581d2
      Wei Liu authored
      There was a bug that netback routines netbk/xenvif_skb_count_slots and
      netbk/xenvif_gop_frag_copy disagreed with each other, which caused
      netback to push wrong number of responses to netfront, which caused
      netfront to eventually crash. The bug was fixed in 6e43fc04
      ("xen-netback: count number required slots for an skb more carefully").
      
      Commit 6e43fc04 focused on backport-ability. The drawback with the
      existing packing scheme is that the ring is not used effeciently, as
      stated in 6e43fc04.
      
      skb->data like:
          |        1111|222222222222|3333        |
      
      is arranged as:
          |1111        |222222222222|3333        |
      
      If we can do this:
          |111122222222|22223333    |
      That would save one ring slot, which improves ring effeciency.
      
      This patch effectively reverts 6e43fc04. That patch made count_slots
      agree with gop_frag_copy, while this patch goes the other way around --
      make gop_frag_copy agree with count_slots. The end result is that they
      still agree with each other, and the ring is now arranged like:
          |111122222222|22223333    |
      
      The patch that improves packing was first posted by Xi Xong and Matt
      Wilson. I only rebase it on top of net-next and rewrite commit message,
      so I retain all their SoBs. For more infomation about the original bug
      please refer to email listed below and commit message of 6e43fc04.
      
      Original patch:
      http://lists.xen.org/archives/html/xen-devel/2013-07/msg00760.html
      
      Signed-off-by: default avatarXi Xiong <xixiong@amazon.com>
      Reviewed-by: default avatarMatt Wilson <msw@amazon.com>
      [ msw: minor code cleanups, rewrote commit message, adjusted code
        to count RX slots instead of meta structures ]
      Signed-off-by: default avatarMatt Wilson <msw@amazon.com>
      Cc: Annie Li <annie.li@oracle.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <Ian.Campbell@citrix.com>
      [ liuw: rebased on top of net-next tree, rewrote commit message, coding
        style cleanup. ]
      Signed-off-by: default avatarWei Liu <wei.liu2@citrix.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Acked-by: default avatarIan Campbell <Ian.Campbell@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f0581d2
  20. 13 Sep, 2013 1 commit
    • David Vrabel's avatar
      xen-netback: count number required slots for an skb more carefully · 6e43fc04
      David Vrabel authored
      
      
      When a VM is providing an iSCSI target and the LUN is used by the
      backend domain, the generated skbs for direct I/O writes to the disk
      have large, multi-page skb->data but no frags.
      
      With some lengths and starting offsets, xen_netbk_count_skb_slots()
      would be one short because the simple calculation of
      DIV_ROUND_UP(skb_headlen(), PAGE_SIZE) was not accounting for the
      decisions made by start_new_rx_buffer() which does not guarantee
      responses are fully packed.
      
      For example, a skb with length < 2 pages but which spans 3 pages would
      be counted as requiring 2 slots but would actually use 3 slots.
      
      skb->data:
      
          |        1111|222222222222|3333        |
      
      Fully packed, this would need 2 slots:
      
          |111122222222|22223333    |
      
      But because the 2nd page wholy fits into a slot it is not split across
      slots and goes into a slot of its own:
      
          |1111        |222222222222|3333        |
      
      Miscounting the number of slots means netback may push more responses
      than the number of available requests.  This will cause the frontend
      to get very confused and report "Too many frags/slots".  The frontend
      never recovers and will eventually BUG.
      
      Fix this by counting the number of required slots more carefully.  In
      xen_netbk_count_skb_slots(), more closely follow the algorithm used by
      xen_netbk_gop_skb() by introducing xen_netbk_count_frag_slots() which
      is the dry-run equivalent of netbk_gop_frag_copy().
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Acked-by: default avatarIan Campbell <ian.campbell@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e43fc04
  21. 29 Aug, 2013 3 commits
  22. 01 Jul, 2013 1 commit
  23. 24 Jun, 2013 1 commit