1. 17 Dec, 2020 3 commits
    • Alexandru Elisei's avatar
      arm64: gic: Use IPI test checking for the LPI tests · 906a8967
      Alexandru Elisei authored
      The LPI code validates a result similarly to the IPI tests, by checking if
      the target CPU received the interrupt with the expected interrupt number.
      However, the LPI tests invent their own way of checking the test results by
      creating a global struct (lpi_stats), using a separate interrupt handler
      (lpi_handler) and test function (check_lpi_stats).
      
      There are several areas that can be improved in the LPI code, which are
      already covered by the IPI tests:
      
      - check_lpi_stats() doesn't take into account that the target CPU can
        receive the correct interrupt multiple times.
      - check_lpi_stats() doesn't take into the account the scenarios where all
        online CPUs can receive the interrupt, but the target CPU is the last CPU
        that touches lpi_stats.observed.
      - Insufficient or missing memory synchronization.
      
      Instead of duplicating code, let's convert the LPI tests to use
      check_acked() and the same interrupt handler as the IPI tests, which has
      been renamed to irq_handler() to avoid any confusion.
      
      check_lpi_stats() has been replaced with check_acked() which, together with
      using irq_handler(), instantly gives us more correctness checks and proper
      memory synchronization between threads. lpi_stats.expected has been
      replaced by the CPU mask and the expected interrupt number arguments to
      check_acked(), with no change in semantics.
      
      lpi_handler() aborted the test if the interrupt number was not an LPI. This
      was changed in favor of allowing the test to continue, as it will fail in
      check_acked(), but possibly print information useful for debugging. If the
      test receives spurious interrupts, those are reported via report_info() at
      the end of the test for consistency with the IPI tests, which don't treat
      spurious interrupts as critical errors.
      
      In the spirit of code reuse, secondary_lpi_tests() has been replaced with
      ipi_recv() because the two are now identical; ipi_recv() has been renamed
      to irq_recv(), similarly to irq_handler(), to avoid confusion.
      
      CC: Eric Auger <eric.auger@redhat.com>
      Signed-off-by: Alexandru Elisei's avatarAlexandru Elisei <alexandru.elisei@arm.com>
      906a8967
    • Alexandru Elisei's avatar
      lib: arm64: gic-v3-its: Add wmb() barrier before INT command · 23018c10
      Alexandru Elisei authored
      The ITS tests use the INT command like an SGI. The its_send_int() function
      kicks a CPU and then the test checks that the interrupt was observed as
      expected in check_lpi_stats(). This is done by using lpi_stats.observed and
      lpi_stats.expected, where the target CPU only writes to lpi_stats.observed,
      and the source CPU reads it and compares the values with
      lpi_stats.expected.
      
      The fact that the target CPU doesn't read data written by the source CPU
      means that we don't need to do inter-processor memory synchronization
      for that between the two at the moment.
      
      The acked array is used by its-pending-migration test, but the reset value
      for acked (zero) is the same as the initialization value for static
      variables, so memory synchronization is again not needed.
      
      However, that is all about to change when we modify all ITS tests to use
      the same functions as the IPI tests. Add a write memory barrier to
      its_send_int(), similar to the gicv3_ipi_send_mask(), which has similar
      semantics.
      Suggested-by: default avatarAuger Eric <eric.auger@redhat.com>
      Signed-off-by: Alexandru Elisei's avatarAlexandru Elisei <alexandru.elisei@arm.com>
      23018c10
    • Alexandru Elisei's avatar
      arm64: gic: its-trigger: Don't trigger the LPI while it is pending · 83513394
      Alexandru Elisei authored
      The its-trigger test checks that LPI 8195 is not delivered to the CPU while
      it is disabled at the ITS level. After that it is re-enabled and the test
      checks that the interrupt is properly asserted. After it's re-enabled and
      before the stats are examined, the test triggers the interrupt again, which
      can lead to the same interrupt being delivered twice: once after the
      configuration invalidation and before the INT command, and once after the
      INT command.
      
      Get rid of the INT command after the interrupt is re-enabled to prevent the
      LPI from being asserted twice and add a separate check to test that the INT
      command still works for the now re-enabled LPI 8195.
      
      CC: Auger Eric <eric.auger@redhat.com>
      Suggested-by: default avatarZenghui Yu <yuzenghui@huawei.com>
      83513394
  2. 10 Dec, 2020 8 commits
    • Alexandru Elisei's avatar
      arm/arm64: gic: Make check_acked() more generic · b805a222
      Alexandru Elisei authored
      Testing that an interrupt is received as expected is done in three places:
      in check_ipi_sender(), check_irqnr() and check_acked(). check_irqnr()
      compares the interrupt ID with IPI_IRQ and records a failure in bad_irq,
      and check_ipi_sender() compares the sender with IPI_SENDER and writes to
      bad_sender when they don't match.
      
      Let's move all the checks to check_acked() by renaming
      bad_sender->irq_sender and bad_irq->irq_number and changing their semantics
      so they record the interrupt sender, respectively the irq number.
      check_acked() now takes two new parameters: the expected interrupt number
      and sender.
      
      This has two distinct advantages:
      
      1. check_acked() and ipi_handler() can now be used for interrupts other
         than IPIs.
      2. Correctness checks are consolidated in one function.
      
      CC: Andre Przywara <andre.przywara@arm.com>
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      Signed-off-by: Alexandru Elisei's avatarAlexandru Elisei <alexandru.elisei@arm.com>
      b805a222
    • Alexandru Elisei's avatar
      arm/arm64: gic: Split check_acked() into two functions · 562a6e4c
      Alexandru Elisei authored
      check_acked() has several peculiarities: is the only function among the
      check_* functions which calls report() directly, it does two things
      (waits for interrupts and checks for misfired interrupts) and it also
      mixes printf, report_info and report calls.
      
      check_acked() also reports a pass and returns as soon all the target CPUs
      have received interrupts, However, a CPU not having received an interrupt
      *now* does not guarantee not receiving an erroneous interrupt if we wait
      long enough.
      
      Rework the function by splitting it into two separate functions, each with
      a single responsibility: wait_for_interrupts(), which waits for the
      expected interrupts to fire, and check_acked() which checks that interrupts
      have been received as expected.
      
      wait_for_interrupts() also waits an extra 100 milliseconds after the
      expected interrupts have been received in an effort to make sure we don't
      miss misfiring interrupts.
      
      Splitting check_acked() into two functions will also allow us to
      customize the behavior of each function in the future more easily
      without using an unnecessarily long list of arguments for check_acked().
      
      CC: Andre Przywara <andre.przywara@arm.com>
      Signed-off-by: Alexandru Elisei's avatarAlexandru Elisei <alexandru.elisei@arm.com>
      562a6e4c
    • Alexandru Elisei's avatar
      arm/arm64: gic: Wait for writes to acked or spurious to complete · 45019d09
      Alexandru Elisei authored
      The IPI test has two parts: in the first part, it tests that the sender CPU
      can send an IPI to itself (ipi_test_self()), and in the second part it
      sends interrupts to even-numbered CPUs (ipi_test_smp()). When acknowledging
      an interrupt, if we read back a spurious interrupt ID (1023), the handler
      increments the index in the static array spurious corresponding to the CPU
      ID that the handler is running on; if we get the expected interrupt ID, we
      increment the same index in the acked array.
      
      Reads of the spurious and acked arrays are synchronized with writes
      performed before sending the IPI. The synchronization is done either in the
      IPI sender function (GICv3), either by creating a data dependency (GICv2).
      
      At the end of the test, the sender CPU reads from the acked and spurious
      arrays to check against the expected behaviour. We need to make sure the
      that writes in ipi_handler() are observable by the sender CPU. Use a DSB
      ISHST to make sure that the writes have completed.
      
      One might rightfully argue that there are no guarantees regarding when the
      DSB instruction completes, just like there are no guarantees regarding when
      the value is observed by the other CPUs. However, let's do our best and
      instruct the CPU to complete the memory access when we know that it will be
      needed.
      
      We still need to follow the message passing pattern for the acked,
      respectively bad_irq and bad_sender, because DSB guarantees that all memory
      accesses that come before the barrier have completed, not that they have
      completed in program order.
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      45019d09
    • Alexandru Elisei's avatar
      arm/arm64: gic: Check spurious and bad_sender in the active test · 955182be
      Alexandru Elisei authored
      The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
      checks that the interrupt has been received as expected. The
      ipi_clear_active_handler() clears the active state of the interrupt with a
      write to the GICD_ICACTIVER register instead of writing the to EOI
      register.
      
      When acknowledging the interrupt it is possible to get back an spurious
      interrupt ID (ID 1023), and the interrupt handler increments the number of
      spurious interrupts received on the current processor. However, this is not
      checked at the end of the test. Let's also check for spurious interrupts,
      like the IPI test does.
      
      For IPIs on GICv2, the value returned by a read of the GICC_IAR register
      performed when acknowledging the interrupt also contains the sender CPU
      ID. Add a check for that too.
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      955182be
    • Alexandru Elisei's avatar
      arm/arm64: gic: Use correct memory ordering for the IPI test · 0f541ddf
      Alexandru Elisei authored
      The IPI test works by sending IPIs to even numbered CPUs from the
      IPI_SENDER CPU (CPU1), and then checking that the other CPUs received the
      interrupts as expected. The check is done in check_acked() by the
      IPI_SENDER CPU with the help of three arrays:
      
      - acked, where acked[i] == 1 means that CPU i received the interrupt.
      - bad_irq, where bad_irq[i] == -1 means that the interrupt received by CPU
        i had the expected interrupt number (IPI_IRQ).
      - bad_sender, where bad_sender[i] == -1 means that the interrupt received
        by CPU i was from the expected sender (IPI_SENDER, GICv2 only).
      
      The assumption made by check_acked() is that if a CPU acked an interrupt,
      then bad_sender and bad_irq have also been updated. This is a common
      inter-thread communication pattern called message passing.  For message
      passing to work correctly on weakly consistent memory model architectures,
      like arm and arm64, barriers or address dependencies are required. This is
      described in ARM DDI 0487F.b, in "Armv7 compatible approaches for ordering,
      using DMB and DSB barriers" (page K11-7993), in the section with a single
      observer, which is in our case the IPI_SENDER CPU.
      
      The IPI test attempts to enforce the correct ordering using memory
      barriers, but it's not enough. For example, the program execution below is
      valid from an architectural point of view:
      
      3 online CPUs, initial state (from stats_reset()):
      
      acked[2] = 0;
      bad_sender[2] = -1;
      bad_irq[2] = -1;
      
      CPU1 (in check_acked())		| CPU2 (in ipi_handler())
      				|
      smp_rmb() // DMB ISHLD		| acked[2]++;
      read 1 from acked[2]		|
      nr_pass++ // nr_pass = 3	|
      read -1 from bad_sender[2]	|
      read -1 from bad_irq[2]		|
      				| // in check_ipi_sender()
      				| bad_sender[2] = <bad ipi sender>
      				| // in check_irqnr()
      				| bad_irq[2] = <bad irq number>
      				| smp_wmb() // DMB ISHST
      nr_pass == nr_cpus, return	|
      
      In this scenario, CPU1 will read the updated acked value, but it will read
      the initial bad_sender and bad_irq values. This is permitted because the
      memory barriers do not create a data dependency between the value read from
      acked and the values read from bad_rq and bad_sender on CPU1, respectively
      between the values written to acked, bad_sender and bad_irq on CPU2.
      
      To avoid this situation, let's reorder the barriers and accesses to the
      arrays to create the needed dependencies that ensure that message passing
      behaves as expected.
      
      In the interrupt handler, the writes to bad_sender and bad_irq are
      reordered before the write to acked and a smp_wmb() barrier is added. This
      ensures that if other PEs observe the write to acked, then they will also
      observe the writes to the other two arrays.
      
      In check_acked(), put the smp_rmb() barrier after the read from acked to
      ensure that the subsequent reads from bad_sender, respectively bad_irq,
      aren't reordered locally by the PE.
      
      With these changes, the expected ordering of accesses is respected and we
      end up with the pattern described in the Arm ARM and also in the Linux
      litmus test MP+fencewmbonceonce+fencermbonceonce.litmus from
      tools/memory-model/litmus-tests. More examples and explanations can be
      found in the Linux source tree, in Documentation/memory-barriers.txt, in
      the sections "SMP BARRIER PAIRING" and "READ MEMORY BARRIERS VS LOAD
      SPECULATION".
      
      For consistency with ipi_handler(), the array accesses in
      ipi_clear_active_handler() have also been reordered. This shouldn't affect
      the functionality of that test.
      Signed-off-by: Alexandru Elisei's avatarAlexandru Elisei <alexandru.elisei@arm.com>
      0f541ddf
    • Alexandru Elisei's avatar
      arm/arm64: gic: Remove unnecessary synchronization with stats_reset() · bd37d4b1
      Alexandru Elisei authored
      The GICv3 driver executes a DSB barrier before sending an IPI, which
      ensures that memory accesses have completed. This removes the need to
      enforce ordering with respect to stats_reset() in the IPI handler.
      
      For GICv2, we still need the DMB to ensure ordering between the read of the
      GICC_IAR MMIO register and the read from the acked array. It also matches
      what the Linux GICv2 driver does in gic_handle_irq().
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      bd37d4b1
    • Alexandru Elisei's avatar
      arm/arm64: gic: Remove SMP synchronization from ipi_clear_active_handler() · 2456844d
      Alexandru Elisei authored
      The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
      checks that the interrupt has been received as expected. There is no need
      to use inter-processor memory synchronization primitives on code that runs
      on the same CPU, so remove the unneeded memory barriers.
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      2456844d
    • Alexandru Elisei's avatar
      lib: arm/arm64: gicv2: Add missing barrier when sending IPIs · e401b787
      Alexandru Elisei authored
      GICv2 generates IPIs with a MMIO write to the GICD_SGIR register. A common
      pattern for IPI usage is for the IPI receiver to read data written to
      memory by the sender. The armv7 and armv8 architectures implement a
      weakly-ordered memory model, which means that barriers are required to make
      sure that the expected values are observed.
      
      It turns out that because the receiver CPU must observe the write to memory
      that generated the IPI when reading the GICC_IAR MMIO register, we only
      need to ensure ordering of memory accesses, and not completion. Use a
      smp_wmb (DMB ISHST) barrier before sending the IPI.
      
      This also matches what the Linux GICv2 irqchip driver does (more details
      in commit 8adbf57fc429 ("irqchip: gic: use dmb ishst instead of dsb when
      raising a softirq")).
      
      The gicv2_ipi_send_self() function sends an IPI from a CPU to itself.
      The tests that use this function rely on the interrupt handler to record
      information about the interrupt by using several arrays. It is possible
      for the compiler to infer that the arrays won't be changed during normal
      program flow and try to perform harmful optimizations (like stashing a
      previous read in a register and reusing it). To prevent this, for GICv2,
      a compile barrier is added to gicv2_ipi_send_self(). For GICv3, the
      wmb() barrier in gic_ipi_send_single() (which is also used when a CPU
      sends an IPI to itself) already implies a compiler barrier.
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      e401b787
  3. 09 Dec, 2020 1 commit
    • Alexandru Elisei's avatar
      lib: arm/arm64: gicv3: Add missing barrier when sending IPIs · 11ad5480
      Alexandru Elisei authored
      One common usage for IPIs is for one CPU to write to a shared memory
      location, send the IPI to kick another CPU, and the receiver to read from
      the same location. Proper synchronization is needed to make sure that the
      IPI receiver reads the most recent value and not stale data (for example,
      the write from the sender CPU might still be in a store buffer).
      
      For GICv3, IPIs are generated with a write to the ICC_SGI1R_EL1 register.
      To make sure the memory stores are observable by other CPUs, we need a
      wmb() barrier (DSB ST), which waits for stores to complete.
      
      From the definition of DSB from ARM DDI 0487F.b, page B2-139:
      
      "In addition, no instruction that appears in program order after the DSB
      instruction can alter any state of the system or perform any part of its
      functionality until the DSB completes other than:
      
      - Being fetched from memory and decoded.
      - Reading the general-purpose, SIMD and floating-point, Special-purpose, or
      System registers that are directly or indirectly read without causing
      side-effects."
      
      Similar definition for armv7 (ARM DDI 0406C.d, page A3-150).
      
      The DSB instruction is enough to prevent reordering of the GIC register
      write which comes in program order after the memory access.
      
      This also matches what the Linux GICv3 irqchip driver does (commit
      21ec30c0ef52 ("irqchip/gic-v3: Use wmb() instead of smb_wmb() in
      gic_raise_softirq()")).
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      11ad5480
  4. 19 Nov, 2020 20 commits
  5. 11 Nov, 2020 6 commits
  6. 19 Oct, 2020 2 commits