- 23 May, 2018 3 commits
-
-
Andre Przywara authored
The header files in arm/aarch*/include/asm/ are directly copied from Linux, so we can't just put our own definitions in there. Move the GICv2M MMIO frame size into a more private header, to avoid breaking the build once the header files are synced from Linux. Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
Currently we accidentally overlap the GICv2m MMIO frame with the CPU interface region. Fix this by moving the v2m frame below the CPUI region. Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
The KVM_VGIC_V3_ITS_SIZE macro from the Linux API header file already covers the doorbell page, so we don't need to add that extra page size in our code. Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
- 06 Apr, 2018 3 commits
-
-
Jean-Philippe Brucker authored
Vhost supports a single eventfd as the kick mechanism. Registering a second one will override the first. To ensure vhost works with our virtio-pci, only register the kick eventfd that is used by the guest. Fixes: a508ea95 ("virtio/pci: Use port I/O for configuration registers by default") Signed-off-by:
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Jean-Philippe Brucker authored
virtio/pci.c registers a notification ioeventfd on both PIO and MMIO buses. But architectures other than x86 cannot differentiate MMIO from PIO traps, and the kernel always calls kvm_io_bus_read/write with KVM_MMIO_BUS as argument. As a result kvmtool's ioeventfd isn't used with virtio PCI, because the kernel can't find it and all accesses to the doorbell return to userspace. To fix it, don't set the PIO flag if the architecture doesn't support it. Fixes: a508ea95 ("virtio/pci: Use port I/O for configuration registers by default") Signed-off-by:
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Jean-Philippe Brucker authored
With vhost, the USER_POLL flags isn't passed to ioeventfd__add_event, the function returns early and doesn't add the new event to the used_ioevents list. As a result ioeventfd__del_event doesn't remove the KVM event or free the structure. Always add the event to the list. Signed-off-by:
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
- 19 Mar, 2018 2 commits
-
-
Jean-Philippe Brucker authored
The wmb() in next_desc seems out of place and the comments are inaccurate. Remove the unnecessary barrier and clean up next_desc(). next_desc() is called by virt_queue__get_head_iov() when filling the iov with desciptor addresses. It reads the descriptor's flag and next index. The virt_queue__get_head_iov() only reads the direct and indirect descriptors, and doesn't write any shared memory except from iov and cursors that will be read by the caller. As far as I can see, vhost (the kernel implementation of virtio device) does well without any barrier here, so I think it might be safe to remove. Signed-off-by:
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Jean-Philippe Brucker authored
One barrier seems to be missing from kvmtool's virtio implementation, between virt_queue__available() and virt_queue__pop(). In the following scenario "avail" represents the shared "available" structure in the virtio queue: Guest | Host | avail.ring[shadow] = desc_idx | while (avail.idx != shadow) smp_wmb() | /* missing smp_rmb() */ avail.idx = ++shadow | desc_idx = avail.ring[shadow++] If the host observes the avail.idx write before the avail.ring update, then it will fetch the wrong desc_idx. Add the missing barrier. This seems to fix the horrible bug I'm often seeing when running netperf in a guest (virtio-net + tap) on AMD Seattle. The TX thread reads the wrong descriptor index and either faults when accessing the TX buffer, or pushes the wrong index to the used ring. In that case the guest complains that "id %u is not a head!" and stops the queue. Signed-off-by:
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
- 29 Jan, 2018 3 commits
-
-
Jean-Philippe Brucker authored
Modern virtio PCI is allowed to use both memory and I/O BARs for the config space, but legacy devices must use I/O for BAR0, as specified by Virtio v1.0 cs04: 4.1.5.1.1.1 Legacy Interface: A Note on Device Layout Detection "Transitional devices MUST expose the Legacy Interface in I/O space in BAR0." What virtio calls "I/O space" is most certainly port I/O, as hinted by the discussion in 4.1.4 Virtio Structure PCI Capabilities, where it distinguishes "memory BARs" from "I/O BARs". This is also the conclusion made by SeaBIOS [1], which only looks for port I/O in BAR0 when driving a transitional device. I think MMIO was made the default by a463650c ("kvm tools: pci: add MMIO interface to virtio-pci devices") to support ARM targets, but we support PIO as well as MMIO nowadays. So let's make the legacy virtio implementation comply with the specification and use port I/O for BAR0. [1] https://patchwork.kernel.org/patch/10038927/ Signed-off-by:
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Jean-Philippe Brucker authored
Bad things happen when the VIRTIO_RING_F_EVENT_IDX feature isn't negotiated and we try to write the avail_event anyway. SeaBIOS, for example, stores internal data where avail_event should be [1]. Technically the Virtio specification doesn't forbid the device from writing the avail_event, and it's up to the driver to reserve space for it ("the transitional driver [...] MUST allocate the total number of bytes for the virtqueue according to [formula containing the avail event]"). But it doesn't hurt us to avoid writing avail_event, and kvmtool needs changes for interrupt suppression anyway, in order to comply with the spec. Indeed Virtio 1.0 cs04 says, in 2.4.7.2 Device Requirements: Virtqueue Interrupt Suppression: """ If the VIRTIO_F_EVENT_IDX feature bit is not negotiated: * The device MUST ignore the used_event value. * After the device writes a descriptor index into the used ring: - If flags is 1, the device SHOULD NOT send an interrupt. """ So let's do that. [1] https://patchwork.kernel.org/patch/10038931/ Signed-off-by:
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Jean-Philippe Brucker authored
We're going to need the features bits negotiated between host and guest in the core code. Save them in the virtio_device structure. Signed-off-by:
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
- 14 Dec, 2017 1 commit
-
-
Jean-Philippe Brucker authored
When characters are input on the console before virtio_console is initialized, the term.c poll thread will get stuck in virtio_console__inject_interrupt, because it ends up doing pthread_cond_wait on the uninitialized poll_cond, which will hang indefinitely. As a result it becomes impossible to input characters into the guest, even when using serial instead of virtio console. Initialize poll_cond statically to prevent this race. Signed-off-by:
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
- 03 Nov, 2017 4 commits
-
-
Andre Przywara authored
Commit f6108d72 ("Add GICv2m support") introduced a bool return type, but missed to include the respective header (this was probably part of a former prerequisite series). Fix this by including the header. Fixes: f6108d72 ("Add GICv2m support") Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Jean-Philippe Brucker authored
GICv2m is a small extension to the GICv2 architecture, specified in the Server Base System Architecture (SBSA). It adds a set of register to converts MSIs into SPIs, effectively enabling MSI support for pre-GICv3 platforms. Implement a GICv2m emulation entirely in userspace. Add a thin translation layer in irq.c to catch the MSI->SPI routing setup of the guest, and then transform irqfd injection of MSI into the associated SPI. There shouldn't be any significant runtime overhead compared to gicv3-its. The device can be enabled by passing "--irqchip gicv2m" to kvmtool. Signed-off-by:
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Jean-Philippe Brucker authored
When kvm_pause is called early (from taking the rwlock), it segfaults because the CPU array is initialized slightly later. Fix this. This doesn't happen at the moment but the gicv2m patch will register an MMIO region, which requires br_write_lock. gicv2m is instantiated by kvm__arch_init from within core_init (level 0). The CPU array is initialized later in base_init (level 1). Signed-off-by:
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Jean-Philippe Brucker authored
Commit 5857730c ("builtin-run: Pass console= parameter based on active console") adds a console parameter to the kernel command line, but doesn't account for x86 kvm__arch_set_cmdline populating real_cmdline without adding a space. Fix the concatenation. Signed-off-by:
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
- 25 Oct, 2017 1 commit
-
-
Will Deacon authored
x86 already does this in the backend, but doing it in the generic code means that it is possible to boot a defconfig arm64 kernel under kvmtool without having to specify any additional parameters at all. Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
- 24 Oct, 2017 1 commit
-
-
Wei Chen authored
In kvmtool, the terminal has 4 term-devices at most. And these term-devices can connect to serial8250 or virtio console ports. The kvmtool has a loop thread to detect the incoming data on these term-devices and then send the data to guest through serial8250 or virtio console ports. On x86, kvmtool allow to read data from all 4 term-devices. But on ARM, we only support reading data from the first term-devices. The data from the other term-devices will be ignored. Currently, we're adding the kvmtool support to runv (a kind of hyper container) with Hyperhq guys. Here we're using 3 serial ports in guest to communicate with host (Container runtime). On x86, it works fine, but on ARM it could not work. Because we're using terminal 2 to send/receive control message, but terminal 2 is single direction. In this case, we change the kvm__arch_read_term for ARM to allow reading data from all term-devices. Signed-off-by:
Wei Chen <Wei.Chen@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
- 09 Oct, 2017 1 commit
-
-
Will Deacon authored
Fully fledged bootloaders should really be populating this from within the guest using virtio-rng, but having a way to specify it on the cmdline is useful for developers or users without a bootloader. Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
- 30 Aug, 2017 4 commits
-
-
Marc Zyngier authored
At the moment we use the linker to convert the compiled guest_init binary into an ELF object file, so it can be embedded into the kvmtool binary and accessed later easily at runtime. Now this has two problems: 1) This approach does not work for MIPS, because the linker defaults to a different ABI than the compiler, so the GCC generated object files are not compatible with this converted binary. 2) The size symbol as it's used at the moment in the object file is subject to relocation, which leads to wrong results when using PIE builds, which is now the default for some distributions. Fix those two problems at once by using some shell tools to create a C source file containing the guest_init binary, which then gets compiled into a proper object file with the normal compiler and its flags. The size of the guest init binaries is now simply a variable, which does not get mangled at all. Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
So far the generation of the guest_init binaries is not properly modelled in the Makefile: the intermediate object files are not targets. This leads to failures when those files get deleted. So (also in preperation for the upcoming rework) rework the dependency chain to have those intermediate files covered as well, which involves splitting the generation into two steps. On the way use automatic variables where applicable and remove the explicit listing of the guest_init targets, which are now covered by the final $(GUEST_OBJS) targets. Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Wei Chen authored
In Linux commit fb652fdfe83710da0ca13448a41b7ed027d0a984: https://www.spinics.net/lists/netdev/msg443562.html The UFO support had been removed. If we use tap mode for network (--network mode=tap,tapif=...), we will get following error: "Warning: Config tap device TUNSETOFFLOAD error You have requested a TAP device, but creation of one has failed because: Invalid argument" So, if we're running with latest kernel, we'd better to remove TUN_F_UFO from TAP init. But if we're running with older kernels without above commit. We'll miss the UFO feature. In this case, we'd better to check the kernel UFO support status for tap driver. The tap UFO state will used in get_host_features to return correct VIRTIO_NET features. If we defer the tap UFO support check in virtio_net__tap_init, it will be too later. So we separate the tap create code from tap_init to a standalone function. This new function will be used in virtio_net_init to create tap device and check the tap UFO support status at the very beginning. Signed-off-by:
Wei Chen <Wei.Chen@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Thomas Petazzoni authored
Since kernel commit 25dc1d6cc3082aab293e5dad47623b550f7ddd2a ("x86: stop exporting msr-index.h to userland"), <asm/msr-index.h> is no longer exported to userspace. Therefore, any toolchain built with kernel headers >= 4.12 will no longer have this header file, causing a build failure in kvmtool. As a replacement, this patch includes inside x86/kvm-cpu.c the necessary MSR_* definitions. Reviewed-by:
Riku Voipio <riku.voipio@linaro.org> Signed-off-by:
Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
- 09 Jun, 2017 17 commits
-
-
Will Deacon authored
GCC 7 warns about truncating the mpidr when we print the cpu_name into the device tree: arm/fdt.c: In function ‘setup_fdt’: arm/fdt.c:58:45: error: ‘%lx’ directive output may be truncated writing between 1 and 10 bytes into a region of size 7 [-Werror=format-truncation=] snprintf(cpu_name, CPU_NAME_MAX_LEN, "cpu@%lx", mpidr); Fix this by bumping the buffer to 15 bytes. We really only need 11 bytes, but GCC isn't smart enough to identify that we mask out the top buts of the MPIDR and the analysis just seems to be based on types. Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Jeremy Linton authored
makedev() should be sourced from sys/sysmacros.h rather than sys/types.h. This is because glibc is moving away from having it available in types.h. https://patchwork.ozlabs.org/patch/611994/ Signed-off-by:
Jeremy Linton <jeremy.linton@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
With everything in place for the ITS emulation add a new option to the --irqchip parameter to allow the user to specify --irqchip=gicv3-its to enable the ITS emulation. This will trigger creating the FDT node and an ITS register frame to tell the kernel we want ITS emulation in the guest. Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
For ITS emulation we need the device ID along with the MSI payload and doorbell address to identify an MSI, so we need to put it in the GSI IRQ routing table too. There is a per-VM capability by which the kernel signals the need for a device ID, so check this and put the device ID into the routing table if needed. For PCI devices we take the bus/device/function triplet and and that to the routing setup call. Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
Since we soon start using GSI routing on ARM platforms too, we have to setup the initial SPI routing table. Before the first call to KVM_SET_GSI_ROUTING, the kernel holds this table internally, but this is overwritten with the ioctl, so we have to explicitly set it up here. The routing is actually not used for IRQs triggered by KVM_IRQ_LINE, but it needs to be here anyway. We use a simple 1:1 mapping. Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
The ITS emulation requires a unique device ID to be passed along the MSI payload when kvmtool wants to trigger an MSI in the guest. According to the proposed changes to the interface add the PCI bus/device/function triple to the structure passed with the ioctl. Check the respective capability before actually adding the device ID to the kvm_msi struct. Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
KVM capabilities can be per-VM, in this case the ioctl should be issued on the VM file descriptor, not on the system fd. Since this feature is guarded by a (system) capability itself, wrap the call into a function of its own. Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
The ARM GICv3 ITS requires a separate device tree node to describe the ITS. Add this as a child to the GIC interrupt controller node to let a guest discover and use the ITS if the user requests it. Since we now need to specify #address-cells for the GIC node, we have to add two zeroes to the interrupt map to match that. Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
The GICv3 ITS expects a separate 64K page to hold ITS registers. Add a function to reserve such a page in the guest's I/O memory and use that for the ITS vGIC type. To cover the 64K page with the MSI doorbell (which directly follows the page with the register frames), we reserve this as well, although the guest is never expected to write into this. Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Vladimir Murzin authored
KVM/arm recently got support for vGICv3 (and vITS), which is evident in the updated header file. So as now ARM has feature parity when it comes to the GIC emulation, we can remove the special defines we had in place to allow compilation for ARM(32). For simplicity we now use 64K sized GIC regions everywhere, as GICv3 mandates them. [Andre: some update, reword commit message] Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Vladimir Murzin <vladimir.murzin@arm.com> Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
The GICv3 ITS emulation brings some additions to the headers, so lets update kvmtool's version of the headers to Linux' v4.11-rc7-57. Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
If we need to inject an MSI into the guest, we rely at the moment on a working GSI MSI routing functionality. However we can get away without IRQ routing, if the host supports MSI injection via the KVM_SIGNAL_MSI ioctl. So we try the GSI routing first, but if that fails due to a missing IRQ routing functionality, we fall back to KVM_SIGNAL_MSI (if that is supported). Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
Currently we deny any VHOST_* functionality if the architecture supports guests with different endianness than the host. Most of the time even on those architectures the endianness of guest and host are the same, though, so we are denying the glory of VHOST needlessly. Switch from compile time determination to a run time scheme, which takes the actual endianness of the guest into account. For this we change the semantics of VIRTIO_ENDIAN_HOST to return the actual endianness of the host (the endianness of kvmtool at compile time, really). The actual check in vhost_net now compares this against the guest endianness. This enables vhost support on ARM and ARM64. Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
When we set up GSI routing to map MSIs to KVM's GSI numbers, we write the current device's MSI setup into the kernel routing table. However the device driver in the guest can use PCI configuration space accesses to change the MSI configuration (address and/or payload data). Whenever this happens after we have setup the routing table already, we must amend the previously sent data. So when MSI-X PCI config space accesses write address or payload, find the associated GSI number and the matching routing table entry and update the kernel routing table (only if the data has changed). This fixes vhost-net, where the queue's IRQFD was setup before the MSI vectors. To avoid issues, we ignore writes to the PBA region. The spec says: "Software should never write, and should only read Pending Bits. If software writes to Pending Bits, the result is undefined." Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
The current IRQ routing code in x86/irq.c is mostly implementing a generic KVM interface which other architectures may use too. Move the code to set up an MSI route into the generic irq.c file and guard it with the KVM_CAP_IRQ_ROUTING capability to return an error if the kernel does not support interrupt routing. This also removes the dummy implementations for all other architectures and only leaves the x86 specific code in x86/irq.c. Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
As KVM supports only onc (v)GIC per guest and it's hard to imagine that we will ever need more than that, lets simplify the FDT generation by not passing that single, constant phandle around. Let's just reference that one global symbol from enum phandles instead. Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Andre Przywara authored
The current implementation of fdt__alloc_phandle() suffers from being implemented in a static inline function situated in a header file. This will only create expected results within a single compilation unit. It seems a bit over the top to use a function to allocate phandles, when at the end of the day a phandle is just a unique identifier. To simplify things - especially with upcoming patches - we just introduce an enum per architecture to hold all possible phandle sources and use that instead of the dynamic allocation. Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Andre Przywara <andre.przywara@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-