- 07 Feb, 2020 2 commits
-
-
Al Viro authored
The former contains nothing but a pointer to an array of the latter... Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
Eric Sandeen authored
Unused now. Signed-off-by:
Eric Sandeen <sandeen@redhat.com> Acked-by:
David Howells <dhowells@redhat.com> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- 06 Feb, 2020 1 commit
-
-
Christoph Hellwig authored
Since the need for a special flag to support SCSI passthrough on a block device was added in May 2017 the SCSI passthrough support in virtio-blk has been disabled. It has always been a bad idea (just ask the original author..) and we have virtio-scsi for proper passthrough. The feature also never made it into the virtio 1.0 or later specifications. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com> Reviewed-by:
Hannes Reinecke <hare@suse.de> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com>
-
- 05 Feb, 2020 37 commits
-
-
Miaohe Lin authored
The function vmx_decache_cr0_guest_bits() is only called below its implementation. So this is meaningless and should be removed. Signed-off-by:
Miaohe Lin <linmiaohe@huawei.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Re-add code to mark CR4.UMIP as reserved if UMIP is not supported by the host. The UMIP handling was unintentionally dropped during a recent refactoring. Not flagging CR4.UMIP allows the guest to set its CR4.UMIP regardless of host support or userspace desires. On CPUs with UMIP support, including emulated UMIP, this allows the guest to enable UMIP against the wishes of the userspace VMM. On CPUs without any form of UMIP, this results in a failed VM-Enter due to invalid guest state. Fixes: 345599f9 ("KVM: x86: Add macro to ensure reserved cr4 bits checks stay in sync") Signed-off-by:
Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Three of the feature bits in vmxfeatures.h have names that are different from the Intel SDM. The names have been adjusted recently in KVM but they were using the old name in the tip tree's x86/cpu branch. Adjust for consistency. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Userspace that does not know about the AMD_IBRS bit might still allow the guest to protect itself with MSR_IA32_SPEC_CTRL using the Intel SPEC_CTRL bit. However, svm.c disallows this and will cause a #GP in the guest when writing to the MSR. Fix this by loosening the test and allowing the Intel CPUID bit, and in fact allow the AMD_STIBP bit as well since it allows writing to MSR_IA32_SPEC_CTRL too. Reported-by:
Zhiyi Guo <zhguo@redhat.com> Analyzed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Analyzed-by:
Laszlo Ersek <lersek@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Eric Hankland authored
Correct the logic in intel_pmu_set_msr() for fixed and general purpose counters. This was recently changed to set pmc->counter without taking in to account the value of pmc_read_counter() which will be incorrect if the counter is currently running and non-zero; this changes back to the old logic which accounted for the value of currently running counters. Signed-off-by:
Eric Hankland <ehankland@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Sane L1 hypervisors are not supposed to turn any of the unsupported VMX controls on for its guests and nested_vmx_check_controls() checks for that. This is, however, not the case for the controls which are supported on the host but are missing in enlightened VMCS and when eVMCS is in use. It would certainly be possible to add these missing checks to nested_check_vm_execution_controls()/_vm_exit_controls()/.. but it seems preferable to keep eVMCS-specific stuff in eVMCS and reduce the impact on non-eVMCS guests by doing less unrelated checks. Create a separate nested_evmcs_check_controls() for this purpose. Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
With fine grained VMX feature enablement QEMU>=4.2 tries to do KVM_SET_MSRS with default (matching CPU model) values and in case eVMCS is also enabled, fails. It would be possible to drop VMX feature filtering completely and make this a guest's responsibility: if it decides to use eVMCS it should know which fields are available and which are not. Hyper-V mostly complies to this, however, there are some problematic controls: SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES VM_{ENTRY,EXIT}_LOAD_IA32_PERF_GLOBAL_CTRL which Hyper-V enables. As there are no corresponding fields in eVMCS, we can't handle this properly in KVM. This is a Hyper-V issue. Move VMX controls sanitization from nested_enable_evmcs() to vmx_get_msr(), and do the bare minimum (only clear controls which are known to cause issues). This allows userspace to keep setting controls it wants and at the same time hides them from the guest. Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Ben Gardon authored
Separate the functions for generating MMIO page table entries from the function that inserts them into the paging structure. This refactoring will facilitate changes to the MMU sychronization model to use atomic compare / exchanges (which are not guaranteed to succeed) instead of a monolithic MMU lock. No functional change expected. Tested by running kvm-unit-tests on an Intel Haswell machine. This commit introduced no new failures. Signed-off-by:
Ben Gardon <bgardon@google.com> Reviewed-by:
Oliver Upton <oupton@google.com> Reviewed-by:
Peter Shier <pshier@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Ben Gardon authored
There are several functions which pass an access permission mask for SPTEs as an unsigned. This works, but checkpatch complains about it. Switch the occurrences of unsigned to unsigned int to satisfy checkpatch. No functional change expected. Tested by running kvm-unit-tests on an Intel Haswell machine. This commit introduced no new failures. Signed-off-by:
Ben Gardon <bgardon@google.com> Reviewed-by:
Oliver Upton <oupton@google.com> Reviewed-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
The blurb pertaining to the return value of nested_vmx_load_cr3() no longer matches reality, remove it entirely as the behavior it is attempting to document is quite obvious when reading the actual code. Signed-off-by:
Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by:
Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Fold kvm_mips_comparecount_func() into kvm_mips_comparecount_wakeup() to eliminate the nondescript function name as well as its unnecessary cast of a vcpu to "unsigned long" and back to a vcpu. Presumably func() was used as a callback at some point during pre-upstream development, as wakeup() is the only user of func() and has been the only user since both with introduced by commit 669e846e ("KVM/MIPS32: MIPS arch specific APIs for KVM"). Cc: Davidlohr Bueso <dbueso@suse.de> Signed-off-by:
Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Hoist kvm_mips_comparecount_wakeup() above its only user, kvm_arch_vcpu_create() to fix a compilation error due to referencing an undefined function. Fixes: d11dfed5 ("KVM: MIPS: Move all vcpu init code into kvm_arch_vcpu_create()") Reported-by:
kbuild test robot <lkp@intel.com> Signed-off-by:
Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Thadeu Lima de Souza Cascardo authored
kvm_setup_pv_tlb_flush will waste memory and print a misguiding message when KVM paravirtualization is not available. Intel SDM says that the when cpuid is used with EAX higher than the maximum supported value for basic of extended function, the data for the highest supported basic function will be returned. So, in some systems, kvm_arch_para_features will return bogus data, causing kvm_setup_pv_tlb_flush to detect support for pv tlb flush. Testing for kvm_para_available will work as it checks for the hypervisor signature. Besides, when the "nopv" command line parameter is used, it should not continue as well, as kvm_guest_init will no be called in that case. Signed-off-by:
Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Take a u64 instead of an unsigned long in kvm_dr7_valid() to fix a build warning on i386 due to right-shifting a 32-bit value by 32 when checking for bits being set in dr7[63:32]. Alternatively, the warning could be resolved by rewriting the check to use an i386-friendly method, but taking a u64 fixes another oddity on 32-bit KVM. Beause KVM implements natural width VMCS fields as u64s to avoid layout issues between 32-bit and 64-bit, a devious guest can stuff vmcs12->guest_dr7 with a 64-bit value even when both the guest and host are 32-bit kernels. KVM eventually drops vmcs12->guest_dr7[63:32] when propagating vmcs12->guest_dr7 to vmcs02, but ideally KVM would not rely on that behavior for correctness. Cc: Jim Mattson <jmattson@google.com> Cc: Krish Sadhukhan <krish.sadhukhan@oracle.com> Fixes: ecb697d10f70 ("KVM: nVMX: Check GUEST_DR7 on vmentry of nested guests") Reported-by:
Randy Dunlap <rdunlap@infradead.org> Signed-off-by:
Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Commit 53fafdbb ("KVM: x86: switch KVMCLOCK base to monotonic raw clock") changed kvmclock to use tkr_raw instead of tkr_mono. However, the default kvmclock_offset for the VM was still based on the monotonic clock and, if the raw clock drifted enough from the monotonic clock, this could cause a negative system_time to be written to the guest's struct pvclock. RHEL5 does not like it and (if it boots fast enough to observe a negative time value) it hangs. There is another thing to be careful about: getboottime64 returns the host boot time with tkr_mono frequency, and subtracting the tkr_raw-based kvmclock value will cause the wallclock to be off if tkr_raw drifts from tkr_mono. To avoid this, compute the wallclock delta from the current time instead of being clever and using getboottime64. Fixes: 53fafdbb ("KVM: x86: switch KVMCLOCK base to monotonic raw clock") Cc: stable@vger.kernel.org Reviewed-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
We will need a copy of tk->offs_boot in the next patch. Store it and cleanup the struct: instead of storing tk->tkr_xxx.base with the tk->offs_boot included, store the raw value in struct pvclock_clock and sum it in do_monotonic_raw and do_realtime. tk->tkr_xxx.xtime_nsec also moves to struct pvclock_clock. While at it, fix a (usually harmless) typo in do_monotonic_raw, which was using gtod->clock.shift instead of gtod->raw_clock.shift. Fixes: 53fafdbb ("KVM: x86: switch KVMCLOCK base to monotonic raw clock") Cc: stable@vger.kernel.org Reviewed-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Miaohe Lin authored
The function nested_vmx_run() declaration is below its implementation. So this is meaningless and should be removed. Signed-off-by:
Miaohe Lin <linmiaohe@huawei.com> Reviewed-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
SVM is now able to disable AVIC dynamically whenever the in-kernel PIT sets up an ack notifier, so we can enable it even if in-kernel IOAPIC/PIC/PIT are in use. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Suravee Suthikulpanit authored
In-kernel IOAPIC does not receive EOI with AMD SVM AVIC since the processor accelerate write to APIC EOI register and does not trap if the interrupt is edge-triggered. Workaround this by lazy check for pending APIC EOI at the time when setting new IOPIC irq, and update IOAPIC EOI if no pending APIC EOI. Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Suravee Suthikulpanit authored
Refactor code for handling IOAPIC EOI for subsequent patch. There is no functional change. Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Suravee Suthikulpanit authored
AMD SVM AVIC accelerates EOI write and does not trap. This causes in-kernel PIT re-injection mode to fail since it relies on irq-ack notifier mechanism. So, APICv is activated only when in-kernel PIT is in discard mode e.g. w/ qemu option: -global kvm-pit.lost_tick_policy=discard Also, introduce APICV_INHIBIT_REASON_PIT_REINJ bit to be used for this reason. Suggested-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Suravee Suthikulpanit authored
AMD AVIC does not support ExtINT. Therefore, AVIC must be temporary deactivated and fall back to using legacy interrupt injection via vINTR and interrupt window. Also, introduce APICV_INHIBIT_REASON_IRQWIN to be used for this reason. Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> [Rename svm_request_update_avic to svm_toggle_avic_for_extint. - Paolo] Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Suravee Suthikulpanit authored
Since AVIC does not currently work w/ nested virtualization, deactivate AVIC for the guest if setting CPUID Fn80000001_ECX[SVM] (i.e. indicate support for SVM, which is needed for nested virtualization). Also, introduce a new APICV_INHIBIT_REASON_NESTED bit to be used for this reason. Suggested-by:
Alexander Graf <graf@amazon.com> Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Suravee Suthikulpanit authored
Since disabling APICv has to be done for all vcpus on AMD-based system, adopt the newly introduced kvm_request_apicv_update() interface, and introduce a new APICV_INHIBIT_REASON_HYPERV. Also, remove the kvm_vcpu_deactivate_apicv() since no longer used. Cc: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Suravee Suthikulpanit authored
Add necessary logics to support (de)activate AVIC at runtime. Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Suravee Suthikulpanit authored
AMD SVM AVIC needs to update APIC backing page mapping before changing APICv mode. Introduce struct kvm_x86_ops.pre_update_apicv_exec_ctrl function hook to be called prior KVM APICv update request to each vcpu. Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Suravee Suthikulpanit authored
Inibit reason bits are used to determine if APICv deactivation is applicable for a particular hardware virtualization architecture. Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Suravee Suthikulpanit authored
Re-factor avic_init_access_page() to avic_update_access_page() since activate/deactivate AVIC requires setting/unsetting the memory region used for virtual APIC backing page (APIC_ACCESS_PAGE_PRIVATE_MEMSLOT). Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Suravee Suthikulpanit authored
Introduce interface for (de)activate posted interrupts, and implement SVM hooks to toggle AMD IOMMU guest virtual APIC mode. Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Suravee Suthikulpanit authored
Add trace points when sending request to (de)activate APICv. Suggested-by:
Alexander Graf <graf@amazon.com> Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Suravee Suthikulpanit authored
Certain runtime conditions require APICv to be temporary deactivated during runtime. The current implementation only support run-time deactivation of APICv when Hyper-V SynIC is enabled, which is not temporary. In addition, for AMD, when APICv is (de)activated at runtime, all vcpus in the VM have to operate in the same mode. Thus the requesting vcpu must notify the others. So, introduce the following: * A new KVM_REQ_APICV_UPDATE request bit * Interfaces to request all vcpus to update APICv status * A new interface to update APICV-related parameters for each vcpu Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
It is unused now. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Suravee Suthikulpanit authored
There are several reasons in which a VM needs to deactivate APICv e.g. disable APICv via parameter during module loading, or when enable Hyper-V SynIC support. Additional inhibit reasons will be introduced later on when dynamic APICv is supported, Introduce KVM APICv inhibit reason bits along with a new variable, apicv_inhibit_reasons, to help keep track of APICv state for each VM, Initially, the APICV_INHIBIT_REASON_DISABLE bit is used to indicate the case where APICv is disabled during KVM module load. (e.g. insmod kvm_amd avic=0 or insmod kvm_intel enable_apicv=0). Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> [Do not use get_enable_apicv; consider irqchip_split in svm.c. - Paolo] Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Suravee Suthikulpanit authored
Re-factor code into a helper function for setting lapic parameters when activate/deactivate APICv, and export the function for subsequent usage. Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Max Filippov authored
Drop redundant result moving from inline assembly, use a1 and b1 values as return value and errno value respectively. Signed-off-by:
Max Filippov <jcmvbkbc@gmail.com>
-
Max Filippov authored
Allow vectors to be either merged into the kernel .text or put at a fixed virtual address independently of XIP option. Drop option that puts vectors at a fixed offset from the kernel text. Add choice to Kconfig. Vectors at fixed virtual address may be useful for XIP-aware MTD support and for noMMU configurations with available IRAM. Configurations without VECBASE register must put their vectors at specific locations regardless of the selected option. All other configurations should happily use merged vectors. Signed-off-by:
Max Filippov <jcmvbkbc@gmail.com>
-
Max Filippov authored
There's no real dependency between SMP and XIP, allow them to be selected together. Always define 2- and 4-argument SECTION_VECTOR macros, always use 4-argument macro for the secondary reset vector and always define relocation entry for it. Signed-off-by:
Max Filippov <jcmvbkbc@gmail.com>
-