diff options
author | Linus Torvalds | 2021-09-07 13:40:51 -0700 |
---|---|---|
committer | Linus Torvalds | 2021-09-07 13:40:51 -0700 |
commit | 192ad3c27a4895ee4b2fa31c5b54a932f5bb08c1 (patch) | |
tree | 5f818faaca9a304997d745aba9c19dbfedf5415a /Documentation/virt | |
parent | a2b28235335fee2586b4bd16448fb59ed6c80eef (diff) | |
parent | 109bbba5066b42431399b40e947243f049d8dc8d (diff) |
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"ARM:
- Page ownership tracking between host EL1 and EL2
- Rely on userspace page tables to create large stage-2 mappings
- Fix incompatibility between pKVM and kmemleak
- Fix the PMU reset state, and improve the performance of the virtual
PMU
- Move over to the generic KVM entry code
- Address PSCI reset issues w.r.t. save/restore
- Preliminary rework for the upcoming pKVM fixed feature
- A bunch of MM cleanups
- a vGIC fix for timer spurious interrupts
- Various cleanups
s390:
- enable interpretation of specification exceptions
- fix a vcpu_idx vs vcpu_id mixup
x86:
- fast (lockless) page fault support for the new MMU
- new MMU now the default
- increased maximum allowed VCPU count
- allow inhibit IRQs on KVM_RUN while debugging guests
- let Hyper-V-enabled guests run with virtualized LAPIC as long as
they do not enable the Hyper-V "AutoEOI" feature
- fixes and optimizations for the toggling of AMD AVIC (virtualized
LAPIC)
- tuning for the case when two-dimensional paging (EPT/NPT) is
disabled
- bugfixes and cleanups, especially with respect to vCPU reset and
choosing a paging mode based on CR0/CR4/EFER
- support for 5-level page table on AMD processors
Generic:
- MMU notifier invalidation callbacks do not take mmu_lock unless
necessary
- improved caching of LRU kvm_memory_slot
- support for histogram statistics
- add statistics for halt polling and remote TLB flush requests"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (210 commits)
KVM: Drop unused kvm_dirty_gfn_invalid()
KVM: x86: Update vCPU's hv_clock before back to guest when tsc_offset is adjusted
KVM: MMU: mark role_regs and role accessors as maybe unused
KVM: MIPS: Remove a "set but not used" variable
x86/kvm: Don't enable IRQ when IRQ enabled in kvm_wait
KVM: stats: Add VM stat for remote tlb flush requests
KVM: Remove unnecessary export of kvm_{inc,dec}_notifier_count()
KVM: x86/mmu: Move lpage_disallowed_link further "down" in kvm_mmu_page
KVM: x86/mmu: Relocate kvm_mmu_page.tdp_mmu_page for better cache locality
Revert "KVM: x86: mmu: Add guest physical address check in translate_gpa()"
KVM: x86/mmu: Remove unused field mmio_cached in struct kvm_mmu_page
kvm: x86: Increase KVM_SOFT_MAX_VCPUS to 710
kvm: x86: Increase MAX_VCPUS to 1024
kvm: x86: Set KVM_MAX_VCPU_ID to 4*KVM_MAX_VCPUS
KVM: VMX: avoid running vmx_handle_exit_irqoff in case of emulation
KVM: x86/mmu: Don't freak out if pml5_root is NULL on 4-level host
KVM: s390: index kvm->arch.idle_mask by vcpu_idx
KVM: s390: Enable specification exception interpretation
KVM: arm64: Trim guest debug exception handling
KVM: SVM: Add 5-level page table support for SVM
...
Diffstat (limited to 'Documentation/virt')
-rw-r--r-- | Documentation/virt/kvm/api.rst | 36 | ||||
-rw-r--r-- | Documentation/virt/kvm/locking.rst | 6 |
2 files changed, 34 insertions, 8 deletions
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index c6212c2d5fe3..a6729c8cf063 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -3357,6 +3357,7 @@ flags which can include the following: - KVM_GUESTDBG_INJECT_DB: inject DB type exception [x86] - KVM_GUESTDBG_INJECT_BP: inject BP type exception [x86] - KVM_GUESTDBG_EXIT_PENDING: trigger an immediate guest exit [s390] + - KVM_GUESTDBG_BLOCKIRQ: avoid injecting interrupts/NMI/SMI [x86] For example KVM_GUESTDBG_USE_SW_BP indicates that software breakpoints are enabled in memory so we need to ensure breakpoint exceptions are @@ -5208,6 +5209,9 @@ by a string of size ``name_size``. #define KVM_STATS_TYPE_CUMULATIVE (0x0 << KVM_STATS_TYPE_SHIFT) #define KVM_STATS_TYPE_INSTANT (0x1 << KVM_STATS_TYPE_SHIFT) #define KVM_STATS_TYPE_PEAK (0x2 << KVM_STATS_TYPE_SHIFT) + #define KVM_STATS_TYPE_LINEAR_HIST (0x3 << KVM_STATS_TYPE_SHIFT) + #define KVM_STATS_TYPE_LOG_HIST (0x4 << KVM_STATS_TYPE_SHIFT) + #define KVM_STATS_TYPE_MAX KVM_STATS_TYPE_LOG_HIST #define KVM_STATS_UNIT_SHIFT 4 #define KVM_STATS_UNIT_MASK (0xF << KVM_STATS_UNIT_SHIFT) @@ -5215,18 +5219,20 @@ by a string of size ``name_size``. #define KVM_STATS_UNIT_BYTES (0x1 << KVM_STATS_UNIT_SHIFT) #define KVM_STATS_UNIT_SECONDS (0x2 << KVM_STATS_UNIT_SHIFT) #define KVM_STATS_UNIT_CYCLES (0x3 << KVM_STATS_UNIT_SHIFT) + #define KVM_STATS_UNIT_MAX KVM_STATS_UNIT_CYCLES #define KVM_STATS_BASE_SHIFT 8 #define KVM_STATS_BASE_MASK (0xF << KVM_STATS_BASE_SHIFT) #define KVM_STATS_BASE_POW10 (0x0 << KVM_STATS_BASE_SHIFT) #define KVM_STATS_BASE_POW2 (0x1 << KVM_STATS_BASE_SHIFT) + #define KVM_STATS_BASE_MAX KVM_STATS_BASE_POW2 struct kvm_stats_desc { __u32 flags; __s16 exponent; __u16 size; __u32 offset; - __u32 unused; + __u32 bucket_size; char name[]; }; @@ -5237,21 +5243,35 @@ The following flags are supported: Bits 0-3 of ``flags`` encode the type: * ``KVM_STATS_TYPE_CUMULATIVE`` - The statistics data is cumulative. The value of data can only be increased. + The statistics reports a cumulative count. The value of data can only be increased. Most of the counters used in KVM are of this type. The corresponding ``size`` field for this type is always 1. All cumulative statistics data are read/write. * ``KVM_STATS_TYPE_INSTANT`` - The statistics data is instantaneous. Its value can be increased or + The statistics reports an instantaneous value. Its value can be increased or decreased. This type is usually used as a measurement of some resources, like the number of dirty pages, the number of large pages, etc. All instant statistics are read only. The corresponding ``size`` field for this type is always 1. * ``KVM_STATS_TYPE_PEAK`` - The statistics data is peak. The value of data can only be increased, and - represents a peak value for a measurement, for example the maximum number + The statistics data reports a peak value, for example the maximum number of items in a hash table bucket, the longest time waited and so on. + The value of data can only be increased. The corresponding ``size`` field for this type is always 1. + * ``KVM_STATS_TYPE_LINEAR_HIST`` + The statistic is reported as a linear histogram. The number of + buckets is specified by the ``size`` field. The size of buckets is specified + by the ``hist_param`` field. The range of the Nth bucket (1 <= N < ``size``) + is [``hist_param``*(N-1), ``hist_param``*N), while the range of the last + bucket is [``hist_param``*(``size``-1), +INF). (+INF means positive infinity + value.) The bucket value indicates how many samples fell in the bucket's range. + * ``KVM_STATS_TYPE_LOG_HIST`` + The statistic is reported as a logarithmic histogram. The number of + buckets is specified by the ``size`` field. The range of the first bucket is + [0, 1), while the range of the last bucket is [pow(2, ``size``-2), +INF). + Otherwise, The Nth bucket (1 < N < ``size``) covers + [pow(2, N-2), pow(2, N-1)). The bucket value indicates how many samples fell + in the bucket's range. Bits 4-7 of ``flags`` encode the unit: @@ -5286,9 +5306,9 @@ unsigned 64bit data. The ``offset`` field is the offset from the start of Data Block to the start of the corresponding statistics data. -The ``unused`` field is reserved for future support for other types of -statistics data, like log/linear histogram. Its value is always 0 for the types -defined above. +The ``bucket_size`` field is used as a parameter for histogram statistics data. +It is only used by linear histogram statistics data, specifying the size of a +bucket. The ``name`` field is the name string of the statistics data. The name string starts at the end of ``struct kvm_stats_desc``. The maximum length including diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst index 88fa495abbac..5d27da356836 100644 --- a/Documentation/virt/kvm/locking.rst +++ b/Documentation/virt/kvm/locking.rst @@ -21,6 +21,12 @@ The acquisition orders for mutexes are as follows: can be taken inside a kvm->srcu read-side critical section, while kvm->slots_lock cannot. +- kvm->mn_active_invalidate_count ensures that pairs of + invalidate_range_start() and invalidate_range_end() callbacks + use the same memslots array. kvm->slots_lock and kvm->slots_arch_lock + are taken on the waiting side in install_new_memslots, so MMU notifiers + must not take either kvm->slots_lock or kvm->slots_arch_lock. + On x86: - vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock |