| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull networking updates from David Miller:
"Here is my initial pull request for the networking subsystem during
this merge window:
1) Support for ESN in AH (RFC 4302) from Fan Du.
2) Add full kernel doc for ethtool command structures, from Ben
Hutchings.
3) Add BCM7xxx PHY driver, from Florian Fainelli.
4) Export computed TCP rate information in netlink socket dumps, from
Eric Dumazet.
5) Allow IPSEC SA to be dumped partially using a filter, from Nicolas
Dichtel.
6) Convert many drivers to pci_enable_msix_range(), from Alexander
Gordeev.
7) Record SKB timestamps more efficiently, from Eric Dumazet.
8) Switch to microsecond resolution for TCP round trip times, also
from Eric Dumazet.
9) Clean up and fix 6lowpan fragmentation handling by making use of
the existing inet_frag api for it's implementation.
10) Add TX grant mapping to xen-netback driver, from Zoltan Kiss.
11) Auto size SKB lengths when composing netlink messages based upon
past message sizes used, from Eric Dumazet.
12) qdisc dumps can take a long time, add a cond_resched(), From Eric
Dumazet.
13) Sanitize netpoll core and drivers wrt. SKB handling semantics.
Get rid of never-used-in-tree netpoll RX handling. From Eric W
Biederman.
14) Support inter-address-family and namespace changing in VTI tunnel
driver(s). From Steffen Klassert.
15) Add Altera TSE driver, from Vince Bridgers.
16) Optimizing csum_replace2() so that it doesn't adjust the checksum
by checksumming the entire header, from Eric Dumazet.
17) Expand BPF internal implementation for faster interpreting, more
direct translations into JIT'd code, and much cleaner uses of BPF
filtering in non-socket ocntexts. From Daniel Borkmann and Alexei
Starovoitov"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1976 commits)
netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
net: Add a test to see if a skb is freeable in irq context
qlcnic: Fix build failure due to undefined reference to `vxlan_get_rx_port'
net: ptp: move PTP classifier in its own file
net: sxgbe: make "core_ops" static
net: sxgbe: fix logical vs bitwise operation
net: sxgbe: sxgbe_mdio_register() frees the bus
Call efx_set_channels() before efx->type->dimension_resources()
xen-netback: disable rogue vif in kthread context
net/mlx4: Set proper build dependancy with vxlan
be2net: fix build dependency on VxLAN
mac802154: make csma/cca parameters per-wpan
mac802154: allow only one WPAN to be up at any given time
net: filter: minor: fix kdoc in __sk_run_filter
netlink: don't compare the nul-termination in nla_strcmp
can: c_can: Avoid led toggling for every packet.
can: c_can: Simplify TX interrupt cleanup
can: c_can: Store dlc private
can: c_can: Reduce register access
can: c_can: Make the code readable
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch adds a jited flag into sk_filter struct in order to indicate
whether a filter is currently jited or not. The size of sk_filter is
not being expanded as the 32 bit 'len' member allows upper bits to be
reused since a filter can currently only grow as large as BPF_MAXINSNS.
Therefore, there's enough room also for other in future needed flags to
reuse 'len' field if necessary. The jited flag also allows for having
alternative interpreter functions running as currently, we can only
detect jit compiled filters by testing fp->bpf_func to not equal the
address of sk_run_filter().
Joint work with Alexei Starovoitov.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |\
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Conflicts:
drivers/net/ethernet/marvell/mvneta.c
The mvneta.c conflict is a case of overlapping changes,
a conversion to devm_ioremap_resource() vs. a conversion
to netdev_alloc_pcpu_stats.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The packet hash can be considered a property of the packet, not just
on RX path.
This patch changes name of rxhash and l4_rxhash skbuff fields to be
hash and l4_hash respectively. This includes changing uses of the
field in the code which don't call the access functions.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Pull kvm updates from Paolo Bonzini:
"PPC and ARM do not have much going on this time. Most of the cool
stuff, instead, is in s390 and (after a few releases) x86.
ARM has some caching fixes and PPC has transactional memory support in
guests. MIPS has some fixes, with more probably coming in 3.16 as
QEMU will soon get support for MIPS KVM.
For x86 there are optimizations for debug registers, which trigger on
some Windows games, and other important fixes for Windows guests. We
now expose to the guest Broadwell instruction set extensions and also
Intel MPX. There's also a fix/workaround for OS X guests, nested
virtualization features (preemption timer), and a couple kvmclock
refinements.
For s390, the main news is asynchronous page faults, together with
improvements to IRQs (floating irqs and adapter irqs) that speed up
virtio devices"
* tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (96 commits)
KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8
KVM: PPC: Book3S HV: Fix decrementer timeouts with non-zero TB offset
KVM: PPC: Book3S HV: Don't use kvm_memslots() in real mode
KVM: PPC: Book3S HV: Return ENODEV error rather than EIO
KVM: PPC: Book3S: Trim top 4 bits of physical address in RTAS code
KVM: PPC: Book3S HV: Add get/set_one_reg for new TM state
KVM: PPC: Book3S HV: Add transactional memory support
KVM: Specify byte order for KVM_EXIT_MMIO
KVM: vmx: fix MPX detection
KVM: PPC: Book3S HV: Fix KVM hang with CONFIG_KVM_XICS=n
KVM: PPC: Book3S: Introduce hypervisor call H_GET_TCE
KVM: PPC: Book3S HV: Fix incorrect userspace exit on ioeventfd write
KVM: s390: clear local interrupts at cpu initial reset
KVM: s390: Fix possible memory leak in SIGP functions
KVM: s390: fix calculation of idle_mask array size
KVM: s390: randomize sca address
KVM: ioapic: reinject pending interrupts on KVM_SET_IRQCHIP
KVM: Bump KVM_MAX_IRQ_ROUTES for s390
KVM: s390: irq routing for adapter interrupts.
KVM: s390: adapter interrupt sources
...
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
kvm_x86_ops is still NULL at this point. Since kvm_init_msr_list
cannot fail, it is safe to initialize it before the call.
Fixes: 93c4adc7afedf9b0ec190066d45b6d67db5270da
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Jet Chen <jet.chen@intel.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When doing nested virtualization, we may be able to read BNDCFGS but
still not be allowed to write to GUEST_BNDCFGS in the VMCS. Guard
writes to the field with vmx_mpx_supported(), and similarly hide the
MSR from userspace if the processor does not support the field.
We could work around this with the generic MSR save/load machinery,
but there is only a limited number of MSR save/load slots and it is
not really worthwhile to waste one for a scenario that should not
happen except in the nested virtualization case.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This is simple to do, the "host" BNDCFGS is either 0 or the guest value.
However, both controls have to be present. We cannot provide MPX if
we only have one of the "load BNDCFGS" or "clear BNDCFGS" controls.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
XSAVE support for KVM is already using host_xcr0 & KVM_SUPPORTED_XCR0 as
a "dynamic" version of KVM_SUPPORTED_XCR0.
However, this is not enough because the MPX bits should not be presented
to the guest unless kvm_x86_ops confirms the support. So, replace all
instances of host_xcr0 & KVM_SUPPORTED_XCR0 with a new function
kvm_supported_xcr0() that also has this check.
Note that here:
if (xstate_bv & ~KVM_SUPPORTED_XCR0)
return -EINVAL;
if (xstate_bv & ~host_cr0)
return -EINVAL;
the code is equivalent to
if ((xstate_bv & ~KVM_SUPPORTED_XCR0) ||
(xstate_bv & ~host_cr0)
return -EINVAL;
i.e. "xstate_bv & (~KVM_SUPPORTED_XCR0 | ~host_cr0)" which is in turn
equal to "xstate_bv & ~(KVM_SUPPORTED_XCR0 & host_cr0)". So we should
also use the new function there.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Add emulation for 0x66 prefixed instruction of 0f 28 opcode
that has been added earlier.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
HCK memory driver test fails when testing 32-bit Windows 8.1
with baloon driver.
tracing KVM shows error:
reason EXIT_ERR rip 0x81c18326 info 0 0
x/10i 0x81c18326-20
0x0000000081c18312: add %al,(%eax)
0x0000000081c18314: add %cl,-0x7127711d(%esi)
0x0000000081c1831a: rolb $0x0,0x80ec(%ecx)
0x0000000081c18321: and $0xfffffff0,%esp
0x0000000081c18324: mov %esp,%esi
0x0000000081c18326: movaps %xmm0,(%esi)
0x0000000081c18329: movaps %xmm1,0x10(%esi)
0x0000000081c1832d: movaps %xmm2,0x20(%esi)
0x0000000081c18331: movaps %xmm3,0x30(%esi)
0x0000000081c18335: movaps %xmm4,0x40(%esi)
which points to MOVAPS instruction currently no emulated by KVM.
Fix it by adding appropriate entries to opcode table in KVM's emulator.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Both QEMU and KVM have already accumulated a significant number of
optimizations based on the hard-coded assumption that ioapic polarity
will always use the ActiveHigh convention, where the logical and
physical states of level-triggered irq lines always match (i.e.,
active(asserted) == high == 1, inactive == low == 0). QEMU guests
are expected to follow directions given via ACPI and configure the
ioapic with polarity 0 (ActiveHigh). However, even when misbehaving
guests (e.g. OS X <= 10.9) set the ioapic polarity to 1 (ActiveLow),
QEMU will still use the ActiveHigh signaling convention when
interfacing with KVM.
This patch modifies KVM to completely ignore ioapic polarity as set by
the guest OS, enabling misbehaving guests to work alongside those which
comply with the ActiveHigh polarity specified by QEMU's ACPI tables.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gabriel L. Somlo <somlo@cmu.edu>
[Move documentation to KVM_IRQ_LINE, add ia64. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When not running in guest-debug mode (i.e. the guest controls the debug
registers, having to take an exit for each DR access is a waste of time.
If the guest gets into a state where each context switch causes DR to be
saved and restored, this can take away as much as 40% of the execution
time from the guest.
If the guest is running with vcpu->arch.db == vcpu->arch.eff_db, we
can let it write freely to the debug registers and reload them on the
next exit. We still need to exit on the first access, so that the
KVM_DEBUGREG_WONT_EXIT flag is set in switch_db_regs; after that, further
accesses to the debug registers will not cause a vmexit.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Unlike other intercepts, debug register intercepts will be modified
in hot paths if the guest OS is bad or otherwise gets tricked into
doing so.
Avoid calling recalc_intercepts 16 times for debug registers.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When preparing the VMCS02, the CPU-based execution controls is computed
by vmx_exec_control. Turn off DR access exits there, too, if the
KVM_DEBUGREG_WONT_EXIT bit is set in switch_db_regs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When not running in guest-debug mode (i.e. the guest controls the debug
registers, having to take an exit for each DR access is a waste of time.
If the guest gets into a state where each context switch causes DR to be
saved and restored, this can take away as much as 40% of the execution
time from the guest.
If the guest is running with vcpu->arch.db == vcpu->arch.eff_db, we
can let it write freely to the debug registers and reload them on the
next exit. We still need to exit on the first access, so that the
KVM_DEBUGREG_WONT_EXIT flag is set in switch_db_regs; after that, further
accesses to the debug registers will not cause a vmexit.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When not running in guest-debug mode, the guest controls the debug
registers and having to take an exit for each DR access is a waste
of time. If the guest gets into a state where each context switch
causes DR to be saved and restored, this can take away as much as 40%
of the execution time from the guest.
After this patch, VMX- and SVM-specific code can set a flag in
switch_db_regs, telling vcpu_enter_guest that on the next exit the debug
registers might be dirty and need to be reloaded (syncing will be taken
care of by a new callback in kvm_x86_ops). This flag can be set on the
first access to a debug registers, so that multiple accesses to the
debug registers only cause one vmexit.
Note that since the guest will be able to read debug registers and
enable breakpoints in DR7, we need to ensure that they are synchronized
on entry to the guest---including DR6 that was not synced before.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The next patch will add another bit that we can test with the
same "if".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Currently, this works even if the bit is not in "min", because the bit is always
set in MSR_IA32_VMX_ENTRY_CTLS. Mention it for the sake of documentation, and
to avoid surprises if we later switch to MSR_IA32_VMX_TRUE_ENTRY_CTLS.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
It's no longer possible to enter enable_irq_window in guest mode when
L1 intercepts external interrupts and we are entering L2. This is now
caught in vcpu_enter_guest. So we can remove the check from the VMX
version of enable_irq_window, thus the need to return an error code from
both enable_irq_window and enable_nmi_window.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
According to SDM 27.2.3, IDT vectoring information will not be valid on
vmexits caused by external NMIs. So we have to avoid creating such
scenarios by delaying EXIT_REASON_EXCEPTION_NMI injection as long as we
have a pending interrupt because that one would be migrated to L1's IDT
vectoring info on nested exit.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We cannot rely on the hardware-provided preemption timer support because
we are holding L2 in HLT outside non-root mode. Furthermore, emulating
the preemption will resolve tick rate errata on older Intel CPUs.
The emulation is based on hrtimer which is started on L2 entry, stopped
on L2 exit and evaluated via the new check_nested_events hook. As we no
longer rely on hardware features, we can enable both the preemption
timer support and value saving unconditionally.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Move the check for leaving L2 on pending and intercepted IRQs or NMIs
from the *_allowed handler into a dedicated callback. Invoke this
callback at the relevant points before KVM checks if IRQs/NMIs can be
injected. The callback has the task to switch from L2 to L1 if needed
and inject the proper vmexit events.
The rework fixes L2 wakeups from HLT and provides the foundation for
preemption timer emulation.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |\ \ \
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into kvm-next
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
commit 0061d53daf26f introduced a mechanism to execute a global clock
update for a vm. We can apply this periodically in order to propagate
host NTP corrections. Also, if all vcpus of a vm are pinned, then
without an additional trigger, no guest NTP corrections can propagate
either, as the current trigger is only vcpu cpu migration.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
When we update a vcpu's local clock it may pick up an NTP correction.
We can't wait an indeterminate amount of time for other vcpus to pick
up that correction, so commit 0061d53daf26f introduced a global clock
update. However, we can't request a global clock update on every vcpu
load either (which is what happens if the tsc is marked as unstable).
The solution is to rate-limit the global clock updates. Marcelo
calculated that we should delay the global clock updates no more
than 0.1s as follows:
Assume an NTP correction c is applied to one vcpu, but not the other,
then in n seconds the delta of the vcpu system_timestamps will be
c * n. If we assume a correction of 500ppm (worst-case), then the two
vcpus will diverge 50us in 0.1s, which is a considerable amount.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Commit e504c9098ed6 (kvm, vmx: Fix lazy FPU on nested guest, 2013-11-13)
highlighted a real problem, but the fix was subtly wrong.
nested_read_cr0 is the CR0 as read by L2, but here we want to look at
the CR0 value reflecting L1's setup. In other words, L2 might think
that TS=0 (so nested_read_cr0 has the bit clear); but if L1 is actually
running it with TS=1, we should inject the fault into L1.
The effective value of CR0 in L2 is contained in vmcs12->guest_cr0, use
it.
Fixes: e504c9098ed6acd9e1079c5e10e4910724ad429f
Reported-by: Kashyap Chamarty <kchamart@redhat.com>
Reported-by: Stefan Bader <stefan.bader@canonical.com>
Tested-by: Kashyap Chamarty <kchamart@redhat.com>
Tested-by: Anthoine Bourgeois <bourgeois@bertin.fr>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
No need to scan the entire VCPU array.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
emulator_cmpxchg_emulated writes to guest memory, therefore it should
update the dirty bitmap accordingly.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
From 44c2abca2c2eadc6f2f752b66de4acc8131880c4 Mon Sep 17 00:00:00 2001
From: Liu Jinsong <jinsong.liu@intel.com>
Date: Mon, 24 Feb 2014 18:12:31 +0800
Subject: [PATCH v5 3/3] KVM: x86: Enable Intel MPX for guest
This patch enable Intel MPX feature to guest.
Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
From 5d5a80cd172ea6fb51786369bcc23356b1e9e956 Mon Sep 17 00:00:00 2001
From: Liu Jinsong <jinsong.liu@intel.com>
Date: Mon, 24 Feb 2014 18:11:55 +0800
Subject: [PATCH v5 2/3] KVM: x86: add MSR_IA32_BNDCFGS to msrs_to_save
Add MSR_IA32_BNDCFGS to msrs_to_save, and corresponding logic
to kvm_get/set_msr().
Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
From caddc009a6d2019034af8f2346b2fd37a81608d0 Mon Sep 17 00:00:00 2001
From: Liu Jinsong <jinsong.liu@intel.com>
Date: Mon, 24 Feb 2014 18:11:11 +0800
Subject: [PATCH v5 1/3] KVM: x86: Intel MPX vmx and msr handle
This patch handle vmx and msr of Intel MPX feature.
Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
From 00c920c96127d20d4c3bb790082700ae375c39a0 Mon Sep 17 00:00:00 2001
From: Liu Jinsong <jinsong.liu@intel.com>
Date: Fri, 21 Feb 2014 23:47:18 +0800
Subject: [PATCH] KVM: x86: Fix xsave cpuid exposing bug
EBX of cpuid(0xD, 0) is dynamic per XCR0 features enable/disable.
Bit 63 of XCR0 is reserved for future expansion.
Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
From 0750e335eb5860b0b483e217e8a08bd743cbba16 Mon Sep 17 00:00:00 2001
From: Liu Jinsong <jinsong.liu@intel.com>
Date: Thu, 20 Feb 2014 17:39:32 +0800
Subject: [PATCH] KVM: x86: expose ADX feature to guest
ADCX and ADOX instructions perform an unsigned addition with Carry flag and
Overflow flag respectively.
Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
From 24ffdce9efebf13c6ed4882f714b2b57ef1141eb Mon Sep 17 00:00:00 2001
From: Liu Jinsong <jinsong.liu@intel.com>
Date: Thu, 20 Feb 2014 17:38:26 +0800
Subject: [PATCH] KVM: x86: expose new instruction RDSEED to guest
RDSEED instruction return a random number, which supplied by a
cryptographically secure, deterministic random bit generator(DRBG).
Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
These days hv_clock allocation is memblock based (i.e. the percpu
allocator is not involved), which means that the physical address
of each of the per-cpu hv_clock areas is guaranteed to remain
unchanged through all its lifetime and we do not need to update
its location after CPU bring-up.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
We should open NMI window right after an iret, but SVM exits before it.
We wanted to single step using the trap flag and then open it.
(or we could emulate the iret instead)
We don't do it since commit 3842d135ff2 (likely), because the iret exit
handler does not request an event, so NMI window remains closed until
the next exit.
Fix this by making KVM_REQ_EVENT request in the iret handler.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
When this was introduced, kvm_flush_remote_tlbs() could be called
without holding mmu_lock. It is now acknowledged that the function
must be called before releasing mmu_lock, and all callers have already
been changed to do so.
There is no need to use smp_mb() and cmpxchg() any more.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| |\ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
Two new features are added by this patch set:
- The floating interrupt controller (flic) that allows us to inject,
clear and inspect non-vcpu local interrupts. This also gives us an
opportunity to fix deficiencies in our existing interrupt definitions.
- Support for asynchronous page faults via the pfault mechanism. Testing
show significant guest performance improvements under host swap.
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
By setting a Kconfig option, the architecture can control when
guest notifications will be presented by the apf backend.
There is the default batch mechanism, working as before, where the vcpu
thread should pull in this information.
Opposite to this, there is now the direct mechanism, that will push the
information to the guest.
This way s390 can use an already existing architecture interface.
Still the vcpu thread should call check_completion to cleanup leftovers.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Remove unused last_kernel_ns variable.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 old platform removal from Peter Anvin:
"This patchset removes support for several completely obsolete
platforms, where the maintainers either have completely vanished or
acked the removal. For some of them it is questionable if there even
exists functional specimens of the hardware"
Geert Uytterhoeven apparently thought this was a April Fool's pull request ;)
* 'x86-nuke-platforms-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, platforms: Remove NUMAQ
x86, platforms: Remove SGI Visual Workstation
x86, apic: Remove support for IBM Summit/EXA chipset
x86, apic: Remove support for ia32-based Unisys ES7000
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The NUMAQ support seems to be unmaintained, remove it.
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: David Rientjes <rientjes@google.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/n/530CFD6C.7040705@zytor.com
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The SGI Visual Workstation seems to be dead; remove support so we
don't have to continue maintaining it.
Cc: Andrey Panin <pazke@donpac.ru>
Cc: Michael Reed <mdr@sgi.com>
Link: http://lkml.kernel.org/r/530CFD6C.7040705@zytor.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
There should no longer be any IBM x440 systems or those using the
Summit/EXA chipset out in the wild, so remove support for it.
We've done our due diligence in reaching out to any contact information
listed for this chipset and no indication was given that it should be
kept around.
Signed-off-by: David Rientjes <rientjes@google.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
There should no longer be any ia32-based Unisys ES7000 systems out in
the wild, so remove support for it.
We've done our due diligence in reaching out to any contact information
listed for this system and no indication was given that it should be
kept around.
Signed-off-by: David Rientjes <rientjes@google.com>
|
|\ \ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 vdso changes from Peter Anvin:
"This is the revamp of the 32-bit vdso and the associated cleanups.
This adds timekeeping support to the 32-bit vdso that we already have
in the 64-bit vdso. Although 32-bit x86 is legacy, it is likely to
remain in the embedded space for a very long time to come.
This removes the traditional COMPAT_VDSO support; the configuration
variable is reused for simply removing the 32-bit vdso, which will
produce correct results but obviously suffer a performance penalty.
Only one beta version of glibc was affected, but that version was
unfortunately included in one OpenSUSE release.
This is not the end of the vdso cleanups. Stefani and Andy have
agreed to continue work for the next kernel cycle; in fact Andy has
already produced another set of cleanups that came too late for this
cycle.
An incidental, but arguably important, change is that this ensures
that unused space in the VVAR page is properly zeroed. It wasn't
before, and would contain whatever garbage was left in memory by BIOS
or the bootloader. Since the VVAR page is accessible to user space
this had the potential of information leaks"
* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
x86, vdso: Fix the symbol versions on the 32-bit vDSO
x86, vdso, build: Don't rebuild 32-bit vdsos on every make
x86, vdso: Actually discard the .discard sections
x86, vdso: Fix size of get_unmapped_area()
x86, vdso: Finish removing VDSO32_PRELINK
x86, vdso: Move more vdso definitions into vdso.h
x86: Load the 32-bit vdso in place, just like the 64-bit vdsos
x86, vdso32: handle 32 bit vDSO larger one page
x86, vdso32: Disable stack protector, adjust optimizations
x86, vdso: Zero-pad the VVAR page
x86, vdso: Add 32 bit VDSO time support for 64 bit kernel
x86, vdso: Add 32 bit VDSO time support for 32 bit kernel
x86, vdso: Patch alternatives in the 32-bit VDSO
x86, vdso: Introduce VVAR marco for vdso32
x86, vdso: Cleanup __vdso_gettimeofday()
x86, vdso: Replace VVAR(vsyscall_gtod_data) by gtod macro
x86, vdso: __vdso_clock_gettime() cleanup
x86, vdso: Revamp vclock_gettime.c
mm: Add new func _install_special_mapping() to mmap.c
x86, vdso: Make vsyscall_gtod_data handling x86 generic
...
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
The new symbols provide the same API as the 64-bit variants, so they
should have the same symbol version name. This can't break
userspace, since these symbols are new for 32-bit Linux.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Stefani Seibold <stefani@seibold.net>
Link: http://lkml.kernel.org/r/0a869bce03d25619565b1eee7d69a4fd15fd203a.1396124118.git.luto@amacapital.net
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
vdso32/vclock_gettime.o was confusing kbuild.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Stefani Seibold <stefani@seibold.net>
Link: http://lkml.kernel.org/r/d741449340642213744dd659471a35bb970a0c4c.1395789923.git.luto@amacapital.net
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
The .discard/.discard.* sections are used to generate intermediate
results for the assembler (effectively "test assembly".) The output
is waste and should not be retained.
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/n/tip-psizrnant8x3nrhbgvq2vekr@git.kernel.org
|