Age | Commit message (Collapse) | Author |
|
When an MSI doorbell is located downstream of an IOMMU, attaching
devices to a DMA ops domain and switching on translation leads to a rude
shock when their attempt to write to the physical address returned by
the irqchip driver faults (or worse, writes into some already-mapped
buffer) and no interrupt is forthcoming.
Address this by adding a hook for relevant irqchip drivers to call from
their compose_msi_msg() callback, to swizzle the physical address with
an appropriatly-mapped IOVA for any device attached to one of our DMA
ops domains.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
For non-aperture-based IOMMUs, the domain geometry seems to have become
the de-facto way of indicating the input address space size. That is
quite a useful thing from the users' perspective, so let's do the same.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
With everything else now in place, fill in an of_xlate callback and the
appropriate registration to plumb into the generic configuration
machinery, and watch everything just work.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Document how the generic "iommus" binding should be used to describe ARM
SMMU stream IDs instead of the old "mmu-masters" binding.
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
In the final step of preparation for full generic configuration support,
swap our fixed-size master_cfg for the generic iommu_fwspec. For the
legacy DT bindings, the driver simply gets to act as its own 'firmware'.
Farewell, arbitrary MAX_MASTER_STREAMIDS!
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Stream Match Registers are one of the more awkward parts of the SMMUv2
architecture; there are typically never enough to assign one to each
stream ID in the system, and configuring them such that a single ID
matches multiple entries is catastrophically bad - at best, every
transaction raises a global fault; at worst, they go *somewhere*.
To address the former issue, we can mask ID bits such that a single
register may be used to match multiple IDs belonging to the same device
or group, but doing so also heightens the risk of the latter problem
(which can be nasty to debug).
Tackle both problems at once by replacing the simple bitmap allocator
with something much cleverer. Now that we have convenient in-memory
representations of the stream mapping table, it becomes straightforward
to properly validate new SMR entries against the current state, opening
the door to arbitrary masking and SMR sharing.
Another feature which falls out of this is that with IDs shared by
separate devices being automatically accounted for, simply associating a
group pointer with the S2CR offers appropriate group allocation almost
for free, so hook that up in the process.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
We iterate over the SMEs associated with a master config quite a lot in
various places, and are about to do so even more. Let's wrap the idiom
in a handy iterator macro before the repetition gets out of hand.
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Simplify things somewhat by stashing our arm_smmu_device instance in
drvdata, so that it's readily available to our driver model callbacks.
Then we can excise the private list entirely, since the driver core
already has a perfectly good list of SMMU devices we can use in the one
instance we actually need to. Finally, make a further modest code saving
with the relatively new of_device_get_match_data() helper.
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
To be able to support the generic bindings and handle of_xlate() calls,
we need to be able to associate SMMUs and stream IDs directly with
devices *before* allocating IOMMU groups. Furthermore, to support real
default domains with multi-device groups we also have to handle domain
attach on a per-device basis, as the "whole group at a time" assumption
fails to properly handle subsequent devices added to a group after the
first has already triggered default domain creation and attachment.
To that end, use the now-vacant dev->archdata.iommu field for easy
config and SMMU instance lookup, and unify config management by chopping
down the platform-device-specific tree and probing the "mmu-masters"
property on-demand instead. This may add a bit of one-off overhead to
initially adding a new device, but we're about to deprecate that binding
in favour of the inherently-more-efficient generic ones anyway.
For the sake of simplicity, this patch does temporarily regress the case
of aliasing PCI devices by losing the duplicate stream ID detection that
the previous per-group config had. Stay tuned, because we'll be back to
fix that in a better and more general way momentarily...
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Making S2CRs first-class citizens within the driver with a high-level
representation of their state offers a neat solution to a few problems:
Firstly, the information about which context a device's stream IDs are
associated with is already present by necessity in the S2CR. With that
state easily accessible we can refer directly to it and obviate the need
to track an IOMMU domain in each device's archdata (its earlier purpose
of enforcing correct attachment of multi-device groups now being handled
by the IOMMU core itself).
Secondly, the core API now deprecates explicit domain detach and expects
domain attach to move devices smoothly from one domain to another; for
SMMUv2, this notion maps directly to simply rewriting the S2CRs assigned
to the device. By giving the driver a suitable abstraction of those
S2CRs to work with, we can massively reduce the overhead of the current
heavy-handed "detach, free resources, reallocate resources, attach"
approach.
Thirdly, making the software state hardware-shaped and attached to the
SMMU instance once again makes suspend/resume of this register group
that much simpler to implement in future.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
In order to consider SMR masking, we really want to be able to validate
ID/mask pairs against existing SMR contents to prevent stream match
conflicts, which at best would cause transactions to fault unexpectedly,
and at worst lead to silent unpredictable behaviour. With our SMMU
instance data holding only an allocator bitmap, and the SMR values
themselves scattered across master configs hanging off devices which we
may have no way of finding, there's essentially no way short of digging
everything back out of the hardware. Similarly, the thought of power
management ops to support suspend/resume faces the exact same problem.
By massaging the software state into a closer shape to the underlying
hardware, everything comes together quite nicely; the allocator and the
high-level view of the data become a single centralised state which we
can easily keep track of, and to which any updates can be validated in
full before being synchronised to the hardware itself.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Rather than assuming fixed worst-case values for stream IDs and SMR
masks, keep track of whatever implemented bits the hardware actually
reports. This also obviates the slightly questionable validation of SMR
fields in isolation - rather than aborting the whole SMMU probe for a
hardware configuration which is still architecturally valid, we can
simply refuse masters later if they try to claim an unrepresentable ID
or mask (which almost certainly implies a DT error anyway).
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Implement the SMMUv3 equivalent of d346180e70b9 ("iommu/arm-smmu: Treat
all device transactions as unprivileged"), so that once again those
pesky DMA controllers with their privileged instruction fetches don't
unexpectedly fault in stage 1 domains due to VMSAv8 rules.
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
With the device <-> stream ID relationship suitably abstracted and
of_xlate() hooked up, the PCI dependency now looks, and is, entirely
arbitrary. Any bus using the of_dma_configure() mechanism will work,
so extend support to the platform and AMBA buses which do just that.
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Now that we can properly describe the mapping between PCI RIDs and
stream IDs via "iommu-map", and have it fed it to the driver
automatically via of_xlate(), rework the SMMUv3 driver to benefit from
that, and get rid of the current misuse of the "iommus" binding.
Since having of_xlate wired up means that masters will now be given the
appropriate DMA ops, we also need to make sure that default domains work
properly. This necessitates dispensing with the "whole group at a time"
notion for attaching to a domain, as devices which share a group get
attached to the group's default domain one by one as they are initially
probed.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Unlike SMMUv2, SMMUv3 has no easy way to bypass unknown stream IDs,
other than allocating and filling in the entire stream table with bypass
entries, which for some configurations would waste *gigabytes* of RAM.
Otherwise, all transactions on unknown stream IDs will simply be aborted
with a C_BAD_STREAMID event.
Rather than render the system unusable in the case of an invalid DT,
avoid enabling the SMMU altogether such that everything bypasses
(though letting the explicit disable_bypass option take precedence).
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
We're about to ratify our use of the generic binding, so document it.
CC: Rob Herring <robh+dt@kernel.org>
CC: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Introduce a common structure to hold the per-device firmware data that
most IOMMU drivers need to keep track of. This enables us to configure
much of that data from common firmware code, and consolidate a lot of
the equivalent implementations, device look-up tables, etc. which are
currently strewn across IOMMU drivers.
This will also be enable us to address the outstanding "multiple IOMMUs
on the platform bus" problem by tweaking IOMMU API calls to prefer
dev->fwspec->ops before falling back to dev->bus->iommu_ops, and thus
gracefully handle those troublesome systems which we currently cannot.
As the first user, hook up the OF IOMMU configuration mechanism. The
driver-defined nature of DT cells means that we still need the drivers
to translate and add the IDs themselves, but future users such as the
much less free-form ACPI IORT will be much simpler and self-contained.
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Now that we have a way to pick up the RID translation and target IOMMU,
hook up of_iommu_configure() to bring PCI devices into the of_xlate
mechanism and allow them IOMMU-backed DMA ops without the need for
driver-specific handling.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
The PCI msi-map code is already doing double-duty translating IDs and
retrieving MSI parents, which unsurprisingly is the same functionality
we need for the identically-formatted PCI iommu-map property. Drag the
core parsing routine up yet another layer into the general OF-PCI code,
and further generalise it for either kind of lookup in either flavour
of map property.
Acked-by: Rob Herring <robh+dt@kernel.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
The existing IOMMU bindings are able to specify the relationship between
masters and IOMMUs, but they are insufficient for describing the general
case of hotpluggable busses such as PCI where the set of masters is not
known until runtime, and the relationship between masters and IOMMUs is
a property of the integration of the system.
This patch adds a generic binding for mapping PCI devices to IOMMUs,
using a new iommu-map property (specific to PCI*) which may be used to
map devices (identified by their Requester ID) to sideband data for the
IOMMU which they master through.
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
The cmdq lock is taken whenever we issue commands into the command queue,
which can occur in IRQ context (as a result of unmap) or in process
context (as a result of a threaded IRQ handler or device probe).
This can lead to a theoretical deadlock if the interrupt handler
performing the unmap hits whilst the lock is taken, so explicitly use
the {irqsave,irqrestore} spin_lock accessors for the cmdq lock.
Tested-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
When the SMMUv3 driver attempts to send a command, it adds an entry to the
command queue. This is a circular buffer, where both the producer and
consumer have a wrap bit. When producer.index == consumer.index and
producer.wrap == consumer.wrap, the list is empty. When producer.index ==
consumer.index and producer.wrap != consumer.wrap, the list is full.
If the list is full when the driver needs to add a command, it waits for
the SMMU to consume one command, and advance the consumer pointer. The
problem is that we currently rely on "X before Y" operation to know if
entries have been consumed, which is a bit fiddly since it only makes
sense when the distance between X and Y is less than or equal to the size
of the queue. At the moment when the list is full, we use "Consumer before
Producer + 1", which is out of range and returns a value opposite to what
we expect: when the queue transitions to not full, we stay in the polling
loop and time out, printing an error.
Given that the actual bug was difficult to determine, simplify the polling
logic by relying exclusively on queue_full and queue_empty, that don't
have this range constraint. Polling the queue is now straightforward:
* When we want to add a command and the list is full, wait until it isn't
full and retry.
* After adding a sync, wait for the list to be empty before returning.
Suggested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Fill in the last bits of machinery required to drive a stage 1 context
bank in v7 short descriptor format. By default we'll prefer to use it
only when the CPUs are also using the same format, such that we're
guaranteed that everything will be strictly 32-bit.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
SMMUv3 only sends interrupts for event queues (EVTQ and PRIQ) when they
transition from empty to non-empty. At the moment, if the SMMU adds new
items to a queue before the event thread finished consuming a previous
batch, the driver ignores any new item. The queue is then stuck in
non-empty state and all subsequent events will be lost.
As an example, consider the following flow, where (P, C) is the SMMU view
of producer/consumer indices, and (p, c) the driver view.
P C | p c
1. SMMU appends a PPR to the PRI queue, 1 0 | 0 0
sends an MSI
2. PRIQ handler is called. 1 0 | 1 0
3. SMMU appends a PPR to the PRI queue. 2 0 | 1 0
4. PRIQ thread removes the first element. 2 1 | 1 1
5. PRIQ thread believes that the queue is empty, goes into idle
indefinitely.
To avoid this, always synchronize the producer index and drain the queue
once before leaving an event handler. In order to prevent races on the
local producer index, move all event queue handling into the threads.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
There is no need to call devm_free_irq when driver detach.
devres_release_all which is called after 'drv->remove' will
release all managed resources.
Signed-off-by: Peng Fan <van.freenix@gmail.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
|
|
Pull drm fixes from Dave Airlie:
"A bunch of fixes covering i915, amdgpu, one tegra and some core DRM
ones. Nothing too strange at this point"
* tag 'drm-fixes-for-4.8-rc4' of git://people.freedesktop.org/~airlied/linux: (21 commits)
drm/atomic: Don't potentially reset color_mgmt_changed on successive property updates.
drm: Protect fb_defio in drivers with CONFIG_KMS_FBDEV_EMULATION
drm/amdgpu: skip TV/CV in display parsing
drm/amdgpu: avoid a possible array overflow
drm/amdgpu: fix lru size grouping v2
drm/tegra: dsi: Enhance runtime power management
drm/i915: Fix botched merge that downgrades CSR versions.
drm/i915/skl: Ensure pipes with changed wms get added to the state
drm/i915/gen9: Only copy WM results for changed pipes to skl_hw
drm/i915/skl: Add support for the SAGV, fix underrun hangs
drm/i915/gen6+: Interpret mailbox error flags
drm/i915: Reattach comment, complete type specification
drm/i915: Unconditionally flush any chipset buffers before execbuf
drm/i915/gen9: Drop invalid WARN() during data rate calculation
drm/i915/gen9: Initialize intel_state->active_crtcs during WM sanitization (v2)
drm: Reject page_flip for !DRIVER_MODESET
drm/amdgpu: fix timeout value check in amd_sched_job_recovery
drm/amdgpu: fix sdma_v2_4_ring_test_ib
drm/amdgpu: fix amdgpu_move_blit on 32bit systems
drm/radeon: fix radeon_move_blit on 32bit systems
...
|
|
property updates.
Due to assigning the 'replaced' value instead of or'ing it,
if drm_atomic_crtc_set_property() gets called multiple times,
the last call will define the color_mgmt_changed flag, so
a non-updating call to a property can reset the flag and
prevent actual hw state updates required by preceding
property updates.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: <stable@vger.kernel.org> # v4.6+
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
"A few fixes from the perf departement
- prevent a imbalanced preemption disable in the events teardown code
- prevent out of bound acces in perf userspace
- make perf tools compile with UCLIBC again
- a fix for the userspace unwinder utility"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Use this_cpu_ptr() when stopping AUX events
perf evsel: Do not access outside hw cache name arrays
tools lib: Reinstate strlcpy() header guard with __UCLIBC__
perf unwind: Use addr_location::addr instead of ip for entries
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Thomas Gleixner:
"A single bugfix to prevent irq remapping when the ioapic is disabled"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic: Do not init irq remapping if ioapic is disabled
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"This lot provides:
- plug a hotplug race in the new affinity infrastructure
- a fix for the trigger type of chained interrupts
- plug a potential memory leak in the core code
- a few fixes for ARM and MIPS GICs"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/mips-gic: Implement activate op for device domain
irqchip/mips-gic: Cleanup chip and handler setup
genirq/affinity: Use get/put_online_cpus around cpumask operations
genirq: Fix potential memleak when failing to get irq pm
irqchip/gicv3-its: Disable the ITS before initializing it
irqchip/gicv3: Remove disabling redistributor and group1 non-secure interrupts
irqchip/gic: Allow self-SGIs for SMP on UP configurations
genirq: Correctly configure the trigger on chained interrupts
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"A few updates for timers & co:
- prevent a livelock in the timekeeping code when debugging is
enabled
- prevent out of bounds access in the timekeeping debug code
- various fixes in clocksource drivers
- a new maintainers entry"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/sun4i: Clear interrupts after stopping timer in probe function
drivers/clocksource/pistachio: Fix memory corruption in init
clocksource/drivers/timer-atmel-pit: Enable mck clock
clocksource/drivers/pxa: Fix include files for compilation
MAINTAINERS: Add ARM ARCHITECTED TIMER entry
timekeeping: Cap array access in timekeeping_debug
timekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING
|
|
Pull KVM fixes from Paolo Bonzini:
"ARM:
- fixes for ITS init issues, error handling, IRQ leakage, race
conditions
- an erratum workaround for timers
- some removal of misleading use of errors and comments
- a fix for GICv3 on 32-bit guests
MIPS:
- fix for where the guest could wrongly map the first page of
physical memory
x86:
- nested virtualization fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
MIPS: KVM: Check for pfn noslot case
kvm: nVMX: fix nested tsc scaling
KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE write
KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC
arm64: KVM: report configured SRE value to 32-bit world
arm64: KVM: remove misleading comment on pmu status
KVM: arm/arm64: timer: Workaround misconfigured timer interrupt
arm64: Document workaround for Cortex-A72 erratum #853709
KVM: arm/arm64: Change misleading use of is_error_pfn
KVM: arm64: ITS: avoid re-mapping LPIs
KVM: arm64: check for ITS device on MSI injection
KVM: arm64: ITS: move ITS registration into first VCPU run
KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic
KVM: arm64: vgic-its: Plug race in vgic_put_irq
KVM: arm64: vgic-its: Handle errors from vgic_add_lpi
KVM: arm64: ITS: return 1 on successful MSI injection
|
|
Merge fixes from Andrew Morton:
"11 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: silently skip readahead for DAX inodes
dax: fix device-dax region base
fs/seq_file: fix out-of-bounds read
mm: memcontrol: avoid unused function warning
mm: clarify COMPACTION Kconfig text
treewide: replace config_enabled() with IS_ENABLED() (2nd round)
printk: fix parsing of "brl=" option
soft_dirty: fix soft_dirty during THP split
sysctl: handle error writing UINT_MAX to u32 fields
get_maintainer: quiet noisy implicit -f vcs_file_exists checking
byteswap: don't use __builtin_bswap*() with sparse
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull ARM64 fix from Catalin Marinas:
"ARM64 fix to avoid potential TLB conflict when CONFIG_RANDOMIZE_BASE
is enabled"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: avoid TLB conflict with CONFIG_RANDOMIZE_BASE
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"Round one of 4.8 rc fixes.
This should be the bulk of the -rc fixes for 4.8. I only have a few
things that are still outstanding (two ipoib bugs for which the
solution is not yet fully known, and a few queued items that came in
after my last push and I didn't want to delay this pull request for
late comers again).
Even though the patch count is kind of high, everything is minor fixes
so the overall churn is pretty low.
Summary:
- minor fixes to cxgb4
- minor fixes to mlx4
- one minor fix each to core, rxe, isert, srpt, mlx5, ocrdma, and usnic
- six or so fixes to i40iw fixes
- the rest are hfi1 fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (34 commits)
i40iw: Send last streaming mode message for loopback connections
IB/srpt: Update sport->port_guid with each port refresh
RDMA/ocrdma: Fix the max_sge reported from FW
i40iw: Avoid writing to freed memory
i40iw: Fix double free of allocated_buffer
IB/mlx5: Remove superfluous include of io-mapping.h
i40iw: Do not set self-referencing pointer to NULL after kfree
i40iw: Add missing NULL check for MPA private data
iw_cxgb4: Fix cxgb4 arm CQ logic w/IB_CQ_REPORT_MISSED_EVENTS
i40iw: Add missing check for interface already open
i40iw: Protect req_resource_num update
i40iw: Change mem_resources pointer to a u8
IB/core: Use memdup_user() rather than duplicating its implementation
IB/qib: Use memdup_user() rather than duplicating its implementation
iw_cxgb4: use the MPA initiator's IRD if < our ORD
iw_cxgb4: limit IRD/ORD advertised to ULP by device max.
IB/hfi1: Fix mm_struct use after free
IB/rdmvat: Fix double vfree() in rvt_create_qp() error path
IB/hfi1: Improve J_KEY generation
IB/hfi1: Return invalid field for non-QSFP CableInfo queries
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Here are a bunch of fixes as you can see in diffstat.
One core change in ASoC is about the unexpected unbinding error, and
another about debugfs cleanup.
The rest are wide-spread driver-specific fixes: a series of LINE6 USB
fixes, a HD-audio quirk, and various ASoC fixes including OMAP boot
fixes and Intel SKL fixes"
* tag 'sound-4.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (22 commits)
ALSA: hda/realtek - fix headset mic detection for MSI MS-B120
ASoC: omap-mcpdm: Fix irq resource handling
ASoC: max98371: Add terminate entry for i2c_device_id tables
ALSA: line6: Fix POD sysfs attributes segfault
ALSA: line6: Give up on the lock while URBs are released.
ALSA: line6: Remove double line6_pcm_release() after failed acquire.
ASoC: omap-abe-twl6040: Correct dmic-codec device registration
ASoC: core: Clean up DAPM before the card debugfs
ASoC: omap-mcpdm: Drop pdmclk clock handling
ASoC: atmel_ssc_dai: Don't unconditionally reset SSC on stream startup
ASoC: compress: Fix leak of a widget list in soc_compr_open_fe
ASoC: Intel: Skylake: Fix error return code in skl_probe()
ASoC: wm2000: Fix return of uninitialised varible
ASoC: Fix leak of rtd in soc_bind_dai_link
ASoC: da7213: Default to 64 BCLKs per WCLK to support all formats
ASoC: nau8825: fix static check error about semaphone control
ASoC: nau8825: fix bug in playback when suspend
ASoC: samsung: Fix clock handling in S3C24XX_UDA134X card
ASoC: simple-card-utils: add missing MODULE_xxx()
ASoC: Intel: Skylake: Check list empty while getting module info
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"We've queued up a few different fixes in here. These range from
enospc corners to fsync and quota fixes, and a few targeted at error
handling for corrupt metadata/fuzzing"
* 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix lockdep warning on deadlock against an inode's log mutex
Btrfs: detect corruption when non-root leaf has zero item
Btrfs: check btree node's nritems
btrfs: don't create or leak aliased root while cleaning up orphans
Btrfs: fix em leak in find_first_block_group
btrfs: do not background blkdev_put()
Btrfs: clarify do_chunk_alloc()'s return value
btrfs: fix fsfreeze hang caused by delayed iputs deal
btrfs: update btrfs_space_info's bytes_may_use timely
btrfs: divide btrfs_update_reserved_bytes() into two functions
btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster()
btrfs: qgroup: Fix qgroup incorrectness caused by log replay
btrfs: relocation: Fix leaking qgroups numbers on data extents
btrfs: qgroup: Refactor btrfs_qgroup_insert_dirty_extent()
btrfs: waiting on qgroup rescan should not always be interruptible
btrfs: properly track when rescan worker is running
btrfs: flush_space: treat return value of do_chunk_alloc properly
Btrfs: add ASSERT for block group's memory leak
btrfs: backref: Fix soft lockup in __merge_refs function
Btrfs: fix memory leak of reloc_root
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm fix from David Teigland:
"This fixes a bug introduced by recent debugfs cleanup"
* tag 'dlm-4.8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: fix malfunction of dlm_tool caused by debugfs changes
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- another stable fix for DM flakey (that tweaks the previous fix that
didn't factor in expected 'drop_writes' behavior for read IO).
- a dm-log bio operation flags fix for the broader block changes that
were merged during the 4.8 merge window.
* tag 'dm-4.8-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm log: fix unitialized bio operation flags
dm flakey: fix reads to be issued if drop_writes configured
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"Fixes from Will Deacon:
- fix a couple of thinkos in the CMDQ error handling and
short-descriptor page table code that have been there since day one
- disable stalling faults, since they may result in hardware deadlock
- fix an accidental BUG() when passing disable_bypass=1 on the
cmdline"
* tag 'iommu-fixes-v4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/arm-smmu: Don't BUG() if we find aborting STEs with disable_bypass
iommu/arm-smmu: Disable stalling faults for all endpoints
iommu/arm-smmu: Fix CMDQ error handling
iommu/io-pgtable-arm-v7s: Fix attributes when splitting blocks
|
|
Pull block fixes from Jens Axboe:
"Here's a set of block fixes for the current 4.8-rc release. This
contains:
- a fix for a secure erase regression, from Adrian.
- a fix for an mmc use-after-free bug regression, also from Adrian.
- potential zero pointer deference in bdev freezing, from Andrey.
- a race fix for blk_set_queue_dying() from Bart.
- a set of xen blkfront fixes from Bob Liu.
- three small fixes for bcache, from Eric and Kent.
- a fix for a potential invalid NVMe state transition, from Gabriel.
- blk-mq CPU offline fix, preventing us from issuing and completing a
request on the wrong queue. From me.
- revert two previous floppy changes, since they caused a user
visibile regression. A better fix is in the works.
- ensure that we don't send down bios that have more than 256
elements in them. Fixes a crash with bcache, for example. From
Ming.
- a fix for deferencing an error pointer with cgroup writeback.
Fixes a regression. From Vegard"
* 'for-linus' of git://git.kernel.dk/linux-block:
mmc: fix use-after-free of struct request
Revert "floppy: refactor open() flags handling"
Revert "floppy: fix open(O_ACCMODE) for ioctl-only open"
fs/block_dev: fix potential NULL ptr deref in freeze_bdev()
blk-mq: improve warning for running a queue on the wrong CPU
blk-mq: don't overwrite rq->mq_ctx
block: make sure a big bio is split into at most 256 bvecs
nvme: Fix nvme_get/set_features() with a NULL result pointer
bdev: fix NULL pointer dereference
xen-blkfront: free resources if xlvbd_alloc_gendisk fails
xen-blkfront: introduce blkif_set_queue_limits()
xen-blkfront: fix places not updated after introducing 64KB page granularity
bcache: pr_err: more meaningful error message when nr_stripes is invalid
bcache: RESERVE_PRIO is too small by one when prio_buckets() is a power of two.
bcache: register_bcache(): call blkdev_put() when cache_alloc() fails
block: Fix race triggered by blk_set_queue_dying()
block: Fix secure erase
nvme: Prevent controller state invalid transition
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input subsystem fixes from Dmitry Torokhov:
"Simply small driver fixups"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: ads7846 - remove redundant regulator_disable call
Input: synaptics-rmi4 - fix register descriptor subpacket map construction
Input: tegra-kbc - fix inverted reset logic
Input: silead - use devm_gpiod_get
Input: i8042 - set up shared ps2_cmd_mutex for AUX ports
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"Resource management:
- Update "pci=resource_alignment" documentation (Mathias Koehrer)
MSI:
- Use positive flags in pci_alloc_irq_vectors() (Christoph Hellwig)
- Call pci_intx() when using legacy interrupts in pci_alloc_irq_vectors() (Christoph Hellwig)
Intel VMD host bridge driver:
- Fix infinite loop executing irq's (Keith Busch)"
* tag 'pci-v4.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
x86/PCI: VMD: Fix infinite loop executing irq's
PCI: Call pci_intx() when using legacy interrupts in pci_alloc_irq_vectors()
PCI: Use positive flags in pci_alloc_irq_vectors()
PCI: Update "pci=resource_alignment" documentation
|
|
For DAX inodes we need to be careful to never have page cache pages in
the mapping->page_tree. This radix tree should be composed only of DAX
exceptional entries and zero pages.
ltp's readahead02 test was triggering a warning because we were trying
to insert a DAX exceptional entry but found that a page cache page had
already been inserted into the tree. This page was being inserted into
the radix tree in response to a readahead(2) call.
Readahead doesn't make sense for DAX inodes, but we don't want it to
report a failure either. Instead, we just return success and don't do
any work.
Link: http://lkml.kernel.org/r/20160824221429.21158-1-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jan Kara <jack@suse.com>
Cc: <stable@vger.kernel.org> [4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The data offset for a dax region needs to account for a reservation in
the resource range. Otherwise, device-dax is allowing mappings directly
into the memmap or device-info-block area with crash signatures like the
following:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: get_zone_device_page+0x11/0x30
Call Trace:
follow_devmap_pmd+0x298/0x2c0
follow_page_mask+0x275/0x530
__get_user_pages+0xe3/0x750
__gfn_to_pfn_memslot+0x1b2/0x450 [kvm]
tdp_page_fault+0x130/0x280 [kvm]
kvm_mmu_page_fault+0x5f/0xf0 [kvm]
handle_ept_violation+0x94/0x180 [kvm_intel]
vmx_handle_exit+0x1d3/0x1440 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0x81d/0x16a0 [kvm]
kvm_vcpu_ioctl+0x33c/0x620 [kvm]
do_vfs_ioctl+0xa2/0x5d0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x1a/0xa4
Fixes: ab68f2622136 ("/dev/dax, pmem: direct access to persistent memory")
Link: http://lkml.kernel.org/r/147205536732.1606.8994275381938837346.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Abhilash Kumar Mulumudi <m.abhilash-kumar@hpe.com>
Reported-by: Toshi Kani <toshi.kani@hpe.com>
Tested-by: Toshi Kani <toshi.kani@hpe.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
seq_read() is a nasty piece of work, not to mention buggy.
It has (I think) an old bug which allows unprivileged userspace to read
beyond the end of m->buf.
I was getting these:
BUG: KASAN: slab-out-of-bounds in seq_read+0xcd2/0x1480 at addr ffff880116889880
Read of size 2713 by task trinity-c2/1329
CPU: 2 PID: 1329 Comm: trinity-c2 Not tainted 4.8.0-rc1+ #96
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
Call Trace:
kasan_object_err+0x1c/0x80
kasan_report_error+0x2cb/0x7e0
kasan_report+0x4e/0x80
check_memory_region+0x13e/0x1a0
kasan_check_read+0x11/0x20
seq_read+0xcd2/0x1480
proc_reg_read+0x10b/0x260
do_loop_readv_writev.part.5+0x140/0x2c0
do_readv_writev+0x589/0x860
vfs_readv+0x7b/0xd0
do_readv+0xd8/0x2c0
SyS_readv+0xb/0x10
do_syscall_64+0x1b3/0x4b0
entry_SYSCALL64_slow_path+0x25/0x25
Object at ffff880116889100, in cache kmalloc-4096 size: 4096
Allocated:
PID = 1329
save_stack_trace+0x26/0x80
save_stack+0x46/0xd0
kasan_kmalloc+0xad/0xe0
__kmalloc+0x1aa/0x4a0
seq_buf_alloc+0x35/0x40
seq_read+0x7d8/0x1480
proc_reg_read+0x10b/0x260
do_loop_readv_writev.part.5+0x140/0x2c0
do_readv_writev+0x589/0x860
vfs_readv+0x7b/0xd0
do_readv+0xd8/0x2c0
SyS_readv+0xb/0x10
do_syscall_64+0x1b3/0x4b0
return_from_SYSCALL_64+0x0/0x6a
Freed:
PID = 0
(stack is not available)
Memory state around the buggy address:
ffff88011688a000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88011688a080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff88011688a100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff88011688a180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88011688a200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Disabling lock debugging due to kernel taint
This seems to be the same thing that Dave Jones was seeing here:
https://lkml.org/lkml/2016/8/12/334
There are multiple issues here:
1) If we enter the function with a non-empty buffer, there is an attempt
to flush it. But it was not clearing m->from after doing so, which
means that if we try to do this flush twice in a row without any call
to traverse() in between, we are going to be reading from the wrong
place -- the splat above, fixed by this patch.
2) If there's a short write to userspace because of page faults, the
buffer may already contain multiple lines (i.e. pos has advanced by
more than 1), but we don't save the progress that was made so the
next call will output what we've already returned previously. Since
that is a much less serious issue (and I have a headache after
staring at seq_read() for the past 8 hours), I'll leave that for now.
Link: http://lkml.kernel.org/r/1471447270-32093-1-git-send-email-vegard.nossum@oracle.com
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A bugfix in v4.8-rc2 introduced a harmless warning when
CONFIG_MEMCG_SWAP is disabled but CONFIG_MEMCG is enabled:
mm/memcontrol.c:4085:27: error: 'mem_cgroup_id_get_online' defined but not used [-Werror=unused-function]
static struct mem_cgroup *mem_cgroup_id_get_online(struct mem_cgroup *memcg)
This moves the function inside of the #ifdef block that hides the
calling function, to avoid the warning.
Fixes: 1f47b61fb407 ("mm: memcontrol: fix swap counter leak on swapout from offline cgroup")
Link: http://lkml.kernel.org/r/20160824113733.2776701-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The current wording of the COMPACTION Kconfig help text doesn't
emphasise that disabling COMPACTION might cripple the page allocator
which relies on the compaction quite heavily for high order requests and
an unexpected OOM can happen with the lack of compaction. Make sure we
are vocal about that.
Link: http://lkml.kernel.org/r/20160823091726.GK23577@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|