aboutsummaryrefslogtreecommitdiff
path: root/arch/s390/pci/pci_dma.c
AgeCommit message (Collapse)Author
2023-10-25s390/pci: fix iommu bitmap allocationNiklas Schnelle
commit c1ae1c59c8c6e0b66a718308c623e0cb394dab6b upstream. Since the fixed commits both zdev->iommu_bitmap and zdev->lazy_bitmap are allocated as vzalloc(zdev->iommu_pages / 8). The problem is that zdev->iommu_bitmap is a pointer to unsigned long but the above only yields an allocation that is a multiple of sizeof(unsigned long) which is 8 on s390x if the number of IOMMU pages is a multiple of 64. This in turn is the case only if the effective IOMMU aperture is a multiple of 64 * 4K = 256K. This is usually the case and so didn't cause visible issues since both the virt_to_phys(high_memory) reduced limit and hardware limits use nice numbers. Under KVM, and in particular with QEMU limiting the IOMMU aperture to the vfio DMA limit (default 65535), it is possible for the reported aperture not to be a multiple of 256K however. In this case we end up with an iommu_bitmap whose allocation is not a multiple of 8 causing bitmap operations to access it out of bounds. Sadly we can't just fix this in the obvious way and use bitmap_zalloc() because for large RAM systems (tested on 8 TiB) the zdev->iommu_bitmap grows too large for kmalloc(). So add our own bitmap_vzalloc() wrapper. This might be a candidate for common code, but this area of code will be replaced by the upcoming conversion to use the common code DMA API on s390 so just add a local routine. Fixes: 224593215525 ("s390/pci: use virtual memory for iommu bitmap") Fixes: 13954fd6913a ("s390/pci_dma: improve lazy flush for unmap") Cc: stable@vger.kernel.org Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-14s390/pci: convert high_memory to physical addressNiklas Schnelle
We use high_memory as a measure for amount of memory available in determining the required minimum size of our IOVA space with the assumption that one rarely maps more than the available memory for DMA. In special cases like mapping significant amounts of memory more than once this can still be tuned with the s390_iommu_apterture kernel parameter. In this use case high_memory is treated as a physical address. As high_memory is a virtual address however this means we need to convert it using virt_to_phys() before use Note that at the moment physical and virtual addresses are identical so this mismatch does not currently cause trouble. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-12-06s390/pci: use physical addresses in DMA tablesNiklas Schnelle
The entries in the DMA translation tables for our IOMMU must specify physical addresses of either the next level table or the final page to be mapped for DMA. Currently however the code simply passes the virtual addresses of both. On the other hand we still need to walk the tables via their virtual addresses so we need to do a phys_to_virt() when setting the entries and a virt_to_phys() when getting them. Similarly when passing the I/O translation anchor to the hardware we must also specify its physical address. As the DMA and IOMMU APIs we are implementing already use the correct phys_addr_t type for the address to be mapped let's also thread this through instead of treating it as just an unsigned long. Note: this currently doesn't fix a real bug, since virtual addresses are indentical to physical ones. Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-10-26s390/pci: add s390_iommu_aperture kernel parameterNiklas Schnelle
Some applications map the same memory area for DMA multiple times while also mapping significant amounts of memory. With our current DMA code these applications will run out of DMA addresses after mapping half of the available memory because the number of DMA mappings is constrained by the number of concurrently active DMA addresses we support which in turn is limited by the minimum of hardware constraints and high_memory. Limiting the number of active DMA addresses to high_memory is only a heuristic to save memory used by the iommu_bitmap and DMA page tables however. This was added under the assumption that it rarely makes sense to DMA map more than system memory. To accommodate special applications which insist on double mapping, which works on other platforms, allow specifying a factor of how many times installed memory is available as DMA address space. Use 0 as a special value to apply no constraints beyond what hardware dictates at the expense of significantly more memory use. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-09-02Merge tag 'dma-mapping-5.15' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping updates from Christoph Hellwig: - fix debugfs initialization order (Anthony Iliopoulos) - use memory_intersects() directly (Kefeng Wang) - allow to return specific errors from ->map_sg (Logan Gunthorpe, Martin Oliveira) - turn the dma_map_sg return value into an unsigned int (me) - provide a common global coherent pool іmplementation (me) * tag 'dma-mapping-5.15' of git://git.infradead.org/users/hch/dma-mapping: (31 commits) hexagon: use the generic global coherent pool dma-mapping: make the global coherent pool conditional dma-mapping: add a dma_init_global_coherent helper dma-mapping: simplify dma_init_coherent_memory dma-mapping: allow using the global coherent pool for !ARM ARM/nommu: use the generic dma-direct code for non-coherent devices dma-direct: add support for dma_coherent_default_memory dma-mapping: return an unsigned int from dma_map_sg{,_attrs} dma-mapping: disallow .map_sg operations from returning zero on error dma-mapping: return error code from dma_dummy_map_sg() x86/amd_gart: don't set failed sg dma_address to DMA_MAPPING_ERROR x86/amd_gart: return error code from gart_map_sg() xen: swiotlb: return error code from xen_swiotlb_map_sg() parisc: return error code from .map_sg() ops sparc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR sparc/iommu: return error codes from .map_sg() ops s390/pci: don't set failed sg dma_address to DMA_MAPPING_ERROR s390/pci: return error code from s390_dma_map_sg() powerpc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR powerpc/iommu: return error code from .map_sg() ops ...
2021-08-25s390/pci: improve DMA translation init and exitNiklas Schnelle
Currently zpci_dma_init_device()/zpci_dma_exit_device() is called as part of zpci_enable_device()/zpci_disable_device() and errors for zpci_dma_exit_device() are always ignored even if we could abort. Improve upon this by moving zpci_dma_exit_device() out of zpci_disable_device() and check for errors whenever we have a way to abort the current operation. Note that for example in zpci_event_hard_deconfigured() the device is expected to be gone so we really can't abort and proceed even in case of error. Similarly move the cc == 3 special case out of zpci_unregister_ioat() and into the callers allowing to abort when finding an already disabled devices precludes proceeding with the operation. While we are at it log IOAT register/unregister errors in the s390 debugfs log, Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-09s390/pci: don't set failed sg dma_address to DMA_MAPPING_ERRORLogan Gunthorpe
Setting the ->dma_address to DMA_MAPPING_ERROR is not part of the ->map_sg calling convention, so remove it. Link: https://lore.kernel.org/linux-mips/20210716063241.GC13345@lst.de/ Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Cc: Niklas Schnelle <schnelle@linux.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-08-09s390/pci: return error code from s390_dma_map_sg()Martin Oliveira
The .map_sg() op now expects an error code instead of zero on failure. So propagate the error from __s390_dma_map_sg() up. __s390_dma_map_sg() returns either -ENOMEM on allocation failure or -EINVAL which is the same as what's expected by dma_map_sgtable(). Signed-off-by: Martin Oliveira <martin.oliveira@eideticom.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Niklas Schnelle <schnelle@linux.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-06dma-mapping: split <linux/dma-mapping.h>Christoph Hellwig
Split out all the bits that are purely for dma_map_ops implementations and related code into a new <linux/dma-map-ops.h> header so that they don't get pulled into all the drivers. That also means the architecture specific <asm/dma-mapping.h> is not pulled in by <linux/dma-mapping.h> any more, which leads to a missing includes that were pulled in by the x86 or arm versions in a few not overly portable drivers. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-25dma-mapping: add a new dma_alloc_pages APIChristoph Hellwig
This API is the equivalent of alloc_pages, except that the returned memory is guaranteed to be DMA addressable by the passed in device. The implementation will also be used to provide a more sensible replacement for DMA_ATTR_NON_CONSISTENT flag. Additionally dma_alloc_noncoherent is switched over to use dma_alloc_pages as its backend. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> (MIPS part)
2020-09-03dma-mapping: introduce dma_get_seg_boundary_nr_pages()Nicolin Chen
We found that callers of dma_get_seg_boundary mostly do an ALIGN with page mask and then do a page shift to get number of pages: ALIGN(boundary + 1, 1 << shift) >> shift However, the boundary might be as large as ULONG_MAX, which means that a device has no specific boundary limit. So either "+ 1" or passing it to ALIGN() would potentially overflow. According to kernel defines: #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask)) #define ALIGN(x, a) ALIGN_MASK(x, (typeof(x))(a) - 1) We can simplify the logic here into a helper function doing: ALIGN(boundary + 1, 1 << shift) >> shift = ALIGN_MASK(b + 1, (1 << s) - 1) >> s = {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s = [b + 1 + (1 << s) - 1] >> s = [b + (1 << s)] >> s = (b >> s) + 1 This patch introduces and applies dma_get_seg_boundary_nr_pages() as an overflow-free helper for the dma_get_seg_boundary() callers to get numbers of pages. It also takes care of the NULL dev case for non-DMA API callers. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com> Acked-by: Niklas Schnelle <schnelle@linux.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-09-19Merge tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping updates from Christoph Hellwig: - add dma-mapping and block layer helpers to take care of IOMMU merging for mmc plus subsequent fixups (Yoshihiro Shimoda) - rework handling of the pgprot bits for remapping (me) - take care of the dma direct infrastructure for swiotlb-xen (me) - improve the dma noncoherent remapping infrastructure (me) - better defaults for ->mmap, ->get_sgtable and ->get_required_mask (me) - cleanup mmaping of coherent DMA allocations (me) - various misc cleanups (Andy Shevchenko, me) * tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mapping: (41 commits) mmc: renesas_sdhi_internal_dmac: Add MMC_CAP2_MERGE_CAPABLE mmc: queue: Fix bigger segments usage arm64: use asm-generic/dma-mapping.h swiotlb-xen: merge xen_unmap_single into xen_swiotlb_unmap_page swiotlb-xen: simplify cache maintainance swiotlb-xen: use the same foreign page check everywhere swiotlb-xen: remove xen_swiotlb_dma_mmap and xen_swiotlb_dma_get_sgtable xen: remove the exports for xen_{create,destroy}_contiguous_region xen/arm: remove xen_dma_ops xen/arm: simplify dma_cache_maint xen/arm: use dev_is_dma_coherent xen/arm: consolidate page-coherent.h xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance arm: remove wrappers for the generic dma remap helpers dma-mapping: introduce a dma_common_find_pages helper dma-mapping: always use VM_DMA_COHERENT for generic DMA remap vmalloc: lift the arm flag for coherent mappings to common code dma-mapping: provide a better default ->get_required_mask dma-mapping: remove the dma_declare_coherent_memory export remoteproc: don't allow modular build ...
2019-09-04dma-mapping: explicitly wire up ->mmap and ->get_sgtableChristoph Hellwig
While the default ->mmap and ->get_sgtable implementations work for the majority of our dma_map_ops impementations they are inherently safe for others that don't use the page allocator or CMA and/or use their own way of remapping not covered by the common code. So remove the defaults if these methods are not wired up, but instead wire up the default implementations for all safe instances. Fixes: e1c7e324539a ("dma-mapping: always provide the dma_map_ops based implementation") Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-08-29s390/pci: avoid using strncmp with hardcoded lengthVasily Gorbik
Command line option values passed to __setup callbacks are always null-terminated and "s390_iommu=" may only accept "strict" as value. So replace strncmp with strcmp. While at it also make s390_iommu_setup return 1, which means this command line option is handled by this callback. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2018-12-20dma-mapping: zero memory returned from dma_alloc_*Christoph Hellwig
If we want to map memory from the DMA allocator to userspace it must be zeroed at allocation time to prevent stale data leaks. We already do this on most common architectures, but some architectures don't do this yet, fix them up, either by passing GFP_ZERO when we use the normal page allocator or doing a manual memset otherwise. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Sam Ravnborg <sam@ravnborg.org> [sparc]
2018-12-06s390: remove the mapping_error dma_map_ops methodChristoph Hellwig
S390 already returns (~(dma_addr_t)0x0) on mapping failures, so we can switch over to returning DMA_MAPPING_ERROR and let the core dma-mapping code handle the rest. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-08dma-debug: move initialization to common codeChristoph Hellwig
Most mainstream architectures are using 65536 entries, so lets stick to that. If someone is really desperate to override it that can still be done through <asm/dma-mapping.h>, but I'd rather see a really good rationale for that. dma_debug_init is now called as a core_initcall, which for many architectures means much earlier, and provides dma-debug functionality earlier in the boot process. This should be safe as it only relies on the memory allocator already being available. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2018-05-07PCI: remove PCI_DMA_BUS_IS_PHYSChristoph Hellwig
This was used by the ide, scsi and networking code in the past to determine if they should bounce payloads. Now that the dma mapping always have to support dma to all physical memory (thanks to swiotlb for non-iommu systems) there is no need to this crude hack any more. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Palmer Dabbelt <palmer@sifive.com> (for riscv) Reviewed-by: Jens Axboe <axboe@kernel.dk>
2017-12-13s390/pci: handle insufficient resources during dma tlb flushSebastian Ott
In a virtualized setup lazy flushing can lead to the hypervisor running out of resources when lots of guest pages need to be pinned. In this situation simply trigger a global flush to give the hypervisor a chance to free some of these resources. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-24s390: pci: add SPDX identifiers to the remaining filesGreg Kroah-Hartman
It's good to have SPDX identifiers in all files to make it easier to audit the kernel tree for correct licenses. Update the arch/s390/pci/ files with the correct SPDX license identifier based on the license text in the file itself. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This work is based on a script and data from Thomas Gleixner, Philippe Ombredanne, and Kate Stewart. Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-06Merge tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping infrastructure from Christoph Hellwig: "This is the first pull request for the new dma-mapping subsystem In this new subsystem we'll try to properly maintain all the generic code related to dma-mapping, and will further consolidate arch code into common helpers. This pull request contains: - removal of the DMA_ERROR_CODE macro, replacing it with calls to ->mapping_error so that the dma_map_ops instances are more self contained and can be shared across architectures (me) - removal of the ->set_dma_mask method, which duplicates the ->dma_capable one in terms of functionality, but requires more duplicate code. - various updates for the coherent dma pool and related arm code (Vladimir) - various smaller cleanups (me)" * tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mapping: (56 commits) ARM: dma-mapping: Remove traces of NOMMU code ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus ARM: NOMMU: Introduce dma operations for noMMU drivers: dma-mapping: allow dma_common_mmap() for NOMMU drivers: dma-coherent: Introduce default DMA pool drivers: dma-coherent: Account dma_pfn_offset when used with device tree dma: Take into account dma_pfn_offset dma-mapping: replace dmam_alloc_noncoherent with dmam_alloc_attrs dma-mapping: remove dmam_free_noncoherent crypto: qat - avoid an uninitialized variable warning au1100fb: remove a bogus dma_free_nonconsistent call MAINTAINERS: add entry for dma mapping helpers powerpc: merge __dma_set_mask into dma_set_mask dma-mapping: remove the set_dma_mask method powerpc/cell: use the dma_supported method for ops switching powerpc/cell: clean up fixed mapping dma_ops initialization tile: remove dma_supported and mapping_error methods xen-swiotlb: remove xen_swiotlb_set_dma_mask arm: implement ->dma_supported instead of ->set_dma_mask mips/loongson64: implement ->dma_supported instead of ->set_dma_mask ...
2017-06-28s390: implement ->mapping_errorChristoph Hellwig
s390 can also use noop_dma_ops, and while that currently does not return errors it will so in the future. Implementing the mapping_error method is the proper way to have per-ops error conditions. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
2017-06-28s390/pci: improve unreg_ioat error handlingSebastian Ott
DMA tables are freed in zpci_dma_exit_device regardless of the return code of zpci_unregister_ioat. This could lead to a use after free. On the other hand during function hot-unplug, zpci_unregister_ioat will always fail since the function is already gone. So let zpci_unregister_ioat report success when the function is gone but don't cleanup the dma table when a function could still have it in access. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-24treewide: Constify most dma_map_ops structuresBart Van Assche
Most dma_map_ops structures are never modified. Constify these structures such that these can be write-protected. This patch has been generated as follows: git grep -l 'struct dma_map_ops' | xargs -d\\n sed -i \ -e 's/struct dma_map_ops/const struct dma_map_ops/g' \ -e 's/const struct dma_map_ops {/struct dma_map_ops {/g' \ -e 's/^const struct dma_map_ops;$/struct dma_map_ops;/' \ -e 's/const const struct dma_map_ops /const struct dma_map_ops /g'; sed -i -e 's/const \(struct dma_map_ops intel_dma_ops\)/\1/' \ $(git grep -l 'struct dma_map_ops intel_dma_ops'); sed -i -e 's/const \(struct dma_map_ops dma_iommu_ops\)/\1/' \ $(git grep -l 'struct dma_map_ops' | grep ^arch/powerpc); sed -i -e '/^struct vmd_dev {$/,/^};$/ s/const \(struct dma_map_ops[[:blank:]]dma_ops;\)/\1/' \ -e '/^static void vmd_setup_dma_ops/,/^}$/ s/const \(struct dma_map_ops \*dest\)/\1/' \ -e 's/const \(struct dma_map_ops \*dest = \&vmd->dma_ops\)/\1/' \ drivers/pci/host/*.c sed -i -e '/^void __init pci_iommu_alloc(void)$/,/^}$/ s/dma_ops->/intel_dma_ops./' arch/ia64/kernel/pci-dma.c sed -i -e 's/static const struct dma_map_ops sn_dma_ops/static struct dma_map_ops sn_dma_ops/' arch/ia64/sn/pci/pci_dma.c sed -i -e 's/(const struct dma_map_ops \*)//' drivers/misc/mic/bus/vop_bus.c Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Russell King <linux@armlinux.org.uk> Cc: x86@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "The main bulk of the s390 patches for the 4.10 merge window: - Add support for the contiguous memory allocator. - The recovery for I/O errors in the dasd device driver is improved, the driver will now remove channel paths that are not working properly. - Additional fields are added to /proc/sysinfo, the extended partition name and the partition UUID. - New naming for PCI devices with system defined UIDs. - The last few remaining alloc_bootmem calls are converted to memblock. - The thread_info structure is stripped down and moved to the task_struct. The only field left in thread_info is the flags field. - Rework of the arch topology code to fix a fake numa issue. - Refactoring of the atomic primitives and add a new preempt_count implementation. - Clocksource steering for the STP sync check offsets. - The s390 specific headers are changed to make them usable with CLANG. - Bug fixes and cleanup" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (70 commits) s390/cpumf: Use configuration level indication for sampling data s390: provide memmove implementation s390: cleanup arch/s390/kernel Makefile s390: fix initrd corruptions with gcov/kcov instrumented kernels s390: exclude early C code from gcov profiling s390/dasd: channel path aware error recovery s390/dasd: extend dasd path handling s390: remove unused labels from entry.S s390/vmlogrdr: fix IUCV buffer allocation s390/crypto: unlock on error in prng_tdes_read() s390/sysinfo: show partition extended name and UUID if available s390/numa: pin all possible cpus to nodes early s390/numa: establish cpu to node mapping early s390/topology: use cpu_topology array instead of per cpu variable s390/smp: initialize cpu_present_mask in setup_arch s390/topology: always use s390 specific sched_domain_topology_level s390/smp: use smp_get_base_cpu() helper function s390/numa: always use logical cpu and core ids s390: Remove VLAIS in ptff() and clear_table() s390: fix machine check panic stack switch ...
2016-11-17s390/pci_dma: remove memset from dma_allocSebastian Ott
Get rid of a useless memset from dma_alloc. Users of dma_alloc who want zero initialized memory can get it by specifying __GFP_ZERO or use one of the zalloc variants. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-17s390/pci_dma: make lazy flush independent from the tlb_refresh bitSebastian Ott
We have 2 strategies to reduce the number of RPCIT instructions: * A HW feature indicated via the tlb_refresh bit allows us to omit RPCIT for invalid -> valid translation-table entry updates. * With "lazy flush" we omit RPCIT for valid -> invalid updates until we run out of dma addresses. When we have to reuse dma addresses we issue a global tlb flush using only one RPCIT instruction. Currently lazy flushing depends on tlb_refresh. Since there is no technical reason for this remove this dependency. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-17s390/pci: fix dma address calculation in map_sgSebastian Ott
__s390_dma_map_sg maps a dma-contiguous area. Although we only map whole pages we have to take into account that the area doesn't start or stop at a page boundary because we use the dma address to loop over the individual sg entries. Failing to do that might lead to an access of the wrong sg entry. Fixes: ee877b81c6b9 ("s390/pci_dma: improve map_sg") Reported-and-tested-by: Christoph Raisch <raisch@de.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11s390: pci: don't print uninitialized data for debuggingArnd Bergmann
gcc correctly warns about an incorrect use of the 'pa' variable in case we pass an empty scatterlist to __s390_dma_map_sg: arch/s390/pci/pci_dma.c: In function '__s390_dma_map_sg': arch/s390/pci/pci_dma.c:309:13: warning: 'pa' may be used uninitialized in this function [-Wmaybe-uninitialized] This adds a bogus initialization to the function to sanitize the debug output. I would have preferred a solution without the initialization, but I only got the report from the kbuild bot after turning on the warning again, and didn't manage to reproduce it myself. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-22s390/pci_dma: improve lazy flush for unmapSebastian Ott
Lazy unmap (defer tlb flush after unmap until dma address reuse) can greatly reduce the number of RPCIT instructions in the best case. In reality we are often far away from the best case scenario because our implementation suffers from the following problem: To create dma addresses we maintain an iommu bitmap and a pointer into that bitmap to mark the start of the next search. That pointer moves from the start to the end of that bitmap and we issue a global tlb flush once that pointer wraps around. To prevent address reuse before we issue the tlb flush we even have to move the next pointer during unmaps - when clearing a bit > next. This could lead to a situation where we only use the rear part of that bitmap and issue more tlb flushes than expected. To fix this we no longer clear bits during unmap but maintain a 2nd bitmap which we use to mark addresses that can't be reused until we issue the global tlb flush after wrap around. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-22s390/pci_dma: split dma_update_transSebastian Ott
Split dma_update_trans into __dma_update_trans which handles updating the dma translation tables and __dma_purge_tlb which takes care of purging associated entries in the dma translation lookaside buffer. The map_sg API makes use of this split approach by calling __dma_update_trans once per physically contiguous address range but __dma_purge_tlb only once per dma contiguous address range. This results in less invocations of the expensive RPCIT instruction when using map_sg. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-22s390/pci_dma: improve map_sgSebastian Ott
Our map_sg implementation mapped sg entries independently of each other. For ease of use and possible performance improvements this patch changes the implementation to try to map as many (likely physically non-contiguous) sglist entries as possible into a contiguous DMA segment. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-22s390/pci_dma: simplify dma address calculationSebastian Ott
Simplify the code we use to calculate dma addresses by putting everything related in a dma_alloc_address function. Also provide a dma_free_address counterpart. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-22s390/pci_dma: remove dma address range checkSebastian Ott
We calculate dma addresses using an iommu bitmap. Since commit 69eea95c ("s390/pci_dma: fix DMA table corruption with > 4 TB main memory") we've made sure that addresses created using that bitmap are below the maximum reported by firmware. Thus the additional check for that address to be within range can be removed. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-04dma-mapping: use unsigned long for dma_attrsKrzysztof Kozlowski
The dma-mapping core and the implementations do not change the DMA attributes passed by pointer. Thus the pointer can point to const data. However the attributes do not have to be a bitfield. Instead unsigned long will do fine: 1. This is just simpler. Both in terms of reading the code and setting attributes. Instead of initializing local attributes on the stack and passing pointer to it to dma_set_attr(), just set the bits. 2. It brings safeness and checking for const correctness because the attributes are passed by value. Semantic patches for this change (at least most of them): virtual patch virtual context @r@ identifier f, attrs; @@ f(..., - struct dma_attrs *attrs + unsigned long attrs , ...) { ... } @@ identifier r.f; @@ f(..., - NULL + 0 ) and // Options: --all-includes virtual patch virtual context @r@ identifier f, attrs; type t; @@ t f(..., struct dma_attrs *attrs); @@ identifier r.f; @@ f(..., - NULL + 0 ) Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> Acked-by: Mark Salter <msalter@redhat.com> [c6x] Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris] Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm] Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp] Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core] Acked-by: David Vrabel <david.vrabel@citrix.com> [xen] Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb] Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32] Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc] Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-13s390/pci: ensure to not cross a dma segment boundarySebastian Ott
When we use the iommu_area_alloc helper to get dma addresses we specify the boundary_size parameter but not the offset (called shift in this context). As long as the offset (start_dma) is a multiple of the boundary we're ok (on current machines start_dma always seems to be 4GB). Don't leave this to chance and specify the offset for iommu_area_alloc. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-06-13s390/pci: ensure page aligned dma start addressSebastian Ott
We don't have an architectural guarantee on the value of the dma offset but rely on it to be at least page aligned. Enforce page alignemt of start_dma. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-04-21s390/pci: fix use after free in dma_initSebastian Ott
After a failure during registration of the dma_table (because of the function being in error state) we free its memory but don't reset the associated pointer to zero. When we then receive a notification from firmware (about the function being in error state) we'll try to walk and free the dma_table again. Fix this by resetting the dma_table pointer. In addition to that make sure that we free the iommu_bitmap when appropriate. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-03-20Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio/vhost updates from Michael Tsirkin: "New features, performance improvements, cleanups: - basic polling support for vhost - rework virtio to optionally use DMA API, fixing it on Xen - balloon stats gained a new entry - using the new napi_alloc_skb speeds up virtio net - virtio blk stats can now be read while another VCPU is busy inflating or deflating the balloon plus misc cleanups in various places" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_net: replace netdev_alloc_skb_ip_align() with napi_alloc_skb() vhost_net: basic polling support vhost: introduce vhost_vq_avail_empty() vhost: introduce vhost_has_work() virtio_balloon: Allow to resize and update the balloon stats in parallel virtio_balloon: Use a workqueue instead of "vballoon" kthread virtio/s390: size of SET_IND payload virtio/s390: use dev_to_virtio vhost: rename vhost_init_used() vhost: rename cross-endian helpers virtio_blk: VIRTIO_BLK_F_WCE->VIRTIO_BLK_F_FLUSH vring: Use the DMA API on Xen virtio_pci: Use the DMA API if enabled virtio_mmio: Use the DMA API if enabled virtio: Add improved queue allocation API virtio_ring: Support DMA APIs vring: Introduce vring_use_dma_api() s390/dma: Allow per device dma ops alpha/dma: use common noop dma ops dma: Provide simple noop dma ops
2016-03-02s390/dma: Allow per device dma opsChristian Borntraeger
As virtio-ccw will have dma ops, we can no longer default to the zPCI ones. Make use of dev_archdata to keep the dma_ops per device. The pci devices now use that to override the default, and the default is changed to use the noop ops for everything that does not specify a device specific one. To compile without PCI support we will enable HAS_DMA all the time, via the default config in lib/Kconfig. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-23s390/pci: remove pdev pointer from arch dataSebastian Ott
For each PCI function we need to maintain arch specific data in struct zpci_dev which also contains a pointer to struct pci_dev. When a function is registered or deregistered (which is triggered by PCI common code) we need to adjust that pointer which could interfere with the machine check handler (triggered by FW) using zpci_dev->pdev. Since multiple instances of the same pdev could exist at a time this can't be solved with locking. Fix that by ditching the pdev pointer and use a bus walk to reach struct pci_dev (only one instance of a pdev can be registered at the bus at a time). Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "Among the traditional bug fixes and cleanups are some improvements: - A tool to generated the facility lists, generating the bit fields by hand has been a source of bugs in the past - The spinlock loop is reordered to avoid bursts of hypervisor calls - Add support for the open-for-business interface to the service element - The get_cpu call is added to the vdso - A set of tracepoints is defined for the common I/O layer - The deprecated sclp_cpi module is removed - Update default configuration" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (56 commits) s390/sclp: fix possible control register corruption s390: fix normalization bug in exception table sorting s390/configs: update default configurations s390/vdso: optimize getcpu system call s390: drop smp_mb in vdso_init s390: rename struct _lowcore to struct lowcore s390/mem_detect: use unsigned longs s390/ptrace: get rid of long longs in psw_bits s390/sysinfo: add missing SYSIB 1.2.2 multithreading fields s390: get rid of CONFIG_SCHED_MC and CONFIG_SCHED_BOOK s390/Kconfig: remove pointless 64 bit dependencies s390/dasd: fix failfast for disconnected devices s390/con3270: testing return kzalloc retval s390/hmcdrv: constify hmcdrv_ftp_ops structs s390/cio: add NULL test s390/cio: Change I/O instructions from inline to normal functions s390/cio: Introduce common I/O layer tracepoints s390/cio: Consolidate inline assemblies and related data definitions s390/cio: Fix incorrect xsch opcode specification s390/cio: Remove unused inline assemblies ...
2016-01-09[s390] page_to_phys() always returns a multiple of PAGE_SIZEAl Viro
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-27s390/pci_dma: fix DMA table corruption with > 4 TB main memoryGerald Schaefer
DMA addresses returned from map_page() are calculated by using an iommu bitmap plus a start_dma offset. The size of this bitmap is based on the main memory size. If we have more than (4 TB - start_dma) main memory, the DMA address calculation will also produce addresses > 4 TB. Such addresses cannot be inserted in the 3-level DMA page table, instead the entries modulo 4 TB will be overwritten. Fix this by restricting the iommu bitmap size to (4 TB - start_dma). Also set zdev->end_dma to the actual end address of the usable range, instead of the theoretical maximum as reported by the hardware, which fixes a sanity check in dma_map() and also the IOMMU API domain geometry aperture calculation. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-11-09s390/pci_dma: improve debugging of errors during dma mapSebastian Ott
Improve debugging to find out what went wrong during a failed dma map/unmap operation. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-11-09s390/pci_dma: handle dma table failuresSebastian Ott
We use lazy allocation for translation table entries but don't handle allocation (and other) failures during translation table updates. Handle these failures and undo translation table updates when it's meaningful. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-11-09s390/pci_dma: unify label of invalid translation table entriesSebastian Ott
Newly allocated translation table entries are flagged as invalid and protected. If an existing translation table entry is invalidated, the protection flag is left unchanged. If a page (with invalid and protection flag set) is accessed it's undefined which type of exception we'll receive. Make sure to always set the invalid flag only. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-06iommu/s390: Add iommu api for s390 pci devicesGerald Schaefer
This adds an IOMMU API implementation for s390 PCI devices. Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-09-10dma-mapping: consolidate dma_set_maskChristoph Hellwig
Almost everyone implements dma_set_mask the same way, although some time that's hidden in ->set_dma_mask methods. This patch consolidates those into a common implementation that either calls ->set_dma_mask if present or otherwise uses the default implementation. Some architectures used to only call ->set_dma_mask after the initial checks, and those instance have been fixed to do the full work. h8300 implemented dma_set_mask bogusly as a no-ops and has been fixed. Unfortunately some architectures overload unrelated semantics like changing the dma_ops into it so we still need to allow for an architecture override for now. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-22s390/pci: inline get_zdevSebastian Ott
Inline get_zdev to save ~200 bytes of kernel text for CONFIG_PCI=y. Also rename the function to to_zpci to make clear that we don't do reference counting here. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>