aboutsummaryrefslogtreecommitdiff
path: root/arch/arm/mm/cache-v7.S
AgeCommit message (Collapse)Author
2022-02-09ARM: cacheflush: avoid clobbering the frame pointerArd Biesheuvel
Thumb2 uses R7 rather than R11 as the frame pointer, and even if we rarely use a frame pointer to begin with when building in Thumb2 mode, there are cases where it is required by the compiler (Clang when inserting profiling hooks via -pg) However, preserving and restoring the frame pointer is risky, as any unhandled exceptions raised in the mean time will produce a bogus backtrace, and it would be better not to touch the frame pointer at all. This is the case even when CONFIG_FRAME_POINTER is not set, as the unwind directive used by the unwinder may also use R7 or R11 as the unwind anchor, even if the frame pointer is not managed strictly according to the frame pointer ABI. So let's tweak the cacheflush asm code not to clobber R7 or R11 at all, so that we can drop R7 from the clobber lists of the inline asm blocks that call these routines, and remove the code that preserves/restores R11. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2021-03-09ARM: 9059/1: cache-v7: get rid of mini-stackArd Biesheuvel
Now that we have reduced the number of registers that we need to preserve when calling v7_invalidate_l1 from the boot code, we can use scratch registers to preserve the remaining ones, and get rid of the mini stack entirely. This works around any issues regarding cache behavior in relation to the uncached accesses to this memory, which is hard to get right in the general case (i.e., both bare metal and under virtualization) While at it, switch v7_invalidate_l1 to using ip as a scratch register instead of r4. This makes the function AAPCS compliant, and removes the need to stash r4 in ip across the call. Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2021-03-09ARM: 9058/1: cache-v7: refactor v7_invalidate_l1 to avoid clobbering r5/r6Ard Biesheuvel
The cache invalidation code in v7_invalidate_l1 can be tweaked to re-read the associativity from CCSIDR, and keep the way identifier component in a single register that is assigned in the outer loop. This way, we need 2 registers less. Given that the number of sets is typically much larger than the associativity, rearrange the code so that the outer loop has the fewer number of iterations, ensuring that the re-read of CCSIDR only occurs a handful of times in practice. Fix the whitespace while at it, and update the comment to indicate that this code is no longer a clone of anything else. Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2021-03-09ARM: 9057/1: cache-v7: add missing ISB after cache level selectionArd Biesheuvel
A write to CSSELR needs to complete before its results can be observed via CCSIDR. So add a ISB to ensure that this is the case. Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-12-08sched/rt, ARM: Use CONFIG_PREEMPTIONThomas Gleixner
CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT. Both PREEMPT and PREEMPT_RT require the same functionality which today depends on CONFIG_PREEMPT. Switch the entry code, cache over to use CONFIG_PREEMPTION and add output in show_stack() for PREEMPT_RT. [bigeasy: +traps.c] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20191015191821.11479-2-bigeasy@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-07-08Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: - Add a "cut here" to make it clearer where oops dumps should be cut from - we already have a marker for the end of the dumps. - Add logging severity to show_pte() - Drop unnecessary common-page-size linker flag - Errata workarounds for Cortex A12 857271, Cortex A17 857272 and Cortex A7 814220. - Remove some unused variables that had started to provoke a compiler warning. * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8863/1: stm32: select ARM errata 814220 ARM: 8862/1: errata: 814220-B-Cache maintenance by set/way operations can execute out of order ARM: 8865/1: mm: remove unused variables ARM: 8864/1: Add workaround for I-Cache line size mismatch between CPU cores ARM: 8861/1: errata: Workaround errata A12 857271 / A17 857272 ARM: 8860/1: VDSO: Drop implicit common-page-size linker flag ARM: arrange show_pte() to issue severity-based messages ARM: add "8<--- cut here ---" to kernel dumps
2019-06-21ARM: 8862/1: errata: 814220-B-Cache maintenance by set/way operations can ↵Benjamin Gaignard
execute out of order The v7 ARM states that all cache and branch predictor maintenance operations that do not specify an address execute, relative to each other, in program order. However, because of this erratum, an L2 set/way cache maintenance operation can overtake an L1 set/way cache maintenance operation, this would cause the data corruption. This ERRATA affected the Cortex-A7 and present in r0p2, r0p3, r0p4, r0p5. This patch is the SW workaround by adding a DSB before changing cache levels as the ARM ERRATA: ARM/MP: 814220 told in the ARM ERRATA documentation. Signed-off-by: Jason Liu <r64343@freescale.com> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-06-20ARM: 8864/1: Add workaround for I-Cache line size mismatch between CPU coresMarek Szyprowski
Some big.LITTLE systems have I-Cache line size mismatch between LITTLE and big cores. This patch adds a workaround for proper I-Cache support on such systems. Without it, some class of the userspace code (typically self-modifying) might suffer from random SIGILL failures. Similar workaround already exists for ARM64 architecture. I has been added by commit 116c81f427ff ("arm64: Work around systems with mismatched cache line sizes"). Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500Thomas Gleixner
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-04ARM: 8814/1: mm: improve/fix ARM v7_dma_inv_range() unaligned address handlingChris Cole
This patch addresses possible memory corruption when v7_dma_inv_range(start_address, end_address) address parameters are not aligned to whole cache lines. This function issues "invalidate" cache management operations to all cache lines from start_address (inclusive) to end_address (exclusive). When start_address and/or end_address are not aligned, the start and/or end cache lines are first issued "clean & invalidate" operation. The assumption is this is done to ensure that any dirty data addresses outside the address range (but part of the first or last cache lines) are cleaned/flushed so that data is not lost, which could happen if just an invalidate is issued. The problem is that these first/last partial cache lines are issued "clean & invalidate" and then "invalidate". This second "invalidate" is not required and worse can cause "lost" writes to addresses outside the address range but part of the cache line. If another component writes to its part of the cache line between the "clean & invalidate" and "invalidate" operations, the write can get lost. This fix is to remove the extra "invalidate" operation when unaligned addressed are used. A kernel module is available that has a stress test to reproduce the issue and a unit test of the updated v7_dma_inv_range(). It can be downloaded from http://ftp.sageembedded.com/outgoing/linux/cache-test-20181107.tgz. v7_dma_inv_range() is call by dmac_[un]map_area(addr, len, direction) when the direction is DMA_FROM_DEVICE. One can (I believe) successfully argue that DMA from a device to main memory should use buffers aligned to cache line size, because the "clean & invalidate" might overwrite data that the device just wrote using DMA. But if a driver does use unaligned buffers, at least this fix will prevent memory corruption outside the buffer. Signed-off-by: Chris Cole <chris@sageembedded.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-12-17ARM: 8729/1: Hook B15 readahead cache functions based on processorFlorian Fainelli
If we detect that we are running on a Broadcom Brahma-B15 CPU, and CONFIG_CACHE_B15_RAC is enabled, make sure that we pick-up the b15_cache_fns function operations. If CONFIG_CACHE_B15_RAC is enabled, but we are not running on a Broadcom Brahma-B15 CPU, we will fallback to calling into the regular v7_cache_fns with no cost. If CONFIG_CACHE_B15_RAC is disabled, there is no cost and we just use the regular v7_cache_fns. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-02-27scripts/spelling.txt: add "swith" pattern and fix typo instancesMasahiro Yamada
Fix typos and add the following to the scripts/spelling.txt: swith||switch swithable||switchable swithed||switched swithing||switching While we are here, fix the "update" to "updates" in the touched hunk in drivers/net/wireless/marvell/mwifiex/wmm.c. Link: http://lkml.kernel.org/r/1481573103-11329-2-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-14ARM: cache-v7: optimise test for Cortex A9 r0pX devicesRussell King
Eliminate one unnecessary instruction from this test by pre-shifting the Cortex A9 ID - we can shift the actual ID in the teq instruction thereby losing the pX bit of the ID at no cost. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-04-14ARM: cache-v7: optimise branches in v7_flush_cache_louisRussell King
Optimise the branches such that for the majority of unaffected devices, we avoid needing to execute the errata work-around code path by branching to start_flush_levels early. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-04-14ARM: cache-v7: consolidate initialisation of cache level indexRussell King
Both v7_flush_cache_louis and v7_flush_dcache_all both begin the flush_levels loop with r10 initialised to zero. In each case, this is done immediately prior to entering the loop. Branch to this instruction in v7_flush_dcache_all from v7_flush_cache_louis and eliminate the unnecessary initialisation in v7_flush_cache_louis. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-04-14ARM: cache-v7: shift CLIDR to extract appropriate field before maskingRussell King
Rather than have code which masks and then shifts, such as: mrc p15, 1, r0, c0, c0, 1 ALT_SMP(ands r3, r0, #7 << 21) ALT_UP( ands r3, r0, #7 << 27) ALT_SMP(mov r3, r3, lsr #20) ALT_UP( mov r3, r3, lsr #26) re-arrange this as a shift and then mask. The masking is the same for each field which we want to extract, so this allows the mask to be shared amongst code paths: mrc p15, 1, r0, c0, c0, 1 ALT_SMP(mov r3, r0, lsr #20) ALT_UP( mov r3, r0, lsr #26) ands r3, r3, #7 << 1 Use this method for the LoUIS, LoUU and LoC fields. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-04-14ARM: cache-v7: use movw/movt instructionsRussell King
We always build cache-v7.S for ARMv7, so we can use the ARMv7 16-bit move instructions to load large constants, rather than using constants in a literal pool. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-07-18ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+Russell King
ARMv6 and greater introduced a new instruction ("bx") which can be used to return from function calls. Recent CPUs perform better when the "bx lr" instruction is used rather than the "mov pc, lr" instruction, and this sequence is strongly recommended to be used by the ARM architecture manual (section A.4.1.1). We provide a new macro "ret" with all its variants for the condition code which will resolve to the appropriate instruction. Rather than doing this piecemeal, and miss some instances, change all the "mov pc" instances to use the new macro, with the exception of the "movs" instruction and the kprobes code. This allows us to detect the "mov pc, lr" case and fix it up - and also gives us the possibility of deploying this for other registers depending on the CPU selection. Reported-by: Will Deacon <will.deacon@arm.com> Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1 Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood Tested-by: Shawn Guo <shawn.guo@freescale.com> Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385 Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-05-25ARM: 8055/1: cacheflush: use -st dsb option for ensuring completionWill Deacon
dsb st can be used to ensure completion of pending cache maintenance operations, so use it for the v7 cache maintenance operations. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-12-29ARM: 7919/1: mm: refactor v7 cache cleaning ops to use way/index sequenceLorenzo Pieralisi
Set-associative caches on all v7 implementations map the index bits to physical addresses LSBs and tag bits to MSBs. As the last level of cache on current and upcoming ARM systems grows in size, this means that under normal DRAM controller configurations, the current v7 cache flush routine using set/way operations triggers a DRAM memory controller precharge/activate for every cache line writeback since the cache routine cleans lines by first fixing the index and then looping through ways (index bits are mapped to lower physical addresses on all v7 cache implementations; this means that, with last level cache sizes in the order of MBytes, lines belonging to the same set but different ways map to different DRAM pages). Given the random content of cache tags, swapping the order between indexes and ways loops do not prevent DRAM pages precharge and activate cycles but at least, on average, improves the chances that either multiple lines hit the same page or multiple lines belong to different DRAM banks, improving throughput significantly. This patch swaps the inner loops in the v7 cache flushing routine to carry out the clean operations first on all sets belonging to a given way (looping through sets) and then decrementing the way. Benchmarks showed that by swapping the ordering in which sets and ways are decremented in the v7 cache flushing routine, that uses set/way operations, time required to flush caches is reduced significantly, owing to improved writebacks throughput to the DRAM controller. Benchmarks results vary and depend heavily on the last level of cache tag RAM content when cache is cleaned and invalidated, ranging from 2x throughput when all tag RAM entries contain dirty lines mapping to sequential pages of RAM to 1x (ie no improvement) when all tag RAM accesses trigger a DRAM precharge/activate cycle, as the current code implies on most DRAM controller configurations. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Nicolas Pitre <nico@linaro.org> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-12ARM: mm: use inner-shareable barriers for TLB and user cache operationsWill Deacon
System-wide barriers aren't required for situations where we only need to make visibility and ordering guarantees in the inner-shareable domain (i.e. we are not dealing with devices or potentially incoherent CPUs). This patch changes the v7 TLB operations, coherent_user_range and dcache_clean_area functions to user inner-shareable barriers. For cache maintenance, only the store access type is required to ensure completion. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-06-17ARM: 7752/1: errata: LoUIS bit field in CLIDR register is incorrectJon Medhurst
On Cortex-A9 before version r1p0, the LoUIS bit field of the CLIDR register returns zero when it should return one. This leads to cache maintenance operations which rely on this value to not function as intended, causing data corruption. The workaround for this errata is to detect affected CPUs and correct the LoUIS value read. Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Nicolas Pitre <nico@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Jon Medhurst <tixy@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-02-11arm: Add v7_invalidate_l1 to cache-v7.SDinh Nguyen
mach-socfpga is another platform that needs to use v7_invalidate_l1 to bringup additional cores. There was a comment that the ideal place for v7_invalidate_l1 should be in arm/mm/cache-v7.S Signed-off-by: Dinh Nguyen <dinguyen@altera.com> Acked-by: Simon Horman <horms+renesas@verge.net.au> Acked-by: Stephen Warren <swarren@nvidia.com> Reviewed-by: Pavel Machek <pavel@denx.de> Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Pavel Machek <pavel@denx.de> Tested-by: Stephen Warren <swarren@nvidia.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Olof Johansson <olof@lixom.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Sascha Hauer <kernel@pengutronix.de> Cc: Magnus Damm <magnus.damm@gmail.com> Signed-off-by: Olof Johansson <olof@lixom.net>
2012-12-20ARM: 7606/1: cache: flush to LoUU instead of LoUIS on uniprocessor CPUsWill Deacon
flush_cache_louis flushes the D-side caches to the point of unification inner-shareable. On uniprocessor CPUs, this is defined as zero and therefore no flushing will take place. Rather than invent a new interface for UP systems, instead use our SMP_ON_UP patching code to read the LoUU from the CLIDR instead. Cc: <stable@vger.kernel.org> Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com> Tested-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-10-11Merge branch 'fixes' into for-linusRussell King
Conflicts: arch/arm/kernel/smp.c
2012-09-28ARM: 7541/1: Add ARM ERRATA 775420 workaroundSimon Horman
arm: Add ARM ERRATA 775420 workaround Workaround for the 775420 Cortex-A9 (r2p2, r2p6,r2p8,r2p10,r3p0) erratum. In case a date cache maintenance operation aborts with MMU exception, it might cause the processor to deadlock. This workaround puts DSB before executing ISB if an abort may occur on cache maintenance. Based on work by Kouei Abe and feedback from Catalin Marinas. Signed-off-by: Kouei Abe <kouei.abe.cp@rms.renesas.com> [ horms@verge.net.au: Changed to implementation suggested by catalin.marinas@arm.com ] Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-09-25ARM: mm: rename jump labels in v7_flush_dcache_all functionLorenzo Pieralisi
This patch renames jump labels in v7_flush_dcache_all in order to define a specific flush cache levels entry point. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Tested-by: Shawn Guo <shawn.guo@linaro.org>
2012-09-25ARM: mm: implement LoUIS API for cache maintenance opsLorenzo Pieralisi
ARM v7 architecture introduced the concept of cache levels and related control registers. New processors like A7 and A15 embed an L2 unified cache controller that becomes part of the cache level hierarchy. Some operations in the kernel like cpu_suspend and __cpu_disable do not require a flush of the entire cache hierarchy to DRAM but just the cache levels belonging to the Level of Unification Inner Shareable (LoUIS), which in most of ARM v7 systems correspond to L1. The current cache flushing API used in cpu_suspend and __cpu_disable, flush_cache_all(), ends up flushing the whole cache hierarchy since for v7 it cleans and invalidates all cache levels up to Level of Coherency (LoC) which cripples system performance when used in hot paths like hotplug and cpuidle. Therefore a new kernel cache maintenance API must be added to cope with latest ARM system requirements. This patch adds flush_cache_louis() to the ARM kernel cache maintenance API. This function cleans and invalidates all data cache levels up to the Level of Unification Inner Shareable (LoUIS) and invalidates the instruction cache for processors that support it (> v7). This patch also creates an alias of the cache LoUIS function to flush_kern_all for all processor versions prior to v7, so that the current cache flushing behaviour is unchanged for those processors. v7 cache maintenance code implements a cache LoUIS function that cleans and invalidates the D-cache up to LoUIS and invalidates the I-cache, according to the new API. Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Tested-by: Shawn Guo <shawn.guo@linaro.org>
2012-05-02ARM: 7408/1: cacheflush: return error to userspace when flushing syscall failsWill Deacon
The cacheflush syscall can fail for two reasons: (1) The arguments are invalid (nonsensical address range or no VMA) (2) The region generates a translation fault on a VIPT or PIPT cache This patch allows do_cache_op to return an error code to userspace in the case of the above. The various coherent_user_range implementations are modified to return 0 in the case of VIVT caches or -EFAULT in the case of an abort on v6/v7 cores. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-15ARM: 7325/1: fix v7 boot with lockdep enabledRabin Vincent
Bootup with lockdep enabled has been broken on v7 since b46c0f74657d ("ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR"). This is because v7_setup (which is called very early during boot) calls v7_flush_dcache_all, and the save_and_disable_irqs added by that patch ends up attempting to call into lockdep C code (trace_hardirqs_off()) when we are in no position to execute it (no stack, MMU off). Fix this by using a notrace variant of save_and_disable_irqs. The code already uses the notrace variant of restore_irqs. Reviewed-by: Nicolas Pitre <nico@linaro.org> Acked-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: stable@vger.kernel.org Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-09ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDRStephen Boyd
armv7's flush_cache_all() flushes caches via set/way. To determine the cache attributes (line size, number of sets, etc.) the assembly first writes the CSSELR register to select a cache level and then reads the CCSIDR register. The CSSELR register is banked per-cpu and is used to determine which cache level CCSIDR reads. If the task is migrated between when the CSSELR is written and the CCSIDR is read the CCSIDR value may be for an unexpected cache level (for example L1 instead of L2) and incorrect cache flushing could occur. Disable interrupts across the write and read so that the correct cache attributes are read and used for the cache flushing routine. We disable interrupts instead of disabling preemption because the critical section is only 3 instructions and we want to call v7_dcache_flush_all from __v7_setup which doesn't have a full kernel stack with a struct thread_info. This fixes a problem we see in scm_call() when flush_cache_all() is called from preemptible context and sometimes the L2 cache is not properly flushed out. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Nicolas Pitre <nico@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-09-17ARM: 7091/1: errata: D-cache line maintenance operation by MVA may not succeedWill Deacon
This patch implements a workaround for erratum 764369 affecting Cortex-A9 MPCore with two or more processors (all current revisions). Under certain timing circumstances, a data cache line maintenance operation by MVA targeting an Inner Shareable memory region may fail to proceed up to either the Point of Coherency or to the Point of Unification of the system. This workaround adds a DSB instruction before the relevant cache maintenance functions and sets a specific bit in the diagnostic control register of the SCU. Cc: <stable@kernel.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-07-07ARM: mm: cache-v7: Use the new processor struct macrosDave Martin
Signed-off-by: Dave Martin <dave.martin@linaro.org>
2011-05-26ARM: 6941/1: cache: ensure MVA is cacheline aligned in flush_kern_dcache_areaWill Deacon
The v6 and v7 implementations of flush_kern_dcache_area do not align the passed MVA to the size of a cacheline in the data cache. If a misaligned address is used, only a subset of the requested area will be flushed. This has been observed to cause failures in SMP boot where the secondary_data initialised by the primary CPU is not cacheline aligned, causing the secondary CPUs to read incorrect values for their pgd and stack pointers. This patch ensures that the base address is cacheline aligned before flushing the d-cache. Cc: <stable@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-03-31Fix common misspellingsLucas De Marchi
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2010-12-12ARM: 6528/1: Use CTR for the I-cache line size on ARMv7Catalin Marinas
The current implementation of the v7_coherent_*_range function assumes that the D and I cache lines have the same size, which is incorrect architecturally. This patch adds the icache_line_size macro which reads the CTR register. The main loop in v7_coherent_*_range is split in two independent loops or the D and I caches. This also has the performance advantage that the DSB is moved outside the main loop. Reported-by: Kevin Sapp <ksapp@quicinc.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-10-04ARM: 6405/1: Handle __flush_icache_all for CONFIG_SMP_ON_UPTony Lindgren
Do this by adding flush_icache_all to cache_fns for ARMv6 and 7. As flush_icache_all may neeed to be called from flush_kern_cache_all, add it as the first entry in the cache_fns. Note that now we can remove the ARM_ERRATA_411920 dependency to !SMP so it can be selected on UP ARMv6 processors, such as omap2. Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Anand Gadiyar <gadiyar@ti.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-10-04ARM: Allow SMP kernels to boot on UP systemsRussell King
UP systems do not implement all the instructions that SMP systems have, so in order to boot a SMP kernel on a UP system, we need to rewrite parts of the kernel. Do this using an 'alternatives' scheme, where the kernel code and data is modified prior to initialization to replace the SMP instructions, thereby rendering the problematical code ineffectual. We use the linker to generate a list of 32-bit word locations and their replacement values, and run through these replacements when we detect a UP system. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-20ARM: 6139/1: ARMv7: Use the Inner Shareable I-cache on MPSantosh Shilimkar
This patch fixes the flush_cache_all for ARMv7 SMP.It was missing from commit b8349b569aae661dea9d59d7d2ee587ccea3336c Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@kernel.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-08ARM: 6112/1: Use the Inner Shareable I-cache and BTB ops on ARMv7 SMPCatalin Marinas
The standard I-cache Invalidate All (ICIALLU) and Branch Predication Invalidate All (BPIALL) operations are not automatically broadcast to the other CPUs in an ARMv7 MP system. The patch adds the Inner Shareable variants, ICIALLUIS and BPIALLIS, if ARMv7 and SMP. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-02-15ARM: dma-mapping: fix for speculative prefetchingRussell King
ARMv6 and ARMv7 CPUs can perform speculative prefetching, which makes DMA cache coherency handling slightly more interesting. Rather than being able to rely upon the CPU not accessing the DMA buffer until DMA has completed, we now must expect that the cache could be loaded with possibly stale data from the DMA buffer. Where DMA involves data being transferred to the device, we clean the cache before handing it over for DMA, otherwise we invalidate the buffer to get rid of potential writebacks. On DMA Completion, if data was transferred from the device, we invalidate the buffer to get rid of any stale speculative prefetches. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-By: Santosh Shilimkar <santosh.shilimkar@ti.com>
2010-02-15ARM: dma-mapping: remove dmac_clean_range and dmac_inv_rangeRussell King
These are now unused, and so can be removed. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-By: Santosh Shilimkar <santosh.shilimkar@ti.com>
2010-02-15ARM: dma-mapping: provide per-cpu type map/unmap functionsRussell King
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-By: Santosh Shilimkar <santosh.shilimkar@ti.com>
2009-12-14ARM: add size argument to __cpuc_flush_dcache_pageRussell King
... and rename the function since it no longer operates on just pages. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-10-07ARM: 5746/1: Handle possible translation errors in ARMv6/v7 coherent_user_rangeCatalin Marinas
This is needed because applications using the sys_cacheflush system call can pass a memory range which isn't mapped yet even though the corresponding vma is valid. The patch also adds unwinding annotations for correct backtraces from the coherent_user_range() functions. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-07-24Thumb-2: Implement the unified arch/arm/mm supportCatalin Marinas
This patch adds the ARM/Thumb-2 unified support to the arch/arm/mm/* files. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2008-11-06ARMv7: Add extra barriers for flush_cache_all compressed/head.SCatalin Marinas
The flush_cache_all function on ARMv7 is implemented as a series of cache operations by set/way. These are not guaranteed to be ordered with previous memory accesses, requiring a DMB. This patch also adds barriers for the TLB operations in compressed/head.S Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2008-09-01[ARM] 5227/1: Add the ENDPROC declarations to the .S filesCatalin Marinas
This declaration specifies the "function" type and size for various assembly functions, mainly needed for generating the correct branch instructions in Thumb-2. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2007-05-08[ARM] armv7: add support for ARMv7 cores.Catalin Marinas
This patch adds support for the ARMv7 cores. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>