aboutsummaryrefslogtreecommitdiff
path: root/drivers/perf
AgeCommit message (Collapse)Author
2022-03-25Merge tag 'riscv-for-linus-5.18-mw0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for Sv57-based virtual memory. - Various improvements for the MicroChip PolarFire SOC and the associated Icicle dev board, which should allow upstream kernels to boot without any additional modifications. - An improved memmove() implementation. - Support for the new Ssconfpmf and SBI PMU extensions, which allows for a much more useful perf implementation on RISC-V systems. - Support for restartable sequences. * tag 'riscv-for-linus-5.18-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (36 commits) rseq/selftests: Add support for RISC-V RISC-V: Add support for restartable sequence MAINTAINERS: Add entry for RISC-V PMU drivers Documentation: riscv: Remove the old documentation RISC-V: Add sscofpmf extension support RISC-V: Add perf platform driver based on SBI PMU extension RISC-V: Add RISC-V SBI PMU extension definitions RISC-V: Add a simple platform driver for RISC-V legacy perf RISC-V: Add a perf core library for pmu drivers RISC-V: Add CSR encodings for all HPMCOUNTERS RISC-V: Remove the current perf implementation RISC-V: Improve /proc/cpuinfo output for ISA extensions RISC-V: Do no continue isa string parsing without correct XLEN RISC-V: Implement multi-letter ISA extension probing framework RISC-V: Extract multi-letter extension names from "riscv, isa" RISC-V: Minimal parser for "riscv, isa" strings RISC-V: Correctly print supported extensions riscv: Fixed misaligned memory access. Fixed pointer comparison. MAINTAINERS: update riscv/microchip entry riscv: dts: microchip: add new peripherals to icicle kit device tree ...
2022-03-24Merge tag 'iommu-updates-v5.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - IOMMU Core changes: - Removal of aux domain related code as it is basically dead and will be replaced by iommu-fd framework - Split of iommu_ops to carry domain-specific call-backs separatly - Cleanup to remove useless ops->capable implementations - Improve 32-bit free space estimate in iova allocator - Intel VT-d updates: - Various cleanups of the driver - Support for ATS of SoC-integrated devices listed in ACPI/SATC table - ARM SMMU updates: - Fix SMMUv3 soft lockup during continuous stream of events - Fix error path for Qualcomm SMMU probe() - Rework SMMU IRQ setup to prepare the ground for PMU support - Minor cleanups and refactoring - AMD IOMMU driver: - Some minor cleanups and error-handling fixes - Rockchip IOMMU driver: - Use standard driver registration - MSM IOMMU driver: - Minor cleanup and change to standard driver registration - Mediatek IOMMU driver: - Fixes for IOTLB flushing logic * tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (47 commits) iommu/amd: Improve amd_iommu_v2_exit() iommu/amd: Remove unused struct fault.devid iommu/amd: Clean up function declarations iommu/amd: Call memunmap in error path iommu/arm-smmu: Account for PMU interrupts iommu/vt-d: Enable ATS for the devices in SATC table iommu/vt-d: Remove unused function intel_svm_capable() iommu/vt-d: Add missing "__init" for rmrr_sanity_check() iommu/vt-d: Move intel_iommu_ops to header file iommu/vt-d: Fix indentation of goto labels iommu/vt-d: Remove unnecessary prototypes iommu/vt-d: Remove unnecessary includes iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO iommu/vt-d: Remove domain and devinfo mempool iommu/vt-d: Remove iova_cache_get/put() iommu/vt-d: Remove finding domain in dmar_insert_one_dev_info() iommu/vt-d: Remove intel_iommu::domains iommu/mediatek: Always tlb_flush_all when each PM resume iommu/mediatek: Add tlb_lock in tlb_flush_all iommu/mediatek: Remove the power status checking in tlb flush all ...
2022-03-21RISC-V: Add sscofpmf extension supportAtish Patra
The sscofpmf extension allows counter overflow and filtering for programmable counters. Enable the perf driver to handle the overflow interrupt. The overflow interrupt is a hart local interrupt. Thus, per cpu overflow interrupts are setup as a child under the root INTC irq domain. Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-21RISC-V: Add perf platform driver based on SBI PMU extensionAtish Patra
RISC-V SBI specification added a PMU extension that allows to configure start/stop any pmu counter. The RISC-V perf can use most of the generic perf features except interrupt overflow and event filtering based on privilege mode which will be added in future. It also allows to monitor a handful of firmware counters that can provide insights into firmware activity during a performance analysis. Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-21RISC-V: Add a simple platform driver for RISC-V legacy perfAtish Patra
The old RISC-V perf implementation allowed counting of only cycle/instruction counters using perf. Restore that feature by implementing a simple platform driver under a separate config to provide backward compatibility. Any existing software stack will continue to work as it is. However, it provides an easy way out in future where we can remove the legacy driver. Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-21RISC-V: Add a perf core library for pmu driversAtish Patra
Implement a perf core library that can support all the essential perf features in future. It can also accommodate any type of PMU implementation in future. Currently, both SBI based perf driver and legacy driver implemented uses the library. Most of the common perf functionalities are kept in this core library wile PMU specific driver can implement PMU specific features. For example, the SBI specific functionality will be implemented in the SBI specific driver. Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-09perf/marvell: Fix !CONFIG_OF build for CN10K DDR PMU driverWill Deacon
When compiling the Marvell CN10K DDR PMU driver with CONFIG_OF=n, the build fails: | drivers/perf/marvell_cn10k_ddr_pmu.c:723:35: error: 'cn10k_ddr_pmu_of_match' undeclared here (not in a function); did you mean 'cn10k_ddr_pmu_driver'? Use `of_match_ptr()` to avoid referencing the non-existent match table in this configuration. Link: https://lore.kernel.org/r/202203091424.Vfe8J4W9-lkp@intel.com Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08Merge branch 'for-next/perf-m1' into for-next/perfWill Deacon
Support for the CPU PMUs on the Apple M1. * for-next/perf-m1: drivers/perf: Add Apple icestorm/firestorm CPU PMU driver drivers/perf: arm_pmu: Handle 47 bit counters irqchip/apple-aic: Move PMU-specific registers to their own include file arm64: dts: apple: Add t8303 PMU nodes arm64: dts: apple: Add t8103 PMU interrupt affinities irqchip/apple-aic: Wire PMU interrupts irqchip/apple-aic: Parse FIQ affinities from device-tree dt-bindings: apple,aic: Add affinity description for per-cpu pseudo-interrupts dt-bindings: apple,aic: Add CPU PMU per-cpu pseudo-interrupts dt-bindings: arm-pmu: Document Apple PMU compatible strings
2022-03-08drivers/perf: Add Apple icestorm/firestorm CPU PMU driverMarc Zyngier
Add a new, weird and wonderful driver for the equally weird Apple PMU HW. Although the PMU itself is functional, we don't know much about the events yet, so this can be considered as yet another random number generator... Nonetheless, it can reliably count at least cycles and instructions in the usually wonky big-little way. For anything else, it of course supports raw event numbers. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08drivers/perf: arm_pmu: Handle 47 bit countersMarc Zyngier
The current ARM PMU framework can only deal with 32 or 64bit counters. Teach it about a 47bit flavour. Yes, this is odd. Reviewed-by: Hector Martin <marcan@marcan.st> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08perf/marvell: cn10k DDR perf event core ownershipBharat Bhushan
As DDR perf event counters are not per core, so they should be accessed only by one core at a time. Select new core when previously owning core is going offline. Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com> Reviewed-by: Bhaskara Budiredla <bbudiredla@marvell.com> Link: https://lore.kernel.org/r/20220211045346.17894-5-bbhushan2@marvell.com Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08perf/marvell: cn10k DDR perfmon event overflow handlingBharat Bhushan
CN10k DSS h/w perfmon does not support event overflow interrupt, so periodic timer is being used. Each event counter is 48bit, which in worst case scenario can increment at maximum 5.6 GT/s. At this rate it may take many hours to overflow these counters. Therefore polling period for overflow is set to 100 sec, which can be changed using sysfs parameter. Two fixed event counters starts counting from zero on overflow, so overflow condition is when new count less than previous count. While eight programmable event counters freezes at maximum value. Also individual counter cannot be restarted, so need to restart all eight counters. Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com> Reviewed-by: Bhaskara Budiredla <bbudiredla@marvell.com> Link: https://lore.kernel.org/r/20220211045346.17894-4-bbhushan2@marvell.com Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08perf/marvell: CN10k DDR performance monitor supportBharat Bhushan
Marvell CN10k DRAM Subsystem (DSS) supports eight event counters for monitoring performance and software can program each counter to monitor any of the defined performance event. Performance events are for interface between the DDR controller and the PHY, interface between the DDR Controller and the CHI interconnect, or within the DDR Controller. Additionally DSS also supports two fixed performance event counters, one for number of ddr reads and other for ddr writes. This patch add basic support for these performance monitoring events on CN10k. Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com> Reviewed-by: Bhaskara Budiredla <bbudiredla@marvell.com> Link: https://lore.kernel.org/r/20220211045346.17894-3-bbhushan2@marvell.com Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08perf/arm-cmn: Update watchpoint formatRobin Murphy
From CMN-650 onwards, some of the fields in the watchpoint config registers moved subtly enough to easily overlook. Watchpoint events are still only partially supported on newer IPs - which in itself deserves noting - but were not intended to become any *less* functional than on CMN-600. Fixes: 60d1504070c2 ("perf/arm-cmn: Support new IP features") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/e1ce4c2f1e4f73ab1c60c3a85e4037cd62dd6352.1645727871.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08perf/arm-cmn: Hide XP PUB events for CMN-600Robin Murphy
CMN-600 doesn't have XP events for the PUB channel, but we missed the appropriate check to avoid exposing them. Fixes: 60d1504070c2 ("perf/arm-cmn: Support new IP features") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/4c108d39a0513def63acccf09ab52b328f242aeb.1645727871.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-02-15perf/smmuv3: Don't cast parameter in bit operationsAndy Shevchenko
While in this particular case it would not be a (critical) issue, the pattern itself is bad and error prone in case somebody blindly copies to their code. Don't cast parameter to unsigned long pointer in the bit operations. Instead copy to a local variable on stack of a proper type and use. Note, new compilers might warn on this line for potential outbound access. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20220209184758.56578-1-andriy.shevchenko@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>
2022-02-15perf: replace bitmap_weight with bitmap_empty where appropriateYury Norov
In some places, drivers/perf code calls bitmap_weight() to check if any bit of a given bitmap is set. It's better to use bitmap_empty() in that case because bitmap_empty() stops traversing the bitmap as soon as it finds first set bit, while bitmap_weight() counts all bits unconditionally. Signed-off-by: Yury Norov <yury.norov@gmail.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220210224933.379149-13-yury.norov@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2022-02-08perf: Replace acpi_bus_get_device()Rafael J. Wysocki
Replace acpi_bus_get_device() that is going to be dropped with acpi_fetch_acpi_dev(). No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/10025610.nUPlyArG6x@kreacher Signed-off-by: Will Deacon <will@kernel.org>
2022-02-08perf/marvell_cn10k: Fix unused variable warning when W=1 and CONFIG_OF=nWill Deacon
The kbuild helpfully reports that the Marvell CN10K TAD PMU driver emits a warning when building with W=1 and CONFIG_OF=n: | >> drivers/perf/marvell_cn10k_tad_pmu.c:371:34: warning: unused variable 'tad_pmu_of_match' [-Wunused-const-variable] static const struct of_device_id tad_pmu_of_match[] = { Guard the match table with CONFIG_OF to squash the warning. Link: https://lore.kernel.org/r/202201292349.zRQLcDDD-lkp@intel.com Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Will Deacon <will@kernel.org>
2022-02-08perf/arm-cmn: Make arm_cmn_debugfs staticRobin Murphy
Indeed our debugfs directory is driver-internal so should be static. Link: https://lore.kernel.org/r/202202030812.II1K2ZXf-lkp@intel.com Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/ca9248caaae69b5134f69e085fe78905dfe74378.1643911278.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-02-08perf: MARVELL_CN10K_TAD_PMU should depend on ARCH_THUNDERGeert Uytterhoeven
The Marvell CN10K Last-Level cache Tag-and-data Units (LLC-TAD) performance monitor is only present on Marvell CN10K SoCs. Hence add a dependency on ARCH_THUNDER, to prevent asking the user about this driver when configuring a kernel without Cavium Thunder (incl. Marvell CN10K) SoC support. Fixes: 036a7584bede ("drivers: perf: Add LLC-TAD perf counter support") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/b4662a2c767d04cca19417e0c845edea2da262ad.1641995941.git.geert+renesas@glider.be Signed-off-by: Will Deacon <will@kernel.org>
2022-02-08perf/arm-ccn: Use platform_get_irq() to get the interruptLad Prabhakar
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq(). Link: https://lore.kernel.org/r/20211224161334.31123-1-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Will Deacon <will@kernel.org>
2022-01-13Merge tag 'irq-msi-2022-01-13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MSI irq updates from Thomas Gleixner: "Rework of the MSI interrupt infrastructure. This is a treewide cleanup and consolidation of MSI interrupt handling in preparation for further changes in this area which are necessary to: - address existing shortcomings in the VFIO area - support the upcoming Interrupt Message Store functionality which decouples the message store from the PCI config/MMIO space" * tag 'irq-msi-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (94 commits) genirq/msi: Populate sysfs entry only once PCI/MSI: Unbreak pci_irq_get_affinity() genirq/msi: Convert storage to xarray genirq/msi: Simplify sysfs handling genirq/msi: Add abuse prevention comment to msi header genirq/msi: Mop up old interfaces genirq/msi: Convert to new functions genirq/msi: Make interrupt allocation less convoluted platform-msi: Simplify platform device MSI code platform-msi: Let core code handle MSI descriptors bus: fsl-mc-msi: Simplify MSI descriptor handling soc: ti: ti_sci_inta_msi: Remove ti_sci_inta_msi_domain_free_irqs() soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation NTB/msi: Convert to msi_on_each_desc() PCI: hv: Rework MSI handling powerpc/mpic_u3msi: Use msi_for_each-desc() powerpc/fsl_msi: Use msi_for_each_desc() powerpc/pasemi/msi: Convert to msi_on_each_dec() powerpc/cell/axon_msi: Convert to msi_on_each_desc() powerpc/4xx/hsta: Rework MSI handling ...
2022-01-04drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL checkDan Carpenter
The devm_ioremap() function does not return error pointers. It returns NULL. Fixes: 036a7584bede ("drivers: perf: Add LLC-TAD perf counter support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20211217145907.GA16611@kili Signed-off-by: Will Deacon <will@kernel.org>
2022-01-04perf/smmuv3: Fix unused variable warning when CONFIG_OF=nWill Deacon
The kbuild robot reports that building the SMMUv3 PMU driver with CONFIG_OF=n results in a warning for W=1 builds: >> drivers/perf/arm_smmuv3_pmu.c:889:34: warning: unused variable 'smmu_pmu_of_match' [-Wunused-const-variable] static const struct of_device_id smmu_pmu_of_match[] = { ^ Guard the match table with #ifdef CONFIG_OF. Link: https://lore.kernel.org/r/202201041700.01KZEzhb-lkp@intel.com Fixes: 3f7be4356176 ("perf/smmuv3: Add devicetree support") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Will Deacon <will@kernel.org>
2021-12-16perf/smmuv3: Use msi_get_virq()Thomas Gleixner
Let the core code fiddle with the MSI descriptor retrieval. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20211210221815.029143589@linutronix.de
2021-12-14Merge branch 'for-next/perf-smmu' into for-next/perfWill Deacon
* for-next/perf-smmu: perf/smmuv3: Synthesize IIDR from CoreSight ID registers perf/smmuv3: Add devicetree support dt-bindings: Add Arm SMMUv3 PMCG binding
2021-12-14Merge branch 'for-next/perf-hisi' into for-next/perfWill Deacon
* for-next/perf-hisi: drivers/perf: hisi: Add driver for HiSilicon PCIe PMU docs: perf: Add description for HiSilicon PCIe PMU driver
2021-12-14Merge branch 'for-next/perf-cn10k' into for-next/perfWill Deacon
* for-next/perf-cn10k: dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings drivers: perf: Add LLC-TAD perf counter support
2021-12-14drivers/perf: hisi: Add driver for HiSilicon PCIe PMUQi Liu
PCIe PMU Root Complex Integrated End Point(RCiEP) device is supported to sample bandwidth, latency, buffer occupation etc. Each PMU RCiEP device monitors multiple Root Ports, and each RCiEP is registered as a PMU in /sys/bus/event_source/devices, so users can select target PMU, and use filter to do further sets. Filtering options contains: event - select the event. port - select target Root Ports. Information of Root Ports are shown under sysfs. bdf - select requester_id of target EP device. trig_len - set trigger condition for starting event statistics. trig_mode - set trigger mode. 0 means starting to statistic when bigger than trigger condition, and 1 means smaller. thr_len - set threshold for statistics. thr_mode - set threshold mode. 0 means count when bigger than threshold, and 1 means smaller. Acked-by: Krzysztof Wilczyński <kw@linux.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/20211202080633.2919-3-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14drivers: perf: Add LLC-TAD perf counter supportBhaskara Budiredla
This driver adds support for Last-level cache tag-and-data unit (LLC-TAD) PMU that is featured in some of the Marvell's CN10K infrastructure silicons. The LLC is divided into 2N slices distributed across N Mesh tiles in a single-socket configuration. The driver always configures the same counter for all of the TADs. The user would end up effectively reserving one of eight counters in every TAD to look across all TADs. The occurrences of events are aggregated and presented to the user at the end of an application run. The driver does not provide a way for the user to partition TADs so that different TADs are used for different applications. The event counters are zeroed to start event counting to avoid any rollover issues. TAD perf counters are 64-bit, so it's not currently possible to overflow event counters at current mesh and core frequencies. To measure tad pmu events use perf tool stat command. For instance: perf stat -e tad_dat_msh_in_dss,tad_req_msh_out_any <application> perf stat -e tad_alloc_any,tad_hit_any,tad_tag_rd <application> Signed-off-by: Bhaskara Budiredla <bbudiredla@marvell.com> Link: https://lore.kernel.org/r/20211115043506.6679-2-bbudiredla@marvell.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14perf/smmuv3: Synthesize IIDR from CoreSight ID registersRobin Murphy
The SMMU_PMCG_IIDR register was not present in older revisions of the Arm SMMUv3 spec. On Arm Ltd. implementations, the IIDR value consists of fields from several PIDR registers, allowing us to present a standardized identifier to userspace. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/r/20211117144844.241072-4-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14perf/smmuv3: Add devicetree supportJean-Philippe Brucker
Add device-tree support to the SMMUv3 PMCG driver. Signed-off-by: Jay Chen <jkchen@linux.alibaba.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20211117144844.241072-3-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14perf/arm-cmn: Add debugfs topology infoRobin Murphy
In general, detailed performance analysis will require knoweldge of the the SoC beyond the CMN itself - e.g. which actual CPUs/peripherals/etc. are connected to each node. However for certain development and bringup tasks it can be useful to have a quick overview of the CMN internal topology to hand too. Add a debugfs file to map this out. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/159fd4d7e19fb3c8801a8cb64ee73ec50f55903c.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14perf/arm-cmn: Add CI-700 SupportRobin Murphy
Add the identifiers and events for the CI-700 coherent interconnect. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/28f566ab23a83733c6c9ef9414c010b760b4549c.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14perf/arm-cmn: Support new IP featuresRobin Murphy
The second generation of CMN IPs add new node types and significantly expand the configuration space with options for extra device ports on edge XPs, either plumbed into the regular DTM or with extra dedicated DTMs to monitor them, plus larger (and smaller) mesh sizes. Add basic support for pulling this new information out of the hardware, piping it around as necessary, and handling (most of) the new choices. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/e58b495bcc7deec3882be4bac910ed0bf6979674.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14perf/arm-cmn: Demarcate CMN-600 specificsRobin Murphy
In preparation for supporting newer CMN products, let's introduce a means to differentiate the features and events which are specific to a particular IP from those which remain common to the whole family. The newer designs have also smoothed off some of the rough edges in terms of discoverability, so separate out the parts of the flow which have effectively now become CMN-600 quirks. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/9f6368cdca4c821d801138939508a5bba54ccabb.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14perf/arm-cmn: Move group validation data off-stackRobin Murphy
With the value of CMN_MAX_DTMS increasing significantly, our validation data structure is set to get quite big. Technically we could pack it at least twice as densely, since we only need around 19 bits of information per DTM, but that makes the code even more mind-bogglingly impenetrable, and even half of "quite big" may still be uncomfortably large for a stack frame (~1KB). Just move it to an off-stack allocation instead. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/0cabff2e5839ddc0979e757c55515966f65359e4.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14perf/arm-cmn: Optimise DTC counter accessesRobin Murphy
In cases where we do know which DTC domain a node belongs to, we can skip initialising or reading the global count in DTCs where we know it won't change. The machinery to achieve that is mostly in place already, so finish hooking it up by converting the vestigial domain tracking to propagate suitable bitmaps all the way through to events. Note that this does not allow allocating such an unused counter to a different event on that DTC, because that is a flippin' nightmare. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/51d930fd945ef51c81f5889ccca055c302b0a1d0.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14perf/arm-cmn: Optimise DTM counter readsRobin Murphy
When multiple nodes of the same type are connected to the same XP (particularly in CAL configurations), it seems that they are likely to be consecutive in logical ID. Therefore, we're likely to gain a small benefit from an easy tweak to optimise out consecutive reads of the same set of DTM counters for an aggregated event. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/7777d77c2df17693cd3dabb6e268906e15238d82.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14perf/arm-cmn: Refactor DTM handlingRobin Murphy
Untangle DTMs from XPs into a dedicated abstraction. This helps make things a little more obvious and robust, but primarily paves the way for further development where new IPs can grow extra DTMs per XP. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/9cca18b1b98f482df7f1aaf3d3213e7f39500423.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14perf/arm-cmn: Streamline node iterationRobin Murphy
Refactor the places where we scan through the set of nodes to switch from explicit array indexing to pointer-based iteration. This leads to slightly simpler object code, but also makes the source less dense and more pleasant for further development. It also unearths an almost-bug in arm_cmn_event_init() where we've been depending on the "array index" of NULL relative to cmn->dns being a sufficiently large number, yuck. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/ee0c9eda9a643f46001ac43aadf3f0b1fd5660dd.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14perf/arm-cmn: Refactor node ID handlingRobin Murphy
Add a bit more abstraction for the places where we decompose node IDs. This will help keep things nice and manageable when we come to add yet more variables which affect the node ID format. Also use the opportunity to move the rest of the low-level node management helpers back up to the logical place they were meant to be - how they ended up buried right in the middle of the event-related definitions is somewhat of a mystery... Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/a2242a8c3c96056c13a04ae87bf2047e5e64d2d9.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14perf/arm-cmn: Drop compile-test restrictionRobin Murphy
Although CMN is currently (and overwhelmingly likely to remain) deployed in arm64-only (modulo userspace) systems, the 64-bit "dependency" for compile-testing was just laziness due to heavy reliance on readq/writeq accessors. Since we only need one extra include for robustness in that regard, let's pull that in, widen the compile-test coverage, and fix up the smattering of type laziness that that brings to light. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/baee9ee0d0bdad8aaeb70f5a4b98d8fd4b1f5786.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14perf/arm-cmn: Account for NUMA affinityRobin Murphy
On a system with multiple CMN meshes, ideally we'd want to access each PMU from within its own mesh, rather than with a long CML round-trip, wherever feasible. Since such a system is likely to be presented as multiple NUMA nodes, let's also hope a proximity domain is specified for each CMN programming interface, and use that to guide our choice of IRQ affinity to favour a node-local CPU where possible. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/32438b0d016e0649d882d47d30ac2000484287b9.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14perf/arm-cmn: Fix CPU hotplug unregistrationRobin Murphy
Attempting to migrate the PMU context after we've unregistered the PMU device, or especially if we never successfully registered it in the first place, is a woefully bad idea. It's also fundamentally pointless anyway. Make sure to unregister an instance from the hotplug handler *without* invoking the teardown callback. Fixes: 0ba64770a2f2 ("perf: Add Arm CMN-600 PMU driver") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/2c221d745544774e4b07583b65b5d4d94f7e0fe4.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-11-02Merge tag 'acpi-5.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These update the ACPICA code in the kernel to the most recent upstream revision, address some issues related to the ACPI power resources management, simplify the enumeration of PCI devices having ACPI companions, add new quirks, fix assorted problems, update the ACPI-related information in maintainers and clean up code in several places. Specifics: - Update the ACPICA code in the kernel to upstream revision 20210930 including the following changes: - Fix system-wide resume issue caused by evaluating control methods too early in the resume path (Rafael Wysocki). - Add support for Windows 2020 _OSI string (Mario Limonciello). - Add Generic Port Affinity type for SRAT (Alison Schofield). - Add disassembly support for the NHLT ACPI table (Bob Moore). - Avoid flushing caches before entering C3 type of idle states on AMD processors (Deepak Sharma). - Avoid enumerating CPUs that are not present and not online-capable according to the platform firmware (Mario Limonciello). - Add DMI-based mechanism to quirk IRQ overrides and use it for two platforms (Hui Wang). - Change the configuration of unused ACPI device objects to reflect the D3cold power state after enumerating devices (Rafael Wysocki). - Update MAINTAINERS information regarding ACPI (Rafael Wysocki). - Fix typo in ACPI Kconfig (Masanari Iid). - Use sysfs_emit() instead of snprintf() in some places (Qing Wang). - Make the association of ACPI device objects with PCI devices more straightforward and simplify the code doing that for all devices in general (Rafael Wysocki). - Use acpi_device_adr() in acpi_find_child_device() instead of evaluating _ADR (Rafael Wysocki). - Drop duplicate device IDs from PNP device IDs list (Krzysztof Kozlowski). - Allow acpi_idle_play_dead() to use C3 on AMD processors (Richard Gong). - Use ACPI_COMPANION() to simplify code in some drivers (Rafael Wysocki). - Check the states of all ACPI power resources during initialization to avoid dealing with power resources in unknown states (Rafael Wysocki). - Fix ACPI power resource issues related to sharing wakeup power resources (Rafael Wysocki). - Avoid registering redundant suspend_ops (Rafael Wysocki). - Report battery charging state as "full" if it appears to be over the design capacity (André Almeida). - Quirk GK45 mini PC to skip reading _PSR in the AC driver (Stefan Schaeckeler). - Mark apei_hest_parse() static (Christoph Hellwig). - Relax platform response timeout to 1 second after instructing it to inject an error (Shuai Xue). - Make the PRM code handle memory allocation and remapping failures more gracefully and drop some unnecessary blank lines from that code (Aubrey Li). - Fix spelling mistake in the ACPI documentation (Colin Ian King)" * tag 'acpi-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (36 commits) ACPI: glue: Use acpi_device_adr() in acpi_find_child_device() perf: qcom_l2_pmu: ACPI: Use ACPI_COMPANION() directly ACPI: APEI: mark apei_hest_parse() static ACPI: APEI: EINJ: Relax platform response timeout to 1 second gpio-amdpt: ACPI: Use the ACPI_COMPANION() macro directly nouveau: ACPI: Use the ACPI_COMPANION() macro directly ACPI: resources: Add one more Medion model in IRQ override quirk ACPI: AC: Quirk GK45 to skip reading _PSR ACPI: PM: sleep: Do not set suspend_ops unnecessarily ACPI: PRM: Handle memory allocation and memory remap failure ACPI: PRM: Remove unnecessary blank lines ACPI: PM: Turn off wakeup power resources on _DSW/_PSW errors ACPI: PM: Fix sharing of wakeup power resources ACPI: PM: Turn off unused wakeup power resources ACPI: PM: Check states of power resources during initialization ACPI: replace snprintf() in "show" functions with sysfs_emit() ACPI: LPSS: Use ACPI_COMPANION() directly ACPI: scan: Release PM resources blocked by unused objects ACPI: battery: Accept charges over the design capacity as full ACPICA: Update version to 20210930 ...
2021-11-01Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "There's the usual summary below, but the highlights are support for the Armv8.6 timer extensions, KASAN support for asymmetric MTE, the ability to kexec() with the MMU enabled and a second attempt at switching to the generic pfn_valid() implementation. Summary: - Support for the Arm8.6 timer extensions, including a self-synchronising view of the system registers to elide some expensive ISB instructions. - Exception table cleanup and rework so that the fixup handlers appear correctly in backtraces. - A handful of miscellaneous changes, the main one being selection of CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK. - More mm and pgtable cleanups. - KASAN support for "asymmetric" MTE, where tag faults are reported synchronously for loads (via an exception) and asynchronously for stores (via a register). - Support for leaving the MMU enabled during kexec relocation, which significantly speeds up the operation. - Minor improvements to our perf PMU drivers. - Improvements to the compat vDSO build system, particularly when building with LLVM=1. - Preparatory work for handling some Coresight TRBE tracing errata. - Cleanup and refactoring of the SVE code to pave the way for SME support in future. - Ensure SCS pages are unpoisoned immediately prior to freeing them when KASAN is enabled for the vmalloc area. - Try moving to the generic pfn_valid() implementation again now that the DMA mapping issue from last time has been resolved. - Numerous improvements and additions to our FPSIMD and SVE selftests" [ armv8.6 timer updates were in a shared branch and already came in through -tip in the timer pull - Linus ] * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (85 commits) arm64: Select POSIX_CPU_TIMERS_TASK_WORK arm64: Document boot requirements for FEAT_SME_FA64 arm64/sve: Fix warnings when SVE is disabled arm64/sve: Add stub for sve_max_virtualisable_vl() arm64: errata: Add detection for TRBE write to out-of-range arm64: errata: Add workaround for TSB flush failures arm64: errata: Add detection for TRBE overwrite in FILL mode arm64: Add Neoverse-N2, Cortex-A710 CPU part definition selftests: arm64: Factor out utility functions for assembly FP tests arm64: vmlinux.lds.S: remove `.fixup` section arm64: extable: add load_unaligned_zeropad() handler arm64: extable: add a dedicated uaccess handler arm64: extable: add `type` and `data` fields arm64: extable: use `ex` for `exception_table_entry` arm64: extable: make fixup_exception() return bool arm64: extable: consolidate definitions arm64: gpr-num: support W registers arm64: factor out GPR numbering helpers arm64: kvm: use kvm_exception_table_entry arm64: lib: __arch_copy_to_user(): fold fixups into body ...
2021-10-27perf: qcom_l2_pmu: ACPI: Use ACPI_COMPANION() directlyRafael J. Wysocki
The ACPI_HANDLE() macro is a wrapper arond the ACPI_COMPANION() macro and the ACPI handle produced by the former comes from the ACPI device object produced by the latter, so it is way more straightforward to evaluate the latter directly instead of passing the handle produced by the former to acpi_bus_get_device(). Modify l2_cache_pmu_probe_cluster() accordingly (no intentional functional impact). While at it, rename the ACPI device pointer to adev for more clarity. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-10-04drivers/perf: Improve build test coverageJohn Garry
Improve build test cover by allowing some drivers to build under COMPILE_TEST where possible. Some notes: - Mostly a dependency on CONFIG_ACPI is not really required for only building (but left untouched), but is required for TX2 which uses ACPI functions which have no stubs - XGENE required 64b dependency as it relies on some unsigned long perf struct fields being 64b - I don't see why TX2 requires NUMA to build, but left untouched - Added an explicit dependency on GENERIC_MSI_IRQ_DOMAIN for ARM_SMMU_V3_PMU, which is required for platform MSI functions Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1633085326-156653-3-git-send-email-john.garry@huawei.com Signed-off-by: Will Deacon <will@kernel.org>