Age | Commit message (Collapse) | Author |
|
Currently, when the hypervisor rejects a page during lock operation, the
VM treats pages differently according to the error-code: in certain
cases the page is immediately freed, and in others it is put on a
rejection list and only freed later.
The behavior does not make too much sense. If the page is freed
immediately it is very likely to be used again in the next batch of
allocations, and be rejected again.
In addition, for support of compaction and OOM notifiers, we wish to
separate the logic that communicates with the hypervisor (as well as
analyzes the status of each page) from the logic that allocates or free
pages.
Treat all errors the same way, queuing the pages on the refuse list.
Move to the next allocation size (4k) when too many pages are refused.
Free the refused pages when moving to the next size to avoid situations
in which too much memory is waiting to be freed on the refused list.
Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The current abstractions for batch vs single operations seem suboptimal
and complicate the implementation of additional features (OOM,
compaction).
The immediate problem of the current abstractions is that they cause
differences in how operations are handled when batching is on or off.
For example, the refused_alloc counter is not updated when batching is
on. These discrepancies are caused by code redundancies.
Instead, this patch presents three type of operations, according to
whether batching is on or off: (1) add page, (2) communication with the
hypervisor and (3) retrieving the status of a page.
To avoid the overhead of virtual functions, and since we do not expect
additional interfaces for communication with the hypervisor, we use
static keys instead of virtual functions.
Finally, while we are at it, change vmballoon_init_batching() to return
int instead of bool, to be consistent in the return type and avoid
potential coding errors.
Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Splitting the allocations between sleeping and non-sleeping made some
sort of sense as long as rate-limiting was enabled. Now that it is
removed, we need to decide - either we want sleeping allocations or not.
Since no other Linux balloon driver (hv, Xen, virtio) uses sleeping
allocations, use the same approach.
We do distinguish, however, between 2MB allocations and 4kB allocations
and prevent reclamation on 2MB. In both cases, we avoid using emergency
low-memory pools, as it may cause undesired effects.
Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The use of accessors for batch entries complicates the code and makes it
less readable. Remove it an instead use bit-fields.
Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The lock and unlock code paths are very similar, so avoid the duplicate
code by merging them together.
Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Now that we have a single point, unify the tracing and collecting the
statistics for commands and their failure. While it might somewhat
reduce the control over debugging, it cleans the code a lot.
Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
By inlining the hypercall interface, we can unify several operations
into one central point in the code:
- Updating the target.
- Updating when a reset is needed.
- Update statistics (which will be done later in the patch-set).
- Print debug-messages (although they cannot be enabled as selectively).
Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
of_node_put and put_device has taken the null pointer check into account.
So it is safe to remove the duplicated check.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
While we updated the coresight DT bindings, some of the
new examples were not updated due to the order in which they
were merged. Let us update all the missed out ones to the
new bindings to avoid confusion.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
TPIU component has an input port. The example uses out-ports
which is wrong. Let us fix it.
Reported-by: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Use CLAIM protocol to make sure the device is available for use.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Use the CLAIM protocol to grab the ownership of the component when
in use.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Use the CLAIM protocol to grab the ownership of the component.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Use the CLAIM tags to grab the device for self-hosted usage.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Coresight architecture defines CLAIM tags for a device to negotiate
control of the components (external agent vs self-hosted). Each device
has a pair of registers (CLAIMSET & CLAIMCLR) for managing the CLAIM
tags. However, the protocol for the CLAIM tags is IMPLEMENTATION DEFINED.
PSCI has recommendations for the use of the CLAIM tags to negotiate
controls for external agent vs self-hosted use. This patch implements
the recommended protocol by PSCI.
The claim/disclaim operations are performed from the device specific
drivers. The disadvantage is that the calls are sprinkled in each driver,
but this makes the operation much simpler.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When a replicator port is enabled, we block the traffic
on the other port and route all traffic to the new enabled
port. If there are two active trace sessions each targeting
the two different paths from the replicator, the second session
will disable the first session and route all the data to the
second path.
ETR
/
e.g, replicator
\
ETB
If CPU0 is operated in sysfs mode to ETR and CPU1 is operated
in perf mode to ETB, depending on the order in which the
replicator is enabled one device is blocked.
Ideally we need trace-id for the session to make the
right choice. That implies we need a trace-id allocation
logic for the coresight subsystem and use that to route
the traffic. The short term solution is to only manage
the "target port" and leave the other port untouched.
That leaves both the paths unaffected, except that some
unwanted traffic may be pushed to the paths (if the Trace-IDs
are not far enough), which is still fine and can be filtered
out while processing rather than silently blocking the data.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Prepare the etb10 driver to return errors in enabling
the device.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add support for reporting errors back from the SMP cross
function call for enabling ETM.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add support for handling errors in enabling the component.
The ETM is enabled via cross call to owner CPU. Make
necessary changes to report the error back from the cross
call.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Prepare to handle errors in enabling the hardware and
report it back to the core driver.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Make sure we honor the errors in CATU device and abort the operation.
While at it, delay setting the etr_buf for the session until we are
sure that we are indeed enabling the ETR.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Refactor the tmc-etr enable operation to make it easier to
handle errors in enabling the hardware. We need to make
sure that the buffer is compatible with the ETR. This
patch re-arranges to make the error handling easier, by
deferring the hardware enablement until all the errors
are checked. This also avoids turning the CATU on/off
during a sysfs read session.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
coresight_enable_path() enables the components in a trace
path from a given source to a sink, excluding the source.
The operation is performed in the reverse order; the sink
first and then backwards in the list. However, if we encounter
an error in enabling any of the component, we simply disable
all the components in the given path irrespective of whether
we enabled some of the components in the enable iteration.
This could interfere with another trace session if one of the
link devices is turned off (e.g, TMC-ETF). So, we need to
make sure that we only disable those components which were
actually enabled from the iteration.
This patch achieves the same by refactoring the coresight_disable_path
to accept a "node" to start from in the forward order, which can
then be used from the error path of coresight_enable_path().
With this change, we don't issue a disable call back for a component
which didn't get enabled. This change of behavior triggers
a bug in coresight_enable_link(), where we leave the refcount
on the device and will prevent the device from being enabled
forever. So, we also drop the refcount in the coresight_enable_link()
if the operation failed.
Also, with the refactoring, we always start after the first node (which
is the "SOURCE" device) for disabling the entire path. This implies,
we must not find a "SOURCE" in the middle of the path. Hence, added
a WARN_ON() to make sure the paths we get are sane, rather than
simply ignoring them.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
>From the comment in the code, it claims the requirement for byte-address
alignment for RRP register: 'for 32-bit, 64-bit and 128-bit wide trace
memory, the four LSBs must be 0s. For 256-bit wide trace memory, the
five LSBs must be 0s'. This isn't consistent with the program, the
program sets five LSBs as zeros for 32/64/128-bit wide trace memory and
set six LSBs zeros for 256-bit wide trace memory.
After checking with the CoreSight Trace Memory Controller technical
reference manual (ARM DDI 0461B, section 3.3.4 RAM Read Pointer
Register), it proves the comment is right and the program does wrong
setting.
This patch fixes byte-address alignment for RRP by following correct
definition in the technical reference manual.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
In ETB dump function tmc_etb_dump_hw() it has nested loops. The second
level loop is to iterate index in the range [0 .. drvdata->memwidth);
but the index isn't really used in the code, thus the second level
loop is useless.
This patch is to remove the second level loop; the refactor also reduces
indentation and we can use 'break' to replace 'goto' tag.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
For non-VHE systems host kernel runs at EL1 and jumps to EL2 whenever
hypervisor code should be executed. In this case ETM4x driver must
restrict configuration to EL1 when it setups kernel tracing.
However, there is no separate hypervisor privilege level when VHE
is enabled, the host kernel runs at EL2.
This patch fixes configuration of TRCACATRn register for VHE systems
so that ETM_EXLEVEL_NS_HYP bit is used instead of ETM_EXLEVEL_NS_OS
to on/off kernel tracing. At the same time, it moves common code
to new helper.
Signed-off-by: Tomasz Nowicki <tnowicki@caviumnetworks.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Up until now the relative simplicity of enabling the ETB made it
possible to accommodate processing for both sysFS and perf methods.
But work on claimtags and CPU-wide trace scenarios is adding some
complexity, making the current code messy and hard to maintain.
As such follow what has been done for ETF and ETR components and split
function etb_enable() so that processing for both API can be done
cleanly.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This patch moves the etb_drvdata::mode from a locat_t to a simple u32,
as it is for the ETF and ETR drivers. This streamlines the code and adds
commonality with the other drivers when dealing with similar operations.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add support for using TMC-ETR as backend for ETM perf tracing.
We use software double buffering at the moment. i.e, the TMC-ETR
uses a separate buffer than the perf ring buffer. The data is
copied to the perf ring buffer once a session completes.
The TMC-ETR would try to match the larger of perf ring buffer
or the ETR buffer size configured via sysfs, scaling down to
a minimum limit of 1MB.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
In coresight perf mode, we need to prepare the sink before
starting a session, which is done via set_buffer call back.
We then proceed to enable the tracing. If we fail to start
the session successfully, we leave the sink configuration
unchanged. In order to make the operation atomic and to
avoid yet another call back to clear the buffer, we get
rid of the "set_buffer" call back and pass the buffer details
via enable() call back to the sink.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We can always find the sink configuration for a given perf_output_handle.
Add a helper to retrieve the sink configuration for a given
perf_output_handle. This will be used to get rid of the set_buffer()
call back.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Right now we issue an update_buffer() and reset_buffer() call backs
in succession when we stop tracing an event. The update_buffer is
supposed to check the status of the buffer and make sure the ring buffer
is updated with the trace data. And we store information about the
size of the data collected only to be consumed by the reset_buffer
callback which always follows the update_buffer. This was originally
designed for handling future IPs which could trigger a buffer overflow
interrupt. This patch gets rid of the reset_buffer callback altogether
and performs the actions in update_buffer, making it return the size
collected. We can always add the support for handling the overflow
interrupt case later.
This removes some not-so pretty hack (storing the new head in the
size field for snapshot mode) and cleans it up a little bit.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Convert component enable/disable messages from dev_info to dev_dbg.
When used with perf, the components in the paths are enabled/disabled
during each schedule of the run, which can flood the dmesg with these
messages. Moreover, they are only useful for debug purposes. So,
convert such messages to dev_dbg() which can be turned on as
needed.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Since the ETR now uses mode specific buffers, we can reliably
provide the trace data captured in sysfs mode, even when the ETR
is operating in PERF mode.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Since the ETR could be driven either by SYSFS or by perf, it
becomes complicated how we deal with the buffers used for each
of these modes. The ETR driver cannot simply free the current
attached buffer without knowing the provider (i.e, sysfs vs perf).
To solve this issue, we provide:
1) the driver-mode specific etr buffer to be retained in the drvdata
2) the etr_buf for a session should be passed on when enabling the
hardware, which will be stored in drvdata->etr_buf. This will be
replaced (not free'd) as soon as the hardware is disabled, after
necessary sync operation.
The advantages of this are :
1) The common code path doesn't need to worry about how to dispose
an existing buffer, if it is about to start a new session with a
different buffer, possibly in a different mode.
2) The driver mode can control its buffers and can get access to the
saved session even when the hardware is operating in a different
mode. (e.g, we can still access a trace buffer from a sysfs mode
even if the etr is now used in perf mode, without disrupting the
current session.)
Towards this, we introduce a sysfs specific data which will hold the
etr_buf used for sysfs mode of operation, controlled solely by the
sysfs mode handling code.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We enable the trace path, before activating the source.
If we fail to enable the source, we must disable the path
to make sure it is available for another session.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
At the moment, if there is no CPU specified for a given
event, we use cpu_online_mask and try to build path for
each of the CPUs in the mask. This could prevent any CPU
that is turned online later to be used for the tracing.
This patch changes to use the cpu_present_mask and tries
to build path for as much CPUs as possible ignoring the
failures in building path for some of the CPUs. If ever
we try to trace on those CPUs, we fail the operation.
Based on a patch from Mathieu Poirier.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We hold the read lock on CPU hotplug to simply copy the
online mask, which is not really needed. And this can
cause a lockdep warning, like :
[ 54.632093] ======================================================
[ 54.638207] WARNING: possible circular locking dependency detected
[ 54.644322] 4.18.0-rc3-00042-g2d39e6356bb7-dirty #309 Not tainted
[ 54.650350] ------------------------------------------------------
[ 54.656464] perf/2862 is trying to acquire lock:
[ 54.661031] 000000007e21d170 (&event->mmap_mutex){+.+.}, at: perf_event_set_output+0x98/0x138
[ 54.669486]
[ 54.669486] but task is already holding lock:
[ 54.675256] 000000001080eb1b (&cpuctx_mutex){+.+.}, at: perf_event_ctx_lock_nested+0xf8/0x1f0
[ 54.683704]
[ 54.683704] which lock already depends on the new lock.
[ 54.683704]
[ 54.691797]
[ 54.691797] the existing dependency chain (in reverse order) is:
[ 54.699201]
[ 54.699201] -> #3 (&cpuctx_mutex){+.+.}:
[ 54.704556] __mutex_lock+0x70/0x808
[ 54.708608] mutex_lock_nested+0x1c/0x28
[ 54.713005] perf_event_init_cpu+0x8c/0xd8
[ 54.717574] perf_event_init+0x194/0x1d4
[ 54.721971] start_kernel+0x2b8/0x42c
[ 54.726107]
[ 54.726107] -> #2 (pmus_lock){+.+.}:
[ 54.731114] __mutex_lock+0x70/0x808
[ 54.735165] mutex_lock_nested+0x1c/0x28
[ 54.739560] perf_event_init_cpu+0x30/0xd8
[ 54.744129] cpuhp_invoke_callback+0x84/0x248
[ 54.748954] _cpu_up+0xe8/0x1c8
[ 54.752576] do_cpu_up+0xa8/0xc8
[ 54.756283] cpu_up+0x10/0x18
[ 54.759731] smp_init+0xa0/0x114
[ 54.763438] kernel_init_freeable+0x120/0x288
[ 54.768264] kernel_init+0x10/0x108
[ 54.772230] ret_from_fork+0x10/0x18
[ 54.776279]
[ 54.776279] -> #1 (cpu_hotplug_lock.rw_sem){++++}:
[ 54.782492] cpus_read_lock+0x34/0xb0
[ 54.786631] etm_setup_aux+0x5c/0x308
[ 54.790769] rb_alloc_aux+0x1ec/0x300
[ 54.794906] perf_mmap+0x284/0x610
[ 54.798787] mmap_region+0x388/0x570
[ 54.802838] do_mmap+0x344/0x4f8
[ 54.806544] vm_mmap_pgoff+0xe4/0x110
[ 54.810682] ksys_mmap_pgoff+0xa8/0x240
[ 54.814992] sys_mmap+0x18/0x28
[ 54.818613] el0_svc_naked+0x30/0x34
[ 54.822661]
[ 54.822661] -> #0 (&event->mmap_mutex){+.+.}:
[ 54.828445] lock_acquire+0x48/0x68
[ 54.832409] __mutex_lock+0x70/0x808
[ 54.836459] mutex_lock_nested+0x1c/0x28
[ 54.840855] perf_event_set_output+0x98/0x138
[ 54.845680] _perf_ioctl+0x2a0/0x6a0
[ 54.849731] perf_ioctl+0x3c/0x68
[ 54.853526] do_vfs_ioctl+0xb8/0xa20
[ 54.857577] ksys_ioctl+0x80/0xb8
[ 54.861370] sys_ioctl+0xc/0x18
[ 54.864990] el0_svc_naked+0x30/0x34
[ 54.869039]
[ 54.869039] other info that might help us debug this:
[ 54.869039]
[ 54.876960] Chain exists of:
[ 54.876960] &event->mmap_mutex --> pmus_lock --> &cpuctx_mutex
[ 54.876960]
[ 54.887217] Possible unsafe locking scenario:
[ 54.887217]
[ 54.893073] CPU0 CPU1
[ 54.897552] ---- ----
[ 54.902030] lock(&cpuctx_mutex);
[ 54.905396] lock(pmus_lock);
[ 54.910911] lock(&cpuctx_mutex);
[ 54.916770] lock(&event->mmap_mutex);
[ 54.920566]
[ 54.920566] *** DEADLOCK ***
[ 54.920566]
[ 54.926424] 1 lock held by perf/2862:
[ 54.930042] #0: 000000001080eb1b (&cpuctx_mutex){+.+.}, at: perf_event_ctx_lock_nested+0xf8/0x1f0
Since we have per-cpu array for the paths, we simply don't care about
the number of online CPUs. This patch gets rid of the
{get/put}_online_cpus().
Reported-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We create a coresight trace path for each online CPU when
we start the event. We rely on the number of online CPUs
and then go on to allocate an array matching the "number of
online CPUs" for holding the path and then uses normal
CPU id as the index to the array. This is problematic as
we could have some offline CPUs causing us to access beyond
the actual array size (e.g, on a dual SMP system, if CPU0 is
offline, CPU1 could be really accessing beyond the array).
The solution is to switch to per-cpu array for holding the path.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
If the ETB is already enabled in sysfs mode, the ETB reports
success even if a perf mode is requested. Fix this by checking
the requested mode.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The coresight components could be operated either in sysfs mode or in perf
mode. For some of the components, the mode of operation doesn't matter as
they simply relay the data to the next component in the trace path. But for
sinks, they need to be able to provide the trace data back to the user.
Thus we need to make sure that "mode" is handled appropriately. e.g,
the sysfs mode could have multiple sources driving the trace data, while
perf mode doesn't allow sharing the sink.
The coresight_enable_sink() however doesn't really allow this check to
trigger as it skips the "enable_sink" callback if the component is
already enabled, irrespective of the mode. This could cause mixing
of data from different modes or even same mode (in perf), if the
sources are different. Also, if we fail to enable the sink while
enabling a path (where sink is the first component enabled),
we could end up in disabling the components in the "entire"
path which were not enabled in this trial, causing disruptions
in the existing trace paths.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Use ERR_CAT inlined function to replace the ERR_PTR(PTR_ERR). It
make the code more concise.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The coresight drivers relied on default bindings for graph
in DT, while reusing the "reg" field of the "ports" to indicate
the actual hardware port number for the connections. This can
cause duplicate ports with same addresses, but different
direction. However, with the rules getting stricter for the
address mismatch with the label, it is no longer possible to use
the port address field for the hardware port number.
This patch introduces new DT binding rules for coresight
components, based on the same generic DT graph bindings, but
avoiding the address duplication.
- All output ports must be specified under a child node with
name "out-ports".
- All input ports must be specified under a childe node with
name "in-ports".
- Port address should match the hardware port number.
The support for legacy bindings is retained, with a warning.
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The platform code parses the component connections and populates
a platform-description of the output connections in arrays of fields
(which is never freed). This is later copied in the coresight_register
to a newly allocated area, represented by coresight_connection(s).
This patch cleans up the code dealing with connections by making
use of the "coresight_connection" structure right at the platform
code and lets the generic driver simply re-use information provided
by the platform.
Thus making it reader friendly as well as avoiding the wastage of
unused memory.
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add a helper to check if the given endpoint is input.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When parsing the remote endpoint of an output port, we do :
rport = of_graph_get_remote_port(ep);
rparent = of_graph_get_remote_port_parent(ep);
and then parse the "remote_port" as if it was the remote endpoint,
which is wrong. The code worked fine because we used endpoint number
as the port number. Let us fix it and optimise a bit as:
remote_ep = of_graph_get_remote_endpoint(ep);
if (remote_ep)
remote_parent = of_graph_get_port_parent(remote_ep);
and then, parse the remote_ep for the port/endpoint details.
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We don't drop the reference on the remote device while parsing the
connection, held by bus_find_device(). Fix this by duplicating the
device name and dropping the reference.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The coresight driver doesn't drop the references on the
remote endpoint/port nodes. Add the missing of_node_put()
calls.
Reported-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Refactor the of graph endpoint parsing code, to make the error
handling easier.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6403587a930c ("coresight: use put_device() instead of kfree()")
fixes the double freeing of resources and ensures that the device
refcount is dropped properly. Add a comment to explain this to
help the readers and prevent people trying to "unfix" it again.
While at it, rename the labels for better readability.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|