diff options
author | Vladimir Oltean | 2023-03-21 03:03:23 +0200 |
---|---|---|
committer | Greg Kroah-Hartman | 2023-04-06 12:10:37 +0200 |
commit | ec6cd79c4e54a2a5867e0497d269ad4f008216ba (patch) | |
tree | 0060e011490c15d427fca4b26e5c3f9487c0309d /drivers | |
parent | 39cd75f2f3a43c0e2f95749eb6dd6420c553f87d (diff) |
net: mscc: ocelot: fix stats region batching
[ Upstream commit 6acc72a43eac78a309160d0a7512bbc59bcdd757 ]
The blamed commit changed struct ocelot_stat_layout :: "u32 offset" to
"u32 reg".
However, "u32 reg" is not quite a register address, but an enum
ocelot_reg, which in itself encodes an enum ocelot_target target in the
upper bits, and an index into the ocelot->map[target][] array in the
lower bits.
So, whereas the previous code comparison between stats_layout[i].offset
and last + 1 was correct (because those "offsets" at the time were
32-bit relative addresses), the new code, comparing layout[i].reg to
last + 4 is not correct, because the "reg" here is an enum/index, not an
actual register address.
What we want to compare are indeed register addresses, but to do that,
we need to actually go through the same motions as
__ocelot_bulk_read_ix() itself.
With this bug, all statistics counters are deemed by
ocelot_prepare_stats_regions() as constituting their own region.
(Truncated) log on VSC9959 (Felix) below (prints added by me):
Before:
region of 1 contiguous counters starting with SYS:STAT:CNT[0x000]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x001]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x002]
...
region of 1 contiguous counters starting with SYS:STAT:CNT[0x041]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x042]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x080]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x081]
...
region of 1 contiguous counters starting with SYS:STAT:CNT[0x0ac]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x100]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x101]
...
region of 1 contiguous counters starting with SYS:STAT:CNT[0x111]
After:
region of 67 contiguous counters starting with SYS:STAT:CNT[0x000]
region of 45 contiguous counters starting with SYS:STAT:CNT[0x080]
region of 18 contiguous counters starting with SYS:STAT:CNT[0x100]
Since commit d87b1c08f38a ("net: mscc: ocelot: use bulk reads for
stats") intended bulking as a performance improvement, and since now,
with trivial-sized regions, performance is even worse than without
bulking at all, this could easily qualify as a performance regression.
Fixes: d4c367650704 ("net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Diffstat (limited to 'drivers')
-rw-r--r-- | drivers/net/ethernet/mscc/ocelot_stats.c | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/drivers/net/ethernet/mscc/ocelot_stats.c b/drivers/net/ethernet/mscc/ocelot_stats.c index dbd20b125cea..0066219bb0e8 100644 --- a/drivers/net/ethernet/mscc/ocelot_stats.c +++ b/drivers/net/ethernet/mscc/ocelot_stats.c @@ -392,7 +392,8 @@ static int ocelot_prepare_stats_regions(struct ocelot *ocelot) if (!ocelot->stats_layout[i].reg) continue; - if (region && ocelot->stats_layout[i].reg == last + 4) { + if (region && ocelot->map[SYS][ocelot->stats_layout[i].reg & REG_MASK] == + ocelot->map[SYS][last & REG_MASK] + 4) { region->count++; } else { region = devm_kzalloc(ocelot->dev, sizeof(*region), |