From 4219196d1f662cb10a462eb9e076633a3fc31a15 Mon Sep 17 00:00:00 2001
From: Sukadev Bhattiprolu
Date: Wed, 16 Mar 2022 18:12:31 -0700
Subject: ibmvnic: fix race between xmit and reset

There is a race between reset and the transmit paths that can lead to
ibmvnic_xmit() accessing an scrq after it has been freed in the reset
path. It can result in a crash like:

	Kernel attempted to read user page (0) - exploit attempt? (uid: 0)
	BUG: Kernel NULL pointer dereference on read at 0x00000000
	Faulting instruction address: 0xc0080000016189f8
	Oops: Kernel access of bad area, sig: 11 [#1]
	...
	NIP [c0080000016189f8] ibmvnic_xmit+0x60/0xb60 [ibmvnic]
	LR [c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
	Call Trace:
	[c008000001618f08] ibmvnic_xmit+0x570/0xb60 [ibmvnic] (unreliable)
	[c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
	[c000000000c9cfcc] sch_direct_xmit+0xec/0x330
	[c000000000bfe640] __dev_xmit_skb+0x3a0/0x9d0
	[c000000000c00ad4] __dev_queue_xmit+0x394/0x730
	[c008000002db813c] __bond_start_xmit+0x254/0x450 [bonding]
	[c008000002db8378] bond_start_xmit+0x40/0xc0 [bonding]
	[c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
	[c000000000c00ca4] __dev_queue_xmit+0x564/0x730
	[c000000000cf97e0] neigh_hh_output+0xd0/0x180
	[c000000000cfa69c] ip_finish_output2+0x31c/0x5c0
	[c000000000cfd244] __ip_queue_xmit+0x194/0x4f0
	[c000000000d2a3c4] __tcp_transmit_skb+0x434/0x9b0
	[c000000000d2d1e0] __tcp_retransmit_skb+0x1d0/0x6a0
	[c000000000d2d984] tcp_retransmit_skb+0x34/0x130
	[c000000000d310e8] tcp_retransmit_timer+0x388/0x6d0
	[c000000000d315ec] tcp_write_timer_handler+0x1bc/0x330
	[c000000000d317bc] tcp_write_timer+0x5c/0x200
	[c000000000243270] call_timer_fn+0x50/0x1c0
	[c000000000243704] __run_timers.part.0+0x324/0x460
	[c000000000243894] run_timer_softirq+0x54/0xa0
	[c000000000ea713c] __do_softirq+0x15c/0x3e0
	[c000000000166258] __irq_exit_rcu+0x158/0x190
	[c000000000166420] irq_exit+0x20/0x40
	[c00000000002853c] timer_interrupt+0x14c/0x2b0
	[c000000000009a00] decrementer_common_virt+0x210/0x220
	--- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c

The immediate cause of the crash is the access of tx_scrq in the following
snippet during a reset, where the tx_scrq can be either NULL or an address
that will soon be invalid:

	ibmvnic_xmit()
	{
		...
		tx_scrq = adapter->tx_scrq[queue_num];
		txq = netdev_get_tx_queue(netdev, queue_num);
		ind_bufp = &tx_scrq->ind_buf;

		if (test_bit(0, &adapter->resetting)) {
		...
	}

But beyond that, the call to ibmvnic_xmit() itself is not safe during a
reset and the reset path attempts to avoid this by stopping the queue in
ibmvnic_cleanup(). However just after the queue was stopped, an in-flight
ibmvnic_complete_tx() could have restarted the queue even as the reset is
progressing.

Since the queue was restarted we could get a call to ibmvnic_xmit() which
can then access the bad tx_scrq (or other fields).

We cannot however simply have ibmvnic_complete_tx() check the ->resetting
bit and skip starting the queue. This can race at the "back-end" of a good
reset which just restarted the queue but has not cleared the ->resetting
bit yet. If we skip restarting the queue due to ->resetting being true,
the queue would remain stopped indefinitely potentially leading to transmit
timeouts.

IOW ->resetting is too broad for this purpose. Instead use a new flag
that indicates whether or not the queues are active. Only the open/
reset paths control when the queues are active. ibmvnic_complete_tx()
and others wake up the queue only if the queue is marked active.

So we will have:
	A. reset/open thread in ibmvnic_cleanup() and __ibmvnic_open()

		->resetting = true
		->tx_queues_active = false
		disable tx queues
		...
		->tx_queues_active = true
		start tx queues

	B. Tx interrupt in ibmvnic_complete_tx():

		if (->tx_queues_active)
			netif_wake_subqueue();

To ensure that ->tx_queues_active and state of the queues are consistent,
we need a lock which:

	- must also be taken in the interrupt path (ibmvnic_complete_tx())
	- shared across the multiple queues in the adapter (so they don't
	  become serialized)

Use rcu_read_lock() and have the reset thread synchronize_rcu() after
updating the ->tx_queues_active state.

While here, consolidate a few boolean fields in ibmvnic_adapter for
better alignment.

Based on discussions with Brian King and Dany Madden.

Fixes: 7ed5b31f4a66 ("net/ibmvnic: prevent more than one thread from running in reset")
Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 63 ++++++++++++++++++++++++++++++--------
 drivers/net/ethernet/ibm/ibmvnic.h |  7 +++--
 2 files changed, 55 insertions(+), 15 deletions(-)

(limited to 'drivers/net')

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index b423e94956f1..b4804ce63151 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1429,6 +1429,15 @@ static int __ibmvnic_open(struct net_device *netdev)
 		return rc;
 	}
 
+	adapter->tx_queues_active = true;
+
+	/* Since queues were stopped until now, there shouldn't be any
+	 * one in ibmvnic_complete_tx() or ibmvnic_xmit() so maybe we
+	 * don't need the synchronize_rcu()? Leaving it for consistency
+	 * with setting ->tx_queues_active = false.
+	 */
+	synchronize_rcu();
+
 	netif_tx_start_all_queues(netdev);
 
 	if (prev_state == VNIC_CLOSED) {
@@ -1603,6 +1612,14 @@ static void ibmvnic_cleanup(struct net_device *netdev)
 	struct ibmvnic_adapter *adapter = netdev_priv(netdev);
 
 	/* ensure that transmissions are stopped if called by do_reset */
+
+	adapter->tx_queues_active = false;
+
+	/* Ensure complete_tx() and ibmvnic_xmit() see ->tx_queues_active
+	 * update so they don't restart a queue after we stop it below.
+	 */
+	synchronize_rcu();
+
 	if (test_bit(0, &adapter->resetting))
 		netif_tx_disable(netdev);
 	else
@@ -1842,14 +1859,21 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter,
 		tx_buff->skb = NULL;
 		adapter->netdev->stats.tx_dropped++;
 	}
+
 	ind_bufp->index = 0;
+
 	if (atomic_sub_return(entries, &tx_scrq->used) <=
 	    (adapter->req_tx_entries_per_subcrq / 2) &&
-	    __netif_subqueue_stopped(adapter->netdev, queue_num) &&
-	    !test_bit(0, &adapter->resetting)) {
-		netif_wake_subqueue(adapter->netdev, queue_num);
-		netdev_dbg(adapter->netdev, "Started queue %d\n",
-			   queue_num);
+	    __netif_subqueue_stopped(adapter->netdev, queue_num)) {
+		rcu_read_lock();
+
+		if (adapter->tx_queues_active) {
+			netif_wake_subqueue(adapter->netdev, queue_num);
+			netdev_dbg(adapter->netdev, "Started queue %d\n",
+				   queue_num);
+		}
+
+		rcu_read_unlock();
 	}
 }
 
@@ -1904,11 +1928,12 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 	int index = 0;
 	u8 proto = 0;
 
-	tx_scrq = adapter->tx_scrq[queue_num];
-	txq = netdev_get_tx_queue(netdev, queue_num);
-	ind_bufp = &tx_scrq->ind_buf;
-
-	if (test_bit(0, &adapter->resetting)) {
+	/* If a reset is in progress, drop the packet since
+	 * the scrqs may get torn down. Otherwise use the
+	 * rcu to ensure reset waits for us to complete.
+	 */
+	rcu_read_lock();
+	if (!adapter->tx_queues_active) {
 		dev_kfree_skb_any(skb);
 
 		tx_send_failed++;
@@ -1917,6 +1942,10 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 		goto out;
 	}
 
+	tx_scrq = adapter->tx_scrq[queue_num];
+	txq = netdev_get_tx_queue(netdev, queue_num);
+	ind_bufp = &tx_scrq->ind_buf;
+
 	if (ibmvnic_xmit_workarounds(skb, netdev)) {
 		tx_dropped++;
 		tx_send_failed++;
@@ -1924,6 +1953,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 		ibmvnic_tx_scrq_flush(adapter, tx_scrq);
 		goto out;
 	}
+
 	if (skb_is_gso(skb))
 		tx_pool = &adapter->tso_pool[queue_num];
 	else
@@ -2078,6 +2108,7 @@ tx_err:
 		netif_carrier_off(netdev);
 	}
 out:
+	rcu_read_unlock();
 	netdev->stats.tx_dropped += tx_dropped;
 	netdev->stats.tx_bytes += tx_bytes;
 	netdev->stats.tx_packets += tx_packets;
@@ -3732,9 +3763,15 @@ restart_loop:
 		    (adapter->req_tx_entries_per_subcrq / 2) &&
 		    __netif_subqueue_stopped(adapter->netdev,
 					     scrq->pool_index)) {
-			netif_wake_subqueue(adapter->netdev, scrq->pool_index);
-			netdev_dbg(adapter->netdev, "Started queue %d\n",
-				   scrq->pool_index);
+			rcu_read_lock();
+			if (adapter->tx_queues_active) {
+				netif_wake_subqueue(adapter->netdev,
+						    scrq->pool_index);
+				netdev_dbg(adapter->netdev,
+					   "Started queue %d\n",
+					   scrq->pool_index);
+			}
+			rcu_read_unlock();
 		}
 	}
 
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
index fa2d607a7b1b..8f5cefb932dd 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -1006,11 +1006,14 @@ struct ibmvnic_adapter {
 	struct work_struct ibmvnic_reset;
 	struct delayed_work ibmvnic_delayed_reset;
 	unsigned long resetting;
-	bool napi_enabled, from_passive_init;
-	bool login_pending;
 	/* last device reset time */
 	unsigned long last_reset_time;
 
+	bool napi_enabled;
+	bool from_passive_init;
+	bool login_pending;
+	/* protected by rcu */
+	bool tx_queues_active;
 	bool failover_pending;
 	bool force_reset_recovery;
 
-- 
cgit v1.2.3


From 8d3ea3d402db94b61075617e71b67459a714a502 Mon Sep 17 00:00:00 2001
From: Jeremy Linton
Date: Wed, 9 Mar 2022 22:53:58 -0600
Subject: net: bcmgenet: Use stronger register read/writes to assure ordering

GCC12 appears to be much smarter about its dependency tracking and is
aware that the relaxed variants are just normal loads and stores and
this is causing problems like:

[  210.074549] ------------[ cut here ]------------
[  210.079223] NETDEV WATCHDOG: enabcm6e4ei0 (bcmgenet): transmit queue 1 timed out
[  210.086717] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:529 dev_watchdog+0x234/0x240
[  210.095044] Modules linked in: genet(E) nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat]
[  210.146561] ACPI CPPC: PCC check channel failed for ss: 0. ret=-110
[  210.146927] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G            E     5.17.0-rc7G12+ #58
[  210.153226] CPPC Cpufreq:cppc_scale_freq_workfn: failed to read perf counters
[  210.161349] Hardware name: Raspberry Pi Foundation Raspberry Pi 4 Model B/Raspberry Pi 4 Model B, BIOS EDK2-DEV 02/08/2022
[  210.161353] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  210.161358] pc : dev_watchdog+0x234/0x240
[  210.161364] lr : dev_watchdog+0x234/0x240
[  210.161368] sp : ffff8000080a3a40
[  210.161370] x29: ffff8000080a3a40 x28: ffffcd425af87000 x27: ffff8000080a3b20
[  210.205150] x26: ffffcd425aa00000 x25: 0000000000000001 x24: ffffcd425af8ec08
[  210.212321] x23: 0000000000000100 x22: ffffcd425af87000 x21: ffff55b142688000
[  210.219491] x20: 0000000000000001 x19: ffff55b1426884c8 x18: ffffffffffffffff
[  210.226661] x17: 64656d6974203120 x16: 0000000000000001 x15: 6d736e617274203a
[  210.233831] x14: 2974656e65676d63 x13: ffffcd4259c300d8 x12: ffffcd425b07d5f0
[  210.241001] x11: 00000000ffffffff x10: ffffcd425b07d5f0 x9 : ffffcd4258bdad9c
[  210.248171] x8 : 00000000ffffdfff x7 : 000000000000003f x6 : 0000000000000000
[  210.255341] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000001000
[  210.262511] x2 : 0000000000001000 x1 : 0000000000000005 x0 : 0000000000000044
[  210.269682] Call trace:
[  210.272133]  dev_watchdog+0x234/0x240
[  210.275811]  call_timer_fn+0x3c/0x15c
[  210.279489]  __run_timers.part.0+0x288/0x310
[  210.283777]  run_timer_softirq+0x48/0x80
[  210.287716]  __do_softirq+0x128/0x360
[  210.291392]  __irq_exit_rcu+0x138/0x140
[  210.295243]  irq_exit_rcu+0x1c/0x30
[  210.298745]  el1_interrupt+0x38/0x54
[  210.302334]  el1h_64_irq_handler+0x18/0x24
[  210.306445]  el1h_64_irq+0x7c/0x80
[  210.309857]  arch_cpu_idle+0x18/0x2c
[  210.313445]  default_idle_call+0x4c/0x140
[  210.317470]  cpuidle_idle_call+0x14c/0x1a0
[  210.321584]  do_idle+0xb0/0x100
[  210.324737]  cpu_startup_entry+0x30/0x8c
[  210.328675]  secondary_start_kernel+0xe4/0x110
[  210.333138]  __secondary_switched+0x94/0x98

The assumption when these were relaxed seems to be that device memory
would be mapped non reordering, and that other constructs
(spinlocks/etc) would provide the barriers to assure that packet data
and in memory rings/queues were ordered with respect to device
register reads/writes. This itself seems a bit sketchy, but the real
problem with GCC12 is that it is moving the actual reads/writes around
at will as though they were independent operations when in truth they
are not, but the compiler can't know that. When looking at the
assembly dumps for many of these routines its possible to see very
clean, but not strictly in program order operations occurring as the
compiler would be free to do if these weren't actually register
reads/write operations.

Its possible to suppress the timeout with a liberal bit of dma_mb()'s
sprinkled around but the device still seems unable to reliably
send/receive data. A better plan is to use the safer readl/writel
everywhere.

Since this partially reverts an older commit, which notes the use of
the relaxed variants for performance reasons. I would suggest that
any performance problems with this commit are targeted at relaxing only
the performance critical code paths after assuring proper barriers.

Fixes: 69d2ea9c79898 ("net: bcmgenet: Use correct I/O accessors")
Reported-by: Peter Robinson <pbrobinson@gmail.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Acked-by: Peter Robinson <pbrobinson@gmail.com>
Tested-by: Peter Robinson <pbrobinson@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220310045358.224350-1-jeremy.linton@arm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/broadcom/genet/bcmgenet.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

(limited to 'drivers/net')

diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 2da804f84b48..bd5998012a87 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -76,7 +76,7 @@ static inline void bcmgenet_writel(u32 value, void __iomem *offset)
 	if (IS_ENABLED(CONFIG_MIPS) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
 		__raw_writel(value, offset);
 	else
-		writel_relaxed(value, offset);
+		writel(value, offset);
 }
 
 static inline u32 bcmgenet_readl(void __iomem *offset)
@@ -84,7 +84,7 @@ static inline u32 bcmgenet_readl(void __iomem *offset)
 	if (IS_ENABLED(CONFIG_MIPS) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
 		return __raw_readl(offset);
 	else
-		return readl_relaxed(offset);
+		return readl(offset);
 }
 
 static inline void dmadesc_set_length_status(struct bcmgenet_priv *priv,
-- 
cgit v1.2.3


From 6b3c74550224c3be24c4cf6ab8c333602b458bff Mon Sep 17 00:00:00 2001
From: Yang Yingliang
Date: Sat, 19 Mar 2022 11:24:50 +0800
Subject: net: wwan: qcom_bam_dmux: fix wrong pointer passed to IS_ERR()

It should check dmux->tx after calling dma_request_chan().

Fixes: 21a0ffd9b38c ("net: wwan: Add Qualcomm BAM-DMUX WWAN network driver")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Stephan Gerhold <stephan@gerhold.net>
Link: https://lore.kernel.org/r/20220319032450.3288224-1-yangyingliang@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 drivers/net/wwan/qcom_bam_dmux.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

(limited to 'drivers/net')

diff --git a/drivers/net/wwan/qcom_bam_dmux.c b/drivers/net/wwan/qcom_bam_dmux.c
index 5dfa2eba6014..17d46f4d2913 100644
--- a/drivers/net/wwan/qcom_bam_dmux.c
+++ b/drivers/net/wwan/qcom_bam_dmux.c
@@ -755,7 +755,7 @@ static int __maybe_unused bam_dmux_runtime_resume(struct device *dev)
 		return 0;
 
 	dmux->tx = dma_request_chan(dev, "tx");
-	if (IS_ERR(dmux->rx)) {
+	if (IS_ERR(dmux->tx)) {
 		dev_err(dev, "Failed to request TX DMA channel: %pe\n", dmux->tx);
 		dmux->tx = NULL;
 		bam_dmux_runtime_suspend(dev);
-- 
cgit v1.2.3


From 32d53c0aa3a7b727243473949bad2a830b908edc Mon Sep 17 00:00:00 2001
From: Alexander Lobakin
Date: Wed, 23 Mar 2022 13:43:52 +0100
Subject: ice: fix 'scheduling while atomic' on aux critical err interrupt

There's a kernel BUG splat on processing aux critical error
interrupts in ice_misc_intr():

[ 2100.917085] BUG: scheduling while atomic: swapper/15/0/0x00010000
...
[ 2101.060770] Call Trace:
[ 2101.063229]  <IRQ>
[ 2101.065252]  dump_stack+0x41/0x60
[ 2101.068587]  __schedule_bug.cold.100+0x4c/0x58
[ 2101.073060]  __schedule+0x6a4/0x830
[ 2101.076570]  schedule+0x35/0xa0
[ 2101.079727]  schedule_preempt_disabled+0xa/0x10
[ 2101.084284]  __mutex_lock.isra.7+0x310/0x420
[ 2101.088580]  ? ice_misc_intr+0x201/0x2e0 [ice]
[ 2101.093078]  ice_send_event_to_aux+0x25/0x70 [ice]
[ 2101.097921]  ice_misc_intr+0x220/0x2e0 [ice]
[ 2101.102232]  __handle_irq_event_percpu+0x40/0x180
[ 2101.106965]  handle_irq_event_percpu+0x30/0x80
[ 2101.111434]  handle_irq_event+0x36/0x53
[ 2101.115292]  handle_edge_irq+0x82/0x190
[ 2101.119148]  handle_irq+0x1c/0x30
[ 2101.122480]  do_IRQ+0x49/0xd0
[ 2101.125465]  common_interrupt+0xf/0xf
[ 2101.129146]  </IRQ>
...

As Andrew correctly mentioned previously[0], the following call
ladder happens:

ice_misc_intr() <- hardirq
  ice_send_event_to_aux()
    device_lock()
      mutex_lock()
        might_sleep()
          might_resched() <- oops

Add a new PF state bit which indicates that an aux critical error
occurred and serve it in ice_service_task() in process context.
The new ice_pf::oicr_err_reg is read-write in both hardirq and
process contexts, but only 3 bits of non-critical data probably
aren't worth explicit synchronizing (and they're even in the same
byte [31:24]).

[0] https://lore.kernel.org/all/YeSRUVmrdmlUXHDn@lunn.ch

Fixes: 348048e724a0e ("ice: Implement iidc operations")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Tested-by: Michal Kubiak <michal.kubiak@intel.com>
Acked-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/intel/ice/ice.h      |  2 ++
 drivers/net/ethernet/intel/ice/ice_main.c | 25 +++++++++++++++----------
 2 files changed, 17 insertions(+), 10 deletions(-)

(limited to 'drivers/net')

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index bea1d1e39fa2..2ca887076dd4 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -290,6 +290,7 @@ enum ice_pf_state {
 	ICE_LINK_DEFAULT_OVERRIDE_PENDING,
 	ICE_PHY_INIT_COMPLETE,
 	ICE_FD_VF_FLUSH_CTX,		/* set at FD Rx IRQ or timeout */
+	ICE_AUX_ERR_PENDING,
 	ICE_STATE_NBITS		/* must be last */
 };
 
@@ -559,6 +560,7 @@ struct ice_pf {
 	wait_queue_head_t reset_wait_queue;
 
 	u32 hw_csum_rx_error;
+	u32 oicr_err_reg;
 	u16 oicr_idx;		/* Other interrupt cause MSIX vector index */
 	u16 num_avail_sw_msix;	/* remaining MSIX SW vectors left unclaimed */
 	u16 max_pf_txqs;	/* Total Tx queues PF wide */
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index b7e8744b0c0a..296f9d5f7408 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -2255,6 +2255,19 @@ static void ice_service_task(struct work_struct *work)
 		return;
 	}
 
+	if (test_and_clear_bit(ICE_AUX_ERR_PENDING, pf->state)) {
+		struct iidc_event *event;
+
+		event = kzalloc(sizeof(*event), GFP_KERNEL);
+		if (event) {
+			set_bit(IIDC_EVENT_CRIT_ERR, event->type);
+			/* report the entire OICR value to AUX driver */
+			swap(event->reg, pf->oicr_err_reg);
+			ice_send_event_to_aux(pf, event);
+			kfree(event);
+		}
+	}
+
 	if (test_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags)) {
 		/* Plug aux device per request */
 		ice_plug_aux_dev(pf);
@@ -3041,17 +3054,9 @@ static irqreturn_t ice_misc_intr(int __always_unused irq, void *data)
 
 #define ICE_AUX_CRIT_ERR (PFINT_OICR_PE_CRITERR_M | PFINT_OICR_HMC_ERR_M | PFINT_OICR_PE_PUSH_M)
 	if (oicr & ICE_AUX_CRIT_ERR) {
-		struct iidc_event *event;
-
+		pf->oicr_err_reg |= oicr;
+		set_bit(ICE_AUX_ERR_PENDING, pf->state);
 		ena_mask &= ~ICE_AUX_CRIT_ERR;
-		event = kzalloc(sizeof(*event), GFP_ATOMIC);
-		if (event) {
-			set_bit(IIDC_EVENT_CRIT_ERR, event->type);
-			/* report the entire OICR value to AUX driver */
-			event->reg = oicr;
-			ice_send_event_to_aux(pf, event);
-			kfree(event);
-		}
 	}
 
 	/* Report any remaining unexpected interrupts */
-- 
cgit v1.2.3


From 5a3156932da06f09953764de113419f254086faf Mon Sep 17 00:00:00 2001
From: Alexander Lobakin
Date: Wed, 23 Mar 2022 13:43:53 +0100
Subject: ice: don't allow to run ice_send_event_to_aux() in atomic ctx

ice_send_event_to_aux() eventually descends to mutex_lock()
(-> might_sched()), so it must not be called under non-task
context. However, at least two fixes have happened already for the
bug splats occurred due to this function being called from atomic
context.
To make the emergency landings softer, bail out early when executed
in non-task context emitting a warn splat only once. This way we
trade some events being potentially lost for system stability and
avoid any related hangs and crashes.

Fixes: 348048e724a0e ("ice: Implement iidc operations")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Tested-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/intel/ice/ice_idc.c | 3 +++
 1 file changed, 3 insertions(+)

(limited to 'drivers/net')

diff --git a/drivers/net/ethernet/intel/ice/ice_idc.c b/drivers/net/ethernet/intel/ice/ice_idc.c
index fc3580167e7b..5559230eff8b 100644
--- a/drivers/net/ethernet/intel/ice/ice_idc.c
+++ b/drivers/net/ethernet/intel/ice/ice_idc.c
@@ -34,6 +34,9 @@ void ice_send_event_to_aux(struct ice_pf *pf, struct iidc_event *event)
 {
 	struct iidc_auxiliary_drv *iadrv;
 
+	if (WARN_ON_ONCE(!in_task()))
+		return;
+
 	if (!pf->adev)
 		return;
 
-- 
cgit v1.2.3