Bug #13003
openMalicious Driver Detection event on ixl driver
0%
Description
There have been a handful of reports of MDD events happening with the Intel X710 NIC. The system logs show the following:
ixl10: Malicious Driver Detection event 2 on TX queue 7, pf number 0 ixl10: MDD TX event is for this function! ixl10: WARNING: queue 7 appears to be hung! ixl10: Malicious Driver Detection event 2 on TX queue 4, pf number 0 ixl10: WARNING: queue 4 appears to be hung!
and
Oct 29 09:47:08 kernel ixl1: Malicious Driver Detection event 2 on TX queue 769, pf number 1 (PF-1) Oct 29 09:37:28 kernel ixl1: Malicious Driver Detection event 2 on TX queue 773, pf number 1 (PF-1)
and https://forum.netgate.com/topic/158415/issues-with-an-intel-x710-and-pfsense-2-4-5-p1
Some info gathered from various reports and troubleshooting:- Occurs anywhere from once a day, to once a month.
- Occurs on pfSense 2.4.5p1 and 22.01.
- Occurs with PF traffic (SR-IOV not required to be enabled).
- Occurs with TSO/LRO disabled.
- Occurs with copper (RJ-45) and optical transceivers.
- Most of the issue reports have been from those running a bridge interface with ixl0 and ixl1. However, there have been multiple reports without using bridges as well.
Increasing the buffer size on the bridge reduced the frequency of the events (went from once a day to taking 5 days before it reoccurred).
Updated by Kris Phillips 3 months ago
I saw this occur on a 7100 that had two bridged ixl interfaces for an add in card on 21.05.2, so it may affect basically everything from 2.4.5p1 to 22.01, potentially.
Updated by Christoph Vieten 2 months ago
Same happened on 2.6.0 with Intel x710-T4 multiple times now.
Updating the nvme from 8.15 to latest 8.60 didn't fix the issue. Replacing the card with another X710 didn't help either.
sysctl -a | grep dev.ixl.0 | grep fw
dev.ixl.0.fw_lldp: 1
dev.ixl.0.fw_version: fw 8.6.68629 api 1.15 nvm 8.60 etid 8000bd5a oem 1.268.0
sysctl -a | grep dev.ixl.0.%desc
dev.ixl.0.%desc: Intel(R) Ethernet Controller X710/X557-AT 10GBASE-T - 2.3.1-k
Seems to only affect one port of the 4 ports, seems to be the one with the most traffic.
TSO is disabled by the checkbox and System => Advanced => Tunable => net.inet.tcp.tso set to 0
Updated by Kris Phillips 2 months ago
Christoph Vieten wrote in #note-2:
Same happened on 2.6.0 with Intel x710-T4 multiple times now.
Updating the nvme from 8.15 to latest 8.60 didn't fix the issue. Replacing the card with another X710 didn't help either.sysctl -a | grep dev.ixl.0 | grep fw
dev.ixl.0.fw_lldp: 1
dev.ixl.0.fw_version: fw 8.6.68629 api 1.15 nvm 8.60 etid 8000bd5a oem 1.268.0sysctl -a | grep dev.ixl.0.%desc
dev.ixl.0.%desc: Intel(R) Ethernet Controller X710/X557-AT 10GBASE-T - 2.3.1-kSeems to only affect one port of the 4 ports, seems to be the one with the most traffic.
TSO is disabled by the checkbox and System => Advanced => Tunable => net.inet.tcp.tso set to 0
Christoph,
Were you running a bridge in your configuration like the original bug report seems to suggest is the root cause?
Updated by Christoph Vieten 6 days ago
Kris Phillips wrote in #note-3:
Christoph Vieten wrote in #note-2:
Same happened on 2.6.0 with Intel x710-T4 multiple times now.
Updating the nvme from 8.15 to latest 8.60 didn't fix the issue. Replacing the card with another X710 didn't help either.sysctl -a | grep dev.ixl.0 | grep fw
dev.ixl.0.fw_lldp: 1
dev.ixl.0.fw_version: fw 8.6.68629 api 1.15 nvm 8.60 etid 8000bd5a oem 1.268.0sysctl -a | grep dev.ixl.0.%desc
dev.ixl.0.%desc: Intel(R) Ethernet Controller X710/X557-AT 10GBASE-T - 2.3.1-kSeems to only affect one port of the 4 ports, seems to be the one with the most traffic.
TSO is disabled by the checkbox and System => Advanced => Tunable => net.inet.tcp.tso set to 0
Christoph,
Were you running a bridge in your configuration like the original bug report seems to suggest is the root cause?
Hi Kris,
no, were aren't running a bridge at all. But we are running approx. 20 vlan interfaces on the port that is affected.
Looks like when the issue occurs, you cannot switch to other physical ports (we have three of those X710 quad port cards in use) of any other adapter as well.
But the other ports in use (e.g. some 10g ports are configured without vlan assignments or have a smaller number of vlans) aren't affected by that driver / firmware stuck issue so can still be used.
Last time when the issue occurred, we migrated the top traffic vlan interfaces to separate ports resulting in a longer uptime until yesterday.
Did someone try the latest FreeBSD driver yet?
https://pkg.freebsd.org/FreeBSD:12:amd64/latest/All/intel-ix-kmod-3.3.24.pkg
Updated by Kris Phillips 3 days ago
Christoph Vieten wrote in #note-5:
Kris Phillips wrote in #note-3:
Christoph Vieten wrote in #note-2:
Same happened on 2.6.0 with Intel x710-T4 multiple times now.
Updating the nvme from 8.15 to latest 8.60 didn't fix the issue. Replacing the card with another X710 didn't help either.sysctl -a | grep dev.ixl.0 | grep fw
dev.ixl.0.fw_lldp: 1
dev.ixl.0.fw_version: fw 8.6.68629 api 1.15 nvm 8.60 etid 8000bd5a oem 1.268.0sysctl -a | grep dev.ixl.0.%desc
dev.ixl.0.%desc: Intel(R) Ethernet Controller X710/X557-AT 10GBASE-T - 2.3.1-kSeems to only affect one port of the 4 ports, seems to be the one with the most traffic.
TSO is disabled by the checkbox and System => Advanced => Tunable => net.inet.tcp.tso set to 0
Christoph,
Were you running a bridge in your configuration like the original bug report seems to suggest is the root cause?
Hi Kris,
no, were aren't running a bridge at all. But we are running approx. 20 vlan interfaces on the port that is affected.
Looks like when the issue occurs, you cannot switch to other physical ports (we have three of those X710 quad port cards in use) of any other adapter as well.
But the other ports in use (e.g. some 10g ports are configured without vlan assignments or have a smaller number of vlans) aren't affected by that driver / firmware stuck issue so can still be used.Last time when the issue occurred, we migrated the top traffic vlan interfaces to separate ports resulting in a longer uptime until yesterday.
Did someone try the latest FreeBSD driver yet?
https://pkg.freebsd.org/FreeBSD:12:amd64/latest/All/intel-ix-kmod-3.3.24.pkg
Hello Christoph,
I don't see any notes that it's been tested for this particular issue. However, the Intel ix driver was updated in 22.05. Have you tested to see if this issue is gone in the latest RC? We expect 22.05 to be released very soon, so might be worth a re-test on the latest.