Project

General

Profile

Actions

Bug #13911

closed

Unnecessary delay when querying ``ixgbe(4)`` interfaces with SFP ports

Added by Steve Wheeler about 1 year ago. Updated 6 months ago.

Status:
Resolved
Priority:
Normal
Category:
Hardware / Drivers
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
23.09
Release Notes:
Default
Affected Version:
2.6.0
Affected Architecture:
amd64

Description

ixgbe NICs with SFP ports attempt to read the modules and wait for 1s when queried by ifconfig -v.

This means that loading the dashboard with the interfaces widget or the Interfaces Status page is delayed by 1s per interface.

For example:

[22.05.1-RELEASE][admin@8200-2.stevew.lan]/root: time ifconfig -vvvm ix0
ix0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
    description: WAN3
    options=e138bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6>
    capabilities=f53fbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,NETMAP,RXCSUM_IPV6,TXCSUM_IPV6>
    ether 90:ec:77:47:5c:e6
    inet6 fe80::92ec:77ff:fe47:5ce6%ix0 prefixlen 64 scopeid 0x5
    media: Ethernet autoselect
    status: no carrier
    supported media:
        media autoselect
    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
0.000u 1.130s 0:01.13 100.0%    147+208k 1+0io 0pf+0w
[22.05.1-RELEASE][admin@8200-2.stevew.lan]/root: time ifconfig -vvvm ix2
ix2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
    description: WAN2
    options=e138bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6>
    capabilities=f53fbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,NETMAP,RXCSUM_IPV6,TXCSUM_IPV6>
    ether 90:ec:77:47:5c:e5
    inet6 fe80::92ec:77ff:fe47:5ce5%ix2 prefixlen 64 scopeid 0x7
    media: Ethernet autoselect
    status: no carrier
    supported media:
        media autoselect
        media 10baseT/UTP
        media 100baseTX
        media 1000baseT
    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
0.000u 0.413s 0:00.41 100.0%    149+212k 1+0io 0pf+0w

ix0 has an SFP+ port on the 8200. ix2 is a combo port.

See: https://forum.netgate.com/topic/177305/high-cpu-usage-and-delay-calling-ifconfig-v-ix0-ix1-delays-dashboard-load

Specifically:

It's a driver bug...

https://github.com/opnsense/core/issues/5349
https://forum.opnsense.org/index.php?topic=25440.msg127981#msg127981
https://github.com/opnsense/src/commit/938a0467e8cbf9e47629aff57e04250e7f6be251

They fixed it a year ago in OPNsense. Is it fixed in pfSense Plus 23.01 and/or FreeBSD 14-Current?

Seems that it still tries to detect it 10 times:
https://github.com/freebsd/freebsd-src/blob/main/sys/dev/ixgbe/ixgbe_phy.h#L150
https://github.com/freebsd/freebsd-src/blob/main/sys/dev/ixgbe/ixgbe_phy.c#L1996

Here is why dashboard is slow:
https://github.com/pfsense/pfsense/blob/master/src/etc/inc/pfsense-utils.inc#L1695
https://github.com/pfsense/pfsense/search?q=get_interface_info

Actions #1

Updated by Steve Wheeler about 1 year ago

  • Subject changed from Unnecessary delay when querying ixbge NICs to Unnecessary delay when querying ixgbe NICs
Actions #2

Updated by LTC Tech about 1 year ago

Looks like it may not be fixed anytime soon. We have moved from 22.05 to 23.01 and it's still happening. I compared the output of ifconfig with and without the verbose flag and found it to be the same for our NICs. Thus I made the following temporary patch to use with the System_Patches package. With this patch Dashboard is quick to load and no longer causes high CPU usage.

--- a/src/etc/inc/pfsense-utils.inc
+++ b/src/etc/inc/pfsense-utils.inc
@@ -1790,7 +1790,7 @@
     if ($ifinfo['status'] == "up" && $swifinfo == NULL) {
         /* try to determine media with ifconfig */
         $ifconfiginfo = [];
-        exec("/sbin/ifconfig -v " . $ifinfo['if'], $ifconfiginfo);
+        exec("/sbin/ifconfig " . $ifinfo['if'], $ifconfiginfo);
         $wifconfiginfo = [];
         if (is_interface_wireless($ifdescr)) {
             exec("/sbin/ifconfig {$ifinfo['if']} list sta", $wifconfiginfo);

We do not have a module capable NIC so the patch does not limit functionality at least for us.

If the underlying delay and CPU usage with module detection in the driver cannot be fixed could Netgate add a system tunable to turn off module detection on the dashboard?

Actions #3

Updated by Jim Pingle 12 months ago

  • Plus Target Version changed from 23.05 to 23.09

The delay is still present in the driver but it's probably best if we look into the driver changes over a longer term here and not this close to a release.

Actions #4

Updated by Jim Pingle 10 months ago

  • Target version changed from 2.7.0 to CE-Next
Actions #5

Updated by Jim Pingle 7 months ago

  • Assignee set to Kristof Provost
Actions #6

Updated by Jim Pingle 7 months ago

  • Target version changed from CE-Next to 2.8.0
Actions #7

Updated by Kristof Provost 7 months ago

  • Status changed from New to Feedback

I've merged a change to the i2c read function to only try once (rather than 11 times) until we've identified an SFP. That should reduce the wait time (per interface) from 1.1 seconds down to .1 second, and still keep SFPs functional.

Actions #8

Updated by Jim Pingle 7 months ago

  • Subject changed from Unnecessary delay when querying ixgbe NICs to Unnecessary delay when querying ``ixgbe(4)`` interfaces with SFP ports

Updating subject for release notes.

Actions #9

Updated by Steve Wheeler 6 months ago

  • Status changed from Feedback to Resolved

This looks good in current 23.09 builds.
Tested:

23.09-BETA (amd64)
built on Mon Oct 16 3:31:00 BST 2023
FreeBSD 14.0-CURRENT

Actions #10

Updated by Jim Pingle 6 months ago

  • Target version changed from 2.8.0 to 2.7.1
Actions

Also available in: Atom PDF