Regression #13381
closedSoftware VLAN tagging does not work on ``ixgbe(4)`` interfaces
0%
Description
VLAN tagged traffic fails on an ix NIC if hardware vlan tagging is disabled.
For example:
[22.05-RELEASE][admin@4100.stevew.lan]/root: ping 10.101.0.12 PING 10.101.0.12 (10.101.0.12): 56 data bytes 64 bytes from 10.101.0.12: icmp_seq=0 ttl=64 time=0.435 ms 64 bytes from 10.101.0.12: icmp_seq=1 ttl=64 time=0.351 ms 64 bytes from 10.101.0.12: icmp_seq=2 ttl=64 time=0.359 ms 64 bytes from 10.101.0.12: icmp_seq=3 ttl=64 time=0.378 ms ^C --- 10.101.0.12 ping statistics --- 4 packets transmitted, 4 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.351/0.381/0.435/0.033 ms [22.05-RELEASE][admin@4100.stevew.lan]/root: ifconfig ix3 ix3: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 1500 description: WAN options=8138b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER> ether 90:ec:77:1f:8a:5f inet6 fe80::92ec:77ff:fe1f:8a5f%ix3 prefixlen 64 scopeid 0x8 inet 172.21.16.232 netmask 0xffffff00 broadcast 172.21.16.255 inet 45.65.87.21 netmask 0xffffffc0 broadcast 45.65.87.63 vhid 1 carp: MASTER vhid 1 advbase 1 advskew 0 media: Ethernet autoselect (1000baseT <full-duplex,rxpause,txpause>) status: active nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL> [22.05-RELEASE][admin@4100.stevew.lan]/root: ifconfig ix3 -vlanhwtag [22.05-RELEASE][admin@4100.stevew.lan]/root: ifconfig ix3 ix3: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 1500 description: WAN options=8138a8<VLAN_MTU,JUMBO_MTU,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER> ether 90:ec:77:1f:8a:5f inet6 fe80::92ec:77ff:fe1f:8a5f%ix3 prefixlen 64 scopeid 0x8 inet 172.21.16.232 netmask 0xffffff00 broadcast 172.21.16.255 inet 45.65.87.21 netmask 0xffffffc0 broadcast 45.65.87.63 vhid 1 carp: BACKUP vhid 1 advbase 1 advskew 0 media: Ethernet autoselect (1000baseT <full-duplex,rxpause,txpause>) status: active nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL> [22.05-RELEASE][admin@4100.stevew.lan]/root: ping 10.101.0.12 PING 10.101.0.12 (10.101.0.12): 56 data bytes ^C --- 10.101.0.12 ping statistics --- 4 packets transmitted, 0 packets received, 100.0% packet loss
VLAN hardware tagging is enabled by default so this is not easy to hit.
It produces some unexpected behaviour. In a packet capture on the parent interface there is no outbound VLAN traffic show at all.
Inbound VLAN traffic appears as double tagged with VLAN0 as the outer tag:
20:30:18.349238 00:51:82:11:22:02 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 68: vlan 0, p 0, ethertype 802.1Q, vlan 1001, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.101.0.1 tell 10.101.0.12, length 46
VLAN0 is expected to be dropped.
This behaviour appears to have been introduced by this commit:
https://github.com/pfsense/FreeBSD-src/commit/9c762cc125c0c2dae9fbf49cc526bb97c14b54a4
All snapshots after 20220314-1916 exhibiting it.
Tested on a 4100 on 22.05 and in 2.7. The user who hit this initially is also using a C3K SoC device with the same on-board NICs.
Updated by Steve Wheeler over 2 years ago
It looks like this issue still happens in FreeBSD Head. Though unlike in pfSense (FreeBSD 12) we can see outbound traffic in packet captures. Replies still come back unexpectedlu double tagged though:
17:29:30.370205 90:ec:77:1f:8a:5f > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1001, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.101.0.12 tell 10.101.0.1, length 28 17:29:30.370787 00:51:82:11:22:02 > 90:ec:77:1f:8a:5f, ethertype 802.1Q (0x8100), length 64: vlan 0, p 0, ethertype 802.1Q, vlan 1001, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.101.0.12 is-at 00:51:82:11:22:02, length 42
Hence packets are still dropped and the connection fails.
Updated by Steve Wheeler over 2 years ago
Tested: FreeBSD-14.0-CURRENT-amd64-20220729-467d3e2e8aa-257025-memstick.img
Updated by Kristof Provost over 2 years ago
I've been able to reproduce this (on pfsense/main).
That required the following:
ifconfig vlan create vlandev ix3 vlan 42 ifconfig vlan0 192.168.42.1/24 up ifconfig ix3 -vlanhwtag
The traffic sent out through the interface it fine, but received traffic is incorrectly double-tagged (once with vlan 0, then with vlan 42. i.e. the outer tag is 0, the inner is 42).
Reverting the listed patch (9c762cc125c0c2dae9fbf49cc526bb97c14b54a4) fixes the problem.
Interestingly the problem does not occur if the vlanhwtag feature is disabled before the vlan is created. I believe that to be an important clue.
Updated by Kristof Provost over 2 years ago
I proposed a patch in https://reviews.freebsd.org/D36139
It works for me, but I'd like the Intel people (and driver maintainers) to take a look before I commit it.
Updated by Steve Wheeler over 2 years ago
- Status changed from New to Waiting on Merge
This has now been committed upstream: https://github.com/freebsd/freebsd-src/commit/e7abb897018be34f039ad957562fdc2f38aa3562
Updated by Jim Pingle about 2 years ago
- Plus Target Version changed from 22.11 to 23.01
Updated by Steve Wheeler about 2 years ago
- Status changed from Waiting on Merge to Resolved
This fix is now merged into 23.01 and works in current snapshots:
[23.01-DEVELOPMENT][admin@4100.stevew.lan]/root: ping 10.101.0.10 PING 10.101.0.10 (10.101.0.10): 56 data bytes 64 bytes from 10.101.0.10: icmp_seq=0 ttl=64 time=0.412 ms 64 bytes from 10.101.0.10: icmp_seq=1 ttl=64 time=0.271 ms ^C --- 10.101.0.10 ping statistics --- 2 packets transmitted, 2 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.271/0.341/0.412/0.070 ms [23.01-DEVELOPMENT][admin@4100.stevew.lan]/root: ifconfig ix3.1001 ix3.1001: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 description: VLAN1001 options=4600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP> ether 90:ec:77:1f:8a:5f inet6 fe80::92ec:77ff:fe1f:8a5f%ix3.1001 prefixlen 64 scopeid 0xe inet 10.101.0.1 netmask 0xffffff00 broadcast 10.101.0.255 groups: vlan vlan: 1001 vlanproto: 802.1q vlanpcp: 0 parent interface: ix3 media: Ethernet autoselect (1000baseT <full-duplex,rxpause,txpause>) status: active nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> [23.01-DEVELOPMENT][admin@4100.stevew.lan]/root: ifconfig ix3 ix3: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=4e138bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP> ether 90:ec:77:1f:8a:5f inet6 fe80::92ec:77ff:fe1f:8a5f%ix3 prefixlen 64 scopeid 0x8 inet 172.21.16.232 netmask 0xffffff00 broadcast 172.21.16.255 media: Ethernet autoselect (1000baseT <full-duplex,rxpause,txpause>) status: active nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL> [23.01-DEVELOPMENT][admin@4100.stevew.lan]/root: ifconfig ix3 -vlanhwtag [23.01-DEVELOPMENT][admin@4100.stevew.lan]/root: ping 10.101.0.10 PING 10.101.0.10 (10.101.0.10): 56 data bytes 64 bytes from 10.101.0.10: icmp_seq=0 ttl=64 time=0.421 ms 64 bytes from 10.101.0.10: icmp_seq=1 ttl=64 time=0.336 ms 64 bytes from 10.101.0.10: icmp_seq=2 ttl=64 time=0.344 ms ^C --- 10.101.0.10 ping statistics --- 3 packets transmitted, 3 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.336/0.367/0.421/0.038 ms
Updated by Jim Pingle about 2 years ago
- Subject changed from Software vlan tagging is broken in ixgbe to Software VLAN tagging does not work on ``ixgbe(4)`` interfaces
Updating subject for release notes.
Updated by Nicolas Embriz almost 2 years ago
Steve Wheeler wrote:
VLAN tagged traffic fails on an ix NIC if hardware vlan tagging is disabled.
For example:
[...]VLAN hardware tagging is enabled by default so this is not easy to hit.
It produces some unexpected behaviour. In a packet capture on the parent interface there is no outbound VLAN traffic show at all.
Inbound VLAN traffic appears as double tagged with VLAN0 as the outer tag:
[...]VLAN0 is expected to be dropped.
This behaviour appears to have been introduced by this commit:
https://github.com/pfsense/FreeBSD-src/commit/9c762cc125c0c2dae9fbf49cc526bb97c14b54a4
All snapshots after 20220314-1916 exhibiting it.Tested on a 4100 on 22.05 and in 2.7. The user who hit this initially is also using a C3K SoC device with the same on-board NICs.
Hi, I am having a similar problem with ixl interfaces, In my case, I can't reach any device in the VLAN only the router (pfsense):
ixl1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
description: LAN
options=a100b9<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,RXCSUM_IPV6>
ether 64:9d:99:b1:80:03
inet6 fe80::669d:99ff:feb1:8003%ixl1 prefixlen 64 scopeid 0x2
inet6 2a0e:97c0:620:affe::1 prefixlen 64
inet 192.168.0.1 netmask 0xffffff00 broadcast 192.168.0.255
media: Ethernet autoselect (10Gbase-Twinax <full-duplex>)
status: active
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
plugged: SFP/SFP+/SFP28 1X Copper Passive (Copper pigtail)
ixl1.20: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
description: bhyve
options=200001<RXCSUM,RXCSUM_IPV6>
ether 64:9d:99:b1:80:03
inet6 fe80::669d:99ff:feb1:8003%ixl1.20 prefixlen 64 scopeid 0x11
inet 10.0.0.1 netmask 0xffffff00 broadcast 10.0.0.255
groups: vlan
vlan: 20 vlanpcp: 0 parent interface: ixl1
media: Ethernet autoselect (10Gbase-Twinax <full-duplex>)
status: active
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>