Project

General

Profile

Actions

Regression #13381

open

Software vlan tagging is broken in ixgbe

Added by Steve Wheeler 2 months ago. Updated 27 days ago.

Status:
Waiting on Merge
Priority:
Normal
Assignee:
-
Category:
Hardware / Drivers
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
22.11
Release Notes:
Default
Affected Version:
2.7.0
Affected Architecture:
amd64

Description

VLAN tagged traffic fails on an ix NIC if hardware vlan tagging is disabled.
For example:

[22.05-RELEASE][admin@4100.stevew.lan]/root: ping 10.101.0.12
PING 10.101.0.12 (10.101.0.12): 56 data bytes
64 bytes from 10.101.0.12: icmp_seq=0 ttl=64 time=0.435 ms
64 bytes from 10.101.0.12: icmp_seq=1 ttl=64 time=0.351 ms
64 bytes from 10.101.0.12: icmp_seq=2 ttl=64 time=0.359 ms
64 bytes from 10.101.0.12: icmp_seq=3 ttl=64 time=0.378 ms
^C
--- 10.101.0.12 ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.351/0.381/0.435/0.033 ms
[22.05-RELEASE][admin@4100.stevew.lan]/root: ifconfig ix3
ix3: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 1500
    description: WAN
    options=8138b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER>
    ether 90:ec:77:1f:8a:5f
    inet6 fe80::92ec:77ff:fe1f:8a5f%ix3 prefixlen 64 scopeid 0x8
    inet 172.21.16.232 netmask 0xffffff00 broadcast 172.21.16.255
    inet 45.65.87.21 netmask 0xffffffc0 broadcast 45.65.87.63 vhid 1
    carp: MASTER vhid 1 advbase 1 advskew 0
    media: Ethernet autoselect (1000baseT <full-duplex,rxpause,txpause>)
    status: active
    nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
[22.05-RELEASE][admin@4100.stevew.lan]/root: ifconfig ix3 -vlanhwtag
[22.05-RELEASE][admin@4100.stevew.lan]/root: ifconfig ix3
ix3: flags=28943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST,PPROMISC> metric 0 mtu 1500
    description: WAN
    options=8138a8<VLAN_MTU,JUMBO_MTU,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER>
    ether 90:ec:77:1f:8a:5f
    inet6 fe80::92ec:77ff:fe1f:8a5f%ix3 prefixlen 64 scopeid 0x8
    inet 172.21.16.232 netmask 0xffffff00 broadcast 172.21.16.255
    inet 45.65.87.21 netmask 0xffffffc0 broadcast 45.65.87.63 vhid 1
    carp: BACKUP vhid 1 advbase 1 advskew 0
    media: Ethernet autoselect (1000baseT <full-duplex,rxpause,txpause>)
    status: active
    nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
[22.05-RELEASE][admin@4100.stevew.lan]/root: ping 10.101.0.12
PING 10.101.0.12 (10.101.0.12): 56 data bytes
^C
--- 10.101.0.12 ping statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss

VLAN hardware tagging is enabled by default so this is not easy to hit.

It produces some unexpected behaviour. In a packet capture on the parent interface there is no outbound VLAN traffic show at all.
Inbound VLAN traffic appears as double tagged with VLAN0 as the outer tag:

20:30:18.349238 00:51:82:11:22:02 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 68: vlan 0, p 0, ethertype 802.1Q, vlan 1001, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.101.0.1 tell 10.101.0.12, length 46

VLAN0 is expected to be dropped.

This behaviour appears to have been introduced by this commit:
https://github.com/pfsense/FreeBSD-src/commit/9c762cc125c0c2dae9fbf49cc526bb97c14b54a4
All snapshots after 20220314-1916 exhibiting it.

Tested on a 4100 on 22.05 and in 2.7. The user who hit this initially is also using a C3K SoC device with the same on-board NICs.

See: https://forum.netgate.com/topic/173149/pfsense-22-05-breaks-vlans-restoring-pfsense-22-01-fixes-the-issue

Actions #1

Updated by Steve Wheeler about 2 months ago

It looks like this issue still happens in FreeBSD Head. Though unlike in pfSense (FreeBSD 12) we can see outbound traffic in packet captures. Replies still come back unexpectedlu double tagged though:

17:29:30.370205 90:ec:77:1f:8a:5f > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1001, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.101.0.12 tell 10.101.0.1, length 28
17:29:30.370787 00:51:82:11:22:02 > 90:ec:77:1f:8a:5f, ethertype 802.1Q (0x8100), length 64: vlan 0, p 0, ethertype 802.1Q, vlan 1001, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.101.0.12 is-at 00:51:82:11:22:02, length 42

Hence packets are still dropped and the connection fails.

Actions #2

Updated by Steve Wheeler about 2 months ago

Tested: FreeBSD-14.0-CURRENT-amd64-20220729-467d3e2e8aa-257025-memstick.img

Actions #3

Updated by Kristof Provost about 2 months ago

I've been able to reproduce this (on pfsense/main).

That required the following:

ifconfig vlan create vlandev ix3 vlan 42
ifconfig vlan0 192.168.42.1/24 up
ifconfig ix3 -vlanhwtag

The traffic sent out through the interface it fine, but received traffic is incorrectly double-tagged (once with vlan 0, then with vlan 42. i.e. the outer tag is 0, the inner is 42).

Reverting the listed patch (9c762cc125c0c2dae9fbf49cc526bb97c14b54a4) fixes the problem.

Interestingly the problem does not occur if the vlanhwtag feature is disabled before the vlan is created. I believe that to be an important clue.

Actions #4

Updated by Kristof Provost about 2 months ago

I proposed a patch in https://reviews.freebsd.org/D36139
It works for me, but I'd like the Intel people (and driver maintainers) to take a look before I commit it.

Actions #5

Updated by Steve Wheeler 27 days ago

  • Status changed from New to Waiting on Merge
Actions

Also available in: Atom PDF