Project

General

Profile

Actions

Bug #16599

open

Wireguard can use the wrong gateway under certain circumstances

Added by Allan Hsu 3 days ago. Updated 3 days ago.

Status:
New
Priority:
Normal
Category:
WireGuard
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Default
Affected Version:
2.8.1
Affected Architecture:

Description

My current pfSense environment has three WAN connections:
  • WAN1: Fiber ONT directly connected to interface ix0
  • WAN2: Cable modem connected via VLAN at ixl0.200
  • WAN3: Unifi 5G Max device connected via GRE tunnel gre0 over ixl0.2 as described here

All three WANs provide both IPv4 and IPv6 connectivity.

There are three gateway groups using WAN 1 and 2, duplicated across IPv4 and IPv6:
  • Gateway Group 1, using global behaviour
    1. Tier 1: WAN1
    2. Tier 2: WAN2
  • Gateway Group 2, set to keep states on gateway recovery
    1. Tier 1: WAN1
    2. Tier 2: WAN2
  • Gateway Group 3, using global behaviour
    1. Tier 1: WAN2
    2. Tier 2: WAN1
Global behaviour is set to:
  • State Killing on Gateway Recovery: "Kill all states for lower-priority gateways"
  • State Killing on Gateway Failure: "Kill states for gateways which are down"

The default gateway in System->Routing->Gateways is set to use the IPv4 and IPv6 versions of Group 1.

WAN3 is not part of any gateway groups (set to: "Never" in all gateway groups) and is only being used for manual testing at the moment, but no traffic has been routed through it via any firewall rules that have been active since boot in the scenarios described below.

There are two Wireguard tunnels configured:
  1. WG1: peer endpoint configured via hostname, typically connected via IPv6
  2. WG2: peer endpoint configured via static IPv4 address.
There are two situations in which I have noticed Wireguard using the wrong gateway in this setup:
  1. Wireguard fails to switch back to WAN1 after a WAN1 outage
    Whenever WAN1 has an outage, both Wireguard connections gracefully switch over to WAN2, but are "stuck" on WAN2 after recovery until manually restarting Wireguard via the web UI.
  1. WG1 switches to WAN3 after physically unplugging the SFP+ module plugged into ixl0.
    This was discovered by chance while cleaning up some wire management and briefly unplugging the SFP+ module in ixl0. This is particularly weird because unplugging ixl0 does not affect the WAN1 connection plugged into ix0 and WAN3 is not configured to be part of the default system gateway group or any gateway group, for that matter. An interesting thing to note is that this switch to WAN3 only happens to WG1 and not WG2, which suggests that this may have something to do with IPv6. As with the first situation, this can be fixed by manually restarting Wireguard via the web UI. It can be recreated at will by unplugging ixl0 again.

I initially noticed this issue after seeing atypical latency numbers for the Wireguard gateways and then confirmed that they were going over the wrong gateways by searching for the Wireguard peer addresses in the state table.

Actions #1

Updated by Allan Hsu 3 days ago

I should add that I'm not sure if it's unplugging or replugging the SFP+ module that causes the switch over to WAN3 in the second situation, as I have only only ever checked on the state of the machine after plugging the module back in.

Actions

Also available in: Atom PDF