Project

General

Profile

Actions

Bug #16821

open

Filter compiler generates spurious block rule for gateway monitor IP on DHCP dynamic gateway

Added by Mike Gschwendtner about 10 hours ago. Updated about 6 hours ago.

Status:
Incomplete
Priority:
Normal
Assignee:
-
Category:
Gateway Monitoring
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Release Notes:
Default
Affected Plus Version:
26.03
Affected Architecture:
amd64

Description

I'm running pfSense Plus 26.03 with a 4-WAN multi-WAN failover setup (AT&T fiber, Starlink in bypass mode, two cellular backups). All four WANs are DHCP. Each gateway has a unique monitor IP (8.8.8.8, 208.67.222.222, 9.9.9.9, 1.0.0.1).
The Starlink gateway (STARLINK_DHCP) intermittently gets stuck showing 100% loss and down, even though the interface has a valid CGNAT IP, the link is up, and dpinger is running with the correct bind address and monitor target. The dish is completely healthy — the Starlink app shows 100% connectivity and I can pass real traffic through the interface.
After a lot of digging, I found the root cause. The filter compiler is generating this rule in /tmp/rules.debug:
block drop out log quick inet proto icmp from any to 208.67.222.222 icmp-type echoreq label "descr=gateway monitoring"
This shows up in pfctl -sr and actively blocks dpinger's ICMP echo requests before they ever leave the box. The quick keyword means it fires before the built-in "let out anything from firewall host itself" pass rule gets evaluated. Manually running ping -S <starlink_ip> 208.67.222.222 also returns "Permission denied" — confirmed it's pf dropping it, not a network issue.
The rule survives pfctl -d && pfctl -e -f /tmp/rules.debug because it's baked into rules.debug itself.
Workaround: Edit the gateway, clear the monitor IP field, save and apply, then re-add the monitor IP and save/apply again. After that, the block rule disappears from rules.debug and the gateway comes back online immediately. It stays working until the bug triggers again.
I haven't been able to nail down the exact trigger. It seems related to DHCP interface events — Starlink in bypass mode does frequent DHCP cycling due to CGNAT IP rotation and satellite handoffs. The gateway is dynamic (shows the ~ marker). I've seen it happen multiple times over the past few weeks. The three other DHCP gateways with their own monitor IPs have not been affected, though they don't cycle DHCP nearly as often as Starlink does.
Environment:

pfSense Plus 26.03-RELEASE
Starlink Gen 3 in bypass mode, DHCP on igc4
Gateway: STARLINK_DHCP (dynamic), Monitor IP: 208.67.222.222
Use non-local gateway: enabled
Probe interval: 5000ms, loss interval: 5000ms, time period: 120000ms
Four DHCP WAN gateways in a tiered gateway group

Related: The Netgate forum thread "Outbound ping blocked" (https://forum.netgate.com/topic/198359) describes the identical symptom on CE 2.8.0 for IPv6 gateway monitoring — same spurious block drop out quick rule with the "gateway monitoring" label. Same workaround (remove and re-add monitor IP). That poster also noted it recurs after reboots.

Actions #1

Updated by Marcos M about 6 hours ago

  • Status changed from New to Incomplete

There's likely some other root cause that results in the issue; the rule itself is added (depending on the current status and configuration) to prevent other routing issues. The discussion can continue on the forum to narrow down a root cause.

Actions

Also available in: Atom PDF