Bug #2257
closedMultiple related bugs, apparent causes and supporting/disagnostic data (please change bug report title as needed)
0%
Description
Around 00.50 today I modified my MTU and saved config, making no other changes. Multiple strange issues occurred which suggest both an interface bug and untrapped error conditions in the connection logic. Hopefully easy enough to verify and fix.
- In some circumstances defining an MTU for a WAN link can cause other unrelated WAN config such as bindings to be silently "lost"/deleted on saving. (crossref issue #2067)
- A knock-on effect is the relevant interface immediately dies, and attempts to bring it up fail abnormally. The system log shows that interfaces_bring_up() fails due to "no variable defined". Pfsense doesn't seem to trap the failure to create an interface and just continues trying the rest of the connection sequence.
- This shows in two ways.
- PPP log shows numerous connection attempts, all failing due to "Skipping link XXX with undefined interface". The user isn't notified at this stage and the ticker doesn't report the failure to define the interface (which might suggest an interface config problem or system issue).
- Instead of stopping with an error when expected interfaces can't be created, pfsense continues with presenting pfctl with filter rules to load. Pfctl interprets a config that sets rules on a non-existant interface as a syntax issue.
- The user is explicitly - but misleadingly - notified that filtering failed to start "due to a syntax error in the rules", but the "incorrect" line numbers point to innocuous system generated rules that worked before.
Data for checking and fixing, including timestamps, follows. Config available if it's not enough.
I set an MTU at 00:50:02. The visible symptom was that on saving an MTU the gateway immediately went down and the alert ticker quickly reported pfctl had identified a syntax error in the filter rules and was unable to load them. I fixed the config at 01:18:08.
Filter reload status display (ticker showed the same):
There were error(s) loading the rules: /tmp/rules.debug:127: syntax error/tmp/rules.debug:128: syntax error pfctl: Syntax error in config file: pf rules not loaded - The line in question reads [127]: block in log quick on { pppoe0 } proto { tcp udp } from !/ to any port $Always_blocked_ports label "USER_RULE: Incoming to blocked ports" ...
I had taken a backup of /tmp/rules.debugf less than an hour earlier while everything was good, which made clear the relevant line was innocuous, it read:
block in log quick on { pppoe0 } reply-to ( pppoe0 ***<ISP's gateway IP>***) proto { tcp udp } from !***<RouterPublicIP>***/32 to any port $Always_blocked_ports
Relevant system log at time of rule change (irrelevant lines such as rrd stripped out):
00:52:34 php: : The command '/sbin/pfctl -o basic -f /tmp/rules.debug' returned exit code '1', the output was '/tmp/rules.debug:127: syntax error /tmp/rules.debug:128: syntax error pfctl: Syntax error in config file: pf rules not loaded' 00:51:24 apinger: No usable targets found, exiting 00:51:24 apinger: Starting Alarm Pinger, apinger(***) 00:51:24 check_reload_status: Reloading filter 00:51:23 apinger: Exiting on signal 15. 00:51:22 php: : Backtrace: Array 00:51:22 php: : interfaces_bring_up() was called but no variable defined. 00:51:21 check_reload_status: updating dyndns wan 00:51:21 dhcpd: Copyright 2004-2011 Internet Systems Consortium. [etc] 00:51:21 dhcpd: Internet Systems Consortium DHCP Server *** 00:51:17 check_reload_status: Rewriting resolv.conf 00:51:17 php: /interfaces.php: Backtrace: Array 00:51:17 php: /interfaces.php: interfaces_bring_up() was called but no variable defined. 00:51:17 check_reload_status: Configuring interface wan 00:50:03 check_reload_status: Syncing firewall 00:31:35 check_reload_status: Syncing firewall 00:30:20 php: /index.php: Successful webConfigurator login for user '***' from ***** 00:30:20 php: /index.php: Successful webConfigurator login for user '***' from ***** 23:15:13 php: /diag_system_activity.php: Session timed out for user '***' from: ***** 21:42:08 sshlockout[****]: **** 21:42:08 php: /diag_logs.php: Successful webConfigurator login for user '***' from ***** 21:42:08 php: /diag_logs.php: Successful webConfigurator login for user '***' from ***** 21:15:00 check_reload_status: Reloading filter 21:14:50 apinger: alarm canceled: WAN(*****) *** WANdelay *** 21:14:31 check_reload_status: Reloading filter 21:14:21 apinger: ALARM: WAN(*****) *** WANdelay *** 20:00:06 check_reload_status: Reloading filter 20:00:06 check_reload_status: Syncing firewall 20:00:05 check_reload_status: Syncing firewall 19:38:41 check_reload_status: Reloading filter 19:38:39 php: : IPSEC: One or more IPsec tunnel endpoints has changed its IP. Refreshing. 19:38:22 check_reload_status: Reloading filter 19:38:22 check_reload_status: Reloading filter 19:38:20 php: : Starting ***<other package>*** 19:38:15 check_reload_status: Syncing firewall [reloaded/resynced several times, possibly due to user browser action] 19:38:14 check_reload_status: Reloading filter 19:37:16 check_reload_status: Syncing firewall 19:37:06 check_reload_status: Reloading filter 19:37:05 check_reload_status: Syncing firewall 19:37:04 check_reload_status: Syncing firewall 19:35:58 check_reload_status: Reloading filter 19:35:58 check_reload_status: Syncing firewall 19:34:52 check_reload_status: Reloading filter 19:34:52 check_reload_status: Syncing firewall 19:33:26 php: : Restarting/Starting all packages. 19:33:08 check_reload_status: Restarting ipsec tunnels 19:31:24 dhcpd: For info, please visit https://www.isc.org/software/dhcp/ 19:31:24 dhcpd: All rights reserved. 19:31:24 dhcpd: Copyright 2004-2011 Internet Systems Consortium.
PPP log contained numerous copies of:
01:06:45 ppp: [wan_link0] Link: DOWN event 01:06:45 ppp: [wan_link0] PPPoE node for link is not initialized 01:06:45 ppp: [wan_link0] PPPoE: Skipping link "wan_link0" with undefined interface 01:06:45 ppp: [wan_link0] Link: reconnection attempt 375 01:06:41 ppp: [wan_link0] Link: reconnection attempt 375 in 4 seconds 01:06:41 ppp: [wan_link0] LCP: Down event 01:06:41 ppp: [wan_link0] Link: DOWN event 01:06:41 ppp: [wan_link0] PPPoE node for link is not initialized 01:06:41 ppp: [wan_link0] PPPoE: Skipping link "wan_link0" with undefined interface 01:06:41 ppp: [wan_link0] Link: reconnection attempt 374 01:06:37 ppp: [wan_link0] Link: reconnection attempt 374 in 4 seconds
Diff for the config change at 00:50 (some stuff blanked):
Configuration diff from *** 00:31:34 to *** 00:50:02 --- /conf/backup/config-*****.xml *** 00:50:02.000000000 +0000 +++ /conf/config.xml *** 00:50:02.000000000 +0000 @@ *** @@ <descr><![CDATA[WAN]]></descr> - <ipaddr>pppoe</ipaddr> + <ipaddr>pppoe</ipaddr> + <mtu>***</mtu> <<--------------- intended @@ *** @@ <type>pppoe</type> <if>pppoe0</if> - <ports>****</ports> <<--------------- not intended/ + <ports/> <<--------------- silent <username>****</username> <password>****</password> - <provider/> <ondemand/> + <provider/> </ppp>
Although I only changed MTU (leaving "ondemand" unchanged), the config had also made an unrequested change from <ports>****</ports> to <ports/>. When I changed this back and manually clicked "reconnect" it immediately all worked again, although the interface had an unusual "ng" label I'm not sure if that's normal or not. PPP log:
01:19:58 ppp: [wan_link0] *** 01:19:58 ppp: [wan_link0] PROTOCOMP 01:19:58 ppp: [wan_link0] LCP: SendConfigReq #1 01:19:58 ppp: [wan_link0] LCP: state change Starting --> Req-Sent 01:19:58 ppp: [wan_link0] LCP: Up event 01:19:58 ppp: [wan_link0] Link: UP event 01:19:58 ppp: [wan_link0] PPPoE: connection successful 01:19:58 ppp: PPPoE: rec'd ACNAME **** 01:19:58 ppp: [wan_link0] PPPoE: Connecting to '' 01:19:58 ppp: [wan_link0] LCP: LayerStart 01:19:58 ppp: [wan_link0] LCP: state change Initial --> Starting 01:19:58 ppp: [wan_link0] LCP: Open event 01:19:58 ppp: [wan_link0] Link: OPEN event 01:19:58 ppp: [wan] Bundle: Interface ng* created 01:19:58 ppp: web: web is not running 01:19:58 ppp: last message repeated 2 times 01:19:57 ppp: process **** terminated 01:19:57 ppp: [wan_link0] Link: Shutdown 01:19:57 ppp: [wan] Bundle: Shutdown 01:19:55 ppp: [wan] IPCP: Close event 01:19:55 ppp: [wan] IFACE: Close event 01:19:55 ppp: caught fatal signal term 01:19:55 ppp: waiting for process **** to die... 01:19:55 ppp: process **** started, version **** 01:19:55 ppp: 01:19:55 ppp: Multi-link PPP daemon for FreeBSD 01:19:54 ppp: [wan_link0] Link: reconnection attempt 698 in 4 seconds 01:19:54 ppp: [wan_link0] LCP: Down event 01:19:54 ppp: [wan_link0] Link: DOWN event 01:19:54 ppp: [wan_link0] PPPoE node for link is not initialized 01:19:54 ppp: [wan_link0] PPPoE: Skipping link "wan_link0" with undefined interface
System log showed once the interface was created, attempts to reload the rules didn't give rise to a "syntax error":
01:20:05 apinger: Starting Alarm Pinger, apinger(***) 01:20:05 check_reload_status: Reloading filter 01:20:04 php: : Removing static route for monitor *** and adding a new route through *** 01:20:04 php: : ROUTING: setting default route to *** 01:18:09 check_reload_status: Syncing firewall 00:52:34 php: : The command '/sbin/pfctl -o basic -f /tmp/rules.debug' returned exit code '1', the output was '/tmp/rules.debug:127: syntax error /tmp/rules.debug:128: syntax error pfctl: Syntax error in config file: pf rules not loaded'
Suspected underlying issues:
- The interface has a bug that in some cases changing MTU can cause interface <ports/> data to be silently deleted (almost certainly the same issue as #2067);
- Connection logic doesn't trap or test for error conditions:
- when the attempt to create/define an interface fails;
- when the kind of error returned by a PPP attempt suggests a system or config abnormality rather than usual connection issues (ie "try again 500 times" probably isn't the best response);
- when prerequisite setup/definitions for the filters don't exist in the system at the point pfctl is called to load rules which refer to them.
I'm basing this on the system log line "no variable defined" in relation to the pppoe interface it was trying to bring up, combined with a config diff that shows key data needed to bring up the interface got silently deleted.
Hopefully just needs verifying and fixing now