Project

General

Profile

Actions

Bug #2257

closed

Multiple related bugs, apparent causes and supporting/disagnostic data (please change bug report title as needed)

Added by Stilez y about 12 years ago. Updated about 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
03/04/2012
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
2.0.1
Affected Architecture:

Description

Around 00.50 today I modified my MTU and saved config, making no other changes. Multiple strange issues occurred which suggest both an interface bug and untrapped error conditions in the connection logic. Hopefully easy enough to verify and fix.

  1. In some circumstances defining an MTU for a WAN link can cause other unrelated WAN config such as bindings to be silently "lost"/deleted on saving. (crossref issue #2067)
  2. A knock-on effect is the relevant interface immediately dies, and attempts to bring it up fail abnormally. The system log shows that interfaces_bring_up() fails due to "no variable defined". Pfsense doesn't seem to trap the failure to create an interface and just continues trying the rest of the connection sequence.
  3. This shows in two ways.
    • PPP log shows numerous connection attempts, all failing due to "Skipping link XXX with undefined interface". The user isn't notified at this stage and the ticker doesn't report the failure to define the interface (which might suggest an interface config problem or system issue).
    • Instead of stopping with an error when expected interfaces can't be created, pfsense continues with presenting pfctl with filter rules to load. Pfctl interprets a config that sets rules on a non-existant interface as a syntax issue.
  4. The user is explicitly - but misleadingly - notified that filtering failed to start "due to a syntax error in the rules", but the "incorrect" line numbers point to innocuous system generated rules that worked before.

Data for checking and fixing, including timestamps, follows. Config available if it's not enough.

I set an MTU at 00:50:02. The visible symptom was that on saving an MTU the gateway immediately went down and the alert ticker quickly reported pfctl had identified a syntax error in the filter rules and was unable to load them. I fixed the config at 01:18:08.

Filter reload status display (ticker showed the same):

There were error(s) loading the rules: 
  /tmp/rules.debug:127: syntax error/tmp/rules.debug:128: syntax error pfctl: Syntax error in config file: 
  pf rules not loaded - The line in question reads [127]: block in log quick on { pppoe0 } proto { tcp udp } 
  from !/ to any port $Always_blocked_ports label "USER_RULE: Incoming to blocked ports" ...

I had taken a backup of /tmp/rules.debugf less than an hour earlier while everything was good, which made clear the relevant line was innocuous, it read:

block  in log  quick  on {  pppoe0  } reply-to ( pppoe0 ***<ISP's gateway IP>***)  proto { tcp udp }  
  from !***<RouterPublicIP>***/32 to any port $Always_blocked_ports

Relevant system log at time of rule change (irrelevant lines such as rrd stripped out):

00:52:34 php: : The command '/sbin/pfctl -o basic -f /tmp/rules.debug' returned exit code '1', the output 
  was '/tmp/rules.debug:127: syntax error /tmp/rules.debug:128: syntax error pfctl: Syntax error in config file: 
  pf rules not loaded' 
00:51:24 apinger: No usable targets found, exiting 
00:51:24 apinger: Starting Alarm Pinger, apinger(***) 
00:51:24 check_reload_status: Reloading filter 
00:51:23 apinger: Exiting on signal 15. 
00:51:22 php: : Backtrace: Array 
00:51:22 php: : interfaces_bring_up() was called but no variable defined. 
00:51:21 check_reload_status: updating dyndns wan 
00:51:21 dhcpd: Copyright 2004-2011 Internet Systems Consortium. [etc]
00:51:21 dhcpd: Internet Systems Consortium DHCP Server ***
00:51:17 check_reload_status: Rewriting resolv.conf 
00:51:17 php: /interfaces.php: Backtrace: Array 
00:51:17 php: /interfaces.php: interfaces_bring_up() was called but no variable defined. 
00:51:17 check_reload_status: Configuring interface wan 
00:50:03 check_reload_status: Syncing firewall 
00:31:35 check_reload_status: Syncing firewall 
00:30:20 php: /index.php: Successful webConfigurator login for user '***' from ***** 
00:30:20 php: /index.php: Successful webConfigurator login for user '***' from ***** 
23:15:13 php: /diag_system_activity.php: Session timed out for user '***' from: ***** 
21:42:08 sshlockout[****]: ****
21:42:08 php: /diag_logs.php: Successful webConfigurator login for user '***' from ***** 
21:42:08 php: /diag_logs.php: Successful webConfigurator login for user '***' from ***** 
21:15:00 check_reload_status: Reloading filter 
21:14:50 apinger: alarm canceled: WAN(*****) *** WANdelay *** 
21:14:31 check_reload_status: Reloading filter 
21:14:21 apinger: ALARM: WAN(*****) *** WANdelay *** 
20:00:06 check_reload_status: Reloading filter 
20:00:06 check_reload_status: Syncing firewall 
20:00:05 check_reload_status: Syncing firewall 
19:38:41 check_reload_status: Reloading filter 
19:38:39 php: : IPSEC: One or more IPsec tunnel endpoints has changed its IP. Refreshing. 
19:38:22 check_reload_status: Reloading filter 
19:38:22 check_reload_status: Reloading filter 
19:38:20 php: : Starting ***<other package>***
19:38:15 check_reload_status: Syncing firewall 
  [reloaded/resynced several times, possibly due to user browser action]
19:38:14 check_reload_status: Reloading filter 
19:37:16 check_reload_status: Syncing firewall 
19:37:06 check_reload_status: Reloading filter 
19:37:05 check_reload_status: Syncing firewall 
19:37:04 check_reload_status: Syncing firewall 
19:35:58 check_reload_status: Reloading filter 
19:35:58 check_reload_status: Syncing firewall 
19:34:52 check_reload_status: Reloading filter 
19:34:52 check_reload_status: Syncing firewall 
19:33:26 php: : Restarting/Starting all packages. 
19:33:08 check_reload_status: Restarting ipsec tunnels 
19:31:24 dhcpd: For info, please visit https://www.isc.org/software/dhcp/ 
19:31:24 dhcpd: All rights reserved. 
19:31:24 dhcpd: Copyright 2004-2011 Internet Systems Consortium.

PPP log contained numerous copies of:

01:06:45 ppp: [wan_link0] Link: DOWN event
01:06:45 ppp: [wan_link0] PPPoE node for link is not initialized
01:06:45 ppp: [wan_link0] PPPoE: Skipping link "wan_link0" with undefined interface
01:06:45 ppp: [wan_link0] Link: reconnection attempt 375
01:06:41 ppp: [wan_link0] Link: reconnection attempt 375 in 4 seconds
01:06:41 ppp: [wan_link0] LCP: Down event
01:06:41 ppp: [wan_link0] Link: DOWN event
01:06:41 ppp: [wan_link0] PPPoE node for link is not initialized
01:06:41 ppp: [wan_link0] PPPoE: Skipping link "wan_link0" with undefined interface
01:06:41 ppp: [wan_link0] Link: reconnection attempt 374
01:06:37 ppp: [wan_link0] Link: reconnection attempt 374 in 4 seconds

Diff for the config change at 00:50 (some stuff blanked):

Configuration diff from *** 00:31:34 to *** 00:50:02

--- /conf/backup/config-*****.xml    *** 00:50:02.000000000 +0000 
+++ /conf/config.xml    *** 00:50:02.000000000 +0000 
@@ *** @@ 
            <descr><![CDATA[WAN]]></descr> 
-            <ipaddr>pppoe</ipaddr> 
+            <ipaddr>pppoe</ipaddr> 
+            <mtu>***</mtu>                    <<--------------- intended
@@ *** @@ 
            <type>pppoe</type> 
            <if>pppoe0</if>
-            <ports>****</ports>               <<--------------- not intended/
+            <ports/>                          <<---------------    silent
            <username>****</username> 
            <password>****</password> 
-            <provider/> 
            <ondemand/>
+            <provider/> 
        </ppp>

Although I only changed MTU (leaving "ondemand" unchanged), the config had also made an unrequested change from <ports>****</ports> to <ports/>. When I changed this back and manually clicked "reconnect" it immediately all worked again, although the interface had an unusual "ng" label I'm not sure if that's normal or not. PPP log:

01:19:58 ppp: [wan_link0] ***
01:19:58 ppp: [wan_link0] PROTOCOMP
01:19:58 ppp: [wan_link0] LCP: SendConfigReq #1
01:19:58 ppp: [wan_link0] LCP: state change Starting --> Req-Sent
01:19:58 ppp: [wan_link0] LCP: Up event
01:19:58 ppp: [wan_link0] Link: UP event
01:19:58 ppp: [wan_link0] PPPoE: connection successful
01:19:58 ppp: PPPoE: rec'd ACNAME ****
01:19:58 ppp: [wan_link0] PPPoE: Connecting to ''
01:19:58 ppp: [wan_link0] LCP: LayerStart
01:19:58 ppp: [wan_link0] LCP: state change Initial --> Starting
01:19:58 ppp: [wan_link0] LCP: Open event
01:19:58 ppp: [wan_link0] Link: OPEN event
01:19:58 ppp: [wan] Bundle: Interface ng* created
01:19:58 ppp: web: web is not running
01:19:58 ppp: last message repeated 2 times
01:19:57 ppp: process **** terminated
01:19:57 ppp: [wan_link0] Link: Shutdown
01:19:57 ppp: [wan] Bundle: Shutdown
01:19:55 ppp: [wan] IPCP: Close event
01:19:55 ppp: [wan] IFACE: Close event
01:19:55 ppp: caught fatal signal term
01:19:55 ppp: waiting for process **** to die...
01:19:55 ppp: process **** started, version ****
01:19:55 ppp: 
01:19:55 ppp: Multi-link PPP daemon for FreeBSD
01:19:54 ppp: [wan_link0] Link: reconnection attempt 698 in 4 seconds
01:19:54 ppp: [wan_link0] LCP: Down event
01:19:54 ppp: [wan_link0] Link: DOWN event
01:19:54 ppp: [wan_link0] PPPoE node for link is not initialized
01:19:54 ppp: [wan_link0] PPPoE: Skipping link "wan_link0" with undefined interface

System log showed once the interface was created, attempts to reload the rules didn't give rise to a "syntax error":

01:20:05 apinger: Starting Alarm Pinger, apinger(***)
01:20:05 check_reload_status: Reloading filter
01:20:04 php: : Removing static route for monitor *** and adding a new route through ***
01:20:04 php: : ROUTING: setting default route to ***
01:18:09 check_reload_status: Syncing firewall
00:52:34 php: : The command '/sbin/pfctl -o basic -f /tmp/rules.debug' returned exit code '1', the output was 
  '/tmp/rules.debug:127: syntax error /tmp/rules.debug:128: syntax error pfctl: Syntax error in config file: 
  pf rules not loaded'

Suspected underlying issues:

  1. The interface has a bug that in some cases changing MTU can cause interface <ports/> data to be silently deleted (almost certainly the same issue as #2067);
  2. Connection logic doesn't trap or test for error conditions:
    • when the attempt to create/define an interface fails;
    • when the kind of error returned by a PPP attempt suggests a system or config abnormality rather than usual connection issues (ie "try again 500 times" probably isn't the best response);
    • when prerequisite setup/definitions for the filters don't exist in the system at the point pfctl is called to load rules which refer to them.

I'm basing this on the system log line "no variable defined" in relation to the pppoe interface it was trying to bring up, combined with a config diff that shows key data needed to bring up the interface got silently deleted.

Hopefully just needs verifying and fixing now

Actions

Also available in: Atom PDF