Project

General

Profile

Actions

Bug #5039

closed

Multiple PPPoE WANs no longer connect simultaneously after upgrade to 2.2.4

Added by Michael Knowles over 8 years ago. Updated over 8 years ago.

Status:
Not a Bug
Priority:
High
Assignee:
-
Category:
Multi-WAN
Target version:
-
Start date:
08/26/2015
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
Affected Architecture:

Description

Hi,

I previously logged bug #4992 but then retracted it as I worked further on the problem. It now turns out that I have a problem with having more than 1 WAN connection up following the upgrade to 2.2.4

As per this Reddit post: https://www.reddit.com/r/PFSENSE/comments/3idccv/gateway_down_after_213_224_upgrade/ it appears that I cannot get more than one PPPoE-based WAN connection up at any one time. I can have one or the other, but not both, and it appears to be fairly random as to which connection I get, despite both connections being perfectly fine prior to the upgrade and also perfectly fine when tested against a BT Business Hub router independently.

Pic attached showing the dashboard following the last reboot, with PPPoE-negotiated WAN IPs and the apinger showing WAN2 as being down.

Current logs are:

General log entries
Aug 26 17:07:33 php-fpm25977: /rc.newwanip: rc.newwanip: on (IP address: 192.168.191.2) (interface: []) (real interface: ovpnc2).
Aug 26 17:07:33 check_reload_status: Reloading filter
Aug 26 17:07:33 php-fpm25977: /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - -> 192.168.191.2 - Restarting packages.
Aug 26 17:07:33 php-fpm25977: /rc.newwanip: rc.newwanip: Info: starting on ovpnc4.
Aug 26 17:07:33 php-fpm25977: /rc.newwanip: rc.newwanip: on (IP address: 192.168.189.2) (interface: []) (real interface: ovpnc4).
Aug 26 17:07:33 php-fpm25977: /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - -> 192.168.189.2 - Restarting packages.
Aug 26 17:07:33 php-fpm25977: /rc.start_packages: Restarting/Starting all packages.
Aug 26 17:07:33 php-fpm25977: /rc.start_packages: Restarting/Starting all packages.
Aug 26 17:07:34 php-fpm95330: /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 86.149.121.170 -> 86.147.247.114 - Restarting packages.
Aug 26 17:07:34 snmpd65231: disk_OS_get_disks: adding device 'da0' to device list
Aug 26 17:07:34 check_reload_status: Starting packages
Aug 26 17:07:34 snmpd65231: disk_OS_get_disks: adding device 'cd0' to device list
Aug 26 17:07:35 php-fpm95330: /rc.start_packages: Restarting/Starting all packages.
Aug 26 17:07:36 check_reload_status: Restarting ipsec tunnels
Aug 26 17:07:38 check_reload_status: updating dyndns opt1
Aug 26 17:07:40 snmpd76525: disk_OS_get_disks: adding device 'da0' to device list
Aug 26 17:07:40 snmpd76525: disk_OS_get_disks: adding device 'cd0' to device list
Aug 26 17:07:40 check_reload_status: Reloading filter
Aug 26 17:07:40 php-fpm92826: /interfaces.php: Creating rrd update script
Aug 26 17:07:41 php-fpm25977: /rc.filter_configure_sync: MONITOR: WAN2_PPPOE is down, omitting from routing group WAN1First
Aug 26 17:07:41 php-fpm25977: /rc.filter_configure_sync: MONITOR: WAN2_PPPOE is down, omitting from routing group WAN2First
Aug 26 17:07:41 php-fpm25977: /rc.filter_configure_sync: MONITOR: WAN2_PPPOE is down, omitting from routing group WANBalanced
Aug 26 17:07:50 check_reload_status: updating dyndns WAN2_PPPOE
Aug 26 17:07:50 check_reload_status: Restarting ipsec tunnels
Aug 26 17:07:50 check_reload_status: Restarting OpenVPN tunnels/interfaces
Aug 26 17:07:50 check_reload_status: Reloading filter
Aug 26 17:07:51 php-fpm24471: /rc.dyndns.update: MONITOR: WAN2_PPPOE is down, omitting from routing group WAN1First
Aug 26 17:07:51 php-fpm24471: /rc.dyndns.update: MONITOR: WAN2_PPPOE is down, omitting from routing group WAN2First
Aug 26 17:07:51 php-fpm24471: /rc.dyndns.update: MONITOR: WAN2_PPPOE is down, omitting from routing group WANBalanced
Aug 26 17:07:51 php-fpm24471: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN2_PPPOE.
Aug 26 17:07:51 php-fpm24471: /rc.filter_configure_sync: MONITOR: WAN2_PPPOE is down, omitting from routing group WAN1First
Aug 26 17:07:51 php-fpm24471: /rc.filter_configure_sync: MONITOR: WAN2_PPPOE is down, omitting from routing group WAN2First
Aug 26 17:07:51 php-fpm24471: /rc.filter_configure_sync: MONITOR: WAN2_PPPOE is down, omitting from routing group WANBalanced
Aug 26 17:09:36 check_reload_status: Syncing firewall
Aug 26 17:09:38 php-fpm97990: /system_gateways.php: ROUTING: setting default route to 81.134.160.1
Aug 26 17:09:38 check_reload_status: Reloading filter
Aug 26 17:11:24 check_reload_status: Syncing firewall
Aug 26 17:11:26 php-fpm77949: /system_gateways.php: ROUTING: setting default route to 81.134.160.1
Aug 26 17:11:26 check_reload_status: Reloading filter
Aug 26 17:11:46 check_reload_status: updating dyndns WAN2_PPPOE
Aug 26 17:11:46 check_reload_status: Restarting ipsec tunnels
Aug 26 17:11:46 check_reload_status: Restarting OpenVPN tunnels/interfaces
Aug 26 17:11:46 check_reload_status: Reloading filter
Aug 26 17:11:47 php-fpm17481: /rc.dyndns.update: MONITOR: WAN2_PPPOE is down, omitting from routing group WAN1First
Aug 26 17:11:47 php-fpm17481: /rc.dyndns.update: MONITOR: WAN2_PPPOE is down, omitting from routing group WAN2First
Aug 26 17:11:47 php-fpm17481: /rc.dyndns.update: MONITOR: WAN2_PPPOE is down, omitting from routing group WANBalanced
Aug 26 17:11:47 php-fpm17481: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN2_PPPOE.
Aug 26 17:11:47 php-fpm17481: /rc.filter_configure_sync: MONITOR: WAN2_PPPOE is down, omitting from routing group WAN1First
Aug 26 17:11:47 php-fpm17481: /rc.filter_configure_sync: MONITOR: WAN2_PPPOE is down, omitting from routing group WAN2First
Aug 26 17:11:47 php-fpm17481: /rc.filter_configure_sync: MONITOR: WAN2_PPPOE is down, omitting from routing group WANBalanced

Gateway log entries
Aug 25 12:42:42 apinger: SIGHUP received, reloading configuration.
Aug 25 12:50:43 apinger: SIGHUP received, reloading configuration.
Aug 25 12:50:54 apinger: ALARM: WAN_PPPOE(172.16.12.35) * down
Aug 25 18:39:59 apinger: ALARM: WAN2_PPPOE(81.148.96.1)
down
Aug 25 18:39:59 apinger: alarm canceled: WAN2_PPPOE(81.148.96.1)
down
Aug 26 01:30:37 apinger: ALARM: WAN2_PPPOE(81.148.96.1)
down
Aug 26 01:31:10 apinger: alarm canceled: WAN2_PPPOE(81.148.96.1)
down
Aug 26 01:31:20 apinger: ALARM: WAN2_PPPOE(81.148.96.1)
delay
Aug 26 01:31:22 apinger: alarm canceled: WAN2_PPPOE(81.148.96.1)
delay
Aug 26 12:46:51 apinger: ALARM: WAN2_PPPOE(81.148.96.1)
down
Aug 26 12:47:37 apinger: alarm canceled: WAN2_PPPOE(81.148.96.1)
down
Aug 26 12:47:46 apinger: ALARM: WAN2_PPPOE(81.148.96.1)
delay
Aug 26 12:47:48 apinger: alarm canceled: WAN2_PPPOE(81.148.96.1)
delay
Aug 26 12:54:12 apinger: SIGHUP received, reloading configuration.
Aug 26 12:54:12 apinger: alarm canceled (config reload): WAN_PPPOE(172.16.12.35)
down
Aug 26 12:54:23 apinger: ALARM: WAN_PPPOE(172.16.12.35)
down
Aug 26 12:54:23 apinger: SIGHUP received, reloading configuration.
Aug 26 14:19:22 apinger: SIGHUP received, reloading configuration.
Aug 26 14:22:51 apinger: SIGHUP received, reloading configuration.
Aug 26 15:31:01 apinger: ALARM: WAN2_PPPOE(81.148.96.1)
down
Aug 26 15:32:04 apinger: SIGHUP received, reloading configuration.
Aug 26 15:32:04 apinger: alarm canceled (config reload): WAN_PPPOE(172.16.12.35)
down
Aug 26 15:32:04 apinger: alarm canceled (config reload): WAN2_PPPOE(81.148.96.1)
down
Aug 26 15:32:14 apinger: ALARM: WAN2_PPPOE(172.16.12.35)
down
Aug 26 15:37:06 apinger: Starting Alarm Pinger, apinger(25837)
Aug 26 15:39:23 apinger: ALARM: WAN_PPPOE(81.134.160.1)
down
Aug 26 15:41:01 apinger: SIGHUP received, reloading configuration.
Aug 26 15:41:01 apinger: alarm canceled (config reload): WAN_PPPOE(81.134.160.1)
down
Aug 26 15:41:01 apinger: alarm canceled (config reload): WAN2_PPPOE(172.16.12.35)
down
Aug 26 15:41:11 apinger: ALARM: WAN2_PPPOE(172.16.12.35)
down
Aug 26 15:42:22 apinger: SIGHUP received, reloading configuration.
Aug 26 15:43:13 apinger: ALARM: WAN_PPPOE(81.148.96.1)
down
Aug 26 15:44:35 apinger: SIGHUP received, reloading configuration.
Aug 26 15:44:35 apinger: alarm canceled (config reload): WAN_PPPOE(81.148.96.1)
down
Aug 26 15:44:35 apinger: alarm canceled (config reload): WAN2_PPPOE(172.16.12.35)
down
Aug 26 15:44:46 apinger: ALARM: WAN2_PPPOE(172.16.12.35)
down
Aug 26 15:45:54 apinger: SIGHUP received, reloading configuration.
Aug 26 16:33:37 apinger: ALARM: WAN_PPPOE(81.148.96.1)
down
Aug 26 16:36:27 apinger: alarm canceled: WAN_PPPOE(81.148.96.1)
down
Aug 26 16:36:29 apinger: SIGHUP received, reloading configuration.
Aug 26 17:02:12 apinger: Starting Alarm Pinger, apinger(25593)
Aug 26 17:06:42 apinger: ALARM: WAN2_PPPOE(172.16.12.35)
down
Aug 26 17:07:29 apinger: SIGHUP received, reloading configuration.
Aug 26 17:07:29 apinger: alarm canceled (config reload): WAN2_PPPOE(172.16.12.35)
down
Aug 26 17:07:40 apinger: ALARM: WAN2_PPPOE(172.16.12.35)
down
Aug 26 17:07:40 apinger: SIGHUP received, reloading configuration.
Aug 26 17:09:38 apinger: SIGHUP received, reloading configuration.
Aug 26 17:09:38 apinger: alarm canceled (config reload): WAN2_PPPOE(172.16.12.35)
down
Aug 26 17:11:26 apinger: SIGHUP received, reloading configuration.
Aug 26 17:11:36 apinger: ALARM: WAN2_PPPOE(172.16.12.35)
down *

Routing daemon log entries
Aug 25 12:42:05 radvd36880: attempting to reread config file
Aug 25 12:42:05 radvd36880: no auto-selected prefix on interface em1, disabling advertisements
Aug 25 12:42:05 radvd36880: resuming normal operation
Aug 25 12:42:34 radvd36880: attempting to reread config file
Aug 25 12:42:34 radvd36880: no auto-selected prefix on interface em1, disabling advertisements
Aug 25 12:42:34 radvd36880: resuming normal operation
Aug 25 12:42:40 radvd36880: attempting to reread config file
Aug 25 12:42:40 radvd36880: no auto-selected prefix on interface em1, disabling advertisements
Aug 25 12:42:40 radvd36880: resuming normal operation
Aug 25 12:50:44 radvd36880: attempting to reread config file
Aug 25 12:50:44 radvd36880: no auto-selected prefix on interface em1, disabling advertisements
Aug 25 12:50:44 radvd36880: resuming normal operation
Aug 26 12:54:14 radvd36880: attempting to reread config file
Aug 26 12:54:14 radvd36880: no auto-selected prefix on interface em1, disabling advertisements
Aug 26 12:54:14 radvd36880: resuming normal operation
Aug 26 12:54:21 radvd36880: attempting to reread config file
Aug 26 12:54:21 radvd36880: no auto-selected prefix on interface em1, disabling advertisements
Aug 26 12:54:21 radvd36880: resuming normal operation
Aug 26 14:19:20 radvd36880: attempting to reread config file
Aug 26 14:19:20 radvd36880: no auto-selected prefix on interface em1, disabling advertisements
Aug 26 14:19:20 radvd36880: resuming normal operation
Aug 26 15:32:05 radvd36880: attempting to reread config file
Aug 26 15:32:05 radvd36880: no auto-selected prefix on interface em1, disabling advertisements
Aug 26 15:32:05 radvd36880: resuming normal operation
Aug 26 15:37:09 radvd39098: version 1.9.1 started
Aug 26 15:37:09 radvd39098: no auto-selected prefix on interface em1, disabling advertisements
Aug 26 15:41:02 radvd39582: no auto-selected prefix on interface em1, disabling advertisements
Aug 26 15:41:02 radvd39582: resuming normal operation
Aug 26 15:42:23 radvd39582: attempting to reread config file
Aug 26 15:42:23 radvd39582: no auto-selected prefix on interface em1, disabling advertisements
Aug 26 15:42:23 radvd39582: resuming normal operation
Aug 26 15:44:36 radvd39582: attempting to reread config file
Aug 26 15:44:36 radvd39582: no auto-selected prefix on interface em1, disabling advertisements
Aug 26 15:44:36 radvd39582: resuming normal operation
Aug 26 15:45:56 radvd39582: attempting to reread config file
Aug 26 15:45:56 radvd39582: no auto-selected prefix on interface em1, disabling advertisements
Aug 26 15:45:56 radvd39582: resuming normal operation
Aug 26 16:36:30 radvd39582: attempting to reread config file
Aug 26 16:36:30 radvd39582: no auto-selected prefix on interface em1, disabling advertisements
Aug 26 16:36:30 radvd39582: resuming normal operation
Aug 26 17:02:15 radvd38324: version 1.9.1 started
Aug 26 17:02:15 radvd38324: no auto-selected prefix on interface em1, disabling advertisements
Aug 26 17:06:32 radvd38436: no auto-selected prefix on interface em1, disabling advertisements
Aug 26 17:06:32 radvd38436: resuming normal operation
Aug 26 17:07:30 radvd38436: attempting to reread config file
Aug 26 17:07:30 radvd38436: no auto-selected prefix on interface em1, disabling advertisements
Aug 26 17:07:30 radvd38436: resuming normal operation
Aug 26 17:07:38 radvd38436: attempting to reread config file
Aug 26 17:07:38 radvd38436: no auto-selected prefix on interface em1, disabling advertisements
Aug 26 17:07:38 radvd38436: resuming normal operation

Resolver log entries
Aug 26 16:36:29 dnsmasq36781: using nameserver 8.8.4.4#53
Aug 26 16:37:46 dnsmasq36781: reading /etc/resolv.conf
Aug 26 16:37:46 dnsmasq36781: ignoring nameserver 127.0.0.1 - local interface
Aug 26 16:37:46 dnsmasq36781: using nameserver 194.72.0.98#53
Aug 26 16:37:46 dnsmasq36781: using nameserver 194.74.65.68#53
Aug 26 16:37:46 dnsmasq36781: using nameserver 8.8.8.8#53
Aug 26 16:37:46 dnsmasq36781: using nameserver 8.8.4.4#53
Aug 26 17:02:15 dnsmasq33978: started, version 2.73 cachesize 10000
Aug 26 17:02:15 dnsmasq33978: compile time options: IPv6 GNU-getopt no-DBus i18n IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth DNSSEC loop-detect no-inotify
Aug 26 17:02:15 dnsmasq33978: reading /etc/resolv.conf
Aug 26 17:02:15 dnsmasq33978: ignoring nameserver 127.0.0.1 - local interface
Aug 26 17:02:15 dnsmasq33978: using nameserver 194.74.65.69#53
Aug 26 17:02:15 dnsmasq33978: using nameserver 194.72.0.114#53
Aug 26 17:02:15 dnsmasq33978: using nameserver 8.8.8.8#53
Aug 26 17:02:15 dnsmasq33978: using nameserver 8.8.4.4#53
Aug 26 17:02:15 dnsmasq33978: read /etc/hosts - 3 addresses
Aug 26 17:06:32 dnsmasq33978: reading /etc/resolv.conf
Aug 26 17:06:32 dnsmasq33978: ignoring nameserver 127.0.0.1 - local interface
Aug 26 17:06:32 dnsmasq33978: using nameserver 194.74.65.69#53
Aug 26 17:06:32 dnsmasq33978: using nameserver 194.72.0.114#53
Aug 26 17:06:32 dnsmasq33978: using nameserver 213.120.234.34#53
Aug 26 17:06:32 dnsmasq33978: using nameserver 194.72.9.34#53
Aug 26 17:06:32 dnsmasq33978: using nameserver 8.8.8.8#53
Aug 26 17:06:32 dnsmasq33978: using nameserver 8.8.4.4#53
Aug 26 17:07:26 dnsmasq33978: reading /etc/resolv.conf
Aug 26 17:07:26 dnsmasq33978: ignoring nameserver 127.0.0.1 - local interface
Aug 26 17:07:26 dnsmasq33978: using nameserver 194.74.65.69#53
Aug 26 17:07:26 dnsmasq33978: using nameserver 194.72.0.114#53
Aug 26 17:07:26 dnsmasq33978: using nameserver 8.8.8.8#53
Aug 26 17:07:26 dnsmasq33978: using nameserver 8.8.4.4#53
Aug 26 17:07:30 dnsmasq33978: reading /etc/resolv.conf
Aug 26 17:07:30 dnsmasq33978: ignoring nameserver 127.0.0.1 - local interface
Aug 26 17:07:30 dnsmasq33978: using nameserver 194.74.65.69#53
Aug 26 17:07:30 dnsmasq33978: using nameserver 194.72.0.114#53
Aug 26 17:07:30 dnsmasq33978: using nameserver 213.120.234.34#53
Aug 26 17:07:30 dnsmasq33978: using nameserver 194.72.9.34#53
Aug 26 17:07:30 dnsmasq33978: using nameserver 8.8.8.8#53
Aug 26 17:07:30 dnsmasq33978: using nameserver 8.8.4.4#53
Aug 26 17:07:36 dnsmasq33978: exiting on receipt of SIGTERM
Aug 26 17:07:37 dnsmasq72298: started, version 2.73 cachesize 10000
Aug 26 17:07:37 dnsmasq72298: compile time options: IPv6 GNU-getopt no-DBus i18n IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth DNSSEC loop-detect no-inotify
Aug 26 17:07:37 dnsmasq72298: reading /etc/resolv.conf
Aug 26 17:07:37 dnsmasq72298: ignoring nameserver 127.0.0.1 - local interface
Aug 26 17:07:37 dnsmasq72298: using nameserver 194.74.65.69#53
Aug 26 17:07:37 dnsmasq72298: using nameserver 194.72.0.114#53
Aug 26 17:07:37 dnsmasq72298: using nameserver 213.120.234.34#53
Aug 26 17:07:37 dnsmasq72298: using nameserver 194.72.9.34#53
Aug 26 17:07:37 dnsmasq72298: using nameserver 8.8.8.8#53
Aug 26 17:07:37 dnsmasq72298: using nameserver 8.8.4.4#53
Aug 26 17:07:37 dnsmasq72298: read /etc/hosts - 3 addresses


Files

dashboard.JPG (111 KB) dashboard.JPG Michael Knowles, 08/26/2015 12:40 PM
Actions #1

Updated by Geoffrey Smith over 8 years ago

Not sure if this will help... because I use virtualPF instead of pfsense... and we have one PPPoe and one Ethernet internet connection. But I found changing the default gateway and then changing it back to the original connection made it show that was up again. It seems like some of failover doesn't work properly. Also changing the monitor ip from the default gateway to another ip helped as well.

Actions #2

Updated by Michael Knowles over 8 years ago

Sorry, I tried changing the default gateway and it doesn't work. I've also tried changing the monitor IP and that doesn't work either.

Actions #3

Updated by Chris Buechler over 8 years ago

  • Status changed from New to Feedback

what do your PPP logs show?

Actions #4

Updated by Michael Knowles over 8 years ago

Sorry for the delay - modem was borrowed from that line for a customer. Now have it back and...

Last 50 PPP log entries
Sep 9 09:50:29 ppp: [opt1_link0] PPPoE: Connecting to ''
Sep 9 09:50:31 ppp: PPPoE: rec'd ACNAME "acc-aln1.tbs"
Sep 9 09:50:38 ppp: [opt1_link0] PPPoE connection timeout after 9 seconds
Sep 9 09:50:38 ppp: [opt1_link0] Link: DOWN event
Sep 9 09:50:38 ppp: [opt1_link0] LCP: Down event
Sep 9 09:50:38 ppp: [opt1_link0] Link: reconnection attempt -15201 in 3 seconds
Sep 9 09:50:41 ppp: [opt1_link0] Link: reconnection attempt -15201
Sep 9 09:50:41 ppp: [opt1_link0] PPPoE: Connecting to ''
Sep 9 09:50:43 ppp: PPPoE: rec'd ACNAME "acc-aln1.tbs"
Sep 9 09:50:50 ppp: [opt1_link0] PPPoE connection timeout after 9 seconds
Sep 9 09:50:50 ppp: [opt1_link0] Link: DOWN event
Sep 9 09:50:50 ppp: [opt1_link0] LCP: Down event
Sep 9 09:50:50 ppp: [opt1_link0] Link: reconnection attempt -15200 in 4 seconds
Sep 9 09:50:54 ppp: [opt1_link0] Link: reconnection attempt -15200
Sep 9 09:50:54 ppp: [opt1_link0] PPPoE: Connecting to ''
Sep 9 09:50:56 ppp: PPPoE: rec'd ACNAME "acc-aln1.tbs"
Sep 9 09:51:03 ppp: [opt1_link0] PPPoE connection timeout after 9 seconds
Sep 9 09:51:03 ppp: [opt1_link0] Link: DOWN event
Sep 9 09:51:03 ppp: [opt1_link0] LCP: Down event
Sep 9 09:51:03 ppp: [opt1_link0] Link: reconnection attempt -15199 in 4 seconds
Sep 9 09:51:07 ppp: [opt1_link0] Link: reconnection attempt -15199
Sep 9 09:51:07 ppp: [opt1_link0] PPPoE: Connecting to ''
Sep 9 09:51:09 ppp: PPPoE: rec'd ACNAME "acc-aln1.tbs"
Sep 9 09:51:16 ppp: [opt1_link0] PPPoE connection timeout after 9 seconds
Sep 9 09:51:16 ppp: [opt1_link0] Link: DOWN event
Sep 9 09:51:16 ppp: [opt1_link0] LCP: Down event
Sep 9 09:51:16 ppp: [opt1_link0] Link: reconnection attempt -15198 in 3 seconds
Sep 9 09:51:19 ppp: [opt1_link0] Link: reconnection attempt -15198
Sep 9 09:51:19 ppp: [opt1_link0] PPPoE: Connecting to ''
Sep 9 09:51:21 ppp: PPPoE: rec'd ACNAME "acc-aln1.tbs"
Sep 9 09:51:28 ppp: [opt1_link0] PPPoE connection timeout after 9 seconds
Sep 9 09:51:28 ppp: [opt1_link0] Link: DOWN event
Sep 9 09:51:28 ppp: [opt1_link0] LCP: Down event
Sep 9 09:51:28 ppp: [opt1_link0] Link: reconnection attempt -15197 in 2 seconds
Sep 9 09:51:30 ppp: [opt1_link0] Link: reconnection attempt -15197
Sep 9 09:51:30 ppp: [opt1_link0] PPPoE: Connecting to ''
Sep 9 09:51:32 ppp: PPPoE: rec'd ACNAME "acc-aln1.tbs"
Sep 9 09:51:39 ppp: [opt1_link0] PPPoE connection timeout after 9 seconds
Sep 9 09:51:39 ppp: [opt1_link0] Link: DOWN event
Sep 9 09:51:39 ppp: [opt1_link0] LCP: Down event
Sep 9 09:51:39 ppp: [opt1_link0] Link: reconnection attempt -15196 in 1 seconds
Sep 9 09:51:40 ppp: [opt1_link0] Link: reconnection attempt -15196
Sep 9 09:51:40 ppp: [opt1_link0] PPPoE: Connecting to ''
Sep 9 09:51:42 ppp: PPPoE: rec'd ACNAME "acc-aln1.tbs"
Sep 9 09:51:49 ppp: [opt1_link0] PPPoE connection timeout after 9 seconds
Sep 9 09:51:49 ppp: [opt1_link0] Link: DOWN event
Sep 9 09:51:49 ppp: [opt1_link0] LCP: Down event
Sep 9 09:51:49 ppp: [opt1_link0] Link: reconnection attempt -15195 in 2 seconds
Sep 9 09:51:51 ppp: [opt1_link0] Link: reconnection attempt -15195
Sep 9 09:51:51 ppp: [opt1_link0] PPPoE: Connecting to ''

Actions #5

Updated by Michael Knowles over 8 years ago

Any further update regarding this issue?

Actions #6

Updated by Chris Buechler over 8 years ago

  • Subject changed from Upgrade to 2.2.4 has broken second WAN connection to Multiple PPPoE WANs no longer connect simultaneously after upgrade to 2.2.4

Not seeing any issues here, with PPPoE or otherwise with multi-WAN. Brought up multiple test systems with multiple PPPoE connections and they work fine. Connect fine initially, reconnect fine if you drop them from the PPPoE server side, or if you drop their NIC. Many others have upgraded in the same circumstance with no other reported issues. Not seeing anything that changed that could have any impact on that.

Packet capturing on the affected Ethernet interface ("tcpdump -i em0 -s 0 -w pppoe.pcap" where em0 is the WAN NIC in question, make it fail, ctrl-c to stop the tcpdump, download pppoe.pcap) might be telling.

Are these NICs plugged straight into the modems, or is there a switch or something in between?

Actions #7

Updated by Michael Knowles over 8 years ago

The instance is virtualised on VMware ESXi 5.5.0 build 1623387, and there's a vSwitch for each WAN connection and another one for the LAN. An Intel NIC for each WAN connection (Broadcom for the LAN). There's no physical switch aside from that on the WAN side, they're direct into the modems, which I've switched for new ones with the same result and switched around with the same result.

BT (the ISP) say they can see the modems syncing to the line but no attempt to do PPPoE auth.

I have promiscuous mode on each WAN connection vSwitch.

As regards the packet capturing on the affected ethernet interface, when you say "make it fail", would the valid test be just to start the capture and disable/re-enable the interface?

Actions #8

Updated by Michael Knowles over 8 years ago

  • File cuau0.JPG added
  • File aftercuau0.JPG added
  • File pppoe.pcap added

Hi Chris,

I thought I had an epiphany earlier because when I went to look at the interface assignments to check what I was supposed to put in your packet capture command I found the following assigned to WAN2 (note, this would never have been re-assigned by us, and i'm not even sure what cuau0 relates to as an interface?!) - anyway, I reassigned that to em2 which is what it was originally, and it authenticated...but then 30 seconds later the link died again (see the "aftercuau0" file).

Anyway, I set the packet capture going and it's attached. I don't think there's much in it that's sensitive, but if you can remove the attachments once you have them that'd be appreciated.

Cheers,
Mike.

Actions #9

Updated by Chris Buechler over 8 years ago

Thanks, that helps. I removed the attachments and committed them to a private internal repo so a few others can see if needed.

cuau0 is your COM1 serial port. There was an issue pre-2.2.5 with multiple PPPoE connections where saving certain changes on interfaces.php would make it set the wrong port association. Either that's what happened there, or at some point you inadvertently changed the interface under Interfaces>assign, PPPs tab. Might want to go ahead and upgrade to a 2.2.5 snapshot @ snapshots.pfsense.org if you want to make sure you don't hit anything there.

The packet capture seems a successful PPPoE connection, you start sending out traffic with the correct source IP, and nothing comes back in response. Only PPP things are coming in to the NIC, you're missing any IP traffic in return or any inbound IP at all for that matter. Only LCP echo requests come in from your ISP's Alcatel (your gateway) past the configuration ACK at end of the initial connection.

makes me wonder if it's something within VMware, or elsewhere on the network. The logs from earlier I don't think would match the circumstance in that packet capture.

Actions #10

Updated by Chris Buechler over 8 years ago

  • File deleted (cuau0.JPG)
Actions #11

Updated by Chris Buechler over 8 years ago

  • File deleted (aftercuau0.JPG)
Actions #12

Updated by Chris Buechler over 8 years ago

  • File deleted (pppoe.pcap)
Actions #13

Updated by Chris Buechler over 8 years ago

  • Target version deleted (2.2.5)
  • Affected Version deleted (2.2.4)

I think something other than the upgrade to 2.2.4 changed things here, and it doesn't appear to be attributable to a bug at all, but it's not clear what the cause is. The pcap doesn't match the ISP description (it's clearly authenticating fine, and they're saying no auth attempts). That makes me suspect maybe the two modems are on the same broadcast domain and that's going out the wrong modem. Or some other issue at the VMware level possibly. If you downgrade to 2.2.3, does the situation change? You can do so via System>Firmware, manual update, using the 2.2.3 full update file from https://files.pfsense.org/mirror/updates/old/

Actions #14

Updated by Michael Knowles over 8 years ago

Would you believe, about the same time you wrote that the second line suddenly burst into life at about 6am one day. I came into work and was astounded. Nothing had changed on pfsense, nobody was in the office at that time, and no changes to VMware. I think BT (my ISP) were lying to me/us.

You can of course close this now. I'll update you if anything else happens. Sorry about that - they swore blind there was nothing wrong!

Actions #15

Updated by Chris Buechler over 8 years ago

  • Status changed from Feedback to Not a Bug

thanks for the feedback Michael, glad it's resolved

Actions

Also available in: Atom PDF