Project

General

Profile

Bug #742

apinger doesn't recover opt wan when connection returns.

Added by Perry Mason almost 9 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Multi-WAN
Target version:
Start date:
07/17/2010
Due date:
% Done:

0%

Estimated time:
Affected Version:
2.0
Affected Architecture:

Description

I've seen this again ( previous a sub note in ticket 536 )

in a multi wan setup with failover groups.
If wan fails apinger set it offline and online when it recover
if wan2 fails apinger set it offline and keeps it offline when it should recover.
I then kill apinger and start apinger and wan2 goes online.

apinger-log.txt (3.48 KB) apinger-log.txt Simon Ihmig, 10/05/2012 07:57 AM

History

#1 Updated by Chris Buechler almost 9 years ago

It's not that easy to replicate, none of mine do that. What kind of WANs?

#2 Updated by Perry Mason almost 9 years ago

A natted ip (linksys dmz)

As i also have problems provoking it to fail I've replaced both the dsl box and the pfSense box.

#3 Updated by Ermal Luçi almost 9 years ago

  • Status changed from New to Feedback

When this happens can you check if apinger process is running at all?

Can you please try latest snapshot there is a workaround for this.

#4 Updated by Chris Buechler almost 9 years ago

  • Status changed from Feedback to Resolved

#5 Updated by Simon Ihmig over 6 years ago

This one should be reopened. Apparently it was closed only because of lack of feedback...

I experience the same issue. WAN is recovering well from a ping failure, but WAN2 (or OPT by default) does not come back. See the log file attached (searched for apinger in the system log). I always have to save the gateway configuration in the web frontend to bring the interface back up.

The apinger process ist still runing, to answer the last question!

Both interfaces are configured with a static IP, and are connected to a local router (ISP).

Help appreciated!

#6 Updated by Jim Pingle over 6 years ago

It's unlikely that it's the same issue still. Try a 2.1 snapshot, there has been a ton of work done in this area since 2.0.x.
Many of us run multi-wan and they both recover fine. If you still have issues, post on the forum, and if it's determined there still is an issue, a new ticket can be opened.

#7 Updated by Vlad Fedorkov over 6 years ago

Same here on 2.1-BETA0 (i386) built on Wed Oct 24 14:05:19 EDT 2012, FreeBSD 8.3-RELEASE-p4

==== LOG ====
Oct 25 10:00:45 apinger: ALARM: GW_WAN(xxx.xxx.xxx.xxx) * down
Oct 25 10:01:08 apinger: alarm canceled: GW_WAN(xxx.xxx.xxx.xxx)
down *
Oct 25 10:01:35 apinger: Error while feeding rrdtool: Broken pipe
Oct 25 10:02:35 apinger: rrdtool respawning too fast, waiting 300s. ==== /LOG ====

WAN: em0: <Intel(R) PRO/1000 Network Connection 7.3.2>
LAN: em1: <Intel(R) PRO/1000 Legacy Network Connection 1.0.4>

#8 Updated by Mathieu Déom over 5 years ago

Same issue here with 2.1 RELEASE : with my two PPPoe WANs, when they goes offline and then came back, I have to reboot Apinger to make them goes online again.

#9 Updated by Ermal Luçi over 5 years ago

Please provide your system log when the issue happens!
The only scenario that i can think of is when both pppoe links have the same address.

#10 Updated by Lionel Lejeune over 5 years ago

Same issue for us on pfsense 2.1 Release:
if opt3 goes down, it doesn't recover and the "last check" time on /status_gateways.php page doesn't change anymore for this GW.
I have to restart apinger service to get the GW up again.

Nothing interresting in the logs:
Oct 16 09:04:40 apinger: Starting Alarm Pinger, apinger(33389)
Oct 16 09:04:39 apinger: Exiting on signal 15.
Oct 16 08:57:11 apinger: SIGHUP received, reloading configuration.
Oct 16 08:39:34 apinger: ALARM: OVHADSL * down *

#11 Updated by Bipin Chandra over 5 years ago

i have a similar issue but in some ways its different, my isp sometimes blocks all traffic to internet and only gives access to its server and they only serve a page that says to pay bill, in this case the wan is connected and the gateway is also online but ping to any other ip fails and this triggers a apinger to mark wan as down but it never attempts to disconnect the pppoe connection and reconnect which would actually recover the wan, if it did that then wan would recover but i have to do it manually every time this happens

#12 Updated by Daniel Bernhardt over 5 years ago

I can confirm this problem. This bug should be reopened.

System:
2.1-RELEASE (i386)
built on Wed Sep 11 18:16:44 EDT 2013
FreeBSD 8.3-RELEASE-p11

Two WAN interfaces (both using static IPs) are in a gateway group and monitored with apinger. If the secondary WAN interface (the one not connected to the WAN port) fails, apinger correctly marks the interface as down. When the secondary WAN interfaces is UP again, apinger is not bringing the gateway up. It doesn't even check the gateway anymore. No pings are sent. On the page "Status -> Gateways" the information "last check:" always corresponds with the time the WAN route failed.

#13 Updated by Sharif Al Motawally about 5 years ago

Chris Buechler wrote:

It's not that easy to replicate, none of mine do that. What kind of WANs?

on my system there are 3 wans all static ips. this issue only happens when it is down for a long period. if just a few minutes it restores fine. but it seems there is a timout afterwhich apinger gives up and does not check the downed gateway anymore. this shows in the gateway status page (last checked). if I restart apinger all is well again.

Also available in: Atom PDF