Bug #3119
closedapinger falls into a loop with assigned OpenVPN interface, restarting itself and triggering events indefinitely
100%
Description
When using dual-stack on a WAN, and the IPv6 gateway goes down, both IPv4 and IPv6 services on the WAN are restarted.
This causes an unnecessary disruption in the IPv4 services that may not have had a failure.
It can also lead to a loop with an assigned OpenVPN interface:
apinger sees IPv6 gateway down, restarts services on wan
check_reload_status restarts OpenVPN
OpenVPN restarts, which causes an interface event for the assigned interface (e.g. ovpns1)
rc.newwanip restarts some things, including apinger
apinger sees IPv6 gateway down, restarts services on wan
check_reload_status restarts OpenVPN
[loop, loop, loop...]
Updated by Jim Pingle over 11 years ago
- Subject changed from apinger restarts IPv4 services when an IPv6 gateway goes down (and vice versa) to apinger restarts all services on an interface when a gateway goes down, not just services using the gateway in question
Changed the subject as it does not apply strictly to IPv6 vs IPv4.
Easy way to reproduce it:
1. Create an OpenVPN server or client
2. Assign the interface, enable, etc
3. Add a second WAN gateway using a bogus IP that will not respond to ping
4. Watch it fall into a loop an explode.
rc.newwanip includes protections to prevent an OpenVPN interface from triggering an OpenVPN reload, but rc.openvpn does not contain any protections. That seems the best place to look for a band-aid solution.
Longer term, we should at least explore the idea of apinger understanding IPv6 vs IPv4 service loss when restarting things.
Updated by Jim Pingle over 11 years ago
Though rc.openvpn is being called with the physical interface (wan, etc) not the openvpn interface so perhaps that's not simple either.
The real problem seems to be rc.newwanip on the OpenVPN interface triggering apinger to be killed and restarted.
Updated by Jim Pingle over 11 years ago
Updated the above again, the OpenVPN part is irrelevant, too. Further testing shows that just a second dead gateway will cause the looping:
apinger sees the gateway down, triggers the interface reload, which triggers apinger to restart, which sees the gateway still down, which triggers the interface reload, repeat forever until monitoring is disabled or the gateway responds.
Updated by Jim Pingle over 11 years ago
Spoke too quickly on that last one, the OpenVPN interface does play into it but it wasn't completely gone when I saw the last looping happen.
So you do need to assign and enable an OpenVPN interface, then have a down gateway on the same interface as the VPN.
Updated by Ermal Luçi over 11 years ago
- Status changed from New to Feedback
- % Done changed from 0 to 100
Applied in changeset pfsense-tools:commit:276eaf1009790760b5fe788156d8ba420f100046.
Updated by Jim Pingle over 11 years ago
- Subject changed from apinger restarts all services on an interface when a gateway goes down, not just services using the gateway in question to apinger falls into a loop with assigned OpenVPN interface, restarting itself and triggering events indefinitely
Updating the subject again to more accurately describe the core issue here. The other part is a different bug than we were chasing here, but it's not quite as critical and can wait for 2.2. I'll start a new ticket for that.
After Ermal's latest patches apinger is behaving itself aside from a minor/cosmetic quirk: It's printing negative RTTs in certain cases.
Updated by Ermal Luçi over 11 years ago
- Status changed from Feedback to Resolved