Bug #3894: OpenVPN client started multiple times when connecting to FQDN where connectivity to server is delayed - pfSense - pfSense bugtracker

Actions

Copy link

Bug #3894

closed

OpenVPN client started multiple times when connecting to FQDN where connectivity to server is delayed

Added by Dmitriy K almost 12 years ago. Updated over 11 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Chris Buechler

Category:

OpenVPN

Target version:

2.2

Start date:

09/26/2014

Due date:

% Done:

100%

Estimated time:

Plus Target Version:

Release Notes:

Affected Version:

All

Affected Architecture:

Description

Requirements:
1. WAN connection should not be Static/DHCP!

Steps to reproduce:
1. Create an ovpn client instance with DN as "server address" (for example: vpn.contoso.com).
2. Check "Server host name resolution" option.
3. Save and restart the router.

If WAN connection establishment delay was long enough our newly created ovpn instance will become "detached" from system.

Upon WAN iface goes up an ovpn client daemon will resolve a DN and establish connection to the server. Good! But you will not be able to control that ovpn instance anymore. That means you wont be able to stop, start, restart, disable/enable it! Ovpn iface will be up and working 4ver.

Files

Download all files

openvpn.log (500 KB) openvpn.log		Dmitriy K, 10/10/2014 05:00 AM
openvpn_client3.pid (6 Bytes) openvpn_client3.pid		Dmitriy K, 10/10/2014 05:00 AM
system.log (27.1 KB) system.log		Dmitriy K, 10/16/2014 01:49 AM

Actions

Copy link

Updated by Ermal Luçi almost 12 years ago

Priority changed from High to Normal

Normally openvpn instances are restarted on interface up event!

Can you back this claim with proper information as pid/ps -axwwvv etc... info?

Actions

Copy link

Updated by Dmitriy K almost 12 years ago

Here is a video http://rghost.net/private/58388261/44e5fb12a48d08550c2bb5cd6c676bd3

Bug is 100% reproducible. My guess is Bind server is being restarted right after ovpn is done restarting so resolving is not available at the time when ovpn trying to resolve DN. When Bind is up on iface ovpn successfully resolves DN and connects to the server being detached from GUI.

Maybe i'm wrong, maybe not ...

Actions

Copy link

Updated by Dmitriy K almost 12 years ago

After some research I've found out that system can't connect to "detached" ovpn instance socket.

I've added some logging to openvpn_get_client_status() of openvpn.inc and here is the output:
/index.php: openvpn_get_client_status(Array, unix:///var/etc/openvpn/client3.sock) = 61;

File (unix:///var/etc/openvpn/client3.sock) itself is exists but not accessible;

Actions

Copy link

Updated by Dmitriy K almost 12 years ago

Error code 61 means "Connection refused".

Actions

Copy link

Updated by Dmitriy K almost 12 years ago

File openvpn.log openvpn.log added
File openvpn_client3.pid openvpn_client3.pid added

Here are logs from clean start with only one ovpn instance enabled. Obviously, "2nd" instance is being detached, because the very 1st launched by system has exited.

Actions

Copy link

Updated by Ermal Luçi over 11 years ago

From the logs seems you have already an running instance hence you cannot start a second one!
Can you post your system logs?

Actions

Copy link

Updated by Dmitriy K over 11 years ago

File system.log system.log added

Yeah, obviously I can't run 2 times same instance but bug in logic can. So, here is system log.

Looks like opvn is being ran 2 times: at bootup and newwanip. Bug is located, I suppose.

Actions

Copy link

Updated by Dmitriy K over 11 years ago

Look for "openvpn_restart" event in the system log to speedup things. Just forgot to mention it in the post above.

Actions

Copy link

Updated by Dmitriy K over 11 years ago

Also, in rc.newwanipv6 instances are started twice ...

Actions

Copy link

#10

Updated by Ermal Luçi over 11 years ago

I am sorry but you need to read better the source!

Actions

Copy link

#11

Updated by Chris Buechler over 11 years ago

Subject changed from System looses control over specifically configured ovpn client instance after reboot to OpenVPN client started multiple times when connecting to FQDN where connectivity to server is delayed
Assignee set to Chris Buechler

The specific issue here is OpenVPN client is launched multiple times when connecting to FQDN with "resolv-retry infinite", where there is a delay in the Internet coming up, or network connectivity to the VPN server and/or DNS is unavailable. I have a good test case for this, will look into it further.

Actions

Copy link

#12

Updated by Michael Schefczyk over 11 years ago

On a server with two OpenVPN Clients in Peer to Peer (SSL/TLS) mode, I have the same issue, while "Infinitely resolve server" is NOT being checked. The issue occurs after every reboot. It can be cured by determining the OpenVPN clients' PIDs and then killing and restarting the processes. Usually, only one of the two clients is affected. Of course, I would very much welcome if the server could reboot to full functionality without manual intervention.

The setting is: 2.1.5-RELEASE (amd64), Intel(R) Atom(TM) CPU C2758 @ 2.40GHz 8 CPUs: 1 package(s) x 8 core(s), two WAN gateways, two OpenVPN Client in Peer to Peer (SSL/TLS) mode, Quagga OPSF package, Unbound package.

Actions

Copy link

#13

Updated by Chris Buechler over 11 years ago

Status changed from New to Confirmed

Actions

Copy link

#14

Updated by Ermal Luçi over 11 years ago

Status changed from Confirmed to Feedback

The issue here is that resolve-retry infinite is on by default.
I pushed a fix to do only 2 retries by default which should fix the issue at hand.
Previous behaviour people can just enable resolv-retry infinite if they want.

Actions

Copy link

#15

Updated by Ermal Luçi over 11 years ago

% Done changed from 0 to 100

Applied in changeset commit:d882658e826ca1c9e41c0832b3d0f433756ed903.

Actions

Copy link

#16

Updated by Chris Buechler over 11 years ago

Status changed from Feedback to Resolved

Ermal's change is good, but doesn't help this circumstance. The root cause here is OpenVPN doesn't exit when sent a SIGTERM in this circumstance, and then we start it again while it's still running. Changed to send a SIGKILL if it doesn't exit after SIGTERM. Confirmed this resolves the circumstance described here.

Actions

Copy link

#17

Updated by Phillip Davis over 11 years ago

I have systems where the internet somewhere goes away quite regularly. The actual pfSense WAN interface to the upstream device (ISP, whatever) is fine, so there is no link down/link up event for pfSense to see in that sense.
OpenVPN site-to-clients time out after a bit, and then try to find their server end again. For this they try to resolve the FQDN of the server again. However the ISP issue lasts more than a few minutes, the DNS resolution fails, and with the now-default "resolv-retry 2", the OpenVPN client simply gives up and exits.
Then there is nothing in the system to try and start it again, either when ISP internet is better, or every so often. The clients stay down.
I have noticed this happen quite a few times recently and now realise the "resolv-retry 2" change is the reason for the new behavior. It seems odd to have a config that will simply exit in a reasonably-expected situation (DNS resolution has gone away for a few minutes) and that the client process just exits and is never restarted.
I can select "Infinitely resolve server" and that will put things back the way they were. But it will be a hassle for lots of users to find this out after upgrading to 2.2
But with Chris' comment above about the SIGKILL/SIGTERM stuff - if that really resolves the underlying issue, then would it be best to revert the commit of the "resolv-retry 2" stuff?

Actions

Copy link

#18

Updated by Ermal Luçi over 11 years ago

You have an option resolve-retry-inifinite on the openvpn settings.
Use that to have it behave as before.

Actions

Copy link

#19

Updated by Phillip Davis over 11 years ago

I understand that, and I will now go to all my site-to-site clients on 2.1.5 and turn on that setting so it carries over into 2.2.
At the moment in 2.1.5, no resolv-retry goes in the config by default. And thus the OpenVPN default is in effect:

"By default, --resolv-retry infinite is enabled."

I am thinking that there might be quite a few people who experience this after upgrading to 2.2. Or is my situation an unusual edge case? Just thought I would raise the issue so others can think and comment.

Actions

Copy link

#20

Updated by Chris Buechler over 11 years ago

Since the circumstance Phil noted is pretty common, and the change that caused a problem there had no benefit on the original bug in this ticket, I changed our resolv-retry default back to OpenVPN's default of infinite. It'd break too much otherwise.

Actions

Copy link

#21

Updated by Dmitriy K over 11 years ago

Does that mean that the issue remains intact? Or SIGKILL will do in my case?

Actions

Copy link

#22

Updated by Chris Buechler over 11 years ago

The last update has nothing to do with your issue Dmitriy, the fix I put in a couple weeks ago is fine for that. Ermal's other change in this ticket is what broke Phil's setup and would end up breaking a lot of others, which was undone today. Everything related here is all good.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

pfSense

Custom queries

Bug #3894

OpenVPN client started multiple times when connecting to FQDN where connectivity to server is delayed

Updated by Ermal Luçi almost 12 years ago

Updated by Dmitriy K almost 12 years ago

Updated by Dmitriy K almost 12 years ago

Updated by Dmitriy K almost 12 years ago

Updated by Dmitriy K almost 12 years ago

Updated by Ermal Luçi over 11 years ago

Updated by Dmitriy K over 11 years ago

Updated by Dmitriy K over 11 years ago

Updated by Dmitriy K over 11 years ago

Updated by Ermal Luçi over 11 years ago

Updated by Chris Buechler over 11 years ago

Updated by Michael Schefczyk over 11 years ago

Updated by Chris Buechler over 11 years ago

Updated by Ermal Luçi over 11 years ago

Updated by Ermal Luçi over 11 years ago

Updated by Chris Buechler over 11 years ago

Updated by Phillip Davis over 11 years ago

Updated by Ermal Luçi over 11 years ago

Updated by Phillip Davis over 11 years ago

Updated by Chris Buechler over 11 years ago

Updated by Dmitriy K over 11 years ago

Updated by Chris Buechler over 11 years ago