Project

General

Profile

Actions

Bug #6202

closed

OpenVPN bad state / pids do not allow clients to restart.

Added by Dan . about 9 years ago. Updated about 9 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
OpenVPN
Target version:
-
Start date:
04/18/2016
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
Affected Architecture:

Description

I got into a state where the pfSense dashboard would tell me that it can't connect to my OpenVPN client instances asking if the daemons were running. According to sockets they were indeed and traffic was flowing but I could not recover from that state using the GUI. This state persisted beyond a reboot. This actually happened when I used the ssh console to upgrade from 2.2.6 to 2.3 although I had it happen under different circumstances in the past and see it possibly happening if there is an unclean shutdown. (my 2.2.6 - 2.3 reboot was initiated by pfSense so it should have been clean).

After this, one or more OpenVPN clients would try to start shortly after which OpenVPN would complain with "Cannot open TUN/TAP dev /dev/tunX: Device busy (errno=16).

I was able to solve this problem by manually killing the OpenVPN clients processes by PID and then invoking client restarts, either manually or via a reboot.

I did some digging and I think I found the problem.. if I look at the /etc/inc/openvpn.inc file, function openvpn_restart($mode, $settings), I can see that if the PID file exists, the pid is read out and the corresponding process killed, but if we got into a bad state, meaning the PID in the file is no longer valid and somehow a client instance is already up, the restart will partially succeed where it fails to kill the non-existent pid but tries to start a new instance of that client which will fail as per the above error.

I recommend there be a check to see if the pid exists before attempting to kill it and if it does not, a secondary check should be added to see if there is a matching /var/run/openvpn_client_${vpnid}.conf process running, extract that pid and kill it. This should the issue once and for all.

For reference, here's my thread about this happening as I asked in the forum: https://forum.pfsense.org/index.php?topic=91067.0

Actions #1

Updated by Chris Buechler about 9 years ago

  • Status changed from New to Duplicate
  • Affected Version deleted (2.3)

duplicate of #6132

Actions #2

Updated by Dan . about 9 years ago

I'm not sure if this is a duplicate of 6132 because

1. I different errors in the vpn log - device busy - none of which are in the other bug's log.
2. none of the running pids match those in the pid files - in the other bug at least one matches
3. I can reproduce it consistently simply by rebooting the firewall - the other bug cannot consistently reproduce this.

Right after a reboot:

[2.3-RELEASE][root@firewall]/root: cat /var/run/openvpn_client*.pid
20954
15268
20763

[2.3-RELEASE][root@firewall]/root: ps aux | grep vpn
root 63313 0.4 0.5 21616 5372 - Ss 11:07PM 0:12.03 /usr/local/sbin/openvpn --config /var/etc/openvpn/client1.conf
root 52831 0.0 0.5 21616 5484 - Ss 11:07PM 0:02.66 /usr/local/sbin/openvpn --config /var/etc/openvpn/client2.conf
root 61850 0.0 0.5 21616 5380 - Ss 11:07PM 0:02.14 /usr/local/sbin/openvpn --config /var/etc/openvpn/client3.conf
root 49497 0.0 0.2 18740 2236 0 S+ 11:28PM 0:00.00 grep vpn

Actions

Also available in: Atom PDF