Project

General

Profile

Actions

Bug #6202

closed

OpenVPN bad state / pids do not allow clients to restart.

Added by Dan . over 9 years ago. Updated over 9 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
OpenVPN
Target version:
-
Start date:
04/18/2016
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
Affected Architecture:

Description

I got into a state where the pfSense dashboard would tell me that it can't connect to my OpenVPN client instances asking if the daemons were running. According to sockets they were indeed and traffic was flowing but I could not recover from that state using the GUI. This state persisted beyond a reboot. This actually happened when I used the ssh console to upgrade from 2.2.6 to 2.3 although I had it happen under different circumstances in the past and see it possibly happening if there is an unclean shutdown. (my 2.2.6 - 2.3 reboot was initiated by pfSense so it should have been clean).

After this, one or more OpenVPN clients would try to start shortly after which OpenVPN would complain with "Cannot open TUN/TAP dev /dev/tunX: Device busy (errno=16).

I was able to solve this problem by manually killing the OpenVPN clients processes by PID and then invoking client restarts, either manually or via a reboot.

I did some digging and I think I found the problem.. if I look at the /etc/inc/openvpn.inc file, function openvpn_restart($mode, $settings), I can see that if the PID file exists, the pid is read out and the corresponding process killed, but if we got into a bad state, meaning the PID in the file is no longer valid and somehow a client instance is already up, the restart will partially succeed where it fails to kill the non-existent pid but tries to start a new instance of that client which will fail as per the above error.

I recommend there be a check to see if the pid exists before attempting to kill it and if it does not, a secondary check should be added to see if there is a matching /var/run/openvpn_client_${vpnid}.conf process running, extract that pid and kill it. This should the issue once and for all.

For reference, here's my thread about this happening as I asked in the forum: https://forum.pfsense.org/index.php?topic=91067.0

Actions

Also available in: Atom PDF