Project

General

Profile

Bug #7905

OpenVPN Authentication Against Backend Stalls All Server Traffic

Added by Chris Linstruth 10 months ago. Updated about 1 month ago.

Status:
Confirmed
Priority:
Normal
Assignee:
Category:
OpenVPN
Target version:
Start date:
10/01/2017
Due date:
% Done:

0%

Affected Version:
2.4
Affected Architecture:
All

Description

When authenticating an OpenVPN Remote Access server against an authentication backend such as RADIUS, all traffic on the server is halted while that authentication is processed. If this takes time, such as RADIUS to a MFA service such as Duo, this delay can be significant even under normal circumstances.

Test procedure:

Created and tested a RADIUS Authentication backend with a 30-second authentication timeout

Created an SSL/TLS + User Auth OpenVPN Server backed by the RADIUS server

Created two different user certificates and corresponding RADIUS accounts

Successfully logged in using the first account and started a ping

Changed the RADIUS server IP address so the request would "hang" for the configured 30 seconds

Attempted to log in to the second account

While that was timing out I changed the RADIUS server back to the proper IP address to see if Viscosity (the first client) recovered automatically. It did time out and reconnect successfully.

  1. 5 seconds apart
    $ ping -i 5 172.25.228.1
    PING 172.25.228.1 (172.25.228.1): 56 data bytes
    64 bytes from 172.25.228.1: icmp_seq=0 ttl=63 time=4.658 ms
    64 bytes from 172.25.228.1: icmp_seq=1 ttl=63 time=5.461 ms
    64 bytes from 172.25.228.1: icmp_seq=2 ttl=63 time=4.718 ms
    64 bytes from 172.25.228.1: icmp_seq=3 ttl=63 time=4.881 ms
    64 bytes from 172.25.228.1: icmp_seq=4 ttl=63 time=2.311 ms
    64 bytes from 172.25.228.1: icmp_seq=5 ttl=63 time=4.756 ms
    64 bytes from 172.25.228.1: icmp_seq=6 ttl=63 time=4.851 ms
    64 bytes from 172.25.228.1: icmp_seq=7 ttl=63 time=5.014 ms
    64 bytes from 172.25.228.1: icmp_seq=8 ttl=63 time=4.892 ms
    64 bytes from 172.25.228.1: icmp_seq=9 ttl=63 time=2.622 ms
    Request timeout for icmp_seq 10
    Request timeout for icmp_seq 11
    Request timeout for icmp_seq 12
    Request timeout for icmp_seq 13
    Request timeout for icmp_seq 14
    Request timeout for icmp_seq 15
    Request timeout for icmp_seq 16
    Request timeout for icmp_seq 17
    Request timeout for icmp_seq 18
    Request timeout for icmp_seq 19
    Request timeout for icmp_seq 20
    Request timeout for icmp_seq 21
    64 bytes from 172.25.228.1: icmp_seq=22 ttl=63 time=2.731 ms
    64 bytes from 172.25.228.1: icmp_seq=23 ttl=63 time=4.933 ms
    64 bytes from 172.25.228.1: icmp_seq=24 ttl=63 time=5.390 ms
    64 bytes from 172.25.228.1: icmp_seq=25 ttl=63 time=5.429 ms
    64 bytes from 172.25.228.1: icmp_seq=26 ttl=63 time=5.014 ms
    64 bytes from 172.25.228.1: icmp_seq=27 ttl=63 time=5.074 ms

Same behavior on 2.3.4_1 and most-recent 2.4.0-RC. I did not test 2.4.1 since it uses the same OpenVPN package as 2.4.0.

server2.conf Magnifier - Server config file - partially redacted (1.33 KB) Phil DeMonaco, 02/23/2018 04:05 PM

openvpn-auth-script-test.log Magnifier - Full server process log - partially redacted (UTC) (38.1 KB) Phil DeMonaco, 02/23/2018 04:05 PM

openvpn-auth-script-test-client.log Magnifier - Full client process log - partially redacted (EST) (1.39 KB) Phil DeMonaco, 02/23/2018 04:08 PM

client.ovpn - Client config file - partially redacted (242 Bytes) Phil DeMonaco, 02/23/2018 04:09 PM

History

#1 Updated by Jim Pingle 10 months ago

  • Affected Architecture set to All

Looks like it's a known issue with the nature of auth-user-pass-verify that OpenVPN does not plan to address: https://community.openvpn.net/openvpn/ticket/222

But there is a workaround:
http://engineering.freeagent.com/2017/05/22/external-authentication-scripts-in-openvpn-the-right-way/
https://github.com/fac/auth-script-openvpn

Looks like that would need to be converted into a FreeBSD port or added to the OpenVPN port and then the OpenVPN code and auth script would have to be adapted to use it instead, assuming it can do everything we need it to do.

#2 Updated by Jim Pingle 9 months ago

  • Target version changed from 2.4.1 to 2.4.2

Moving target to 2.4.2 as we need 2.4.1 sooner than anticipated.

#3 Updated by Jim Thompson 9 months ago

  • Assignee set to Jim Pingle

#4 Updated by Jim Pingle 9 months ago

  • Status changed from New to Confirmed
  • Target version changed from 2.4.2 to 2.4.3

I was finally able to confirm the problem, I'm looking at that auth_script plugin now, but it will require some significant changes to the auth script and it doesn't compile out of the box on FreeBSD. Probably going to need more time to analyze this than we have for 2.4.2 given the nature of how it operates and what is required, so bumping it forward for now.

#5 Updated by John Tikis 6 months ago

I can also add that when two RADIUS servers are declared as backend authenticators and the first on the list fails (e.g. it is switched off), then OpenVPN never rolls back to the second RADIUS authentication server on the list.

A similar issue has been opened for LDAP backends -> https://redmine.pfsense.org/issues/3022

pFsense version: 2.4.2-Release-p1

#6 Updated by Steve Beaver 6 months ago

  • Target version changed from 2.4.3 to 2.4.4

Sorry to have to kick this to ++version but the work required cannot be squeezed into the 2.4.3 schedule

#7 Updated by John Tikis 5 months ago

Would a solution like Keepalived on the authentication servers back-end (if they are of the same type, e.g. RADIUS)work as an interim solution?

#8 Updated by Phil DeMonaco 5 months ago

I'm actively working on implementing this in my pfSense environment. Currently I have a fork of the workaround linked above that compiles successfully both on FreeBSD 10.3 and FreeBSD 11.1:

https://github.com/pdemonaco/auth-script-openvpn/tree/freebsd-make-support

Note that the Makefile is using gmake features.

FreeBSD isn't my primary development platform so I'm still tinkering a bit trying to get a pfSense build environment configured. Once I have that working I'm going to look at at least doing a backend implementation of this on my appliance.

When this is solved in the mainline implementation is there a plan to incorporate GUI changes to the VPN server page?

Additionally, is there a good way for me to contribute these changes directly to the project?

#9 Updated by Phil DeMonaco 5 months ago

I'm really close to having this working on the 2.4.2-RELEASE code base, however, I'm running into an issue and I'm hoping someone a bit more familiar with the integration of pfsense & openvpn might be able to help.

In brief, I've modified two small parts of the existing pfsense environment:
  1. Added a new asynrchonous script designed to be called via a OpenVPN plugin: https://github.com/pdemonaco/pfsense/blob/master/src/usr/local/sbin/ovpn_auth_verify_async
  2. Updated the config file generation such that it used the plugin directive instead of auth-user-pass-verify: https://github.com/pdemonaco/pfsense/blob/master/src/etc/inc/openvpn.inc
  3. Modified the auth-script plugin to allow arguments to be passed to the target script: https://github.com/pdemonaco/auth-script-openvpn/tree/pass-script-arguments

Note that I made the same changes to the 2.4.2-RELEASE version of the openvpn.inc file to actually execute my tests as it differs from the current master.

While the call to the authentication module occurs correctly and returns the proper result something goes wrong after the plugin notifies the core process of the result. The log excerpt below begins at the point where the authentication process completes with a successful auth.

Feb 23 21:11:28 d0-nfw2533 openvpn[29675]: event_wait : Interrupted system call (code=4)
Feb 23 21:11:29 d0-nfw2533 openvpn[29675]: <client-ip>:43518 UDPv4 READ [56] from [AF_INET]<client-ip>:43518: P_CONTROL_V1 kid=0 [ ] pid=5 DATA len=42
Feb 23 21:11:29 d0-nfw2533 openvpn[29675]: <client-ip>:43518 PUSH: Received control message: 'PUSH_REQUEST'
Feb 23 21:11:29 d0-nfw2533 openvpn[29675]: <client-user>/<client-ip>:43518 MULTI_sva: pool returned IPv4=172.29.128.2, IPv6=(Not enabled)
Feb 23 21:11:29 d0-nfw2533 openvpn[29675]: <client-user>/<client-ip>:43518 WARNING: Failed running command (--client-connect): external program fork failed
Feb 23 21:11:29 d0-nfw2533 openvpn[29675]: <client-user>/<client-ip>:43518 UDPv4 WRITE [22] to [AF_INET]<client-ip>:43518: P_ACK_V1 kid=0 [ 5 ]
Feb 23 21:11:35 d0-nfw2533 openvpn[29675]: <client-user>/<client-ip>:43518 UDPv4 READ [56] from [AF_INET]<client-ip>:43518: P_CONTROL_V1 kid=0 [ ] pid=6 DATA len=42
Feb 23 21:11:35 d0-nfw2533 openvpn[29675]: <client-user>/<client-ip>:43518 PUSH: Received control message: 'PUSH_REQUEST'
Feb 23 21:11:35 d0-nfw2533 openvpn[29675]: <client-user>/<client-ip>:43518 Delayed exit in 5 seconds
Feb 23 21:11:35 d0-nfw2533 openvpn[29675]: <client-user>/<client-ip>:43518 SENT CONTROL [<client-user>]: 'AUTH_FAILED' (status=1)
Feb 23 21:11:35 d0-nfw2533 openvpn[29675]: <client-user>/<client-ip>:43518 UDPv4 WRITE [22] to [AF_INET]<client-ip>:43518: P_ACK_V1 kid=0 [ 6 ]
Feb 23 21:11:35 d0-nfw2533 openvpn[29675]: <client-user>/<client-ip>:43518 UDPv4 WRITE [55] to [AF_INET]<client-ip>:43518: P_CONTROL_V1 kid=0 [ ] pid=7 DATA len=41
Feb 23 21:11:37 d0-nfw2533 openvpn[29675]: <client-user>/<client-ip>:43518 UDPv4 WRITE [55] to [AF_INET]<client-ip>:43518: P_CONTROL_V1 kid=0 [ ] pid=7 DATA len=41
Feb 23 21:11:40 d0-nfw2533 openvpn[29675]: <client-user>/<client-ip>:43518 SIGTERM[soft,delayed-exit] received, client-instance exiting

I believe the "event_wait" message may be the normal indication of the completion of the auth request as it is immediately followed by an attempt to allocate an IP to the client. The part which confuses me is the fork failure messageon the client connect command. From what I can tell the script referenced in the configuration file is not actually being called.

Any suggestions would be appreciated - thanks!

#10 Updated by Phil DeMonaco 5 months ago

Note that the event_wait signal, the MULTI_sva, and the WARNING do not appear if the auth request fails.

#11 Updated by Phil DeMonaco 5 months ago

I've corrected the issue. The problem was caused by the fact that the plugin was stealing the original signal handler built into openvpn. I've reworked it to daemonize a second level child and it's functioning flawlessly.

#13 Updated by Phil DeMonaco about 1 month ago

I've added another pull request which includes the new plugin port as a dependency to the main pfSense port.

https://github.com/pfsense/FreeBSD-ports/pull/525

It's been over a month since the plugin was accepted into FreeBSD upstream and I've seen no action on the requests. What can I do to expedite this process?

Also available in: Atom PDF