Project

General

Profile

Actions

Bug #12163

closed

WAN interface throughput degradation after send high volume through OpenVPN site-to-site Tunnel

Added by Tom Hebert over 2 years ago. Updated over 2 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
Interfaces
Target version:
-
Start date:
07/24/2021
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Default
Affected Version:
2.5.2
Affected Architecture:
All

Description

We have a Netgate 5100 onsite and three remote sites. Two of those sites use Netgate 5100s and the third is running pfSense on an Azure VM. We use that for our poor man's hybrid cloud. My notes here are taken from the Azure-to-home connection but the results are repeatable from all three remote sites.

Here are the steps I follow to reproduce the problem:

  • Restart both routers, wait for OpenVPN site-to-site to connect. In this case onsite is the OVPN server and the remote router is the client.
  • Start iperf server onsite with defaults
  • Log in to the remote router
  • Run the iperf client, from the remote site, to the local site's WAN Address (Public IP). Take all defaults.
  • Check results, 500 mbps which is essentially full speed
  • Run the iperf client again. This time use the in-tunnel address
  • 24 mbps...this is very bad...One might say it's an OpenVPN problem. But that's not what I am reporting
  • Run the iperf client one more time. This time go back to the WAN Address (Public IP), outside the tunnel
  • Check results, 77 mbps, outside the tunnel, 84.6% SLOWER than 500 mbps
  • Reboot the remote server, leave the onsite server as-is
  • Rerun the iperf client using the WAN Address (Public IP) and it is back to full speed, 500mbps.

Additional notes:

  • Restarting the OpenVPN client on the remote router has no effect. Once the WAN interface is broken it remains broken
  • All routers are up-to-date on maintenance
  • The on-premise local server is never altered, nor is it rebooted. The errant behavior is at the remote site.
  • Changing values at System==>Advanced==>Network/ Network Interfaces has no effect on either side
  • Rebooting the on-premise server has no effect on the broken remote router
  • The tunnel cannot be used for FTP, SMB or backups
  • The tunnel works for low usage activities, RDP, SSH terminal, et.al.
  • This has been recreated from two remote sites, the Azure/pfSense and the Netgate 5100 pfSense+
  • I posted this as a general question because I don't think it's an OpenVPN bug, per se. It's most likely a bug in the adaptation of OpenVPN by pfSense.

The bottom line:

High volumes inside the site-to-site OpenVPN tunnel corrupt the WAN interface in some way on the client side. It could be an OpenVPN issue, or bug. However it feels to me like OpenVPN is the victim; and pfSense, or its adaptation of OpenVPN is triggering a problem. I also see many poorly answered or unanswered questions regarding OpenVPN performance in the forums and think this may be the root cause behind many of those observations.


Related issues

Is duplicate of Bug #11778: OpenVPN uses 100% CPU after experiencing packet lossNew04/03/2021

Actions
Actions #1

Updated by Jim Pingle over 2 years ago

  • Status changed from New to Duplicate

Almost certainly a duplicate of #11778

Actions #2

Updated by Jim Pingle over 2 years ago

  • Is duplicate of Bug #11778: OpenVPN uses 100% CPU after experiencing packet loss added
Actions #3

Updated by Tom Hebert over 2 years ago

Jim Pingle wrote in #note-1:

Almost certainly a duplicate of #11778

I doubt it, in my case CPU never exceeded a few percentage points and most of that was the gui. I am not as experienced as you maybe it's just related.

If/when you think it's resolved, I'd be happy to apply a patch and retest. In the meantime we are spinning up WireGuard gateways on Linux VMs as a workaround. It is much less than optimal, a pain firewalling, but we have to move the data.

Actions

Also available in: Atom PDF