Project

General

Profile

Actions

Bug #14693

open

Filter reload with NAT reflection rules is extremely slow

Added by Kevin Bentlage almost 2 years ago. Updated 14 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
NAT Reflection
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Default
Affected Version:
2.7.0
Affected Architecture:
amd64

Description

We're running a PFSense cluster which contains the following amount of rules:

- 60x Outbound NAT rule
- 120x NAT rule (port forward)
- 80x 1:1 NAT rule
- 850x Firewall rule

When reloading the filter (or applying changes to rules / NAT) the full reload will take 10 minutes to finish!

When i check the logs on the "Filter Reload" page the "NAT Reflection" rules are taking 5 seconds each! The Firewall rules are quite fast, they are loaded all together in about 5 seconds!

Initializing
Creating aliases
Creating gateway group item...
Generating Limiter rules
Generating NAT rules
Creating 1:1 rules...
Creating reflection NAT rule for VIP for xxxxxx... -> takes about 5 seconds
Creating reflection NAT rule for VIP for xxxxxx... -> takes about 5 seconds
Creating reflection NAT rule for VIP for xxxxxx... -> takes about 5 seconds
Creating reflection NAT rule for VIP for xxxxxx... -> takes about 5 seconds
Creating reflection NAT rule for VIP for xxxxxx... -> takes about 5 seconds
(and so on, for all reflection rules)

So the problem (delay) seems only to be present at the "NAT reflection" rules.

Each server is running baremetal with the following specs:

- 2x Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
- 64GB RAM
- SSD storage

How can we speed up our reloads?

Actions #1

Updated by Vincent Caron 14 days ago

This problem has been bugging me a lot too. I have lots of interfaces (250 VLANs) and about 200 NAT rules, reloading took 25min ... I could fix it down to 17s.

Looking at the filter reloading log I was obviusly hit by the "Creating reflection NAT rule" slowness, the toll was about 12s by item in my case !

I chased down the problem to 2 places, and reflection rule generation went from 12s to 25ms.

The analysis is the following :
  • get_interface_ip() is invoked 2x per NAT reflection rule, and each invocation takes ~6.2s in my case
  • in get_interface_ip() most of the (wallclock) time is in the path get_failover_interface($if) -> return_gateway_groups_array(true)
  • in return_gateway_groups_array() I found that the time spent was dispatched like this :
    $gateways_status = return_gateways_status(true);  # 3.8s
    ...
    $gateways_arr = return_gateways_array();  # 0.8s
    ...
    $gw4 = lookup_gateway_or_group_by_name($config['gateways']['defaultgw4'], $gways);  # 0.8s
    $gw6 = lookup_gateway_or_group_by_name($config['gateways']['defaultgw6'], $gways);  # 0.8s
    
    

For the return_gateways_status() path : invoked 2x by reflection rule, and does a ping-test on gateways via the dpinger service. AFAIK pfsense should make its mind once if a gateway is working or not, before building the filter rules. I hence used a simple memoization pattern :

--- /etc/inc/gwlb.inc.orig    2021-02-08 13:44:12.000000000 +0100
+++ /etc/inc/gwlb.inc    2025-05-20 17:51:02.586402000 +0200
@@ -437,8 +437,13 @@

 /* return the status of the dpinger targets as an array */
 function return_gateways_status($byname = false, $gways = false) {
-    global $config, $g;
+    global $config, $g, $gateways_status_memoized;

+    $memo_key = ($byname ? "byname" : "").":".($gways ? "gways" : "");
+    if (isset($gateways_status_memoized[$memo_key])) {
+        return ($gateways_status_memoized[$memo_key]);
+    }
+
     $dpinger_gws = running_dpinger_processes();
     $status = array();

@@ -530,6 +535,7 @@

         $status[$target]['monitor_disable'] = true;
     }
+    $gateways_status_memoized[$memo_key] = $status;
     return($status);
 }

Then it turns out that lookup_gateway_or_group_by_name() calls return_gateways_array(), thus we're only left with 3 calls to return_gateways_array() to optimize. It follows the chain : 2x calls to route_get_default() -> route_get() -> route_table() :

function route_table() {
        $_gb = exec("/usr/bin/netstat --libxo json -nWr", $rawdata, $rc);

On my system this command takes 0.4s, it's invoked 6 times per NAT reflection rule, we have our 2.4s budget to optimize. Same remark as above : I don't think it makes sense that the OS routing table would change while we're reloading the filter, thus I memoized it :

--- /etc/inc/util.inc.orig    2021-02-08 13:44:12.000000000 +0100
+++ /etc/inc/util.inc    2025-05-20 17:37:17.941698000 +0200
@@ -2623,6 +2623,9 @@

 /* Return system's route table */
 function route_table() {
+    global $route_table_memoized;
+    if (isset($route_table_memoized)) return $route_table_memoized;
+
     $_gb = exec("/usr/bin/netstat --libxo json -nWr", $rawdata, $rc);

     if ($rc != 0) {
@@ -2645,6 +2648,7 @@
     }
     unset($netstatarr);

+    $route_table_memoized = $result;
     return $result;
 }

Actions

Also available in: Atom PDF