Project

General

Profile

Bug #7396

Stopping and then starting again the load balancer clears out system tables (Bogons, sshlockout, aliases...)

Added by Julien Petit 4 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Load Balancer
Target version:
Start date:
03/15/2017
Due date:
% Done:

100%

Affected version:
2.3.3
Affected Architecture:
All

Description

Hi there :)

This is reproducible on a brand new 2.3.3 or 2.3.3_1 pfsense 64 bits with following simple load balancer configuration.
Note that, this doesn't happen on reload or when the system start. You have to stop the service from webconfigurator and then start it again.
I noticed this because the system was locking me out of pfsense at every load balancer start (we use the alias table to only open remote access to given hosts).

<load_balancer>
        <virtual_server>
            <name>httpLB</name>
            <descr></descr>
            <poolname>httpWebPool</poolname>
            <port>80</port>
            <ipaddr>192.168.10.254</ipaddr>
            <mode>redirect_mode</mode>
            <relay_protocol>tcp</relay_protocol>
        </virtual_server>
        <lbpool>
            <name>httpWebPool</name>
            <mode>loadbalance</mode>
            <descr></descr>
            <port>80</port>
            <retry></retry>
            <servers>192.168.50.1</servers>
            <servers>192.168.50.2</servers>
            <serversdisabled></serversdisabled>
            <monitor>HTTP</monitor>
        </lbpool>
    </load_balancer>

Associated revisions

Revision 803ca43a
Added by Jim Pingle 4 months ago

Perform a filter reload after starting relayd so it does not leave the firewall without pf tables. Fixes #7396

Revision a8014f46
Added by Jim Pingle 4 months ago

Perform a filter reload after starting relayd so it does not leave the firewall without pf tables. Fixes #7396

Revision 9433cda2
Added by Jim Pingle 4 months ago

Perform a filter reload after starting relayd so it does not leave the firewall without pf tables. Fixes #7396

Revision 31b1f1e1
Added by Jim Pingle 4 months ago

Don't process empty anchors as it could lead to flushing more than intended when cleaning up after relayd. Fixes #7396

Revision 3480105f
Added by Jim Pingle 4 months ago

Don't process empty anchors as it could lead to flushing more than intended when cleaning up after relayd. Fixes #7396

Revision 0d40b2cb
Added by Jim Pingle 4 months ago

Don't process empty anchors as it could lead to flushing more than intended when cleaning up after relayd. Fixes #7396

History

#1 Updated by Jim Pingle 4 months ago

  • Status changed from New to Confirmed
  • Target version set to 2.3.4
  • Affected Architecture changed from amd64 to All

Also affects 2.4.x.

That is not the usual way to operate relayd, however. Normally you would not need to stop/start it. Or you could stop it there, but start it again by edit/save/apply on one of the load balancer tabs, which does not negatively impact pf tables.

#2 Updated by Jim Pingle 4 months ago

  • Assignee set to Jim Pingle

To me, I have a fix pushed.

#3 Updated by Jim Pingle 4 months ago

  • Status changed from Confirmed to Feedback
  • % Done changed from 0 to 100

#4 Updated by Julien Petit 4 months ago

Jim Pingle wrote:

That is not the usual way to operate relayd, however. Normally you would not need to stop/start it.

In our case, we setup the load balancer first before our web servers are completly configured. If we do not stop the load balancer, monitoring generates errors on our webservers. It's not a big problem when you know that you shouldn't press the start button. But it's kind of confusing to have the start button and not being able to use it without breaking the firewall configuration.

Or you could stop it there, but start it again by edit/save/apply on one of the load balancer tabs, which does not negatively impact pf tables.

From the user's perspective (without technical knowledge of relayd or pfsense), it's difficult to guess the edit/save/apply buttons would have a different impact as the start button.

To me, I have a fix pushed.

I've tried to apply your fix in /etc/inc/service-utils.inc via the "Edit File" tab but it seems its still not working on 2.3.3-RELEASE-p1. Is there something else to patch ?

Thanks anyway :)

#5 Updated by Jim Pingle 4 months ago

Nothing else should be required but the changes made in the patch.

I can reproduce the problem without that fix applied, and with the fix applied I can't reproduce the problem.

If you still have a problem, consider using the HAProxy package instead of relayd as it is a much more full-featured proxy solution that is not so closely tied up with pf that it has these sorts of issues.

#6 Updated by Julien Petit 4 months ago

Jim Pingle wrote:

Nothing else should be required but the changes made in the patch.

I can reproduce the problem without that fix applied, and with the fix applied I can't reproduce the problem.

Ok, it seems to work but is appears to be dependent on a cron job because tables are not restored straight away.
Chrome gives you the feeling all is well probably because it seems the state still allows him to be connected to pfsense but firefox breaks straight away until the tables are restored. I discovered this checking the table state in console after each actions with pfctl -t Trusted -T show. After relayd is started, the table content disappears but after some times, it get restored. Can you confirm this ?

If you still have a problem, consider using the HAProxy package instead of relayd as it is a much more full-featured proxy solution that is not so closely tied up with pf that it has these sorts of issues.

I know HAProxy is a very good choice too but i like the simplicity of relayd and the fact that is is embedded in pfsense :)

#7 Updated by Jim Pingle 4 months ago

I couldn't reproduce that but it gave me another idea of where to look for problems. I'll have another fix pushed here in a few moments, give that one a try. It should work even without the first fix, but I'd feel safer with both around.

#8 Updated by Julien Petit 4 months ago

Note that with your patch, tables are not deleted like before. Only our alias table "Trusted" is emptied. Without your patch, even system tables (sshlockout...) were deleted (not only emptied). That might be why you can't reproduce.

#9 Updated by Jim Pingle 4 months ago

OK, try the later change here on the ticket now ( 31b1f1e1 )

#10 Updated by Julien Petit 4 months ago

Jim Pingle wrote:

OK, try the later change here on the ticket now ( 31b1f1e1 )

This is all good now ! Thanks :)

#11 Updated by Jim Pingle 4 months ago

  • Status changed from Feedback to Resolved

Great, thanks for testing!

Also available in: Atom PDF