Bug #15612: Captive Portal with big number of passththrough MAC addresses is causing webgui gateway timeouts, Error 50x, and HA-sync XMLRPC Error - pfSense - pfSense bugtracker

Custom queries

2.8.0 - Resolved/Closed
2.8.1 - Resolved/Closed
2.9.0 - All Open Bugs
2.9.0 - All Open Features
2.9.0 - All Open Issues
2.9.0 - All Open Regressions
2.9.0 - Feedback
2.9.0 - Needs Attention
2.9.0 - New/Confirmed
2.9.0 - Pull Requests
2.9.0 - Regressions affecting 2.9.0
2.9.0 - Resolved/Closed
25.07 Plus - All Closed Issues
25.07 Target - All Closed Issues
25.11 Plus - All Closed Issues
25.11 Target - All Closed Issues
25.11.1 Plus - All Closed Issues
25.11.1 Target - All Closed Issues
26.03 Plus - All Closed Issues
26.03 Target - All Closed Issues
26.03.1 Plus - All Closed Issues
26.03.1 Target - All Closed Issues
26.07 Plus - All Closed Issues
26.07 Plus - All Open Issues
26.07 Plus - Feedback Issues
26.07 Plus - Needs Attention/Work
26.07 Plus - New/Confirmed/In Progress Issues
26.07 Plus - Pull Request Review
26.07 Plus - Waiting on Merge
26.07 Target - All Closed Issues
26.07 Target - All Open Issues
All Open Issues assigned to Me
All Open Pull Requests
Any Target - All Open Regressions
Any Target - Feedback Issues
CE-Next - All Closed Issues (Move to specific target)
CE-Next - All Open Issues
CE-Next - Feedback (Likely needs target changed)
New Issues by Category - Future Target
New Issues by Category - No Target
New Issues by Category - No Target+Future
No Target - All Open Issues (Base Only)
No Target - New Issues (Base Only)
No Target - New Issues (Base and Packages)
Release Notes - CE Target Version (DO NOT EDIT)
Release Notes - Plus Target Version (DO NOT EDIT)
Release Notes - Target Version (DO NOT EDIT)

Actions

Copy link

Bug #15612

closed

Captive Portal with big number of passththrough MAC addresses is causing webgui gateway timeouts, Error 50x, and HA-sync XMLRPC Error

Added by Thomas Hohm almost 2 years ago. Updated 2 months ago.

Status:

Closed

Priority:

High

Assignee:

Category:

Captive Portal

Target version:

Start date:

Due date:

% Done:

Estimated time:

Plus Target Version:

Release Notes:

Default

Affected Version:

Affected Architecture:

Description

Forum discussion:
https://forum.netgate.com/topic/188936/captive-portal-with-big-number-of-passththrough-mac-addresses-is-causing-webgui-gateway-timeouts-error-50x-and-ha-sync-xmlrpc-error-broken-or-quantity-limitations/8

History
Notes
Property changes

Actions

Copy link

Updated by Thomas Hohm almost 2 years ago

Sorry, submitted by accident without details, here are the details to it:

The problematic behaviours:

1. Editing firewall rules: when I try to edit/save firewall rules, it takes a long time until it is completed; it happens often, that we get a nginx gateway timeout during saving.
2. Editing captive portal zone: when we edit the zone with the high number of passthrough MAC addresses, saving takes a very long time and causes 50x error. The crash reporter does not show any error (see output below), the syslog shows a message about "upstream timed out" (see below).
3. HA sync is failing with xmlrpc default socket timeout (see below)

In some cases the web ui is accessable after some minutes again, in some cases I have to use the SSH cli menu to restart php-fpm in order to make the web ui accessable again.

Crash Reporter:

Crash report begins.  Anonymous machine information:

amd64
15.0-CURRENT
FreeBSD 15.0-CURRENT #0 plus-RELENG_24_03-n256311-e71f834dd81: Fri Apr 19 00:28:14 UTC 2024     root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/obj/amd64/Y4MAEJ2R/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/sources/FreeBS

Crash report details:

No PHP errors found.

No FreeBSD crash data found.

XMLRPC alert:

A communications error occurred while attempting to call XMLRPC method restore_config_section: Request timed out due to default_socket_timeout php.ini setting @ 2024-06-26 11:47:28

Syslog entry:

2024/06/28 08:07:14 [error] 18824#101717: *3816 upstream timed out (60: Operation timed out) while reading response header from upstream, client: 10.10.100.11, server: , request: "POST /services_captiveportal.php?zone=mconweb_premium HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm.socket", host: "10.10.100.64:8080", referrer: "https://10.10.100.64:8080/services_captiveportal.php?zone=mconweb_premium"

- the behaviour is the same in Version 23.05 (tested) and also at least 1 version prior to it (as I can remember out of my head)
- we are using a ha cluster of 2x Netgate 1537 with 32 GB RAM & 500 GB SSD each
- we have 600+ mac addresses in the captive portal zone for automatic passthrough. The problems do not occur below 100 addresses.
- we have 2 captive portal zones in total, one with 6ßß+ mac addresses, the other with 0 mac addresses
- we are not using captive portal vouchers (we are using radius authentication with a radius server on a separate non-pfsense system)
- captive portal zones are included in the ha xmlrpc sync settings
- usualy whe have 1000+ users logged in to the captive portal
- as soon as we delete the captive portal zone, all problems are gone

Actions

Copy link

Updated by Thomas Hohm almost 2 years ago

addition:
- even excluding captive portal from xmlrpc ha sync does not fix the problem.
- I can also export the captive portal settings to XML and import them to a fresh installed system. Even during the import the web ui responds with error 50x or nginx gateway timeout (I`ve seen both, possible different behaviour between 24.03 and 23.05)

Actions

Copy link

Updated by Karl Ruskowski almost 2 years ago

We've been having the Same-ish Problem.

Main XMLRPC Error:

A communications error occurred while attempting to call XMLRPC method captive_portal_sync: Unable to connect to tls://172.16.1.252:4444. Error: Operation timed out @ 2024-08-22 07:54:18

Syslog:

Aug  2 11:33:44 pfSense01 php-fpm[45974]: /rc.carpmaster: A communications error occurred while attempting to call XMLRPC method captive_portal_sync: Unable to connect to tls://172.16.1.252:4444. Error: Operation timed out

2x Netgate Hardware Version 23.09.1-RELEASE on both
Any changes in the configuration result in many of these errormessages.

Actions

Copy link

Updated by Karl Ruskowski over 1 year ago

I was able to solve our problem. Our firewalls weren't syncing at all at closer inspection. I set the same Options under System -> advanced settings -> Webconfigurator and the sync began working again.

Actions

Copy link

Updated by Danilo Zrenjanin over 1 year ago

Priority changed from Normal to High

I successfully replicated the observed behavior. Both High Availability (HA) nodes were operating on the 24.03 release. Initially, there were two zones with a total of 345 MAC address pass-through entries. The XML-RPC was failing, as indicated by the following logs:

Nov 21 15:12:35 php-fpm 4777 /rc.filter_synchronize: Retrying XMLRPC Request due to error: A communications error occurred while attempting to call XMLRPC method host_firmware_version: Request timed out due to default_socket_timeout php.ini setting.

Upon removing the second zone, which contained 88 entries, the XML-RPC functioned without issues. It is noteworthy that the firewall had no additional packages installed and was configured with only two interfaces during the testing phase.

Actions

Copy link

Updated by Marcos M over 1 year ago

Project changed from pfSense Plus to pfSense
Category changed from Captive Portal to Captive Portal
Affected Plus Version deleted (~~24.03~~)

Actions

Copy link

Updated by Timo C over 1 year ago

Subject: Ongoing Issues with pfSense+ Following Update
Hello,
We are still encountering the same issues exclusively with pfSense+. Has there been any progress or changes on this matter? The project was migrated from pfSense Plus to pfSense.
Recently, we updated from 24.03 to 24.11-RELEASE (amd64), built on Fri Nov 22, 05:34:00 CET 2024. However, the update continues to cause significant disruptions to the GUI, with erratic behavior persisting.
Additionally, we've observed that one Phase 2 IKEv2 tunnel is no longer syncing properly via HA, which is particularly concerning.
Could you let us know if a fix is in the works or if there's a timeline for a resolution?
Looking forward to your response.
Kind regards,
Timo

Actions

Copy link

Updated by Danilo Zrenjanin 4 months ago

Tested against:

25.11.1-RELEASE (amd64)
built on Tue Jan 27 20:33:00 UTC 2026
FreeBSD 16.0-CURRENT

I was unable to reproduce the reported behavior on this release. Testing was performed with 400 MAC address pass-through entries configured. XML-RPC reported no errors when applying changes on the primary system, and the configuration replicated successfully to the secondary without any issues.

Actions

Copy link

Updated by Marcos M 2 months ago

Status changed from New to Closed

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

pfSense

Custom queries

Bug #15612

Captive Portal with big number of passththrough MAC addresses is causing webgui gateway timeouts, Error 50x, and HA-sync XMLRPC Error

Updated by Thomas Hohm almost 2 years ago

Updated by Thomas Hohm almost 2 years ago

Updated by Karl Ruskowski almost 2 years ago

Updated by Karl Ruskowski over 1 year ago

Updated by Danilo Zrenjanin over 1 year ago

Updated by Marcos M over 1 year ago

Updated by Timo C over 1 year ago

Updated by Danilo Zrenjanin 4 months ago

Updated by Marcos M 2 months ago