Bug #8317
closedCaptive Portal Sync Errors
100%
Description
We have a CP setup running at a customer's site which uses a central pfSense VM as CP Voucher Sync target for central management/sync of voucher rolls and 3 physical HW machines. Those 3 HW run captive portal for Guest Laptops and WiFi devices. Vouchers are working as they should but after expiration or manual diusconnect of a voucher, there are always 2 UI Alerts that pop up:
Examples:
CP Log: Jan 22 11:05:10 logportalauth 8654 Zone: cpzone - DISCONNECT: <USER ID>, , <Client IP> Sys Log: Jan 22 11:05:10 php-fpm 8654 /status_captiveportal.php: New alert found: Exception calling XMLRPC method exec_php # String could not be parsed as XML Jan 22 11:05:10 php-fpm 8654 /status_captiveportal.php: Exception calling XMLRPC method exec_php # String could not be parsed as XML Jan 22 11:05:10 php-fpm 8654 /status_captiveportal.php: Beginning XMLRPC sync data to http://<MGMT VM IP>:8080/xmlrpc.php. Jan 22 11:05:10 php-fpm 8654 /status_captiveportal.php: New alert found: Exception calling XMLRPC method exec_php # String could not be parsed as XML Jan 22 11:05:10 php-fpm 8654 /status_captiveportal.php: Exception calling XMLRPC method exec_php # String could not be parsed as XML Jan 22 11:05:10 php-fpm 8654 /status_captiveportal.php: Beginning XMLRPC sync data to http://<MGMT VM IP>:8080/xmlrpc.php.
That's from a manual disconnect via UI. Normal Timeouts and housekeeping of the CP also triggers that alerts:
CP Log: Jan 22 11:07:15 logportalauth 71126 Zone: cpzone - TIMEOUT: ThAQ8T7BQD4, , <Client IP> Jan 22 11:06:15 logportalauth 40786 Zone: cpzone - TIMEOUT: 6eLAMJTNqBr, , <Client IP> SysLog: Jan 22 11:07:15 php-cgi rc.prunecaptiveportal: New alert found: Exception calling XMLRPC method exec_php # String could not be parsed as XML Jan 22 11:07:15 php-cgi rc.prunecaptiveportal: Exception calling XMLRPC method exec_php # String could not be parsed as XML Jan 22 11:07:15 php-cgi rc.prunecaptiveportal: Beginning XMLRPC sync data to http://<MGMT VM IP>:8080/xmlrpc.php. Jan 22 11:07:15 php-cgi rc.prunecaptiveportal: New alert found: Exception calling XMLRPC method exec_php # String could not be parsed as XML Jan 22 11:07:15 php-cgi rc.prunecaptiveportal: Exception calling XMLRPC method exec_php # String could not be parsed as XML Jan 22 11:07:15 php-cgi rc.prunecaptiveportal: Beginning XMLRPC sync data to http://<MGMT VM IP>:8080/xmlrpc.php. Jan 22 11:06:15 php-cgi rc.prunecaptiveportal: New alert found: Exception calling XMLRPC method exec_php # String could not be parsed as XML Jan 22 11:06:15 php-cgi rc.prunecaptiveportal: Exception calling XMLRPC method exec_php # String could not be parsed as XML Jan 22 11:06:15 php-cgi rc.prunecaptiveportal: Beginning XMLRPC sync data to http://<MGMT VM IP>:8080/xmlrpc.php. Jan 22 11:06:15 php-cgi rc.prunecaptiveportal: New alert found: Exception calling XMLRPC method exec_php # String could not be parsed as XML Jan 22 11:06:15 php-cgi rc.prunecaptiveportal: Exception calling XMLRPC method exec_php # String could not be parsed as XML Jan 22 11:06:15 php-cgi rc.prunecaptiveportal: Beginning XMLRPC sync data to http://<MGMT VM IP>:8080/xmlrpc.php.
No more information are visible besides the Syslog PHP processes throwing those XMLRPC errors (2 of them) for every Timeout or Disconnect of the CP. As those CP instances are quite busy, the customer setup shows thousands of alerts in the UI on those CP hosts. The mgmt VM shows no errors that we are aware of.
Any hint to why that only happens on those two occasions and what the triggers may be?
Greets
Updated by Jim Pingle over 6 years ago
- Status changed from New to Confirmed
- Target version set to 2.4.3
Updated by Jim Pingle over 6 years ago
- Assignee changed from Jim Pingle to Renato Botelho
Using vouchers with a "central" host is not an approved or supported use of the voucher sync system, but the problem does happen on an HA setup so it's still a legitimate bug.
This can be reproduced by having an HA cluster setup with voucher sync, then manually fail over to the secondary and have a client use a voucher to login. Disconnecting the voucher user from status_captiveportal.php will show the error.
The notifications are a result of the calls to pfSense_ipfw_table() and pfSense_ipfw_pipe() in captiveportal_disconnect() on the XMLRPC server (not client).
Because the XMLRPC server (in most cases, the primary) does not have the entries in its IPFW tables and the referenced DUMMYNET pipes do not exit, those function calls produce errors which are apparently sent back to the XMLRPC client as raw text and not XML.
Even using @ before the functions does not suppress the errors, they are still present in output:
Failed setsockoptFailed setsockoptrule 1: setsockopt(IP_DUMMYNET_DEL)rule 1: setsockopt(IP_DUMMYNET_DEL)
Failed setsockopt Failed setsockopt rule 1: setsockopt(IP_DUMMYNET_DEL) rule 1: setsockopt(IP_DUMMYNET_DEL)
Updated by Jens Groh over 6 years ago
Jim Pingle wrote:
Using vouchers with a "central" host is not an approved or supported use of the voucher sync system, but the problem does happen on an HA setup so it's still a legitimate bug.
This can be reproduced by having an HA cluster setup with voucher sync, then manually fail over to the secondary and have a client use a voucher to login. Disconnecting the voucher user from status_captiveportal.php will show the error.
Hi Jim,
not to argue with the supported "use" but if the CP has a separate sync available for voucher rolls - isn't it normal that it will get used that way and that some sort of slave(s) will get synced from it? As the "default" sync for the portal doesn't seem to include the voucher area (see my other ticket about the voucher sync https://redmine.pfsense.org/issues/8280) it seems only logical to me.
I didn't think about the CARP setup until I read your reply but that matches another problem with another customer installation we have out there that could be explained that way, so thanks for testing!
As I was discussing CP features with a customer rolling out a bigger multi-location installation that also wants to sync the rolls with the central HQ, even if it's not approved/supported use, I see that "feature" or usage as a much requested one in the wild. So perhaps it can become an official feature or use case in the future :)
Thanks and Greets from Germany,
Jens
Updated by Jim Pingle over 6 years ago
That's not a discussion relevant to the bug, but in brief: The ONLY supported use of XMLRPC sync in any area is for HA. It may work for other use cases by luck, but not by design.
Updated by Renato Botelho over 6 years ago
- Status changed from Confirmed to Feedback
- % Done changed from 0 to 100
Applied in changeset 356f29a03f7a7c770cbd8492c20347f615e3fdd7.
Updated by Jim Pingle over 6 years ago
- Status changed from Feedback to Resolved
Works on current snaps.