Regression #13418
closedCaptive Portal does not keep track of client data usage
Added by Dale Harron over 2 years ago. Updated almost 2 years ago.
100%
Description
- pfSense+ 22.05
- Configure Captive Portal on VLAN interface
- Use FreeRADIUS auth backend
- Check
Reauthenticate Users
,Session timeout
,Traffic quota
,Per-user bandwidth restrictions
- Enable
Send RADIUS accounting packets
, selectStop/Start (FreeRADIUS)
orInterim
- used-octects remain at 0 bytes or at pre-upgrade value in
/var/log/radacct/datacounter/forever
, and 0 bytes incremented on quota in authentication log every minute - FreeRadius log updates "connect time" correctly but all data values are 0.
- Working on 22.01 with patch 12834
- Broken on 22.05
- Broken on 2.7
Related issues
Updated by Gertjan KROEB over 2 years ago
I've posted the same (?) conclusion in the forum : FreeRadius and quotas, doesn't work since 22.05
Updated by Dale Harron about 2 years ago
I would also like to point out that this issue is not solely related to FreeRadius Accounting Packets but also affects all VLan based Captive Portals that have Traffic Quotas. Services, Captive Portal, click on VLan Portal,
Traffic quota (Megabytes)
100 (for example)
Clients will be disconnected after exceeding this amount of traffic, inclusive of both downloads and uploads. They may log in again immediately, though. Leave this field blank for no traffic quota.
This quota also stays at Zero and is not enforced. Bandwidth and time related constraints are still working though.
Updated by Marcos M about 2 years ago
Please test the attached patch with the System Patches package on pfSense+ 22.05.
Updated by Marcos M about 2 years ago
- Subject changed from FreeRadius Accounting zeros data or fails to collect data in 22.05, worked in 22.02 to Captive Portal does not keep track of client data usage
- Description updated (diff)
- Status changed from New to Confirmed
- Assignee set to Marcos M
- Priority changed from Urgent to Normal
- Target version set to 2.7.0
- Plus Target Version set to 22.11
Updating issue for clarification.
Updated by Marcos M about 2 years ago
- File 13418.patch added
- Status changed from Confirmed to Pull Request Review
Tested patch on 22.05 and reported issues are resolved.
https://gitlab.netgate.com/pfSense/pfSense/-/merge_requests/850
Updated by Marcos M about 2 years ago
- Status changed from Feedback to Pull Request Review
Updated by Marcos M about 2 years ago
- Status changed from Feedback to Pull Request Review
Additional fix:
https://gitlab.netgate.com/pfSense/pfSense/-/merge_requests/863
https://gitlab.netgate.com/pfSense/FreeBSD-ports/-/merge_requests/263
This should address the issue noted here: https://forum.netgate.com/post/1059435
Edit: Merged
Updated by Marcos M about 2 years ago
- Status changed from Pull Request Review to Feedback
Updated by Marcos M about 2 years ago
- % Done changed from 0 to 100
Applied in changeset af044b67492c936eda0ef009fe713a29ec4deefb.
Updated by Jim Pingle about 2 years ago
- Plus Target Version changed from 22.11 to 23.01
Updated by Chris Linstruth almost 2 years ago
Counters still zero
print_r(pfSense_pf_cp_get_eth_rule_counters("cpzoneid_2_auth/172.25.235.130_32")); Array ( [0] => Array ( [0] => 1 [1] => 0 [2] => 0 ) [1] => Array ( [0] => 2 [1] => 0 [2] => 0 ) ) pfctl -vvse -a cpzoneid_2_auth/172.25.235.130_32 @0 ether pass in quick proto 0x0800 from a8:20:66:2b:f0:d2 l3 from 172.25.235.130 to any tag cpzoneid_2_auth dnpipe 2002 [ Evaluations: 19232 Packets: 4298 Bytes: 477032 ] [ Last Active Time: Sun Dec 11 21:53:56 2022 ] @1 ether pass out quick proto 0x0800 to a8:20:66:2b:f0:d2 l3 from any to 172.25.235.130 tag cpzoneid_2_auth dnpipe 2003 [ Evaluations: 14914 Packets: 13515 Bytes: 18008092 ] [ Last Active Time: Sun Dec 11 21:53:56 2022 ]
Updated by Marcos M almost 2 years ago
- Status changed from Feedback to In Progress
Thank you for testing - there looks to be a type casting issue in php-pfSense-module.
https://gitlab.netgate.com/pfSense/FreeBSD-ports/-/merge_requests/302
Updated by Marcos M almost 2 years ago
- Status changed from In Progress to Pull Request Review
Updated by Reid Linnemann almost 2 years ago
- Assignee changed from Marcos M to Reid Linnemann
Updated by Reid Linnemann almost 2 years ago
- Status changed from Pull Request Review to Feedback
Applied in changeset c1bc55a9f37e5977110a3bb1f170321738fdf3d2.
Updated by Reid Linnemann almost 2 years ago
PF_IN/PF_OUT direction was mismatched with the array index into the counters that we sampled. This should be fixed in the next build.
Updated by Dale Harron almost 2 years ago
Some success, data is now being passed to freeRadius but: (steady state stream of 33 MB/minute, single login per user, Dec 30,0600 beta release)
1. If stop/start freeRadius is set for accounting in the captive portal with or without Reauthenticate con users every minute, the data value increments at 4-9 X the true rate with the variable appearing to be time. i.e. it starts out at 1, then 2, then 4 and around 10 minutes it has logged 9X the actual data throughput. It is as if the interim accounting interval was still present and being cumulated rather than incremented as the temporary used-octets-user-mac file is not used with all data accumulated in the single used-octets-user file in var/log/radacct/datacounter/forever (or day, etc as applicable).
2. if interim is selected for accounting in the captive portal, with or without Reauthenticate con users every minute, the data value cumulates in the temporary, per connected user, used-octets-user-mac file very closely matches the actual data used (freeRadius interim intervals of 61 and default 600 secs tested). As always, the used-octets-user file is not updated until after the user logs out. With multiple logins per user, this increases overall accuracy as well but only when everyone has logged out. I have not tested this release as to if the captive portal with freeRadius authentication will correctly log all users out when the data limit is reached but in the past it did NOT. That is why stop/start freeRadius is preferred over interim in out case.
3. Reauthenticate con users every minute affected sys log contents but did not affect the above results.
4. I have not checked the captive portal data limit based on local user database authentication yet. 22.05 was not consistently enforcing it.
I am not sure if you intended stop/start freeRadius to accurately log data quotas (as in multiple logins per user), but it sure makes implementation easier if it does. pfSense correctly logs a simultaneous per-user-acct out when the single used-octets-user file maintained under stop/start freeRadius contains the total of all users data consumption on an ongoing basis to the nearest minute. I have been using that fact to insert an over-quota value into used-octets-user when we estimated that the data quota had been exceeded (using darkstat data) while redmine 13418 remains unresolved. The above over reporting of data consumption will thus log them out prematurely.
pfSense does not appear to be able to consolidate multiple (even just one) used-octets-user-mac files into the single used-octets-user file to determine data consumption, instead when it checks the single used-octets-user file for Data Quota compliance so unless the used-octets-user-mac files associated users log out, pfSense will not enforce a data Quota on the captive portal interim setting because the data use is cumulating in the user specific used-octets-user-mac file until logout when it's contents is summed with the current contents of the single used-octets-user file that pfSense checks as part of it's freeRadius support for accounting data. This is not new but in the absence of an accurate stop/start freeRadius Data log, a Data Quota can't be enforced in this setting util the cumulative data usage of logged out users exceeds the quota. Users that are not logged out are not included in the evaluation.
Updated by Dale Harron almost 2 years ago
More extended testing demonstrates a NEW issue (see #2 point above for as tested configuration): pre-mature captive portal logout of a freeRadius authenticated user on a user account where simultaneous logins are permitted (likely also true for single login free Radius authenticated users but not tested).
With a test case of 2 simultaneous device logins to a freeRadius user account, monitoring of data logins in the device specific used-octets-user-mac files, data is being logged/incremented roughly every minute with reasonably accurate quantities in each of the 2 device specific interim freeRadius files. The used-octets-user file which summarizes the usage against that user over time (i.e. the sum of all devices/user that have been logged out already) accurately reflects prior usage. At approximately 30 minutes after around 1.5 Gb of data throughput, the Captive Portal logs the device out and merges its session data usage from the device specific used-octets-user-mac file into the used-octets-user file, deletes the used-octets-user-mac file and enters "QUOTA EXCEEDED: USER" into the System Logs/Authentication/Captive Portal Auth log. Normally that device and all other device specific sessions associated with the user account would be logged out and the User (forever) would not be able to log in again, instead receiving a "Quota Exceeded" error message. In this case though, the Device can log back in and start a new session. All other active device/user sessions continue without interruption. In this test, the second device was likewise logged out a minute or so later after similar conditions were met.
This appears to indicate an issue with the Captive Portal incorrectly implementing the freeRadius Quota, possibly because of the exponential data calc inaccuracies over time symptoms in item 1 above. Depending upon throughput, a 100 GB data quota takes approximately 1.5 GB of actual data throughput over time, typically 10 minutes to 2 hours in my testing to date.
With Captive portal set to interim and no reauthentication/minute by the captive portal, freeRadius does appear to accurately reflect the data usage to date of all "logged out devices" per user account. freeRadius also retains interim data files for devices that have not yet logged out and Captive Portal shows the device logged in for the period of time until a logout is forced by the Captive Portal due to the Captive Portal inactivity setting.
Updated by Marcos M almost 2 years ago
- Status changed from Feedback to Resolved
The original issue is now resolved; traffic is recorded correctly:
print_r(pfSense_pf_cp_get_eth_rule_counters("cpzoneid_2_auth/10.0.1.102_32")); Array ( [0] => Array ( [direction] => 1 [evaluations] => 728145 [input_pkts] => 81866 [input_bytes] => 36867259 ) [1] => Array ( [direction] => 2 [evaluations] => 645870 [output_pkts] => 314081 [output_bytes] => 457773782 ) )Regarding the feedback above; if I understand correctly, the issues are as follows:
- Using "Stop/Start (FreeRADIUS)" for "Send accounting updates" leads to the data usage count becoming increasingly inaccurate, showing a higher usage count than expected.
- When using "Interim" for "Send accounting updates" along with "Multiple" for "Concurrent user logins", all login sessions for that user are not logged out.
- User-specific traffic quotas (the "Amount of Download and Upload Traffic" option within the FreeRADIUS user configuration) correctly leads to the user being logged out after the quota is reached, but the user is able to log in again instead of being blocked.
The second one is expected. By enabling "Reauthenticate Users", all sessions will be re-authenticated every minute and hence logged out and rejected on subsequent attempts to log in.
I was not able to replicate the third one - the authentication attempt is correctly rejected and the user cannot log in. This is with "Interim" selected, and "Reauthenticate Users" unchecked:
Jan 1 17:42:25 radiusd 24492 (9) Rejected in post-auth: [User01/<via Auth-Type = mschap>] (from client localhost port 2000 cli 14:91:82:xx:xx:xx)
Jan 1 17:42:25 radiusd 24492 (9) Login incorrect (Failed retrieving values required to evaluate condition): [User01/<via Auth-Type = mschap>] (from client localhost port 2000 cli 14:91:82:xx:xx:xx)
If steps can be provided to reproduce these issues, they need their own bug report.
Updated by Marcos M almost 2 years ago
- Related to Regression #13823: RADIUS attribute pfSense-Max-Total-Octets is not parsed correctly added
Updated by Dale Harron almost 2 years ago
The solution that was applied for stop/start freeRadius that sends only incremental data use in each stop/start packet to freeRadius works well because it effectively bypasses the concern under 13843 as the freeRadius supplied value for max-octets-user is unimportant.
If you applied this logic to all incremental communication with freeRadius under reauthenticate every minute and interim, they too would not care about the 32 bit unsigned integer concern in 13843 and you could remove the UNUSABLE 4095 GB maximum now enforced in the freeRadius GUI front end. freeRadius will cumulate the incremental amounts as designed into user specific used-octets-user-uniqueID files and cumulate them to the used-octets-user file used by freeRadius to ascertain if a false (logout) should be sent back to pfSense Captive Portal to log the user out when the limit is reached.
If the fact multi users are logged in and freeRadius does not account for their data usage until they formally log out, or time out, is a problem to the pfSense installation, that is when either reauthenticate every minute or stop/start freeRadius are applicable and either will effect cumulation of data into the used-octets-user file so that freeRadius is tracking data usage for all users. This is by design and need not be compensated for beyond these two options.
If you need a new redmine, please raise one based on this observation but in my opinion, this is an issue related to this original request and should be done under it.
The frequency of which all interim used-octets-user-uniqueID files are consolidated into the used-octets-user file should be manageable as once a minute is too frequent and a substantial overhead for some systems with many users. I suggest you expose the frequency of both reauthenticate every minute and stop/start freeRadius to the GUI so that loading can be compenstated for by increasing the delay when deemed necessary. I do see why you might consider this a new redmine issue but I suspect is is simple to implement and it is desparately needed right now in the 23.01 release.
Updated by Dale Harron almost 2 years ago
In way of clarification, the used-octets-user or used-octets-user-uniqueID files are currently correctly updated with incremental data once a minute (should use the freeRadius incremental update value though; 600 seconds default). The problem that is related to 13843's 32 bit unsigned integer limitation in representing the max-octets-user value is that pfSense's support for freeRadius does not let freeRadius manage the summing of the incremental consumption values into the used-octets-user file and then compare it to max-octets-user on the freeRadius side. Instead, pfSense maintains its own total and compares it to the max-octets-user value sent in the packet, yes, the value that is wrong when above 4GB due to the 32 bit unsigned integer limitations. pfSense should simply not check that value, it is not it's job to manage this data quota, that is the job of freeRadius. Thus, freeRadius is not limited by the 4GB 32 bit issue in 13843, pfSense is and only because it is not minding it's own business and instead arbitrarily disconnecting the user before receiving a request to do so from freeRadius in the interim update packet.
I will cross post this observation to 13843.