Bug #14061
closedPHP error if a non-privileged shell user attempts an operation which needs to write ``config.cache``
100%
Description
Running 23.01 on a 7100. Noticed these PHP errors many hours after they occurred, so unfortunately have no idea what might have triggered them.
Version 23.01 was installed two weeks ago (upgraded from 22.01). No other PHP errors or crashes since the upgrade.
Crash report begins. Anonymous machine information: amd64 14.0-CURRENT FreeBSD 14.0-CURRENT #0 plus-RELENG_23_01-n256037-6e914874a5e: Fri Feb 10 20:30:29 UTC 2023 root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-23_01-main/obj/amd64/VDZvZksF/var/jenkins/workspace/pfSense-Plus-snapshots-23_01-main/sources/FreeBS Crash report details: PHP Errors: [01-Mar-2023 14:45:19 America/Los_Angeles] PHP Fatal error: Uncaught TypeError: fwrite(): Argument #1 ($stream) must be of type resource, bool given in /etc/inc/config.lib.inc:172 Stack trace: #0 /etc/inc/config.lib.inc(172): fwrite(false, 'a:43:{s:7:"vers...') #1 /etc/inc/config.lib.inc(147): generate_config_cache(Array) #2 /etc/inc/config.inc(141): parse_config() #3 /etc/inc/gwlb.inc(25): require_once('/etc/inc/config...') #4 /etc/inc/functions.inc(35): require_once('/etc/inc/gwlb.i...') #5 /etc/inc/notices.inc(26): require_once('/etc/inc/functi...') #6 /usr/local/bin/notify_monitor.php(24): include_once('/etc/inc/notice...') #7 {main} thrown in /etc/inc/config.lib.inc on line 172 [01-Mar-2023 14:45:19 America/Los_Angeles] PHP Fatal error: Uncaught ValueError: Path cannot be empty in /etc/inc/notices.inc:101 Stack trace: #0 /etc/inc/notices.inc(101): fopen('', 'w') #1 /etc/inc/config.lib.inc(1162): file_notice('phperror', 'PHP ERROR: Type...', 'PHP errors') #2 [internal function]: pfSense_clear_globals() #3 {main} thrown in /etc/inc/notices.inc on line 101 No FreeBSD crash data found.
Related issues
Updated by Jim Pingle over 1 year ago
- Status changed from New to Not a Bug
That isn't a bug in PHP code, it's failing to write files to /tmp
for some reason. We've seen a few reports of this but haven't seen any common thread as to why it might be happening other than a potential issue with the disk being full or not responding in some way.
This site is not for support or diagnostic discussion, however, to track down what might be happening on your particular setup.
For assistance in solving problems, please post on the Netgate Forum .
See Reporting Issues with pfSense Software for more information.
Updated by Marcos M over 1 year ago
- Subject changed from PHP Errors to Improve PHP error handling when failing to write to disk
- Status changed from Not a Bug to New
- Affected Architecture All added
- Affected Architecture deleted (
7100)
It would be preferable to add better error handling for these kinds of PHP errors, and ideally show a more useful alert and attempt to log that to the system log if possible. I'm reopening this to fit the request given that it's closely related.
Updated by Jim Pingle over 1 year ago
- Status changed from New to Not a Bug
That really isn't viable. We'd have to potentially catch any/every PHP error or rewrite every call that might even possibly hit the disk and fail. Also we can't log things (at least locally) if the disk isn't responding. Even if we catch the errors, it's just a guess as to why they failed, not definite.
The functions above just happen to be two that are commonly hit, but they are far from the only ones.
Updated by Jim Pingle over 1 year ago
- Subject changed from Improve PHP error handling when failing to write to disk to Write failure of ``config.cache`` for what may be a non-hardware cause
- Category changed from Unknown to PHP Interpreter
- Status changed from Not a Bug to New
- Assignee set to Jim Pingle
- Target version set to 2.7.0
- Plus Target Version set to 23.05
Reopening this for some more investigation. There appear to be several people hitting this, but not consistently and not in a way we can reproduce in lab conditions. Enough that it may not actually be a hardware issue, but it's still possible it is a filesystem problem. If it was a drive problem I'd expect some more variation in where it fails but too many people are hitting this code path exactly.
The first error is from a failure to open /tmp/config.cache
for writing. The fopen
call at source:src/etc/inc/config.lib.inc#L171 is failing (returns false
). Because it failed to write the config.cache, parsing the configuration fails, which triggers a notification which also fails.
The notification fails (second error, source:src/etc/inc/notices.inc#L101) because $notice_path
is empty but it's not clear how that might be happening since it's defined earlier in the same file (source:src/etc/inc/notices.inc#L29).
The first error is from a file in /tmp
the second in /var
, though both paths do use g_get()
, that doesn't seem to be related. If it was I would expect the notices to be in /notices
not empty.
If it isn't a drive or FS issue, I wonder if this is related to #14031 as well then. No PHP error there but it also seems to fail to modify the notices queue on the drive somehow.
Updated by Jim Pingle over 1 year ago
- Related to Bug #14031: Identical SMTP notifications repeat in an infinite loop under certain conditions added
Updated by Jim Pingle over 1 year ago
The only potential possible cause I can see is that both this and #14031 are initially triggered by source:usr/local/bin/notify_monitor.php -- that file is launched via mwexec_bg()
in source:src/etc/inc/notices.inc#L307 . I seem to recall in the past we had issues launching separate PHP scripts this way, which may explain some of the issues here. That said, usually those issues were easier to reproduce.
I think the next best step here is to come up with some alternative way to launch the process that runs the message queue. Whether that's a separate PHP thread or using something like cron or minicron to check/run the queue periodically instead of the current method. That would also allow the messages to be collected into larger batches instead of the current way where they will only be batched if they happen within a very short window.
Maybe a small script run from a minicron instance set to check every 20 seconds if the queue is empty and if it's not empty, then it can launch notify_monitor.php. That would be much lighter than running a PHP script directly that frequently.
Updated by Jim Pingle over 1 year ago
The more I looked at this I'm fairly certain it's the same root cause as #14031 -- If an unprivileged user such as nut
tries to send an e-mail notification it would follow this code path but not have sufficient privileges to write the files mentioned here.
I couldn't induce the error on demand but when I cleared the cache and ran a test notification from nut, it created config.cache as nut. So if the cache was outdated (but the file existed), then it would fail as observed here.
I added some safety belt checks to hopefully prevent this from happening.
You can install the System Patches package and then create an entry for c5faa351c1ef6d4555478a7f50b3a16ece7e0b2a
to apply the fix.
Updated by Jim Pingle over 1 year ago
- Status changed from New to Feedback
- % Done changed from 0 to 100
Applied in changeset c5faa351c1ef6d4555478a7f50b3a16ece7e0b2a.
Updated by Jim Pingle over 1 year ago
- Related to Bug #14277: Fatal error while restarting Unbound through SSH added
Updated by Jim Pingle over 1 year ago
- Subject changed from Write failure of ``config.cache`` for what may be a non-hardware cause to PHP error if a non-privileged shell user attempts an operation which needs to write ``config.cache``
Updating subject for release notes.