Bug #4803
closedconfig.xml is empty if power loss or panic happens shortly after config write
100%
Description
When running ver 2.2.3 nanobsd with filesystem kept permanently read-write enabled (due to 3 minutes+ waiting time when running conf_mount_ro() on CF card).
Config.xml can get corrupted when power is lost after saving configuration changes.
Situation can be reproduced on ESXi VM.
1. Convert nanobsd image to vmdk image and create VM using the image. No installation.
2. From factory default configuration, enable permanent read-write. reboot using pfsense command.
3. Make changes like adding FW rule or create a captive portal.
4. Click save, after iostat show disk activities finished. Power reset using VM command.
5. Reboot will show
- /conf/config.xml:1: parser error: Document is empty.
- config.xml get restored from backup, but with last changes lost.
- warnings about missing timezone settings.
- warnings about wrong function parameters in config.lib.inc
Version 2.2.3 64bit nanobsd is affected. While version 2.2.2 is not affected by the file corruption or slow conf_mount_ro.
Updated by Kill Bill over 9 years ago
dem co wrote:
3 minutes+ waiting time when running conf_mount_ro() on CF card).
That's due to removal of this patch - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=176169 (Bug #2401). And yeah, this is a complete performance disaster on CF-based systems. Might as well reopen #2401.
Updated by Jim Pingle over 9 years ago
- Subject changed from Unsafe power off conrrupt config.xml to config.xml is empty if power loss or panic happens shortly after config write
- Status changed from New to Confirmed
- Target version set to 2.2.4
This does not appear to be specific to NanoBSD or even sync on the filesystem.
I can replicate this by causing a panic just after config save on a full install with sync on the filesystem. If a minute or two has passed before the power loss or crash, the config is fine, so it does appear to be somewhat related to filesystem data not being fully flushed to disk at at the time.
Checking the timestamp on the 0-byte config.xml it is from before the reboot.
Updated by Chris Buechler over 9 years ago
- Assignee set to Renato Botelho
- Affected Version changed from 2.2.3 to All
Updated by Chris Buechler over 9 years ago
Updated by Jim Thompson over 9 years ago
This needs similar work (and a PHP extension, because fsync() isn't possible via PHP) to what fixed the corruption of /etc/master.passwd and /etc/group.
Updated by Kill Bill over 9 years ago
Jim Thompson wrote:
This needs similar work (and a PHP extension, because fsync() isn't possible via PHP) to what fixed the corruption of /etc/master.passwd and /etc/group.
Well, fsync()/fdatasync() is definitely possible with devel/pecl-eio (http://docs.php.net/manual/en/function.eio-fsync.php, http://docs.php.net/manual/en/function.eio-fdatasync.php)
Updated by Renato Botelho over 9 years ago
- Status changed from Confirmed to Feedback
- % Done changed from 0 to 100
Please try next round of snapshots, a pfSense_fsync was implemented and is being used to make config.xml save operation safer.
Updated by Chris Buechler over 9 years ago
this looks to be fixed. Up to 15 cycles with no issues in a circumstance that would fail at least 50% of the time before. Leaving the power cycle test rig running in a loop overnight.
Updated by Chris Buechler over 9 years ago
The config.xml portion was fine with Renato's change, but missed other parts of /cf/conf/. Jim T's earlier change gets the entire directory. It's been through over 500 power cycles, with no issues. Just need to verify again with latest snapshot build, as the tested system was patched. That's running now.
Updated by Chris Buechler over 9 years ago
- Status changed from Feedback to Resolved
I'm confident in this, snapshots including all relevant changes have been through the config_write loop torture test, dropping power in the middle of writing the config repeatedly in a loop, upwards of 600 times with no ill effects.