Bug #15108
closed``pfctl`` is unable to retrieve state creator list in certain circumstances
100%
Description
In certain cases pfctl -sc
is unable to obtain the list of state creators, and instead results in an error message such as:
System Log / Kernel Message Buffer:
Dec 4 18:49:08 kernel Warning: too many creators!
CLI:
: pfctl -sc pfctl: Failed to retrieve creators
The output of truss -o creatordebug.txt -f pfctl -sc
from an affected system is attached.
The specific circumstances under which this fails have not been determined yet and thus far we have been unable to trigger the error in lab conditions.
See also: https://forum.netgate.com/topic/184561/no-state-creator-host-ids-visible
Files
Updated by Jim Pingle about 1 year ago
- File creatordebug.txt creatordebug.txt added
Updated by Kristof Provost about 1 year ago
I think I see how the 'No space left on device' error can happen if we have many creator ids.
It's already fixed, because current versions of that code use the netlink to communicate, so that specific bug is already gone.
That's not the root cause of this issue though, because we ought to have 1 or 2 (distinct) creator ids and no more, and the 'Warning: too many creators!' kernel message means we have 16.
A full state table output (i.e. pfctl -ss -vvv) might give us clues about why there are so many different creator ids.
Updated by Kristof Provost 11 months ago
- Status changed from New to Feedback
Quick summary from the forum discussion: the reporter has upgraded both (pfsync) hosts to the same version, and the problem appears to have gone away.
Previously they ran different versions, with different pfsync protocol versions. That ought to work, and I failed to reproduce this behaviour in such a setup, but as it did go away for the reporter that's the most probable cause.
Given that the actionable issue (the 'no space left on device' error) is already fixed I think we're done here.
Updated by Jim Pingle 11 months ago
- Status changed from Feedback to Resolved
- % Done changed from 0 to 100
Given that we can't reproduce it there isn't a good way to verify the fix, so we can close this out for now. If we get any additional reports we can update/reopen it as needed.