Bug #16275
openRemoving Limiters can leave unconnected queues behind
0%
Description
As noted in https://forum.netgate.com/topic/197882/25-03-b-20250610-1659-re-enabling-limiters-leads-to-syslog-kernel-messages-update_fs it seems it is possible to remove a Limiter in the GUI and the corresponding queue is left dangling in the system.
I am not quite sure how I ended up with this configuration, but there has been a lot of adding and removing of limiters lately. I started the day with six limiters defined. Here is that configuration:
[25.03-BETA][admin@pfsense.local.lan]/root: /sbin/dnctl sched list
00001: 600.000 Mbit/s 0 ms burst 0
q65537 50 sl. 0 flows (1 buckets) sched 1 weight 0 lmax 0 pri 0 droptail
sched 1 type FQ_CODEL flags 0x0 0 buckets 0 active
FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
Children flowsets: 1
00002: 600.000 Mbit/s 0 ms burst 0
q65538 50 sl. 0 flows (1 buckets) sched 2 weight 0 lmax 0 pri 0 droptail
sched 2 type FQ_CODEL flags 0x0 0 buckets 0 active
FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
Children flowsets: 2
00005: 400.000 Mbit/s 0 ms burst 0
q65541 50 sl. 0 flows (1 buckets) sched 5 weight 0 lmax 0 pri 0 droptail
sched 5 type FQ_CODEL flags 0x0 0 buckets 0 active
FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
Children flowsets: 3
00006: 100.000 Mbit/s 0 ms burst 0
q65542 50 sl. 0 flows (1 buckets) sched 6 weight 0 lmax 0 pri 0 droptail
sched 6 type FQ_CODEL flags 0x0 0 buckets 0 active
FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
Children flowsets: 4
00010: 600.000 Mbit/s 0 ms burst 0
q65546 50 sl. 0 flows (1 buckets) sched 10 weight 0 lmax 0 pri 0 droptail
sched 10 type FQ_CODEL flags 0x0 0 buckets 0 active
FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
Children flowsets: 5 10
00011: 600.000 Mbit/s 0 ms burst 0
q65547 50 sl. 0 flows (1 buckets) sched 11 weight 0 lmax 0 pri 0 droptail
sched 11 type FQ_CODEL flags 0x0 0 buckets 0 active
FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
Children flowsets: 6 11
[25.03-BETA][admin@pfsense.local.lan]/root: /sbin/dnctl queue list
q00001 50 sl. 0 flows (1 buckets) sched 1 weight 0 lmax 0 pri 0 droptail
q00002 50 sl. 0 flows (1 buckets) sched 2 weight 0 lmax 0 pri 0 droptail
q00003 50 sl. 0 flows (1 buckets) sched 5 weight 0 lmax 0 pri 0 droptail
q00004 50 sl. 0 flows (1 buckets) sched 6 weight 0 lmax 0 pri 0 droptail
q00005 50 sl. 0 flows (1 buckets) sched 10 weight 0 lmax 0 pri 0 droptail
q00006 50 sl. 0 flows (1 buckets) sched 11 weight 0 lmax 0 pri 0 droptail
q00009 50 sl. 0 flows (1 buckets) sched 9 weight 0 lmax 0 pri 0 droptail
q00010 50 sl. 0 flows (1 buckets) sched 10 weight 0 lmax 0 pri 0 droptail
q00011 50 sl. 0 flows (1 buckets) sched 11 weight 0 lmax 0 pri 0 droptail
[25.03-BETA][admin@pfsense.local.lan]/root: /sbin/dnctl pipe list
00001: 600.000 Mbit/s 0 ms burst 0
q131073 2000 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
sched 65537 type FIFO flags 0x0 0 buckets 0 active
00002: 600.000 Mbit/s 0 ms burst 0
q131074 2000 sl. 0 flows (1 buckets) sched 65538 weight 0 lmax 0 pri 0 droptail
sched 65538 type FIFO flags 0x0 0 buckets 0 active
00005: 400.000 Mbit/s 0 ms burst 0
q131077 2000 sl. 0 flows (1 buckets) sched 65541 weight 0 lmax 0 pri 0 droptail
sched 65541 type FIFO flags 0x0 0 buckets 0 active
00006: 100.000 Mbit/s 0 ms burst 0
q131078 2000 sl. 0 flows (1 buckets) sched 65542 weight 0 lmax 0 pri 0 droptail
sched 65542 type FIFO flags 0x0 0 buckets 0 active
00010: 600.000 Mbit/s 0 ms burst 0
q131082 2000 sl. 0 flows (1 buckets) sched 65546 weight 0 lmax 0 pri 0 droptail
sched 65546 type FIFO flags 0x0 0 buckets 0 active
00011: 600.000 Mbit/s 0 ms burst 0
q131083 2000 sl. 0 flows (1 buckets) sched 65547 weight 0 lmax 0 pri 0 droptail
sched 65547 type FIFO flags 0x0 0 buckets 0 active
After deleting all limiters in the GUI I ended up with this, three unconnected queues:
[25.03-BETA][admin@pfsense.local.lan]/root: /sbin/dnctl pipe list
[25.03-BETA][admin@pfsense.local.lan]/root: /sbin/dnctl sched list
[25.03-BETA][admin@pfsense.local.lan]/root: /sbin/dnctl queue list
q00009 50 sl. 0 flows (1 buckets) sched 9 weight 0 lmax 0 pri 0 droptail
q00010 50 sl. 0 flows (1 buckets) sched 10 weight 0 lmax 0 pri 0 droptail
q00011 50 sl. 0 flows (1 buckets) sched 11 weight 0 lmax 0 pri 0 droptail
The syslog shows the kernel not happy about those:
2025-06-18 20:38:42.806662+02:00 kernel - do_config invalid delete type 3 2025-06-18 20:38:42.003208+02:00 kernel - update_fs fs 9 for sch 9 not 1 still unlinked 2025-06-18 20:38:42.003165+02:00 kernel - update_fs fs 11 for sch 11 not 1 still unlinked 2025-06-18 20:38:42.003126+02:00 kernel - update_fs fs 10 for sch 10 not 1 still unlinked 2025-06-18 20:38:42.003083+02:00 kernel - update_fs fs 9 for sch 9 not 65537 still unlinked 2025-06-18 20:38:42.003027+02:00 kernel - update_fs fs 11 for sch 11 not 65537 still unlinked 2025-06-18 20:38:42.002890+02:00 kernel - update_fs fs 10 for sch 10 not 65537 still unlinked 2025-06-18 20:38:41.176033+02:00 kernel - update_fs fs 9 for sch 9 not 1 still unlinked 2025-06-18 20:38:41.175991+02:00 kernel - update_fs fs 11 for sch 11 not 1 still unlinked 2025-06-18 20:38:41.175938+02:00 kernel - update_fs fs 10 for sch 10 not 1 still unlinked 2025-06-18 20:38:41.175892+02:00 kernel - update_fs fs 9 for sch 9 not 65537 still unlinked 2025-06-18 20:38:41.175815+02:00 kernel - update_fs fs 11 for sch 11 not 65537 still unlinked 2025-06-18 20:38:41.175673+02:00 kernel - update_fs fs 10 for sch 10 not 65537 still unlinked 2025-06-18 20:38:40.291607+02:00 check_reload_status 627 Reloading filter
So, I went ahead and manually deleted the queues:
[25.03-BETA][admin@pfsense.local.lan]/root: /sbin/dnctl queue list
q00009 50 sl. 0 flows (1 buckets) sched 9 weight 0 lmax 0 pri 0 droptail
q00010 50 sl. 0 flows (1 buckets) sched 10 weight 0 lmax 0 pri 0 droptail
q00011 50 sl. 0 flows (1 buckets) sched 11 weight 0 lmax 0 pri 0 droptail
[25.03-BETA][admin@pfsense.local.lan]/root: /sbin/dnctl queue delete 9
[25.03-BETA][admin@pfsense.local.lan]/root: /sbin/dnctl queue delete 10
[25.03-BETA][admin@pfsense.local.lan]/root: /sbin/dnctl queue delete 11
[25.03-BETA][admin@pfsense.local.lan]/root: /sbin/dnctl queue list
[25.03-BETA][admin@pfsense.local.lan]/root:
Then I manually recreated the six limiters in the GUI, resulting in these
[25.03-BETA][admin@pfsense.local.lan]/root: /sbin/dnctl sched list
00001: 600.000 Mbit/s 0 ms burst 0
q65537 50 sl. 0 flows (1 buckets) sched 1 weight 0 lmax 0 pri 0 droptail
sched 1 type FQ_CODEL flags 0x0 0 buckets 0 active
FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
Children flowsets: 1
00002: 600.000 Mbit/s 0 ms burst 0
q65538 50 sl. 0 flows (1 buckets) sched 2 weight 0 lmax 0 pri 0 droptail
sched 2 type FQ_CODEL flags 0x0 0 buckets 0 active
FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
Children flowsets: 2
00003: 400.000 Mbit/s 0 ms burst 0
q65539 50 sl. 0 flows (1 buckets) sched 3 weight 0 lmax 0 pri 0 droptail
sched 3 type FQ_CODEL flags 0x0 0 buckets 0 active
FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
Children flowsets: 3
00004: 100.000 Mbit/s 0 ms burst 0
q65540 50 sl. 0 flows (1 buckets) sched 4 weight 0 lmax 0 pri 0 droptail
sched 4 type FQ_CODEL flags 0x0 0 buckets 0 active
FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
Children flowsets: 4
00005: 600.000 Mbit/s 0 ms burst 0
q65541 50 sl. 0 flows (1 buckets) sched 5 weight 0 lmax 0 pri 0 droptail
sched 5 type FQ_CODEL flags 0x0 0 buckets 0 active
FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
Children flowsets: 5
00006: 600.000 Mbit/s 0 ms burst 0
q65542 50 sl. 0 flows (1 buckets) sched 6 weight 0 lmax 0 pri 0 droptail
sched 6 type FQ_CODEL flags 0x0 0 buckets 0 active
FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
Children flowsets: 6
[25.03-BETA][admin@pfsense.local.lan]/root: /sbin/dnctl queue list
q00001 2000 sl. 0 flows (1 buckets) sched 1 weight 0 lmax 0 pri 0 droptail
q00002 2000 sl. 0 flows (1 buckets) sched 2 weight 0 lmax 0 pri 0 droptail
q00003 2000 sl. 0 flows (1 buckets) sched 3 weight 0 lmax 0 pri 0 droptail
q00004 2000 sl. 0 flows (1 buckets) sched 4 weight 0 lmax 0 pri 0 droptail
q00005 2000 sl. 0 flows (1 buckets) sched 5 weight 0 lmax 0 pri 0 droptail
q00006 2000 sl. 0 flows (1 buckets) sched 6 weight 0 lmax 0 pri 0 droptail
[25.03-BETA][admin@pfsense.local.lan]/root: /sbin/dnctl pipe list
00001: 600.000 Mbit/s 0 ms burst 0
q131073 2000 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail
sched 65537 type FIFO flags 0x0 0 buckets 0 active
00002: 600.000 Mbit/s 0 ms burst 0
q131074 2000 sl. 0 flows (1 buckets) sched 65538 weight 0 lmax 0 pri 0 droptail
sched 65538 type FIFO flags 0x0 0 buckets 0 active
00003: 400.000 Mbit/s 0 ms burst 0
q131075 2000 sl. 0 flows (1 buckets) sched 65539 weight 0 lmax 0 pri 0 droptail
sched 65539 type FIFO flags 0x0 0 buckets 0 active
00004: 100.000 Mbit/s 0 ms burst 0
q131076 2000 sl. 0 flows (1 buckets) sched 65540 weight 0 lmax 0 pri 0 droptail
sched 65540 type FIFO flags 0x0 0 buckets 0 active
00005: 600.000 Mbit/s 0 ms burst 0
q131077 2000 sl. 0 flows (1 buckets) sched 65541 weight 0 lmax 0 pri 0 droptail
sched 65541 type FIFO flags 0x0 0 buckets 0 active
00006: 600.000 Mbit/s 0 ms burst 0
q131078 2000 sl. 0 flows (1 buckets) sched 65542 weight 0 lmax 0 pri 0 droptail
sched 65542 type FIFO flags 0x0 0 buckets 0 active
As you can see, the new limiters does not exactly match the original ones. There are differences in the number of children flowsets on the last two, the original had two whereas the new only has one child (as I should I guess). This goes some way to explain where two of the unconnected queues came from. The third unconnected queue is likely from an earlier limiter that was deleted.
So, if there is a bug here it would be that it is possible to delete a limiter and its configured queue is not deleted with it. If the queue needs to be deleted first, then the removal of the limiter should be rejected if it has a queue configured.
I don't know if this is a regression from 24.11. I have run with limiters before but never ended up with the syslog messages from the kernel, alerting me something was going on. But that could simply be down to me not doing many adding/removal of limiters in that configuration.
Updated by Jim Pingle 2 months ago
- Affected Plus Version changed from 25.03 to 25.07
Updated by Patrik Stahlman about 2 months ago
I'm not sure if there's been any work on this for the RC (25.07.r.20250709.2036) but I noticed something else today.
If I disable a Limiter, the queue/scheduler and pipe is still active in the system, it takes a reboot to remove them (or running /sbin/dnctl to delete them). I upgraded to the RC with a configuration of 8 limiters configured of which 6 were disable.
These are the step I took discovering / reproducing the issue:
- disabled the remaining 2 limiters
- Diagnostics / Limiters shows the 2 limiters are still active
- reboot
- Diagnostics / Limiters shows no limiters active after reboot
- re-enabled 2 limiters
- Diagnostics / Limiters shows the 2 limiters active
- disabled the 2 limiters
- Diagnostics / Limiters shows the 2 limiters are still active
Updated by Jim Pingle about 2 months ago
Technically speaking, you're supposed to reset the state table after any change to limiters/shaper queues otherwise existing connections will still reference old things since they can't be updated like that. Changes to limiters and queues can only take effect for new connections.
After changing and applying the limiters, if you reset the states under Diagnostics > States, on the Reset States tab does this behavior still persist?
Updated by Patrik Stahlman about 2 months ago
True, I might not have done that for this test as I didn't consider any connection being involed in the manual deactivation of limiters. I'll do the test again.
As a side-note, I also tested LAN limiters in conjunction with the buffer-bloat recepie which showed issues in the betas. There I did a state reset between each test, but unfortunately it looks like the combination of limiters on both LAN and WAN and the floating rule still has some way to go to receive a gold star.
Updated by Patrik Stahlman about 2 months ago
I re-ran the test with this sequence:
- disabled all four limiters (LAN/WAN)
- reset the firewall state table
- Diagnostics / Limiter Info showed four limiters still active
- reboot
- Diagnostics / Limiter Info showed no limiters active
The result is the same, disabling a limiter requires a restart to take effect.
Updated by Patrik Stahlman about 2 months ago
My comment #2 might stem from a misunderstanding of how the limiters are implemented. I have done some more testing and realised that disbling the limiter leaves them configure but are not participating in the "traffic flow". So, seeing them in Diagnostics / Limiter Info is fine as long as they are not doing any limiting whilst disabled. Sorry, my bad, hopefully no time/effort has been wasted on this misunderstanding.
Updated by Jordan G about 1 month ago
- Status changed from New to Incomplete
thanks for the update, we'll mark this incomplete, in case you encounter anything further