Bug #14507
closedCPU hog with 23.05
0%
Description
I’ve started to observe a CPU hog of one CPU core on APU2 box running pfSense 23.05.
dtrace showed:
kernel`pmap_copy 33 kernel`amd64_syscall 33 kernel`vm_radix_insert 35 kernel`vm_map_pmap_enter 37 kernel`vm_radix_lookup_unlocked 38 kernel`memcpy_std 38 kernel`vm_object_deallocate 39 kernel`pmap_enter_quick_locked 41 kernel`em_update_stats_counters 43 kernel`copyout_nosmap_std 43 kernel`ck_epoch_poll_deferred 44 kernel`sbuf_put_bytes 46 kernel`vm_page_pqbatch_submit 48 kernel`pmap_remove_pte 51 kernel`pmap_pvh_remove 53 kernel`vm_pqbatch_process_page 54 kernel`cpu_search_highest 56 kernel`get_pv_entry 57 kernel`pmap_try_insert_pv_entry 59 kernel`vm_map_lookup_entry 65 kernel`epoch_call_task 92 kernel`pmap_enter 101 kernel`vm_fault 110 kernel`pagecopy 110 kernel`0xffffffff81 133 kernel`lock_delay 145 kernel`pmap_remove_pages 203 kernel`_thread_lock 415 kernel`pagezero_std 490 kernel`assert_rw 532 kernel`acpi_cpu_c1 600 kernel`callout_lock 641 kernel`kern_yield 1206 kernel`_callout_stop_safe 2010 kernel`spinlock_enter 2032 kernel`tcp_timer_stop 2703 0x0 5927 kernel`spinlock_exit 40964 kernel`cpu_idle 61943 kernel`sched_idletd 76722
The symptom is that the kernel thread "kernel{if_io_tqg_1}” consumes 100% of the CPU core.
Updated by Kris Phillips over 1 year ago
I'm unable to reproduce this on 23.05 on an amd64 system.
kernel{if_io_tqg_1} would be interface processing from the iflib library, if I'm not mistaken. Does your NIC has a lot of throughput currently going on when this happens?
Updated by Juraj Lutter over 1 year ago
Kris Phillips wrote in #note-1:
I'm unable to reproduce this on 23.05 on an amd64 system.
kernel{if_io_tqg_1} would be interface processing from the iflib library, if I'm not mistaken. Does your NIC has a lot of throughput currently going on when this happens?
It happens every 2-3 days with low-to-moderate traffic. I have two of those boxes at home, both of them running 23.05. On the first one of those, the problem exhibits every 1.5-2 days, the second one was used to rule out the HW problem. On that box, the problem exhibits with lower frequency (every 4-5 days).
Can the DEBUG kernel be built with pfSense? If yes, I could put it into work, attach a serial console and then I could drop into ddb to see the backtrace.
I've already looked into iflib sources but I'm not a kernel person.
Updated by Marcos M over 1 year ago
If there is a bug, it's more likely to be upstream. FWIW a debug kernel is available in the pfSense repo:
pkg search debug pfSense-kernel-debug-pfSense-23.05.1.r.20230621.1927 pfSense kernel-debug (pfSense)
Updated by Jim Pingle over 1 year ago
- Status changed from New to Not a Bug
Given that the thread in question is from iflib this seems more like busy hardware or an upstream driver issue and not a bug. Please keep discussion about it on the forum until a more definite diagnosis can be reached. If there is something actionable this can always be updated with more information and reopened later.