Bug #16019
closedKea can unintentionally attempt to spawn multiple processes and fail
0%
Description
When doing things like restarting the Kea service or switching between ISC and Kea, there is a possibility of Kea attempting to launch multiple copies of itself.
I haven't been able to recreate the exact situation where this happens, but we've had 4-5 different customers experience the same issue. Once the problem occurs, the following log items are present repeatedly:
kea-dhcp458922: WARN [kea-dhcp4.dhcpsrv.0xf488ce12000] DHCPSRV_OPEN_SOCKET_FAIL failed to open socket: Failed to open socket on interface [Interface Here], reason: failed to bind fallback socket to address [Interface Address Here], port 67, reason: Address already in use - is another DHCP server running?
Once this happens, you have to manually kill the duplicate processes and then restart the Kea service for the problem to go away.
Updated by Sevi A 2 months ago
Thanks for the info! I just ran into the same issue and I think it's also mentioned in the forums here: https://forum.netgate.com/topic/195583/kea-dhcp-reservations-not-being-honored
I'm running 24.11 on a Netgate 4100 and noticed the issue when some devices wouldn't get a new IP through DHCP after updating static mappings. I think the issue appeared for me after I switched to KEA from ISC (already a while back so I can't recall if I restarted the device in the meantime, but most likely not) and recently added two new VLANs with corresponding interfaces and DHCP servers.
At least that's what the output of `sockstat -l | grep :67` suggests:
root kea-dhcp4 15901 16 udp4 192.168.104.1:67 *:* root kea-dhcp4 15901 18 udp4 192.168.60.1:67 *:* root kea-dhcp4 33016 33 udp4 192.168.66.1:67 *:* < removed redundant lines for 10 more networks with PID 33016 >
The networks with the server listening on 192.168.60.1/24 and 192.168.104.1/24 are the newly created interfaces and run on a different process. The issue appeared with a host on one of the "old" networks (192.168.66.1/24): It got an IP from the DHCP pool, but never updated after assigning a static mapping.
Similar to the forum post above, stopping the DHCP server through the web GUI only stopped the new process 15901, but the old process 33016 kept running and had to be killed manually to solve the issue.
I would love to help, but I have no clues on how to reproduce the situation or how to diagnose this further.
Updated by Christian McDonald 2 months ago
- Status changed from New to Feedback
- Assignee set to Christian McDonald
- Target version set to 2.8.0
- Plus Target Version changed from 24.11 to 25.07
A mitigation has been added for this.
Updated by Christian McDonald 2 months ago
- Plus Target Version changed from 25.07 to 25.03
Updated by Jim Pingle about 2 months ago
- Subject changed from Kea-dhcp4 can sometimes get "stuck" with multiple processes running and cause binding issues to Kea can unintentionally attempt to spawn multiple processes and fail
Updated by Jason Montleon about 1 month ago
I ran into this on 1 6100. I recently assigned an interface to add a new network and enabled the dhcp server on the interface. Today I was trying to update a clients settings and they were not taking effect.
Looking in the log I saw the 'DHCPSRV_OPEN_SOCKET_FAIL' error message going back to around the time I configured the new interface, though at this point it is hard for me to say if it coincides exactly.
I stopped the service in the web UI and found there were still two processes running. After killing them and restarting the service it seems to be working correctly again. I wasn't smart enough to look at sockstat output before killing them, but if I had to guess one was listening on each interface similar to what Sevi A saw.
It seems like it could be a similar situation to his, "...recently added two new VLANs with corresponding interfaces and DHCP servers..."
Updated by John Doe 30 days ago
Hello all,
I just had the same behavior on my Netgate 4200 without playing around with pfsense nor changing anything in the setup. From the very begin, I am using KEA and today without any reasons my PC was just thrown away from DHCP , lost IP.
After I restarted my PC I could get an IP from DHCP , but I found following in the system logs " DHCPSRV_OPEN_SOCKET_FAIL failed to open socket: Failed to open socket on interface igc1, reason: failed to bind fallback socket to address xxx.xxx.xxx.xxx, port 67, reason: Address already in use - is another DHCP server running? "
regards
Tom
Updated by Kris Phillips 10 days ago
Tried switching between kea and isc, stopping and starting services in unusual ways, etc. I'm no longer able to reproduce kea spawning multiple processes that it shouldn't.
At this point, I think we can mark this as Resolved, but would be good to have one more person try to reproduce this on 25.03-BETA.
Updated by Christian McDonald 1 day ago
- Status changed from Feedback to Resolved
Marking resolved.
Thanks