Project

General

Profile

Actions

Bug #11934

closed

IPSEC stops working on 2.5.1 running on Watchguard XTM 5

Added by Paul Kennedy almost 3 years ago. Updated almost 3 years ago.

Status:
Not a Bug
Priority:
Normal
Assignee:
-
Category:
IPsec
Target version:
-
Start date:
05/18/2021
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Default
Affected Version:
Affected Architecture:

Description

I currently have 4 sites that were all running 2.4.5p1 pfSense with IPSEC connecting all together without any major issues.

Internal IPs in /24s using 172.16.0.x, 172.16.1.x, 172.16.2.x and 172.16.3.x.

With the release of 2.5.0 I ran the upgrade on 172.16.0.x (which is ideally a test-lab location) which kinda screwed up (I know, should have clean installed…) The environment was using a Lanner box running an older Atom processor which is pretty-much end-of-life, so have some Watchguard Firebox XTM 5’s with C2D processors, 4Gb RAM - which was my short-term upgrade path for greater use of IDS as the Atom ran too high on utilization when doing a lot…

Built the XTM5, restored a configuration and after a lot of tweaking got it running with all packages and IPSEC tunnels. No biggie, just took longer and a little more complex than I had hoped.

Herein lies the issue… After running for a while, the IPSEC on that location just appears to stop, VPN offline, clicking connect from there or from one of the other sites doesn’t resolve anything. Clicking stop on the GUI doesn’t stop, restart also seems to do nothing. Am unable to run ‘swanctl --list-conns’ or ‘swanctl --load-all --file /var/etc/ipsec/swanctl.conf --debug 1’ as it doesn’t respond with anything

If I reboot, all is good for a while until the same happens again.

I also built new, re-entering all info manually and the same occurs. Usually after a day or so (and the VPNs are not under heavy load most of the time as only used for phones primarily at remote sites.

Another firewall (different hardware) exhibiting the same issue. Both running 2.5.1, both built clean and reconfigured manually to remove any doubt of upgrade issues. Both built on Watchguard hardware XTM5s.

If selecting Stop for IPSEC on the services page it never stops. Rebooting Firewall normalizes and it works for a day or so then stops again.

Log shows the following and then nothing for days till rebooted...

May 7 00:16:57 charon 59608 12[ENC] <con100000|63> generating INFORMATIONAL response 716 [ ]
May 7 00:16:57 charon 59608 12[NET] <con100000|63> sending packet: from XXX.XXX.XXX.XXX500 to XXX.XXX.XXX.XX500 (57 bytes)
May 7 00:17:00 newsyslog 25803 logfile turned over due to size>500K
May 7 00:17:00 newsyslog 25803 logfile turned over due to size>500K
May 7 00:17:06 charon 59608 15[NET] <con300000|66> received packet: from XXX.XXX.XXX.XX500 to XXX.XXX.XX.XX500 (57 bytes)
May 7 00:17:06 charon 59608 15[ENC] <con300000|66> parsed INFORMATIONAL request 344 [ ]
May 7 00:17:06 charon 59608 15[ENC] <con300000|66> generating INFORMATIONAL response 344 [ ]
May 7 00:17:06 charon 59608 15[NET] <con300000|66> sending packet: from XXX.XXX.XX.XXX500 to XXX.XXX.XXX.XX500 (57 bytes)
May 7 00:28:45 charon 59608 03[KNL] creating rekey job for CHILD_SA ESP/0xc4427143/XXX.XXX.XXX.XXX
May 7 00:29:32 charon 59608 03[KNL] creating rekey job for CHILD_SA ESP/0xc3cd1301/XXX.XXX.XXX.XXX
May 7 00:35:33 charon 59608 03[KNL] creating rekey job for CHILD_SA ESP/0xc2535822/XXX.XXX.XXX.XXX
May 7 00:37:14 charon 59608 03[KNL] creating rekey job for CHILD_SA ESP/0xc6823624/XXX.XXX.XXX.XXX
May 7 00:38:50 charon 59608 03[KNL] creating delete job for CHILD_SA ESP/0xc4427143/XXX.XXX.XXX.XXX
May 7 00:38:50 charon 59608 03[KNL] creating delete job for CHILD_SA ESP/0xc3cd1301/XXX.XXX.XXX.XXX
May 7 00:46:02 charon 59608 03[KNL] creating delete job for CHILD_SA ESP/0xc2535822/XXX.XXX.XXX.XXX
May 7 00:46:02 charon 59608 03[KNL] creating delete job for CHILD_SA ESP/0xc6823624/XXX.XXX.XXX.XXX
May 7 00:51:12 charon 59608 03[KNL] creating rekey job for CHILD_SA ESP/0xc12d5134/XXX.XXX.XXX.XXX
May 7 00:54:35 charon 59608 03[KNL] creating rekey job for CHILD_SA ESP/0xc2f81b76/XXX.XXX.XXX.XXX
May 7 01:02:12 charon 59608 03[KNL] creating delete job for CHILD_SA ESP/0xc12d5134/XXX.XXX.XXX.XXX
May 7 01:02:12 charon 59608 03[KNL] creating delete job for CHILD_SA ESP/0xc2f81b76/XXX.XXX.XXX.XXX
May 11 21:56:19 charon 59608 03[KNL] interface pppoe0 activated
May 11 21:56:19 charon 59608 03[KNL] XXX.XXX.XXX.XXX disappeared from pppoe0
May 11 21:56:19 charon 59608 03[KNL] interface pppoe0 deactivated
May 11 21:56:34 charon 59608 03[KNL] XXX.XXX.XXX.XXX appeared on pppoe0
May 12 12:41:43 charon 59608 00[DMN] SIGTERM received, shutting down

Even selecting stop multiple times nothing else adds or changes in log.

Actions #1

Updated by Jim Pingle almost 3 years ago

  • Status changed from New to Not a Bug

This site is not for support or diagnostic discussion.

For assistance in solving problems, please post on the Netgate Forum or the pfSense Subreddit .

See Reporting Issues with pfSense Software for more information.

Actions #2

Updated by Paul Kennedy almost 3 years ago

Sorry Jim, but thought that this was a bug - related to the 2.5.1 running on a specific hardware.....

Works fine on the 2.4.5p1 and there are quite a few people running XTM5s

Anyway, just thought it might be relevant.

Best regards
Paul.

Actions #3

Updated by Jim Pingle almost 3 years ago

We don't claim to officially support that hardware, so if it's hardware specific, there is nothing Netgate/pfSense can do for it. You'd have to reproduce it on FreeBSD and report it upstream. That said, there is not enough information to make any kind of conclusion. It belongs on the forum at least until a more definitive diagnosis can be reached.

Actions #4

Updated by Paul Kennedy almost 3 years ago

Apologies, it’s on the forum under IPSEC, someone else running same HW recorded same info, no other responses.

Thought it was obviously reproduceable and could be identified.

As I have never listed what I thought to be a bug before I didn’t know the correct procedure or that I would be doing something inappropriate etc. Was willing to provide info to help the community but the first response was not a request for further info.

Many thanks
Paul.

Actions #5

Updated by Denis Grilli almost 3 years ago

I cannot tell if the same issue but with 2.5.1 I am experiencing a similar problem with VPN and not with the watchguard hw so if it is the same, definitely not an hw issue.

My VPN are working normally but in same occasion I found them all done and as reported here, no connection is possible and stop and restart the ipsec service doesn't work.

What I found out is that the strongswan configuration get lost and doesn't reflect anymore the configuration in UI. Editing any of the VPN and save even without changing anything seems to trigger the re-creation of the configuration in strongswan and the VPN (all of them) start working again.

Can the author of this post try to edit the VPN config and test if my workaround works as well?? That would confirm if the problem is the same or not.

Actions #6

Updated by Paul Kennedy almost 3 years ago

Tried altering and saving then applying but no IPSEC status, still unable to stop or start service...

Actions

Also available in: Atom PDF