Bug #15449
closedIPsec VTI static routes may not be added after the system boots
100%
Description
I have a pair of 4200s which were running 23.09.1
Both have an old gateway in a disabled state (see Disabled gateway.png)
There is an IPsec VTI between the 4200s & thus each has a static route to the remote network (Static route.png)
After upgrading to 24.03 I deleted the unused gateway, but upon reboot the static route is not loaded, thus traffic to the remote site doesn't flow. Initially I thought this was due to the state policy change but this is not the case; traffic flows properly as long as the static routes are in place on each end.
This is happening on both systems and is easy for me to replicate: simply roll back to the config with the disabled gateway, reboot (part of the restore) delete the gateway and then reboot. If I leave the disabled gateway in place, the static route is loaded properly upon boot.
The attached screenshots should be self explanatory & the attached system.log covers the boot where the static route was not applied. I have not identified any messages that accompany the problem of the missing route.
--Larry
Files
Updated by Jim Pingle 7 months ago
- Project changed from pfSense Plus to pfSense
- Category changed from Configuration Backend to Routing
- Target version set to 2.8.0
- Affected Plus Version deleted (
24.03) - Plus Target Version set to 24.07
Updated by Larry Fahnoe 7 months ago
Additional information.
The gateway that is disabled was originally used with a fiber provider's ONT/router which was operating as a router, thus the WAN int and WANGW gateway had static RFC-1918 addressing. When I reconfigured the ONT/router to operate as a bridge, I switched the WAN int from Static IPv4 to DHCP, which caused pfSense to create a new gateway WAN_DHCP which is what is currently being used. Thinking I might need to go back to using the router I disabled rather than deleted the WANGW.
Now, even though the WANGW is disabled, I'm noting that there is a dpinger trying to monitor an IP that is no longer reachable, so I go to edit WANGW to disable gateway monitoring. System won't allow that because the static IP is no longer within the interface's network, so I blanked out the IP and disabled gateway monitoring.
Upon rebooting, the static route is now gone once again.
Updated by Azamat Khakimyanov 7 months ago
I've tested on 23.09.1
- I've added disabled WAN gateway which is not in the same subnet as a real WAN subnet is
- then I added VTi IPsec tunnel, assigned it and enabled.
- then I added static route via this IPsec VTi
Finally I've updated 23.09.1 up to latest 24.03 and I wasn't able to reproduce this issue with static route not being loaded after deleting disabled WAN gateway.
Please delete wrong gateway, delete this Routed IPsec tunnel and recreate it from scratch.
If you still have issue with static routes via IPsec VTi, please open a ticket on our support platform: https://portal.netgate.com
Updated by Larry Fahnoe 7 months ago
So on one of the 4200s running 24.03 I have done the following:
1. Deleted static route to 192.168.5.0/24
2. Deleted ALEX_MPLS IPsec VTI interface assignment
3. Deleted IPsec P1 and P2
4. Rebooted
1. Add IPsec P1
2. Add IPsec P2
3. Apply changes
4. Add interface OPT3 ipsec1 (IPsec VTI: Mpls), enable, rename to ALEX_MPLS
5. Apply changes
6. Add static route to 192.168.5.0/24 via ALEX_MPLS
7. Apply changes
8. Routes are present as expected and can ssh to remote pfSense via IPsec
9. Reboot
ALEX_MPLS interface is up and both sides are pingable
Static route to 192.168.5.0/24 does not show up which is the problem I'm attempting to document
# netstat -in | grep ipsec1 ipsec1 1400 <Link#9> ipsec1 387 0 0 117 0 0 ipsec1 - fe80::%ipsec1/64 fe80::92ec:77ff:fe8e:66e4%ipsec1 0 - - 2 - - ipsec1 - 192.168.8.0/30 192.168.8.2 114 - - 112 - - # # netstat -rn4 Routing tables Internet: Destination Gateway Flags Netif Expire default 47.7.216.1 UGS igc3 47.7.216.0/21 link#4 U igc3 47.7.222.194 link#6 UHS lo0 70.59.119.49 47.7.216.1 UGHS igc3 127.0.0.1 link#6 UH lo0 192.168.3.0/24 link#3 U igc2 192.168.3.1 link#6 UHS lo0 192.168.8.1 link#9 UH ipsec1 192.168.8.2 link#6 UHS lo0 192.168.10.1 link#6 UH lo0 # # ping -qc 2 192.168.8.1 PING 192.168.8.1 (192.168.8.1): 56 data bytes --- 192.168.8.1 ping statistics --- 2 packets transmitted, 2 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 24.621/24.764/24.907/0.143 ms # # ping -qc 2 192.168.8.2 PING 192.168.8.2 (192.168.8.2): 56 data bytes --- 192.168.8.2 ping statistics --- 2 packets transmitted, 2 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.255/0.289/0.322/0.033 ms
This configuration was working with 23.09.1. I will attempt to open a ticket per your request, however I only have TAC-lite.
--Larry
Updated by Lev Prokofev 7 months ago
Ticket for reference #2703470963 the SOs and steps included.
Updated by Lev Prokofev 7 months ago
I finally replicated the issue by restoring the config from the status output file, the root cause is still unknown however I was able to find a way to fix the behavior:
1) Move the static route to another gateway or delete it
2) Delete the IPsec gateway
3)Re-create IPsec gateway
4)Move/set a static route to the IPsec gateway
Updated by Steve Wheeler 6 months ago
- Status changed from Incomplete to Confirmed
Updated by Azamat Khakimyanov 6 months ago
I used customer's status output file to create the same config on my lab (as Lev done) but I still wasn't able to reproduce this issues.
According to status_output-24.03-after-upgrade.tgz (ticket #2703470963):
customer deleted disabled/wrong gateway
May 8 18:07:32 pfs-a php-fpm408: /system_gateways.php: Configuration Change: fahnoe@192.168.3.65 (Local Database): Gateways: removed gateway 0
and then customer rebooted pfSense at 18:35
May 8 18:35:08 pfs-a php-cgi482: rc.bootup: Default gateway setting Via Spectrum as default.
May 8 18:35:08 pfs-a php-cgi482: rc.bootup: Gateway, NONE AVAILABLE
May 8 18:35:08 pfs-a php-cgi482: rc.bootup: route_add_or_change: Invalid gateway and/or network interface ipsec1
and one more time at 18:56
May 8 18:56:58 pfs-a php-fpm543: /rc.newwanip: rc.newwanip: on (IP address: 47.7.222.194) (interface: WAN[wan]) (real interface: igc3).
May 8 18:56:58 pfs-a php-cgi630: rc.bootup: dpinger: status socket /var/run/dpinger_WAN_DHCP~47.7.222.194~47.7.216.1.sock not found
May 8 18:56:58 pfs-a php-cgi630: rc.bootup: Default gateway setting Via Spectrum as default.
May 8 18:56:58 pfs-a php-cgi630: rc.bootup: Gateway, NONE AVAILABLE
May 8 18:56:58 pfs-a php-cgi630: rc.bootup: route_add_or_change: Invalid gateway and/or network interface ipsec1
so for me it looks like removing disabled/wrong WAN gateway which shouldn't exist at all, partly ruined pfSense functionality.
I would check if one more test/check should be added which won't allow to add wrong gateways at booting or more precise test before deleting gateways.
Updated by Larry Fahnoe 6 months ago
Another customer is experiencing related issues, see https://forum.netgate.com/topic/188214/vti-gateways-in-24-03 beginning with the posts from May 17 2024
At this point I do not believe this is caused by deleting the disabled gateway. As mentioned in the support ticket dialog, I rolled back to 23.09.1, deleted the disabled gateway WANGW, observed that after rebooting, the static route was loaded and traffic was properly flowing over the IPsec connection. I then upgraded to 24.03 and observed that the static route was NOT being loaded and therefore traffic was not properly flowing over the IPsec connection.
--Larry
Updated by Jim Pingle 6 months ago
- Plus Target Version changed from 24.07 to 24.08
Updated by Marcos M 6 months ago
- File 15449.txt 15449.txt added
- Subject changed from Delete disabled gateway prevents static routes being loaded on boot to IPsec static routes may not added after the system boots
- Status changed from Confirmed to Ready To Test
- Assignee set to Marcos M
- Affected Architecture All added
- Affected Architecture deleted (
amd64)
The inconsistency of the issue seems to stem from the Gateway Monitoring
setting. When unchecked (default), the routes will be added as part of the process that happens when a gateway comes online. When checked (monitoring is disabled), no action is taken and hence the routes are not added.
The root of the issue is that when a tunnel is set up, the VTI may not yet be in the interface cache (e.g. after rebooting) and hence adding the route fails (since it's based off the cache). I've attached a patch that should resolve the issue.
Updated by Marcos M 6 months ago
- Status changed from Ready To Test to Feedback
- % Done changed from 0 to 100
Applied in changeset 487d7d5e322993703716439422e3d032e40b61b4.
Updated by Lev Prokofev 6 months ago
The patch is working, confirmed in ticket #2703470963 and on my test device. The issue can be marked as resolved.
Updated by Jim Pingle about 1 month ago
- Plus Target Version changed from 24.08 to 24.11