Feature #12169
closed
IPsec keep alive option to initiate phase 2 without using ICMP
Added by Jim Pingle over 3 years ago.
Updated about 3 years ago.
Plus Target Version:
22.01
Description
Currently the IPsec GUI allows users to enter an IP address to ping a remote host as a means to connect a P2 and keep it active. This works OK for tunnel mode since the ping will match a trap policy and initiate the tunnel but is not viable for VTI as VTI doesn't support trap policies.
We should change this into an option where the user can opt to choose to initiate the P2 periodically if it's down on the same schedule as the ping runs, for a similar net effect.
Potential problems/random thoughts/notes when implementing this:
- The "automatically ping host" option is the only entry in the P2 "Advanced Configuration" options, so rename that to something more relevant like "Keep Alive"
- This should be a separate option from ping host
- A user may still want to send a ping even if it doesn't initiate, if the far side requires traffic to keep it alive. May not be necessary these days as DPD does the job that used to do, but there may still be third parties which go by traffic to disconnect idle tunnels.
- If this is tunnel mode it's redundant to both, so either disallow that or warn against it. Doing both for VTI is fine.
- This should be a separate script from the current ping_hosts.sh script
- Likely will need to be PHP, otherwise it will take a lot more work to write code to fetch and parse things out from the IPsec status and config
- Code should collect a list of all P2s which want to be checked, and then when the time comes, loop through them and see if they are connected.
- If there is an active child SA matching the P2, nothing should be done
- If no matching child SA is found, then initiate the P2
- As a part of other ongoing work, the code to fetch the status when checking is already being moved to a function, which can be leveraged for this when the time comes.
- Also there will be a new ipsec_initiate() function which likewise may be leveraged here.
- Description updated (diff)
Also note this should solve what some users see where after some time of a peer being down, a VTI tunnel won't automatically reconnect without manual intervention.
- Status changed from New to Feedback
- % Done changed from 0 to 100
- Subject changed from Initiate IPsec P2 without ping to IPsec keep alive option to initiate P2 without ping
Currently after a gateway comes back up, check_reload_status
will run "Restarting ipsec tunnels". This is not triggering a VTI P2 to initiate even with Child SA Close Action
set to "Restart/Reconnect".
My guess is that check_reload_status
is only reloading the configuration rather than restarting the tunnel, and given that Child SA Close Action
aka dpd_action
would not come into play after the IKE timeout/retransmit period has passed, the P2 VTI never comes back up.
Would this behavior be resolved by this feature?
- Status changed from Feedback to In Progress
Almost certainly since this just checks if a P2 with the option checked it enabled and disconnected. If so, it triggers an initiate action for it.
It wouldn't have any relation to tunnel types, events, etc. It just checks every 5 minutes if it's up.
Though now that I think about it, This should probably also check the CARP status so it doesn't initiate tunnels on secondary nodes in BACKUP status.
- Status changed from In Progress to Feedback
- Subject changed from IPsec keep alive option to initiate P2 without ping to IPsec keep alive option to initiate phase 2 without ping
Updating subject for release notes.
- Subject changed from IPsec keep alive option to initiate phase 2 without ping to IPsec keep alive option to initiate phase 2 without using ICMP
Updating subject for release notes.
- Status changed from Feedback to Resolved
Tested on 22.01.a.20211010.0500. Still works well.
- Status changed from Resolved to New
I did some further testing on this.
(substr($status[$ikeid]['p1']['interface'], 0, 4) == "_vip")
returns a false negative when the interface is a gateway group due to ['interface']
at this point being defined as the gateway group name. This leads to the secondary incorrectly initiating a connection.
It would also be nice to let the user adjust the keepalive check time, as once #12184 is implemented, the keepalive time could be lowered.
- Status changed from New to Resolved
Those should be added as a separate bug report and feature request. For most cases this is working fine.
- Plus Target Version changed from 21.09 to 22.01
Also available in: Atom
PDF