Bug #9331
closedParallel Rekey fails for multiple Child SAs
100%
Description
We are running a IKEv1 VPN connection towards a Watchguard firewall cluster. It has 10 Tunnel definitions. Whenever the Watchguard cluster fails over several tunnels stop working. Our analysis shows that the failover issues all phase 2 rekeys at the same time. The pfsense strongSwan daemon fails during this process with the following message:
Jan 11 18:07:24 firewall charon: 11[NET] <con1000|6> received packet: from xxx.xxx.xxx.xxx500 to yyy.yyy.yyy.yyy500 (68 bytes)
Jan 11 18:07:24 firewall charon: 11[ENC] <con1000|6> invalid HASH_V1 payload length, decryption failed?
Jan 11 18:07:24 firewall charon: 11[ENC] <con1000|6> could not decrypt payloads
In the forum this issue is mentioned in this post https://forum.netgate.com/topic/139536/phase-2-invalid-hash_v1-payload-length-error/2
It all boils down that strongSwan only allows 3 rekeys at the same time because of parameter max_ikev1_exchanges. As we cannot control any foreign firewall behaviour pfSense should allow for all rekeys to complete. Independant of the number of tunnel interfaces. Sophos already has implemented a workaround for this. See https://community.sophos.com/kb/en-us/128175
I would expect a two stage solution for this problem in pfSense.
1 Simple: Add parameter to /etc/inc/vpn.inc in charon section and increase maximum number of outstanding Rekeys.
...
charon {
max_ikev1_exchanges = 22
...
Although 22 is only an arbitrary value it might make sense because 12 tunnels (as in our case) should be not very common.
2 Sustained: Add warning in web interface if number of tunnels for IKEv1 VPN reach 20. So we have 2 slots safety.