Regression #11524
closedUsing SHA1 or SHA256 with AES-NI may fail if AES-NI attempts to accelerate hashing
0%
Description
Based on at least one report, it appears AES-NI on Plus 21.02/2.5.0 has an issue with SHA-256 and some clients, notably Android and Apple clients.
https://forum.netgate.com/topic/161268/ipsec-tunnels-using-sha256-may-not-connect
If the tunnel is switched to a different hash or if AES-NI is disabled, the problems do not occur. There is no problem when using other accelerators such as QAT, only AES-NI appears to be affected.
Per Mark J the AES-NI driver in Plus 21.02/2.5.0 now supports accelerating SHA, so it's possible there is a difference in the implementation of SHA-256 in AES-NI than in the OS.
Historically there were differences with SHA-256 on FreeBSD which could lead to similar problems. It was standardized on the RFC 4868 implementation about 10 years ago (ref: http://lists.freebsd.org/pipermail/svn-src-head/2011-February/025040.html )
Files
Updated by Jim Pingle over 3 years ago
Specifically, the hardware from the thread above is a Netgate 5100 running pfSense Plus, but this likely affects both Plus and CE. Needs more data, however.
Updated by Jim Pingle over 3 years ago
Another potential report at https://forum.netgate.com/topic/161354/ipsec-packet-loss-routing-issue-with-21-02-release but for the Netgate 7100. Waiting on more data/confirmation that moving off AES-NI helps there yet.
Updated by Kris Phillips over 3 years ago
This also affects Site to Site VPN tunnels. Please reference internal ticket 76224 for another example of this bug causing issues.
Updated by Kris Phillips over 3 years ago
Interesting point to mention related to IPSec: If you lower the subnet size to something like a /30 this issue takes longer to rear its head. If you up the subnet size on a tunnel to something bigger like a /17 and then restart the IPsec service packets will pass for about 2-3 seconds and then die. With a /30 it can take upwards of a few minutes before traffic stops passing.
Updated by Chris Linstruth over 3 years ago
To addto the above: looks like TAC had one that was Plus 21.02 on an XG-7100 on one side and Azure VPN on the other. Disabling AES-NI stopped it from failing after "some traffic."
Updated by Jim Pingle over 3 years ago
- Target version changed from CE-Next to 2.5.1
Updated by Jim Pingle over 3 years ago
There have been multiple additional confirmations of this from customers and forum users, and in each case thus far, switching to QAT or switching the hash has stabilized the IPsec behavior.
Updated by Michael Spears over 3 years ago
Jim Pingle wrote:
Based on at least one report, it appears AES-NI on Plus 21.02/2.5.0 has an issue with SHA-256 and some clients, notably Android and Apple clients.
https://forum.netgate.com/topic/161268/ipsec-tunnels-using-sha256-may-not-connect
If the tunnel is switched to a different hash or if AES-NI is disabled, the problems do not occur. There is no problem when using other accelerators such as QAT, only AES-NI appears to be affected.
Per Mark J the AES-NI driver in Plus 21.02/2.5.0 now supports accelerating SHA, so it's possible there is a difference in the implementation of SHA-256 in AES-NI than in the OS.
Historically there were differences with SHA-256 on FreeBSD which could lead to similar problems. It was standardized on the RFC 4868 implementation about 10 years ago (ref: http://lists.freebsd.org/pipermail/svn-src-head/2011-February/025040.html )
Assisted a customer with this today on a 5100
Updated by Yury Zaytsev over 3 years ago
We've hit this after upgrade from 2.4.5 to 2.5.0 on our two SG-5100 - was terribly difficult to figure it out, but thanks to NetGate to pointing us in the right direction!
Updated by Renato Botelho over 3 years ago
- Target version changed from 2.5.1 to CE-Next
Not enough time for 2.5.1
Updated by Jim Pingle over 3 years ago
- Subject changed from Using SHA256 with AES-NI may fail for some clients to Using SHA1 or SHA256 with AES-NI may fail if AES-NI attempts to accelerate hashing
Updating subject.
Note that this problem only affects CPUs which report the ability to accelerate SHA1 and SHA256.
When AES-NI is active the System Information widget on the Dashboard indicates whether or not acceleration for the affected hashes is supported. For example:
Unsupported:
Hardware crypto AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS
Supported:
Hardware crypto AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS,SHA1,SHA256
In the latter case, to avoid problems with SHA1 or SHA256 the cryptographic support option should be changed to QAT for those on pfSense Plus. On pfSense CE, change to an AEAD cipher such as AES-GCM which does not utilize hashes or switch to a different hash (e.g. SHA-512).
Updated by Jan de Groot over 3 years ago
- File disable-sha.patch disable-sha.patch added
This hit me after migrating a pfSense CE firewall for a customer. The Atom C3000 series CPU in the new firewall has SHA1/SHA256 offload, the old CPU didn't have any offloading at all but was faster than the Atom. The customer has Windows VPN clients, Windows can't do AES-GCM, only 3DES or AES-CBC.
I applied a hotfix by disabling sha support in the AESNI module. This requires a kernel recompile, but after that the /boot/kernel/aesni.ko module can be replaced.
Attached is the quickfix patch.
Updated by Jim Pingle over 3 years ago
After inspecting the code, disabling the SHA functionality in AES-NI is the best course of action.
Updated by Luiz Souza over 3 years ago
- Status changed from New to Feedback
Regression fixed in 2.6 devel.
Updated by Renato Botelho over 3 years ago
Another fix [1] was imported from FreeBSD and will be present on tomorrow's snapshots
[1] https://cgit.freebsd.org/src/commit/?id=62e32cf9140e6c13663dcd69ec3b3c7ca4579782
Updated by Jim Pingle over 3 years ago
- Target version changed from CE-Next to 2.6.0
Updated by Jim Pingle over 3 years ago
- Target version changed from 2.6.0 to 2.5.2
Updated by Marcos M over 3 years ago
Tested with SHA256 on IPsec P1 and SHA1 on P2 on 21.05-RC built on Wed May 26 18:11:31 EDT 2021
with AES-NI selected in system settings. Traffic passed correctly.
Updated by Jim Pingle over 3 years ago
- Status changed from Closed to Feedback
Due to changes in the freebsd-src branch used to build 2.5.2 snapshots, this needs re-tested on a build dated after this comment.