Project

General

Profile

Regression #11524

Using SHA1 or SHA256 with AES-NI may fail if AES-NI attempts to accelerate hashing

Added by Jim Pingle about 2 months ago. Updated 4 days ago.

Status:
New
Priority:
Very High
Assignee:
Category:
Hardware / Drivers
Target version:
Start date:
02/24/2021
Due date:
% Done:

0%

Estimated time:
Affected Version:
2.5.0
Affected Architecture:
All
Release Notes:
Default

Description

Based on at least one report, it appears AES-NI on Plus 21.02/2.5.0 has an issue with SHA-256 and some clients, notably Android and Apple clients.

https://forum.netgate.com/topic/161268/ipsec-tunnels-using-sha256-may-not-connect

If the tunnel is switched to a different hash or if AES-NI is disabled, the problems do not occur. There is no problem when using other accelerators such as QAT, only AES-NI appears to be affected.

Per Mark J the AES-NI driver in Plus 21.02/2.5.0 now supports accelerating SHA, so it's possible there is a difference in the implementation of SHA-256 in AES-NI than in the OS.

Historically there were differences with SHA-256 on FreeBSD which could lead to similar problems. It was standardized on the RFC 4868 implementation about 10 years ago (ref: http://lists.freebsd.org/pipermail/svn-src-head/2011-February/025040.html )

History

#1 Updated by Jim Pingle about 2 months ago

Specifically, the hardware from the thread above is a Netgate 5100 running pfSense Plus, but this likely affects both Plus and CE. Needs more data, however.

#2 Updated by Jim Pingle about 2 months ago

Another potential report at https://forum.netgate.com/topic/161354/ipsec-packet-loss-routing-issue-with-21-02-release but for the Netgate 7100. Waiting on more data/confirmation that moving off AES-NI helps there yet.

#3 Updated by Kris Phillips about 2 months ago

This also affects Site to Site VPN tunnels. Please reference internal ticket 76224 for another example of this bug causing issues.

#4 Updated by Kris Phillips about 2 months ago

Interesting point to mention related to IPSec: If you lower the subnet size to something like a /30 this issue takes longer to rear its head. If you up the subnet size on a tunnel to something bigger like a /17 and then restart the IPsec service packets will pass for about 2-3 seconds and then die. With a /30 it can take upwards of a few minutes before traffic stops passing.

#5 Updated by Chris Linstruth about 2 months ago

To addto the above: looks like TAC had one that was Plus 21.02 on an XG-7100 on one side and Azure VPN on the other. Disabling AES-NI stopped it from failing after "some traffic."

#6 Updated by Jim Pingle about 1 month ago

  • Target version changed from CE-Next to 2.5.1

#7 Updated by Jim Pingle about 1 month ago

There have been multiple additional confirmations of this from customers and forum users, and in each case thus far, switching to QAT or switching the hash has stabilized the IPsec behavior.

#8 Updated by Michael Spears about 1 month ago

Jim Pingle wrote:

Based on at least one report, it appears AES-NI on Plus 21.02/2.5.0 has an issue with SHA-256 and some clients, notably Android and Apple clients.

https://forum.netgate.com/topic/161268/ipsec-tunnels-using-sha256-may-not-connect

If the tunnel is switched to a different hash or if AES-NI is disabled, the problems do not occur. There is no problem when using other accelerators such as QAT, only AES-NI appears to be affected.

Per Mark J the AES-NI driver in Plus 21.02/2.5.0 now supports accelerating SHA, so it's possible there is a difference in the implementation of SHA-256 in AES-NI than in the OS.

Historically there were differences with SHA-256 on FreeBSD which could lead to similar problems. It was standardized on the RFC 4868 implementation about 10 years ago (ref: http://lists.freebsd.org/pipermail/svn-src-head/2011-February/025040.html )

Assisted a customer with this today on a 5100

#9 Updated by Yury Zaytsev 22 days ago

We've hit this after upgrade from 2.4.5 to 2.5.0 on our two SG-5100 - was terribly difficult to figure it out, but thanks to NetGate to pointing us in the right direction!

#10 Updated by Renato Botelho 14 days ago

  • Target version changed from 2.5.1 to CE-Next

Not enough time for 2.5.1

#11 Updated by Jim Pingle 4 days ago

  • Subject changed from Using SHA256 with AES-NI may fail for some clients to Using SHA1 or SHA256 with AES-NI may fail if AES-NI attempts to accelerate hashing

Updating subject.

Note that this problem only affects CPUs which report the ability to accelerate SHA1 and SHA256.

When AES-NI is active the System Information widget on the Dashboard indicates whether or not acceleration for the affected hashes is supported. For example:

Unsupported:

Hardware crypto AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS

Supported:

Hardware crypto AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS,SHA1,SHA256

In the latter case, to avoid problems with SHA1 or SHA256 the cryptographic support option should be changed to QAT for those on pfSense Plus. On pfSense CE, change to an AEAD cipher such as AES-GCM which does not utilize hashes or switch to a different hash (e.g. SHA-512).

Also available in: Atom PDF