Project

General

Profile

Actions

Feature #12863

open

dynamically tune sha512crypt rounds

Added by Royce Williams 4 months ago. Updated 3 months ago.

Status:
New
Priority:
Very Low
Assignee:
Category:
Authentication
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Default

Description

As touched on in #12800 and #12855, sha512crypt's default number of rounds (5000) can be cracked relatively quickly by modern standards. But "fixing" this with a static, arbitrary number of rounds could adversely impact login speed and user experience, depending on platform.

I propose a middle-ground solution: tune the number of rounds based on platform capability to a target runtime. Multiple UX studies have cited 500ms (half a second) as an upper bound for user login delay tolerance.

This reference code detects the number of rounds near 500ms performance, using a simple approach: performing a test hash, and then applying its performance ratio to the rounds count. It then hashes the password with that number of rounds. It abstracts both the sha512crypt hashing and the dynamic rounds tuning into their own functions. It also improves salt entropy in passing, to match bcrypt and scrypt's 128 bits and to match the sha512crypt

The code is overly commented, to explain the reasoning behind various design choices, such as those informed by attack techniques well known in the password-cracking community.

Sample results for a few platforms at 500ms runtimes (I am actively soliciting for additional data points):

* AMD Geode LX800 500 MHz (alix2):                rounds=11851
* AMD GX-412TC SOC (apu2):                        rounds=157921
* Intel(R) Celeron(R) CPU N3150 @ 1.60GHz:        rounds=209662
* Pentium(R) Dual-Core CPU E5:                    rounds=568985
* 11th Gen Intel(R) Core(TM) i7-11700K @ 3.60GHz: rounds=1741092

Note especially these higher values. A modern CPU can run 1.7 million rounds of sha512crypt in half a second. By contrast, a medium-sized pentest cracking rig (equivalent of 6 GTX 1080s) can do a little over 2 billion rounds in half a second against a single hash (scaling downward across multiple salted hashes).

So while not even a strong hash can protect a single very weak password for long, strengthening these hashes can do a much better job of protecting midrange and stronger ones.


Files

auth.patch (352 Bytes) auth.patch sha512 use 800k rounds Phil Wardt, 03/19/2022 12:52 PM
Actions #1

Updated by Royce Williams 4 months ago

and to match the sha512crypt

*match the salts in the various sha512crypt mkpasswd implementations.

Actions #2

Updated by Jim Pingle 4 months ago

  • Assignee set to Jim Pingle
  • Priority changed from Normal to Very Low
  • Target version set to Future

Dynamic tuning sounds like more trouble than it's worth, IMO. We'd have to test and cache the value or test each time, maybe periodically re-test (at boot? Some other time?). Depending on the load during the test the resulting value could vary as well.

Allowing the user to manually set the work factor (bcrypt) or rounds (sha512) may be something we could consider doing, but I don't think trying to setup and maintain an automation system for this would be worth the technical debt it would incur right now.

Maybe in the future that might change.

Actions #3

Updated by Royce Williams 4 months ago

Jim Pingle wrote in #note-2:

Dynamic tuning sounds like more trouble than it's worth, IMO. We'd have to test and cache the value or test each time, maybe periodically re-test (at boot? Some other time?). Depending on the load during the test the resulting value could vary as well.

Understood. I was in the same boat until I hit upon the solution that I linked to. It's actually much simpler than I thought. The value doesn't have to be stored anywhere, and the rounds value is dynamically calculated every time a password is changed. And the load variability is taken into account - in fact, that variability, and dynamically tuning the value every time a password is changed, is actually a feature in this context!

Allowing the user to manually set the work factor (bcrypt) or rounds (sha512) may be something we could consider doing

Rather than setting the work factor directly, setting how long the work should take might be more intuitive for the user, and might indeed be a useful future configuration item. In the code I linked to, I make the argument that .5 seconds is a threshold that multiple UX studies have shown that users can tolerate. But I don't think it's necessary to wait on that in order to get the much more cracking-resistant hashes that it would produce.

but I don't think trying to setup and maintain an automation system for this would be worth the technical debt it would incur right now.

The dynamic calculation of the rounds value is less than 20 lines of actual code (including variables and verbose feedback for the demo only), and the algorithm is very easy to understand:

    function get_target_sha512crypt_rounds($target_seconds) {

        // Purpose: Tune sha512crypt rounds to a target runtime.
        // Note that we do *not* set a rounds value once globally, nor do we
        // normalize or round up or down here, by design. This is because
        // having a variable number of rounds is a security feature, to resist
        // correlation attacks (JtR's single mode or hashcat -a 9 mode).
        // Some variability in runtime also provides rough protection
        // against sha512crypt's "guess how long the password is" flaw
        // (see https://pthree.org/2018/05/23/do-not-use-sha256crypt-sha512crypt-theyre-dangerous/)

        // Set a test password to use for the tuning.
        // sha512crypt speed roughly increases with password length, so we 
        // pick a test password that is larger than an average simple password,
        // but smaller than a passphrase.

        $test_password = 'pfsense89ABCDEF';

        // To minimize testing time, pick a relatively small value relative to
        // modern performance for common platforms, but large enough to offset
        // some variability in runtime. Very old systems may take significantly
        // longer, so the initial rounds_candidate value may need to be adjusted.

        $rounds_candidate = 100000;
        $time_elapsed_secs = 0;
        $accuracy_margin_seconds = .1;

        // Set a minimum number of rounds.
        // If the platform is slow, attack can happen on a faster system,
        // so this value should be as high as can be tolerated across the
        // expected fleet of systems we can reasonably expect to support.
        // PHP's current minimum is 1000, so this value should never be less.
        // For attack resistance, it should be far more than 1000 or even 5000.

        $minimum_rounds = 50000;

        // Adjust rounds until hash time is roughly close to the target time.
        // Since we use the results to calculate the next run, and we don't
        // care if it's rough, this loop should only run a couple of times.
        //
        // Reference round counts (dmidecode -s processor-version):
        // (Examples wanted - especially old and new pfSense/Netgate appliances)
        // 
        // - AMD Geode LX800 500 MHz (alix2):                rounds=11851
        // - AMD GX-412TC SOC (apu2):                        rounds=157921
        // - Intel(R) Celeron(R) CPU N3150 @ 1.60GHz:        rounds=209662
        // - Pentium(R) Dual-Core CPU E5:                    rounds=568985
        // - 11th Gen Intel(R) Core(TM) i7-11700K @ 3.60GHz: rounds=1741092
        //
        // By contrast, a medium-sized pentest cracking rig (equivalent of 6 GTX 
        // 1080s) can do a little over 2 *billion* rounds in half a second against
        // a single hash (scaling downward against multiple salted hashes). So the
        // goal is to counter such attack speeds by as much as can be tolerated.

        while (abs($target_seconds - $time_elapsed_secs) > $accuracy_margin_seconds) {

            // Time the hash.

            $start = microtime(true);
            $test_hash = hash_sha512crypt($test_password, $rounds_candidate);
            $time_elapsed_secs = microtime(true) - $start;
            print "Elapsed time: " . $time_elapsed_secs . ", rounds: $rounds_candidate\n";

            // Adjust the number of rounds based on the runtime.

            $perf_ratio = $target_seconds / $time_elapsed_secs;
            $rounds_candidate = intval($rounds_candidate * $perf_ratio);

        }
        print "Final autotuned rounds: $rounds_candidate\n";

        // If rounds are below minimum, warn the user and use the minimum instead.
        // Should only happen on very old hardware.

        if ($rounds_candidate < $minimum_rounds) {
            fwrite(STDERR, "Warning: detected rounds $rounds_candidate is less than minimum of $minimum_rounds - using minimum\n");
            $rounds_candidate = $minimum_rounds;
        }

        return $rounds_candidate;

    }
Actions #4

Updated by Phil Wardt 3 months ago

Jim Pingle wrote in #note-2:

Dynamic tuning sounds like more trouble than it's worth, IMO. We'd have to test and cache the value or test each time, maybe periodically re-test (at boot? Some other time?). Depending on the load during the test the resulting value could vary as well.

Allowing the user to manually set the work factor (bcrypt) or rounds (sha512) may be something we could consider doing, but I don't think trying to setup and maintain an automation system for this would be worth the technical debt it would incur right now.

Maybe in the future that might change.

So, in the meanwhile, why not just implement a simpler workaround with a 500k or even 800k rounds ?
I pushed a sample code to the redmine issue I opened (https://redmine.pfsense.org/issues/12962), without knowing this is already discussed here.

Github sample:
https://github.com/pfsense/pfsense/pull/4563

I added a working patch in next post

The rounds count can be lowered to 500k and it will probably be fine on most enterprise and existing embedded systems, yet offering way better security than the default 5k rounds

Actions #5

Updated by Phil Wardt 3 months ago

Here's a patch that can be applied by copying its contents
Tested with auth on my current system
Rounds could maybe decreased to 500k for very low end Netgate appliances after testing

Actions

Also available in: Atom PDF