Recommended settings for Argon2

Argon2 support has been added, but I don’t see anything in the Bitwarden knowledge base to explain the different settings for Argon2: in particular the user settings for KDF parallelism and KDF memory (MB).

Will Bitwarden be updating the knowledge base? And can anyone point to a resource explaining how to configure the above settings?

2 Likes

until BW update their docs…
keepass has some good info Security - KeePass look at the dictionary attacks section about memory / itterations, i suspect some of it will be useful…

i hope that bitwarden also provide a “test” tool too
e.g in keepass you can adjust and press test to see the delay caused before even saving the database.

now using argon (my keepass db takes 8 seconds to open/save) which is an acceptable delay for me as its only opened once a day, and rarely saved to…
but in the event of brute forcing that would be quite a delay to an attacker…
image
example from keepas’s database settings and notice how low the itterations are (certianly not in the hundreds of thousands!)
if bw offer no testing tool… be sure to start small, and test opening on your slowest device… increase incrementally as high values will cause a delay

now bitwarden may be completely different, but there is sone useful reading for you at the link above on keepass’s page for now

1 Like

Thanks all, the Help Center has been updated :+1:

4 Likes

Maybe @Quexten can help me here a bit: If I increase parallelism, does that make it faster for me to login (aka faster for an attacker too) or is it the other way round?

1 Like

My understanding is that the current implementation of Argon2id is not actually able to use the parallelism feature, so the question, for now, is moot.

Nonetheless, I would also be interested in any learning resources that explain the relative impact of the three cost parameters on attacker vs. defender.

1 Like

@grb, @tomtom, at the moment having parallelism >1 it is faster on mobile and cli, but the same on desktop and web.
I don’t believe It helps an attacker to have parallelism > 1 for the default memory settings, but I’m also not an expert on argon2 cracking. I have written a longer reasoning on why here.

The defaults for parallelism and the rest are from RFC 9106.

4 Likes

On my LUKS2 installations you can increase the “Time Cost” for cracking/solving - thereby greatly adding adversarial resistance - by increasing the iteration time during keyslot generation. All my vaults have a time cost > 45 as an FYI.

I haven’t examined the procedure here since we are told to wait until BW updates all the clients/extensions. When that happens I will do my testing!

1 Like

It tests Argon and it is MUCH slower than PBKDF2. At maximum settings (KDF iterations: 10 KDF memory (MB): 1024 KDF concurrency: 16) logging in on a powerful computer (Ryzen 7 3800x, 32 GB RAM) takes 30 seconds. In addition, every time you log in, the browser throws an error that the card is not responding, you have to press wait, and only then it works. I had PBKDF2 set to the maximum value (2,000,000) and it ran fast. Here, setting the option to max slows down the login a lot, I haven’t tested on slower devices yet. But it will definitely reduce these values.
I’m writing this to warn against setting to large values.

KDF iterations:5 KDF memory (MB):128 KDF concurrency 4 - it’s bearable here, login takes less than 3 seconds
For comparison KDF iterations: 4 KDF memory (MB): 256 Concurrency KDF: 4 takes about 5 seconds

With default settings KDF memory (MB):64 KDF concurrency 4 KDF iterations:3 login takes about 1.5 seconds on my computer.

In my opinion, it is not worth increasing the default values, because logging in on slower devices can take a very long time

Sorry for my english, i’m using google translate

3 Likes

There are going to be some mad as heck posts from users who select Argon and then drop in random very high settings (“I can’t remember what I set” or “I swear I used defaults” or “I’m an IT security expert!” or “I have used KeePass for 35 years - yes 35! - and this has never happened!”) and then have difficulty logging in. Bitwarden may want to post some specific guidance on that settings page so users have been advised of best practices, in advance.

3 Likes

Just saw this article. It appears pretty straightforward:

1 Like

Can someone provide a brief, executive overview explanation of (1) What Argon is, and (2) Why it is an improvement.

Some guidance would be helpful. From just observing the chatter on this subject, I get the impression that Argon 2 is superior; so that would tend to make me want to adopt it.

 

Please support my feature request below, which would allow the interactive cryptography page to be used as a testing tool for KDF settings:

5 Likes

When you log in to Bitwarden, your master password is converted into a “key” (a very large random-looking number, the value of which is unique for each master password, but which cannot be reverse-engineered to figure out what your password was). Brute-force password cracking involves repeatedly guessing a password, then performing the same key derivation calculation that Bitwarden uses during the login process, and finally evaluating whether the derived key is correct.

Until now, the “key derivation function” (KDF) used by Bitwarden (and also by any password crackers hoping to break into a Bitwarden vault) has been an algorithm known as PBKDF2-HMAC-SHA256 (or PBKDF2, for short). Now, a different KDF option is available, called Argon2id.

Below is a summary of my layman’s understanding of Argon2id:

The main benefit of Argon2id over PBKDF2 is that the former needs a lot longer time to calculate the master key from the master password. Being slower may not seem like something positive (because it now takes a longer time to complete the login process), but it provides protection against brute-force password cracking (because the attacker now must spend much more time to derive the key that is needed to evaluate each password guess. The effect of the delay is worse on the attacker (who may have to test astronomical numbers of password guesses before finding the correct password, and will have to run the KDF algorithm for every guess), but will not have a great effect on you, since you can adjust the time delay to be of a length acceptable to you.

The PBKDF2 algorithm can (in principle) be made slower by requiring that the calculation be repeated (by specifying a large number of KDF “iterations”). The slowness of the Argon2id algorithm can also be adjusted by increasing the number of iterations required, but Argon2id also provides for other adjustments that can make it more difficult for an attacker to crack your master password. For example, you can specify how much computer memory is required by the KDF algorithm. This creates a problem for password crackers, because now the cost of the required memory will make it more expensive to create a computer cluster that can guess passwords at a fast rate by using a divide-and-conquer approach.

4 Likes

Yes, you can increase time cost (iterations) here too. Do beware, Bitwarden puts a limit of 10 iteration rounds because in QA testing, it was unlimited, which lead to a tester having a 30 minute unlock time (1k+ iterations at 1GiB memory).

Anyways, always increase memory first and iterations second as recommended in the argon2 paper and iterations only afterwards. (Goes for Luks too).

2 Likes

Yes exactly. PBKDF2’s only limiting factor is time. You can easily run many instances in parallel on a GPU, and the only way to make it harder for an attacker is making it slower.

Argon2 has memory aswell (and parallelism but I’ll omit that here). This means an attacker can not run many argon2 instances in parallel on a GPU/CPU since they are limited by the memory of the hardware. At the same time this does not increase the user’s unlock time.

I.e with a comparable unlock time for the user on both, argon2 is still slower to crack by an attacker.

3 Likes

I have two follow-up questions/comments to help improve my understanding (and I did see your note about not being an expert on Argon2 cracking, so feel free not not respond):

  1. I understand conceptually that the above description may explain how the memory cost could theoretically slow down an attacker, but I’m having trouble translating into a practical example. For example, even if using the 120 MiB upper bound for iOS devices, how is this going to present any kind of limit to parallelization using GPUs, when a single RTX 4090 already has 24 GB of memory?

  2. I think your last sentence in the excerpt quoted above implies that increasing memory cost does not increase unlock time for the defender. My understanding is that this is only the case if you also increase the number of lanes (parallelism). Certainly in my own tests using antelle’s test page, the hashing speed decreases dramatically when I increase the memory requirement. Am I understanding this correctly?

Incredible explanation✅ I actually think that I grasp the concept and understand why that is an improvement and also why you experts have been asking for this. I thank everyone for helping to make me and others better users🙏

2 Likes

Yes, in that example the attacker could run 192 copies in parallel (give or take a few). That is significantly less than PBKDF2, where the attacker can use all GPU cores (16,384).
There are of course other factors at play which might do some limiting (memory bandwidth f.e), I do not know how these play into it.

Increasing the memory cost does increase the unlock time, but you can decrease iterations such that in total the unlock time for the vault is (roughly) the same. However cracking time on most hardware will be slower since fewer instances can be run in parallel.

Similarly, increasing parallelism allows you to increase the iterations, keeping total unlock time for the vault the same (excluding the WebAssembly implementation for now :frowning: ). The total amount of work being done is now higher but the unlock time is the same since more cores of your CPU are in use. For an attacker, who wants to run instances in parallel this is a problem though since he (most likely) didn’t have free CPU cores left over anyways (again depends on hardware).

@Quezten Thank you for clearing that up — I hadn’t realized that GPUs use so many cores, but in hindsight this makes perfect sense.

And I think my previous understanding of the interactions between memory and parallelism was backwards: for the defender, if I increase the number of lanes from 1 to L, it actually increases the amount of memory used by a factor of L (but also decreases the unlock time by a factor of L).

I have read variously that the maximum value of parallelism should be set either the the number of cores in the CPU (e.g., L=4 for a quadcore), or double that value (e.g., L=8 for a quadcore CPU). Can you explain this?

1 Like