Performance of Argon2id Algorithm in Bitwarden as Compared to KeePass

Hello,

I recently discovered a significant difference in the performance of Argon2id in Bitwarden as compared to KeePass for the same key settings. I’m guessing this might have something to do with browser constraints, but I thought I’d ask the community.

For example, I thought it would be a good idea to make the key settings in Bitwarden match the settings I use in KeePass. KeePass has a feature that will select settings based on a benchmark of a 1 second delay.
Using this feature on one of my systems, I get the following:
Iterations = 56
Memory = 64MB
Parallelism = 2

But when I set these same values for Bitwarden, I thought I had locked myself out at first. Instead of 1 second, it significantly longer to login to my vault. I’m talking about ~10 minutes or so as compared to 1 second in KeePass.
I had to dial back the settings in Bitwarden obviously.

For contrast, my Bitwarden settings that are “tolerable” are:
Iterations = 3
Memory = 64MB
Parallelism = 4

Any comments on this? Tips for improving this with higher settings, hopefully closer to KeePass?

1 Like

First of all, please specify which version of Bitwarden you are using, and what device/operating system?

Bitwarden’s Argon2id implementation should be speeding up (by a factor of 2 or so, I believe) with version 2023.7.1.

Furthermore, the Parallelism factor has no effect except on Android/iOS devices.

Additional insight can most likely be provided by @Quexten. Most likely, any performance differences are caused by the restriction to use the WebAssembly library.

image

In the Micrsoft Edge Browser.

My device is a Windows 11 PC. The exact specs vary as I use different systems at different times.

Recently, my SIMD PR got merged which made things 2x (not quite) as fast as before on SIMD enabled systems as grb mentioned. Compilation with a newer compiler would have made things even 10% faster but was out of scope for the PR. Parallelism is sequential in the web / desktop version atm. It’s technically fixable but only with severe effort as multithreading in wasm is a bit of a hack atm. In total the perf difference (at parallelism 1) should be about 20-40% ( from what I recall) excluding the time taken to prepare the argon2 runtime. At parallelism 2 the wasm is 2x slower, on 4x it is 4x slower (due to being run sequentially).

I look forward to the speed-up, though I will use it to increase the strength so that the overall decryption time remains the same.

Parallelism is presumably a good thing on phone contraptions, so it is a trade-off between phones and computers.

Is the WASM implementation expected to be orders of magnitude slower than the implementation used by KeePass, as observed by OP?

No. Maybe by a factor of 2-3 for parallelism 2. I can’t measure atm to see what’s going on, but will get back to this as soon as I find some time.

Alright, sounds like the performance issue is being acknowledged and further triage is needed.

Looking to any improvements that can be made.

How did you even test 56 iterations in Bitwarden? The maximum the web interface allows you to pick is 10 iterations. https://github.com/bitwarden/clients/blob/49549cc150d11cf9fd1daa11432718ee5a027a38/apps/web/src/app/settings/change-kdf/change-kdf.component.html#L74

Regardless, when circumventing the limits I get ~4 second kdf time for Iterations = 56 etc. in Firefox with SIMD, without SIMD it’s around ~9 seconds. Working parallelism would speed it up to around half that respectively.

With my experimental Argon2-browser build that supports parallelism, the same settings yield ~2 second kdf time.

One bug I did discover, is that the rather complex loader of the argon2-browser library has a codepath that leads to the Desktop build not using the SIMD implementation, even though SIMD would be available. I’ll fix that at some point.

Does that mean that the factor of ~2× speed-up will not be seen in the Desktop app (until you fix this bug)?

I am not limited to 10 iterations. I just changed it again.

But interestingly I’m not seeing the same extremely poor performance now. Has something changed? The version info looks the same but its a very different experience today with these settings.

UPDATE:

Perhaps the size of the vault or some other aspect of one vault over another is a factor.

My comment above is true for a demo/lab vault I use. It has almost nothing in it.

But I just change the Argod2id iterations to 56 again on another “production” vault I use and the extremely poor performance persists there.

I’m timing the length of time to unlock it now, already over 2 minutes…

UPDATE: It took just over 3 minutes using 56 iterations. Not the ~10 minutes I originally reported and I admittedly didn’t time it before. But I will say it seemed much longer than what I’m seeing today.
When I did this before, I even walked away from my computer and when I returned, it was still trying to devrypt the vault.

I have confirmed that you can set the iterations to a value higher than the specified max=10 if one types the number into the field instead of using the up-arrow to increase the value (using the up-arrow, the value cannot be incremented past 10 iterations).

Interestingly, if entering a value smaller than the specified min=2 (i.e., 1 iteration), then there is an error message (“Argon2 iteration minimum is 2”) when clicking Change KDF after entering the Master Password. Such an error is not shown when surpassing the max.

Edited to Add: Seems like some code of the form

      else if (kdfConfig.iterations > 10) {
        throw new Error("Argon2 iteration maximum is 10.");
      }

needs to be added at the end of Line 432 in crypto.service.ts. Beyond that, shouldn’t these input conditioning checks use the min and max variables that are defined in change-kdf.component.html — or even better, as constants defined in kdf-type.enum.ts — instead of hardcoding the numerical values of these limits?

Yes. Specifically the code path was meant for node is (server) environments but also triggers in electron (desktop), even in the window process.

Nothing changed. But that you can pick more than. 10 iterations is an interesting bug.

Decryption time is different from kdf time and not dependent on kdf setting (iterations, memory, pbkdf2 / argon2), even if from the users perspective it’s one step.

See my edit in the comment above.

Edit: If you plan to do a PR to fix this, please also note that the Cancel button doesn’t work on the Change KDF confirmation screen (below the input field for the Master Password). You might want to look at that, as well.

Okay. From this users perspective, it takes too long for this one step when KDF iterations is set to 56. :slight_smile:

Question: If it’s a bug that I can set 56 iterations, is 56 iterations actually being applied or is it capped at 10 despite what the UI reports?

Please (temporarily) set your KDF to 100000 iterations of PBKDF2-HMAC-SHA256, then time the unlock delay on your large production vault. The try it again with Argon2id, using the minimum settings for memory (16 MiB) and iterations (2 iterations). How do these unlock times compare to what you had measured for 56 Argon2id iterations?

Tests results for what you asked for and a bit more.


KDF algorithm: PBKDF2 SHA-256
KDF iterations: 100000
Unlock time: ~1 second

KDF algorithm: PBKDF2 SHA-256
KDF iterations: 600000
Unlock time: ~2 seconds

KDF algorithm: PBKDF2 SHA-256
KDF iterations: 1000000
Unlock time: ~2 seconds

KDF algorithm: Argon2id
KDF iterations: 2 (Default = 3)
KDF Memory: 16MB (Default = 64MB)
Parallelism: 4 (Default = 4)
Unlock time: ~4 seconds

KDF algorithm: Argon2id
KDF iterations: 3 (Default = 3)
KDF Memory: 64MB (Default = 64MB)
Parallelism: 4 (Default = 4)
Unlock time: ~12 seconds