Has Bitwarden lost all in-house expertise on entropy/password strength?

…or are expert opinions being overridden by Marketing?

 

Apologies for the (intentionally) inflammatory post title, but I am seriously concerned. Decisions are being made that have the net effect of making Bitwarden appear (to an outside observer) as if their developers don’t completely understand password security — a topic that should be their bread-and-butter.

At the start of the year, we got wind of plans to implement a poorly thought-out “password strength indicator”, although this proposed design has (thankfully) not yet seen the light of day.

But now, Desktop and Web Vault version 2024.11.0 (as well as version 2024.11.999 of the redesigned browser extension) all incorporate an arbitrary lower limit for the number of words in generated passphrases. PR 11675 raises the lower limit from 3 words to 6 words, with no rationale or explanation (other than the stated objective: “Increase entropy of generated passphrases”).

The range of allowed passphrase lengths is now 6–20 words, corresponding to an entropy range 78–258 bits. At the same time, the password length restrictions are 5–128 characters, which corresponds to an entropy range 5–785 bits. It makes absolutely no sense why the allowed entropy range for passwords should be over 4 times larger than the allowed entropy range for passphrases.

In addition, there are very legitimate scenarios under which a passphrase shorter than 6 words (or a password shorter than 5 characters) provides proper security for the application at hand. For example, the Bitwarden vault master password itself does not need to have more than 50 bits of entropy to provide sufficient protection against plausible brute-force attack scenarios (and PINs used for unlocking should have even lower entropy). Another example is PINs used for user verification on Yubikeys — as illustrated here, even a 13-bit PIN provides reasonable protection against a brute-force attack, and anything higher than 43 bits overkill.

The limits chosen are also not grounded in any published standards. For example, NIST mandates a minimum password length of 8 characters (corresponding to an entropy as low as 24 bits depending on character set), a lower limit that exceeds Bitwarden’s 5-character limit for passwords.

For the reasons above, it is difficult to not draw the conclusion that irrational decisions are being made at Bitwarden, with no input from experts on password security. I hope that I am wrong, or that Bitwarden will at least listen to constructive input from experts in the user community.

My recommendation would be to not impose any minimum length restrictions in the generators (or as a second best option, limit passwords to 4 characters minimum, and passphrases to 2 words minimum), but instead set the default lengths to 12 characters and 6 words, respectively.

Any comment from @jensen-af @djsmith85 @BrandonTreston @Micah_Edelblut ?

4 Likes

I can’t add much to that, @grb . Your proposal of more flexibility (my word) but reasonable default settings seems very reasonable to me. If security is the argument here, I would suggest to add a warning message (explanatory text), if you go below the default settings at your own risk, so to speak. By that you can use it if necessary (= make your own choice) - and at the same time further security awareness.

1 Like

This is not an official statement of Bitwarden. I am speaking only for myself.

I appreciate your concern and effort in keeping me honest. I’ll let a community manager speak to Bitwarden’s perspective. In the mean time, I’ll offer mine.

As an engineer, I try to deal with facts. So, let’s talk about how security engineering works at a high level, independently of Bitwarden’s processes.

Quite a bit of security engineering is performed by whitehats reporting security issues in return for bounties. There’s a period of time between when an issue is reported and when someone solves it where the reported information is embargoed. Essentially, the researcher is required to refrain from publishing information until the issue is resolved or a fixed period of time has elapsed. After that, they can disclose the vulnerability.

Concurrently with that, the organization responsible for the software has development ongoing for many objectives. It needs to choose which reports it responds to, when to respond, and balance the effort of a “perfect” solution with the ongoing needs of both the business and its customers.

There’s a careful balance to strike across urgency, available resources, priority of competing work, contractual obligations, overhead, technical debt, and so forth. What you’re observing is not a lack of rationality, but the result of needing to balance all of these concerns while continuing to enhance the product.

Neither you nor I have a complete picture of that. I’ve been head-down trying to build an extensible credential generator toolkit that we can evolve to meet the growing needs of our users. It includes features I’m excited to expand, and we’re just now closing in on recognizing the “MVP” of that vision.

I think it’s a great thing that you, and others, are speaking up. I’ve been working on the generator for about a year now, and I want to make it the best that it can be. Feedback is the best way I know to get there, and I truly appreciate everyone who dedicates their time to exploring what’s possible. I can’t promise I’ll implement any specific idea, but I can say that we have the flexibility and know-how to move forward with confidence.

I support setting the defaults to 12 characters and 4 words, but I do not like having the lower bound of the passphrase generator be 6 words. This is a painful kludge to address the stated issue in the PR of the wordlist being smaller than wanted. There are still too many apps and websites that have unreasonably low character limits.

@jensen-af Thank you for your prompt response, and for being open-minded.

However, you speak in riddles (perhaps by necessity), so I still don’t really understand the reason why a 6-word limit was implemented at this time. You seem to imply that the lower word limit (3 word) has been identified as a contributing factor in some type of CVE that has not yet been disclosed.

The point of my comment about irrationality is that even if there truly was a security vulnerability caused by allowing passphrases with entropy lower than 78 bits (6 words from the Long EFF list), then the very same vulnerability must also exist for passwords (random character strings) that have fewer than 78 bits of entropy! However, Bitwarden’s password generator allows for passwords with as few as 5 characters. Even if using all available characters sets, a 5-character random password will have less than 29 bits of entropy; if using a restricted character set (e.g., only special characters, or only numbers while enabling "avoid ambiguous characters), then the total entropy of a 5-character random password is only 15 bits!

It does not seem rational to allow the generator to produce credentials with as few as 15–29 bits of entropy if the “Password” option is chosen, while (arbitrarily, it seems) imposing a 78-bit minimum entropy limit if the “Passphrase” option is chosen.

If there is a concern with entropy, then it should be applied evenly for both passwords and passphrases. The simplest rational implementation would be to make the minimum number of password characters equal to double the minimum number of passphrase words (or vice versa); this works because log(7776)/log(70) ≈ 2 (under the assumption that password are generated using all available character sets, with no restrictions).

Beyond that, there is the question of whether a minimum entropy limit should exist at all, and if so, what should that limit be. I have tried to make the case (in my original post), that there are common and legitimate use-cases in which a 78-bit entropy minimum is much too high. If one adheres to NIST guidelines, then the minimum entropy limit should be somewhere in the range 24–52 bits. Personally, I believe that 13 bits of entropy is a suitable minimum that can still provide sufficient security for some use-cases (e.g., Yubikey UV PINs).

1 Like

I think we are missing the obvious.

We routinely gripe that password strength testers are incapable of calculating the strength of a password because they don’t know how many possible passwords there are. That is not a problem when a generator is involved. With a generator, the character (or word) set is known and therefore it is easy to calculate the entropy with mathematical precision.

Seems to me the solution is to display the bits of entropy based on the current settings, perhaps alongside a subjective opinion ( “strong”, “weak”), although I would look for some authoritative source to define the thresholds (e.g. declaring 65 bits to be “strong”).

As a bonus, we could then instruct users to use the generator as a learning tool, figuring out for themselves if including digits or adding length is more effective.

Perhaps the generator might to warn (or prohibit) saving passwords with less than N-bits of entropy, but lets start by making the generator be a learning tool.

The problem with this is that there can be no “authoritative” source for such judgments, because what is “strong” vs. “weak” depends on a number of factors which will differ widely from one use-case to another (as explained here and here). What is “strong” for one set of credentials will always be “weak” for another set of credentials (and vice versa) — it’s a fool’s errand (or as @bit had put it in the entropy meter thread, it’s “security theatre”).

But if we are required to choose between two evils (arbitrary lower bounds in the generator vs. arbitrary warnings about “weak” passwords), I suppose I would prefer the arbitrary warnings (especially if they can be disabled with an option to “don’t warn me again”).

However, you speak in riddles (perhaps by necessity), so I still don’t really understand the reason why a 6-word limit was implemented at this time.

I’m not trying to be cryptic, I just really like elephants.

Realtalk, I’m not talking about why because that’s a complex question and I don’t have all of the facts.

I just CLOC’d main in the bitwarden/clients repository. At the moment, there’s around 5000 files containing ~400,000 lines of code and markup (omitting JSON and node_modules folders). That doesn’t include long-lived branches, the mobile code, the server, or the SDK. My world, the generator code, is maybe 20,000 lines of that?

Meanwhile, any given line of code in any repository could be involved in a vulnerability report, and Bitwarden manages all of it. Features, contributions, bugs, engineering initiatives, design languages… all of this passes through a lot of hands, sometimes including mine, and these kinds of decisions can literally circle the world, depending upon who’s involved.

I’m aware that my role is implementing and explaining the technical consequences of our decisions. That’s a pretty narrow perspective, right?

:elephant:

I appreciate the love for elephants, but I still feel like I’m having a conversation with the Oracle of Delphi! :laughing:

I do have some sense for both the complexity of the Bitwarden code base and the constraints on developer time/effort. You seem like a reasonable (and rational) person, so the only way I can understand the impetus for PR 11675 is that you may have been told/asked to make this change (i.e., the decision was not yours), and simply focused on executing that task correctly and without negatively impacting the rest of the code.

And if this is the case, then my diatribe above should not have been directed at you as the developer, but at whomever made the decision to set inconsistent entropy bounds for passphrases and passwords. If you know who they are, perhaps you can gently suggest to them that they read this thread?

The constraints are on the effort of the company as a whole. The solution space for accommodating the passphrase generator is vast. Just in this thread alone we have 4 folks commenting on what their ideal solution would be. That doesn’t include all of the discussion in prior threads, it doesn’t include discussions on github, and it doesn’t include reddit.

Assume for a moment that someone is reading all of this and deciding how to move forward. Entropy estimates, warnings, turning the hard limit into a soft limit, disabling limits entirely, embedding educational content–all of this takes time just to sort through. After compiling all of this, you then need to gather information about the effort of each of the options. Create mockups, possibly run usability experiments, coordinate translations, decide how this affects downstream systems like the generators embedded in vault items, figure out how to prioritize it against all of your other commitments, and so on. All of that before programming begins.

Before I worked here, I was a consultant, and the #1 thing I learned working for big-name companies is that creating products is always a complex endeavor. The Parable of the Blind Men and the Elephant is one of my favorite analogies. It reminds me that I’m gonna take the handful of facts I know and fill in the gaps with a plausible story, but that’s all it is. I can’t trust myself to know it all.

And so, to close, I’ll say directly what I’ve been speaking around: You started with very few facts, extrapolated that to “Has Bitwarden lost all in-house expertise…”, and then wrote a call-out post. I have a flair for the dramatic myself, but you could have started with a DM instead of a pitchfork. :slight_smile:

The issue is that we still don’t have facts. There was evidently some impetus to make a change (instead of leaving the old limit alone), but the reason for that change is apparently a closely guarded secret.

And I really don’t want to seem like I’m trying to attack you in particular (as I am appreciative of the fact that you’ve taken time out of your busy schedule to participate in this discussion), but at the same time, I have seen no evidence from Bitwarden that contradicts the premise of my argument.

Consider also the option of moving backwards, to restore the status quo ante elephantum.

Think of it more as an intervention, out of love. The impetus for my post was the concern that Bitwarden’s reputation as a security-focused business would suffer by decision-making that is inscrutable and defies common sense. I hope sincerely that the answer to the rhetorical question in my post title is “no”, but it would be reassuring to see some direct evidence of that fact.

1 Like

Yet, NIST rises to the challenge:

“printing ascii characters” is about 6.6 bits per character. So, they require ~50 bits of entropy and recommend ~100. That right there is enough to create a defensible “weak”, “good”, “great” scale.

But, the bigger point is that Bitwarden needs to simultaneously speak to multiple audiences. An objective entropy measurement is great for the expert, but a subjective weak/good/great measurement resonates better with those not skilled in the trade, even if it is demonstrably less accurate.

I agree with @grb :

A yubikey will lock itself after 8 attempts with a wrong pin. For this case, a randomly generated 3 word passphrase (39 bits) is very strong (too strong, I would argue). But for a password manager’s master password it might be weak.

2 Likes

@jensen-af As another user, I would like to respond to what you wrote here. I must say, personally, I like your “oracle-style” - and also like the story of the elephant (but that would lead too far away here)… :wink: And I think I get (in general) what you want to say here, respectively are able to express here. And appreciate it, that you responded here at all.

But :face_with_raised_eyebrow: :

Perhaps that wouldn’t have been the case, if Bitwarden communicated that change openly/transparently and explained it in the first place, instead of “just changing it” (and creating “user-friction” and “frustration” by that)…

… if there are no explanations, Bitwarden leaves room for speculations:man_shrugging: And that is more than the “problem” of one individual (@grb ) but affects others and how Bitwarden is viewed “in general”, so…

… so, the advantage of a forum is, you don’t get a direct message by every user… who would have thought, you wanted it the other way round? :sweat_smile:

PS: And though you may not intend what I write now, but your whole texts sound to me like “everything is so complicated… - and even we don’t know everything… so, users, please don’t think too much about it and accept everything as we do it… it makes sense in the end… though not even we as individuals see the whole picture”. - And I don’t know if that “message (or approach)” makes it better or worse (at least for me)… :thinking:

2 Likes

If I may speculate further, here is my current theory/fan-fiction:

A Bitwarden customer used Bitwarden’s passphrase generator to make a 3-word passphrase to protect some internet-facing asset (cryptocurrency?), and subsequently became a victim of a brute-force attack. Now, somebody (a trade publication or security blogger) is about to publish news of this event (possibly a hatchet piece, as we’ve seen in the past), but has notified Bitwarden of the “vulnerability” (which was really a user error), for the purpose of allowing Bitwarden an opportunity to mitigate the issue and provide a response prior to publication. To prevent bad PR, Bitwarden has decided to take some action, so that their response/reaction can be published along with the details of the exploit. The simplest change that directly addresses the identified “vulnerability” while having the least impact on the code base as a whole (and requires the least amount of time for development and QA testing) was to simply up the passphrase lower word limit from 3 to 6. So that is what was done and the reason why.


…am I getting warmer?

Here is where we differ. I view strength more like grades one receives in school. Just as we don’t (shouldn’t) give someone who “tries hard” an inflated grade, I don’t feel it a scalable approach to declare 39 bits “strong” for a Yubikey yet “weak” for Gmail.

In the exceptional cases (Yubikey, Windows Hello, Bitwarden), I believe we would be better off by explaining the mitigating factors that make it OK for these to have a weak password.

I feel we are straying from the important part of my initial post, that it would be helpful for the generator to offer a measurement of strength. Is that something with which there is disagreement?

1 Like

This thread has already served its main purpose (the individuals who needed to see it appear to have seen it), and unfortunately, it doesn’t seem that Bitwarden is ready yet to provide an explanation for PR 11675. For these reasons, and because the topic title is perhaps unnecessarily antagonistic in tone, I am making this thread unlisted to remove it from the forum index (it will still be accessible by direct link); I will leave the thread open a little while longer in case anybody has constructive follow-up comments, but will set a timer to auto-close the thread 1 week after the last response.

Discussion of the generator passphrase/password length limits can continue in the new Feature Request topic:

 

Likewise, discussion of entropy strength indicators can continue in the relevant Feature Request thread:

 

If anyone wants me to move their above comments into one of the linked feature request threads (so that the comments will remain publicly visible after I unlist this thread), please let me know.

I concur that the topic title is unnecessarily antagonistic. It and some of the posts do not appear to be aligned with the first community guideline.

Perhaps it would make more sense to tone down the topic title given that this post is being suggested in other places as where this discussion belongs.

In the Github PR thread, the discussion is now being steered in the direction of my feature request, which makes more sense.

However, the above side-discussion about entropy strength indicators is off-topic for both the current thread and the new feature request. The best place for that argument to continue would be in the old feature request on entropy meters.