Notes 10000 character max error with less than 10k char

djkdjkdjk · February 18, 2021, 10:13pm

Importing from LastPass, encountered the 10k character max limit. Found the offending records, removed them, successfully imported the rest.

Here comes the fun part: I then split the offending records into <= 9000 char, attempted to create new records for them individually, but I’m still getting the 10k error when pasting some of them in. I can plainly see each has well under 10k char (wc -m ), yet Bitwarden tells me it’s over 10k. Am I missing something?

Thanks

eliykat · February 18, 2021, 10:54pm

The character limit applies to your data after it’s been encrypted. Our encryption algorithm increases the length of your data, so 9,000 characters unencrypted will exceed the 10,000 limit after encryption. It seems you can use 7,500 characters as an approximate limit for unencrypted text - see the recent discussion here: The field Notes exceeds the maximum encrypted value length of 10000 characters (when it is false) · Issue #1618 · bitwarden/browser · GitHub

djkdjkdjk · February 25, 2021, 4:46pm

Ah, great, makes sense, thanks for clarifying!

TheBestPessimist · September 16, 2022, 11:02am

Is there the option to increase this character limit?

Robert_Ellis · June 3, 2023, 7:19pm

This for sure is the biggest issue with bitwarden. Limited notes.

DoctorB · June 3, 2023, 11:32pm

Does BitWarden compress the data before encrypting?

eliykat · June 4, 2023, 10:01pm

No, we don’t compress encrypted strings.

DoctorB · June 4, 2023, 11:15pm

Out of interest I created a secure note by repeating the characters 1234567890 until I got the error about about “the field note exceeds the maximum encrypted value length of 10,000 characters”

The maximum BW note size is 7,439 characters (using the method I described)

I then encrypted the same message with GPG/PGP (which I know compresses before encrypting) and the result is just 227 ascii characters

That is 227 versus 10,000, both AES256. Maybe BW should compress before encrypting to help alleviate the problem of the restriction of the note size

password is password LOL
-----BEGIN PGP MESSAGE-----

jA0ECQMCawdQ9PkaBTv/0o0BQke9ab/HBWuFBD/nuZ66pqYqNQgOfPjiKhbwXiMo
lU6KeG0wLIEGfDtL5TbHZ4Z0mzx6b+zLDYEqngL9xvtHOISlhP/SYfNx2ritrWn7
uQBYODSLUGf3HJaOnpZHZZtK9L5WnlIQqsdrVQDaHVYxuc0A6ujLUY/sAbYITR4t
nEQz9QG9egPAsDNkqYA=
=Pt8T
-----END PGP MESSAGE-----

grb · June 5, 2023, 12:59am

This is not a bad idea, but your example is misleading:

Because of the way compression algorithms work, repeated patterns compress much more efficiently than other data (for example, I can create a “compressed” representation of the string 1234567890 repeated 750 times, simply by encoding this pattern as 750X1234567890, thus reducing the data size from 7500 bytes to 14 bytes).

Below is a more realistic experiment, using an online text compression tool. In each case, the input text consisted of 7500 characters:

Text Source	Compressed Size	Compression
Repeated “`1234567890`”	68 bytes	99%
Lorem Ipsum pseudo-Latin	1896 bytes	75%
Moby Dick Chapter 1	4824 bytes	36%
Random ASCII characters	7724 bytes	–3%

The Moby Dick example suggests that for English text, it may be possible to store a Secure Note that is up to 12k in length. However, for storing encryption keys and other random data, you would be better off not using any compression (since the compression algorithm actually expanded the data size).

Christop · September 29, 2023, 12:49pm

The character limitation of Bitwarden Login not field is still 10000 characters ? (I have not seen something about the limitation in the FAQ)
It is possible to use bit of markdown or it is strictly plain text ?

grb · September 29, 2023, 2:10pm

@Christop Yes, the character limit for Notes fields is still 10,000 characters after encryption. Markdown is not supported.

rokejulianlockhart · January 11, 2024, 8:52pm

@eliykat, why not?

eliykat · January 12, 2024, 3:30am

I’m not sure. My best guess is that encrypted strings are usually already quite small (the size of a single field, e.g. a username or password) so compression wouldn’t yield much benefit in exchange for slowing down decryption (which is already quite intensive for large vaults).

I might be wrong on that though, or there might be better ways to do it - this is just how we do it today.

grb · January 12, 2024, 4:37am

Another reason not to compress before encrypting is that if attempting to compress a string that is random (e.g., a password, or a note containing an encryption key or recovery code), then compression algorithms will typically return an output that is larger in size than the input. For the same reason, it would not be a good idea to compress ciphers after encrypting.

Quexten · January 7, 2025, 2:11am

Just using a better encoding than base64 could reduce the overhead slightly, (base85, others), as long as care is taken that the special characters introduced don’t cause issues.

However, compressing plaintext data before encrypting is a dangerous game. If an attacker somehow has control over some of the plaintext saved (Imagine a feature where a Passkey auto-updates your displayName or something) CRIME style attacks can allow a chosen-plaintext attacker to recover decrypted data (or even encryption keys), by repeatedly choosing a plaintext part, and observing the compressed & encrypted lengths.

With individual field strings this is not as much of a problem, but moving to encrypting larger objects at once (f.e entire ciphers) this becomes a risk.

rokejulianlockhart · January 7, 2025, 10:45am

@Quexten, that’s incredible, but even after reading the Wikipedia article, remains difficult to believe in the form that it’s presented. Can you elaborate how compression renders the content easier to estimate? I ask because length should vary fairly randomly, and I cannot see how it would not be nullified by a decent encryption algorithm.

I would presume that using a library makes this quite trivial. Nobody’s going to re-engineer their own > base64 implementation.

DenBesten · January 7, 2025, 1:35pm

To the extent that the attacker knows something about the plaintext (begins “-----BEGIN PGP MESSAGE-----”, or only uses 64/256 characters), a brute force attack can more quickly rule out candidates. This is one of the techniques used by the Bombe to break the Enigma machine.

DoctorB · January 7, 2025, 1:53pm

I think compressing before encrypting is generally considered a secure practice.
One reason is that it enhances security.
Compression algorithms often remove patterns and redundancies in data.
Encrypting the already compressed data further obscures the original information, making it more difficult for attackers to analyze.

Compressing plaintext data before encrypting is the approach used by GPG/PGP and successfully used by Ed Snowden to keep his communication secure from the NSA.
Snowden worked at the NSA so he knew their capabilities and he considered GPG secure. I think this approves compressing before encrypting on security grounds.

PS This topic is about storing larger data in a note field but this will only work if the plaintext data is compressible and that is often not the case.

So we are back to just increasing the data limit (no compression).
It sounds like an easy thing to do so if BW wouldn’t mind just increasing the limit then that would be great, thanks.

rokejulianlockhart · January 7, 2025, 2:07pm

@DenBesten, I’m familiar with how Enigma was broken. However, that’s irrelevant to compression, for the German communications were uncompressed.

Additionally, I believe that the PGP example you provided can be nullified by communicating the content in a structured manner with randomly ordered key values, instead of mere text/plain with PGP-key preface and postface syntax.

No, it doesn’t. Snowden is just a security researcher with asylum in the Russian Federation. He’s not particularly competent in comparison to those who design encryption algorithms. Irrespective, that’s conjecture – an actual evaluation, like the one aforecited in Wikipedia, is more actionable.

Quexten · January 7, 2025, 6:57pm

Encryption does not vary length randomly, at least for AES or chacha. For chacha20poly1305, the length is mapped 1:1 (it is a stream cipher, that is xor’d with the plaintext and a MAC is calculated on the ciphertext afterwards). For AES in CBC mode, the block size is 16, so your padding can be up to 15 bytes (with pkcs5). But even with padding, you can increase the data to a tipping point, and then leak plaintext size via the amount of blocks. So, for instance, at the moment, from the encrypted format of a vault, you can tell whether a password is 14 characters (1 block of ciphertext), or 20 characters (2 blocks). Rather, you know if the password falls into the interval [0,16], or (16,32] and so on. Base64 then additionally adds 33% overhead on it (plus IV plus MAC for the current bitwarden encstring type 2).

I would presume that using a library makes this quite trivial. Nobody’s going to re-engineer their own > base64 implementation.

Yes standard base85 implementations exist.

@DoctorB

I think compressing before encrypting is generally considered a secure practice.

I do not agree with this, and would like to request a source. CRIME/BREACH style attacks prove otherwise. Additionally, TLS 1.3 specifically removed compression because it was insecure. From the RFC:

Other cryptographic improvements were made, including changing the
RSA padding to use the RSA Probabilistic Signature Scheme
(RSASSA-PSS), and the removal of compression, the Digital
Signature Algorithm (DSA), and custom Ephemeral Diffie-Hellman
(DHE) groups.

Snowden worked at the NSA so he knew their capabilities and he considered GPG secure. I think this approves compressing before encrypting on security grounds.

PGP/GPG were OK for the time, but are not up to modern standards anymore:
https://www.latacora.com/blog/2019/07/16/the-pgp-problem/
https://blog.cryptographyengineering.com/2014/08/13/whats-matter-with-pgp/
https://soatok.blog/2024/11/15/what-to-use-instead-of-pgp/

I think this approves compressing before encrypting on security grounds.

I do not agree with this conclusion. Also, for GPG/PGP for encrypting data manually, this might be fine, because an attacker does not have a chosen plaintext channel. For Bitwarden specifically, I can imagine a few features that could be added that accidentally introduce a chosen plaintext channel.

With respect to how the attack works: The main idea is that the chosen plaintext in the compressed message is just brute-forced. If it matches a larger prefix of the secret, it will compress better. Therefore, if the attacker can reliably force the client to encrypt a chosen plaintext + a unknown target secret + other data (irrelevant for the attack), then they can adapt the plaintext, and search which string leads to the best compressed encrypted result and iteratively reconstruct the secret. If data is just encrypted per-field (as it is now) then this is probably not problematic, but again, moving to larger encrypted object this becomes more relevant.

I agree the solution is increasing the data size.

If anyone wants further reading: https://ethz.ch/content/dam/ethz/special-interest/infk/inst-infsec/appliedcrypto/education/theses/threema.pdf section 4.3.2 is a more recent example on the backup function of the E2E encrypted chat application threema that abuses this compression ratio oracle.