Service and Server stores personally-identifiable information (PII) unnecessarily

Storing users’ and customers’ personally-identifiable information is a liability for Bitwarden, and risk for those users and customers.

Breaches happen. Subpoena-happy courts can disrupt your business operations. The best way to protect yourself is to avoid storing information that you don’t absolutely need to store. The EFF Guide for service providers has a good list of practices that Bitwarden should check itself against and adopt.

That user in Burma or Russia shouldn’t fear that the government can kick in the door and take the computer and learn where his brother has been for the last year. The FISA court in the USA shouldn’t be able to compel Bitwarden Inc to turn over logs about user Sally or about IP address . The best way to be safe is to be ignorant of it because it doesn’t exist here.

For example, the Event table in the database schema stores IpAddress. Why? For accounting and debugging, it can be useful to store some information, but it’s unlikely that you need to store the exact value. Being able to track an event back to a physical device out on earth is not really in your domain of concern. It’s too much power. Avoid that. IP addresses could hash and bucket it to some number that is unique-ish, without giving away actual identity.

I think this should be a meta-bug for all the various things to change. There will be some I don’t know about, but those include:

  • identifiers like IP addresses should hash and be assigned to some smaller number of buckets. Like, if you have 100,000 customers, put them into 10,000 buckets. A single NAT at a university already hides dozens or hundreds of users behind one IP address! Embrace that it was never actually unique for you and protect the other 99.999% of users.
  • plugin descriptions on Chrome Store etc sound like you’re 0wnzoring all customer’s data. Reduce what you collect outside of the Vault and be clear about the delineation of what’s “collected” as post-crypto. Sending the encrypted vault to a server shouldn’t trigger such a gloomy, definition hair-splitting policy document that is inaccurate if you understand crypto at all. The policy document should be clean and much smaller, at resolution of this bug report.
  • expiring personal data to uselessness at some horizon. It would be a good idea to store a daily key for decrypting that day’s events, and ensure that keys are created daily, and expunged to some horizon daily. So, at day 17, you delete the key that can decrypt the PII parts of day 3’s logs because you keep a 14-day horizon. Having the object logs around can still be used for analytics or debugging, but is no risk to humans any more.

I don’t want to remove all PII from use in the server. I just want Bitwarden Inc and server not to be such a target-rich environment for people who want to know about its users. It’s probably okay to still transport the IP address through the app layers and emit an address in email and on-screen, but I don’t want that to be stored for very long. It would be best if taking all information out of a running server gave away very little information, so that using a browser plugin or running a server or trusting Bitwarden Inc’s server becomes an unqualified good idea.

Thanks, @chadmiller! Always good to keep an eye on what information is collected. Feedback is definitely appreciated.

One small point of clarification - the Events are only logged for Teams and Enterprise Organizations, so standard/premium/family user events are not collected.

For those wondering, more information on the types of data we store in the event logs, the docs are available here: Event Logs | Bitwarden Help & Support

Any other specific feedback is welcomed, too :+1: