Duplicate removal tool/report (including merge)

HI all, is there any updates about this feature?
I use BW from many years and I never seen this feature implemented.

1 Like

Hi @jwmcgettigan ,
this script delete all duplicate automatically without letting you check them?
Is there a way to check the duplicates before deleting them?
If not, does this script leave you a report of what’s been removed and what’s not?

Enable bitwarden to automatically check for doubles when importing passwords, logins, data, etc. from other sources.

For each double found, be automatically prompted for the choice to delete and replace current login for that account, keep original, merge together or delete both.

Also provide an important setting allowing you to choose one of these options as a default setting when importing.

Before posting, learn how to make a feature request, or view the roadmap.

1 Like

Made an account here specifically to :arrow_up: vote on this! Please it would make the onboarding process much less painful, which is an important thing for less tech-savy users (elderly parents anyone?).

4 Likes

Damn, when will this be implemented… Just do it, it will really please a lot of users

3 Likes

Here is one way to remove duplicates.

  1. Export passwords from Bitwarden
  2. Paste text from passwords file into this website
    https://dedupelist.com/
  3. Click Submit button and copy generated text back to the passwords file, and save it.
  4. Purge Bitwarden database.
  5. Re-import passwords text file.

I checked the website code and it doesn’t submit your passwords online, or does the de-dupe locally. You can put your phone in Airplane mode if you don’t trust it.

2 Likes

Upvote!!!

Pls do a first step simply mooving duplicates to a seperate Group or Tresor.
Merging the Data would be magic and prpably to complicated.

So pls do a first step

My Suggestion:
add a Button to move duplicates to another storage.
Do it by URL and Username or Password (if no username)
Choose the oldest to be the Duplicate as a newer entry will be more up to date normaly.

This will help most people here. And will prevent us from copying a Secure PW Database in a Online Tool headshaking, as mentioned above.

1 Like

As I said I reviewed the code of the online tool. It does not make any copies of the text or post it online, everything is done locally in the browser and never saved anywhere. I also said you can turn on airplane mode (or turn off WiFi) to be sure. Just trying to help.

You can also do the same with tools like Notepad++ which is a local application, but requires users to download and install it.

1 Like

Good god man how is this still not a thing. People drop your product immediately when they discover they can’t delete more than one item at a time. People come to you with saved logins from 6 browsers. How in the flaming hell is there no way to delete these duplicates. It boggles the mind.

7 Likes

Adding my voice here. Have been a paid BitWarden user for a few years, but wanted to clean up the browser stored passwords. Surprised to see there is no de-dupe function in BitWarden, which should be simple to implement and save users hassle.

I am disappointed to find this feature is still missing. Simply being able to select multiple entries and merge them would already be welcome.

Seeing posts here with suggestions to use websites to paste all your passwords in is even more disappointing…

3 Likes

Since you can export them to an csv file wouldn’t it be possible, and most importantly safe, to run an offline python script to delete the duplicates then reimport?

I just wrote this python script for this purpose. It gets rid of duplicates if they differ only by their internal bitwarden id, so if you have 2 mostly identical logins but one has a text note, it will keep both. Use at your own risk.

Here’s what I did:

  1. export vault and save as ‘after-merge.json’
  2. run the script
  3. purge vault
  4. import the ‘deduped.json’ file
import json

with open('after-merge.json') as f:
    d = json.load(f)

dd = {}
for item in d['items']:
    dd[repr({**item,'id':0})] = item

d['items'] = list(dd.values())
with open('deduped.json', 'w', encoding='utf-8') as f:
    json.dump(d, f, indent=2)
2 Likes

this. thank you. this is what I am talking about. perfect. exported to after-merge, opened spyder. ran your code. imported deduped after hesitatingly deleting all… and it worked. no duplicates.

Thanks for the feedback! I scrolled up after posting and saw this post which is more thorough than my code. So I took inspiration from their code and updated mine to take the same fields into account (not just “id” but also “folderId”, etc):

import json,sys

with open(sys.argv[1]) as f:
    d = json.load(f)

dd = {}
for item in d['items']:
    remove = {'id':0,'folderId':0,'revisionDate':0,'creationDate':0,'deletedDate':0}
    dd[repr({**item, **remove})] = item

d['items'] = list(dd.values())
with open(sys.argv[2], 'w', encoding='utf-8') as f:
    json.dump(d, f, indent=2)

Alex, thanks for this! What a great solution for the community.

An update - there is room for improvement for users to clean up their vault items so we understand the feedback. The team is working through other priorities at the moment. Another solution to find duplicates in your vault is to use the reused passwords report in addition to what Alex provided.

1 Like

not only some kind of deduplication clean up utility would be interesting but some kind of statistics related with last time used credentials so we can get rid of all services that no longer exists or I’m no longer a customer therefore I haven’t done any login in the last X years.

I completely understand the frustration of dealing with duplicate entries in your Bitwarden vault. It can be time-consuming and cumbersome to manually clean up your password manager. Fortunately, there are some excellent community-driven efforts to address this issue, including my own humble contribution.

One such contribution is my Bitwarden-Vault-Cleaner, which you can find on GitHub here. This utility goes beyond just removing duplicates. It offers several powerful features to optimize your Bitwarden vault:

  1. Duplicate Entry Removal: Automatically detects and removes duplicate entries based on URI, username, and password.
  2. URL Validation: Checks the validity of URLs within your Bitwarden entries by pinging the base site. Invalid URLs are automatically removed from the entry’s URIs list.
  3. Optimized Export: Exports a cleaned and optimized Bitwarden export JSON file with duplicate entries and invalid URLs removed.
  4. Real-time Output: Provides progress updates in real-time, displaying which items are being processed and the reason for deletion (if any).
  5. Customizable Configuration: Easily configure the utility by adjusting settings such as input file names, output file names, and URL timeout duration.
  6. Detailed Deletion Report: Generates a separate JSON file containing information about deleted items, including the reason for deletion, for your reference.
  7. Skip “https://” or “http://” URIs: Items with a single “https://” or “http://” URI are retained during processing and are not deleted.

I want to acknowledge the valuable contributions of the community in addressing this issue.

1 Like

Wanted to provide an update here regarding the script I shared previously as I can’t edit my previous post. I just did a fairly large overhaul which has expanded the capabilities of the script.

If you are interested in the originally shared script, as it is smaller and simpler, you can find it here.

If you are interested in the updated version, the link from my previous post will bring you there but I’ll share the link again here.

My last comment there, as of posting this, explains many of the improvements from the original script. I’ll share those details here as well for posterity:

Changelog

  • The script now also deletes duplicate folders where the previous version only deleted duplicate items.
  • The script output is now much fancier with colors and loading bars but does now require installing the colorama and tqdm packages.
  • The script now automatically syncs your Bitwarden vault to ensure the data is up-to-date. This can be disabled with the --no-sync flag.
  • Session setup can now be taken care of by the script itself. If the user wants to opt-out of it they can use the --no-auth flag.
  • Added the --dry-run flag for if you want to run the script without actually deleting anything.
  • Added the --ignore-history flag for if you want to ignore password history when determining duplicates.
  • Added the --oldest flag for if you want to keep the older duplicates instead of the newer ones.
  • Added the --empty-folders flag for if you want empty folders to be included when determining duplicate folders.
  • The script now has a VERSION associated with it that can be checked with the --version flag.
  • Added a summary to the end of a successful run of the script.
  • All functions are well documented and the code should be very readable. Unfortunately the script is far larger as a result.

Example Usage (Preview Image)

How does your script handle organizations?
Have the case where I am splitting and merging organizations (and I’m importing from another password managers for them) and need to move/merge only the differences from one to another