We released version 12 last week and it includes a very cool new feature: de-duplication. De-duplication will filter out duplicate email messages during the conversion process, and it’s really easy to use. All you have to do is just turn on de-duplication in Emailchemy’s preferences.
That’s really all there is to it, but even though it’s simple, it still is quite powerful. Here’s a bit more details on its features:
Safe De-duplication
Because we know just how important every last email message is, we built Emailchemy’s de-dupe (saying de-duplication every time is too much long) feature to be very, very sure that a message is a duplicate before filtering it out. It’s not as simple as checking if the message has the same subject, sender and date. But, because we also understand that even with the best algorithm there will always be edge cases, we added an option to save all the duplicate messages to a separate folder. That way, you can verify the duplicates before deleting them.
You can also see all the messages that were filtered as duplicates in Emailchemy’s conversion log.
De-duplication stats
When de-duplication is enabled, Emailchemy’s progress window will show you how many duplicate messages it is filtering out.
De-duplicator memory
Emailchemy will even dedupe across files from different email applications. It keeps track of duplicates as long as the application is running in a de-duplicator cache. That is, Emailchemy will check for duplicates of any messages it has seen since you launched it.
This is useful is you are trying to convert and condense messages from more than one email application. For example, let’s say you once used Eudora at home and Outlook at work to check a personal email account. Messages from that account would be in files from both Eudora and Outlook. Emailchemy’s de-dupe feature will help solve that problem.
Emailchemy’s de-duplicator cache will clear itself every time you close Emailchemy, but you can also clear it manually by selecting “Clear De-duplicator Cache” in the Tools menu.
De-dupe without converting
You can remove duplicates from archives of mbox and EML files without doing any conversion of the files if you set the output format of the conversion to be the same as the source format. So, by “converting” from mbox to mbox, you actually would by just removing the duplicates.