Outlook PST files have a very complicated internal database structure, which is read from and written to probably thousands of times every day, and they tend to be quite large. The combination of these characteristics, combined with the fact that Outlook and Windows do crash every now an then, makes the Outlook PST file somewhat susceptible to data corruption.
It can be random, and you may not even notice it right away, but even if a single bit gets flipped from a “0” to a “1” in a PST file, you could lose messages or the ability to even open the PST file in Outlook.
How do you know if a PST file is corrupt? There are major tells, like if Outlook tells you, for example, when you try to open it, or, if Outlook crashes when you try to open a particular message or open a particular folder in the PST data. But sometimes it is more subtle, like you may discover messages or attachments have disappeared, search no longer returns any results, or you can’t move messages in or out of the PST file. These are the cases that you may not notice in normal day-to-day use, but when you want to export your email from Outlook PST files with a utility like Emailchemy, you just might.
Emailchemy reads the PST file directly from the disk — not using the Microsoft connectors or API’s, but it reads the data from the PST’s database in much the same way that Outlook would. When Emailchemy finds corrupted entries in the PST data, it will first tell you about it by logging a warning to the console (or to stdout.log or stderr.log file) and then it will try to recover. Most of the time, if an expected data object like an attachment or other message body part is missing, Emailchemy will easily recover and you will get the message in the output with all the data that Emailchemy was able to find. However, because data corruption is random in nature it can sometimes cause Emailchemy to halt or stop the conversion prematurely — but this only happens with files that even Outlook would have trouble opening. With Emailchemy, you would at least get some of the messages out.
However, not getting all your messages out is simply unacceptable. So, what can you do if you have a corrupt PST file? (Remember, if Emailchemy is not completely converting your PST file, then it is likely corrupt.)
Fortunately, Microsoft is aware of how easily the PST file can become corrupted, so they provide the necessary tools for repairing them. The Inbox Repair Tool from Microsoft, aka “scanpst.exe”, can identify PST file corruption and most of the time repair it to the point that Outlook can again open the file — and thus so that Emailchemy can read the PST file too.
Here’s an excerpt from the Microsoft support article on how to use scanpst.exe:
ScanPST mostly validates and corrects errors in the internal data structures of a .pst file. The .pst file is a database file. Therefore, structures, such as BTrees and reference counts, are checked and repaired as necessary. These low-level objects have no knowledge of the upper-level structures, such as messages, calendar items, and so on, that are built upon them. If ScanPST determines a specific block of the structure or table is unreadable or corrupted, ScanPST removes it. If that block was part of a specific item in Outlook, the item will be removed when it is validated. User may not expect this behavior. However, the removal of the item is appropriate given the circumstances. Also, this specific type of situation is probably very rare, and it will always be entered in the ScanPST log file.
Scanpst.exe will also create a backup of the original PST file before attempting repair. Keep this backup in case you experience the rare case of messages or attachments (items) being removed from the database as a result of the repair.
To sum up, in most cases you will not need to run scanpst.exe on your PST file before converting it with Emailchemy, but if you think Emailchemy isn’t converting your PST file correctly, try running scanpst.exe on it first. If you see warnings written to the console logs during the conversion of your PST file, definitely try running scanpst.exe on the PST file and then try the conversion again.