Center For Practice Management, Ethics, Microsoft Office, PDF, Security

Exposed! What Lawyers Need to Know About Metadata

In the early 2000s many lawyers were horrified to find that their documents could reveal more than they meant to share. In addition to forgetting to remove comments and track changes, Microsoft and PDF documents have file properties that reveal everything from author to editing times to file location. There are many ethics opinions that address metadata, including NC 2009 FEO 1. Fortunately, is it easy to find and remove these digital footprints from your documents in just a few steps.

What Is Metadata?

There is data lurking in your data. Some people call it “invisible ink”. The technology world refers to it as “metadata”. Either way, the reference is to information in an electronic document that is not always visible. Metadata, or “information about information”, does serve a purpose. Metadata helps users save and retrieve documents more readily by capturing information such as author, editor, “date created” and “date revised” in the hidden part of the document.  However, other information about the document is also captured, such as additions, deletions, revisions, versions, comments, and other information about the document that an attorney may not want to share with others. In the same way that a misdirected email can expose client confidential information to others, metadata may inadvertently provide a peek behind the curtain.

Why Is Metadata a Big Deal?

The potential for exposure of metadata in electronic documents has been widely known since 1998. Mass movement to Microsoft Word by attorneys, the advent of efiling, ediscovery, and many bar association ethics opinions thrust the issues with metadata into the light.

There were also many examples of metadata exposure that made headlines. For example:

  • October 2000: The Wall Street Journal reports that a candidate running for the U.S. Senate began receiving anonymous emails containing messages written in MS Word criticizing and attacking the candidate. A savvy aide looked at the document properties and discovered they were authored by the chief-of-staff of the opposing party.
  • February 2003: A dossier on Iraq’s security and intelligence organizations, cited by Colin Powell and published by 10 Downing Street, is discovered to have been plagiarized from a U.S. researcher on Iraq. Since the dossier was published on their website in MS Word format, researchers also discovered the four people in the British government who edited the document. They were subsequently called to Parliament for a hearing.
  • March 2004: SCO Group, seller of UNIX and Linux, sent out a warning letter to 1,500 of the world’s largest companies threatening legal liability for using Linux if they failed to obtain a license from the Utah-based company. After filing suit against Daimler-Chrysler, the metadata in a MS Word document revealed that the SCO’s attorneys had originally identified Bank of America as the defendant.
  • December 2005 – New England Journal of Medicine discovered Merk deleted information on Vioxx Study Data.
  • December 2005 – Justice Department reveals social security numbers.
  • March 2006 – Google inadvertently reveals financial projections and info about projects in the works.

One of the reasons there was so much metadata in Microsoft Office documents is because changes to documents were overwritten instead of removed due to slow computer processors. As processors got faster and Microsoft became aware of the issues, they made strides to reduce the amount of metadata in a document that could be found. Microsoft Word added the Inspect Document feature in Office 2010 and all new versions.

Ethics Opinions

Metadata resides in every type of electronic document or file created in a law office—especially files created using the Microsoft Office suite.  Corel WordPerfect users are not off the hook – these files contain similar metadata. Therefore, when you email a document as an attachment the receiving party may be able to see your edited changes or whether the document is original to that client, or a form created for someone else.  The disclosure of the metadata could be a breach of confidentiality, not to mention highly embarrassing. Numerous state bar associations and the ABA have ethics opinions regarding metadata. The ethics opinions deal primarily with the sending attorney’s responsibility to remove metadata, whether the recipient can “mine” for metadata, and whether she must notify the sender of inadvertently disclosed metadata.

In 2009 the North Carolina State Bar issued 2009 Formal Ethics Opinion 1, adopted January 15, 2010, on the Review and Use of Metadata. In summary the “Opinion rules that a lawyer must use reasonable care to prevent the disclosure of confidential client information hidden in metadata when transmitting an electronic communication and a lawyer who receives an electronic communication from another party or another party’s lawyer must refrain from searching for and using confidential information found in the metadata embedded in the document.”

In addition to restraining from intentionally mining a document for metadata, North Carolina lawyers are also advised: “The Ethics Committee recognizes that it is possible for a lawyer to unintentionally find confidential information upon viewing the contents of an electronic communication. If this occurs, the lawyer must notify the sender and may not subsequently use the information revealed without the consent of the other lawyer or party.”

Metadata – What Can Be Viewed

While tracked changes in a document that are not removed by accepting or rejecting them are not technically metadata, they can be some of the more damaging pieces of information still in a document. Similarly, Comments in an electronic document can reveal more than intended. You can track changes in Microsoft Word documents and effectively create tracked changes by marking up a PDF with software like Adobe Acrobat and Kofax Power PDF. Users can add comments to Word and PDF documents, as well as Excel and PowerPoint files.

The file properties of electronic documents can reveal a lot of information. While much of this information may not seem damaging on its face, what if a client disputed a bill based on the editing time in the document properties versus what was listed on the invoice? Or the location of the document revealed it was in a folder that divulged an impending potential merger? The following are examples of information in the file properties of most electronic documents. In Microsoft Word go to File – Info and click on Properties then choose Advanced Properties.

  • Author name/ initials
  • Last modified by
  • Company/organization name
  • Subject, file type, location
  • Date created/modified/last accessed
  • Number of revisions/versions
  • Previous document authors
  • Total editing time
  • Template information

There are other types of metadata in an electronic document. In Excel you may have hidden columns. They are not gone, and easy to reveal. If you copy a chart from Excel into a Word document and link it so that the data is updated in the document if the spreadsheet is changed, a recipient of that document can right click and choose “edit data in Excel” and view the entire spreadsheet and all the workbooks in it (try it here).  Macros may expose information, hyperlinks may send someone to poorly protected internal documents, footers can reveal file paths. A speaker’s notes in a PowerPoint may provide information that was not intended for viewing.

How Do You Get Rid of It?

Since Office 2010, Microsoft has included an Inspect Document feature that will handily and completely remove all types of metadata including track changes, comments, file properties, hidden text and more. Be aware that if your document has tracked changes the Inspect Document feature will accept all the changes so make sure that is what you want to do. PowerPoint has a feature called Inspect Presentation and Excel has Inspect Workbook. This can be found by going to File – Info – Check for Issues and follow the prompts. Here is a very quick video tutorial on how to use the Inspect Document tool, including how to add Inspect Document to your Quick Access toolbar, in MS Word.

Many law firms have chosen to send document attachments as PDF to minimize the metadata in Microsoft files. However, PDF documents have their own metadata. PDFs can be marked up and commented on and while the file properties are not as extensive, they do still exist. Adobe Acrobat has a tool in the Protection features called “Remove Hidden Information”. Here is a video tutorial on how to use this tool to remove metadata in a PDF.

In addition to the tools built into the software you use to create the documents, there is purpose built third-party software to remove metadata. These tools can help remove metadata in Office files and PDFs, as well as other file types like graphics. They integrate with email and document management systems to help ensure that metadata is scrubbed automatically, instead of relying on the end user to remember to take that step. Here are some on the market:

When you are looking at third party tools consider what types of files are analyzed, whether the user can intervene, integration options with email and document management systems, and costs.

Keep in mind that if you attach or share a document from a smartphone you may bypass third-party scrubbing tools and there is no way to inspect a document in the Microsoft Office smartphone app. You may have to give up some convenience for enhanced security.

Efiling and Metadata

It is best practice to remove metadata for documents that are efiled. One thing to be especially aware of is rules regarding redaction of sensitive information in efiled documents. Redaction needs to be applied properly, so that the data underneath the redaction cannot be revealed.

Electronic Discovery and Metadata

Metadata plays a critical role in electronic discovery. Metadata is a source of potential evidence in electronic files, including video, audio, images, PDFS, emails, and other electronic files. One court described a printed email as “dismembered”.  If you are requesting or producing electronic documents as part of discovery, realize that scrubbing metadata may be the equivalent of shredding the file. You will need to expressly consider and understand what you are requesting and whether that should include metadata. If you are producing files determine whether they need to be “native” files that include the metadata. Work with your litigation team and your ediscovery experts to understand the implications of metadata. They can also help you extract and mine metadata, as it is appropriate in the ediscovery setting when agreed upon in a Rule 26 conference.

Conclusion

Not all metadata are damaging. However, it is your ethical responsibility to be able to discern if the information you are sharing reveals confidential or privileged information. You may just inadvertently share information that would be difficult to explain, like why the matter document has an editing time of 60 minutes but you are charging two hours, or that the client is being billed at a partner’s rate even though the document author is an associate. The firm should create a policy to address how metadata will be removed as a matter of course and provide the tools necessary to do so.