Now Available:

line

Featured Resources:

line

Newsletter

Email Address:


line

Ask the Expert

Have a question for our resident expert? Email your questions to Rebecca.

« Surveillance and Managing Information With So Many Ways To Capture It | Main | RAM Is Subject To E-Discovery Under Recent Ruling: Talk With Your Legal Counsel About The IT Issues »

PII in PDF Metadata...Yes, It Can Happen When You Aren't Looking!

Much is written about the many different ways in which sensitive data is leaked...and yes, there certainly are MANY ways!

Something I noticed once more today while I was doing some online research was the incredibly large amount of personally identifiable information (PII) I found within the PDFs I discovered during my searches.

The PII was not visibly printed within the PDF document itself, however, when I did a search for some specific terms, some of which included names, the search results showed associated PII.

For example, when doing a search for "John Doe" (name obviously changed) within a PDF I found, the search results showed for each instance of John Doe within the PDF his home address, phone number, and email address. This was not visible if I was only just viewing the document, but it was visible within the search results; in the metadata.

I've seen this happen several times before...I'm sure there is a treasure-trove of PII floating around the Internet within PDFs.

Most people think converting a Word, PPT, or other document, into a PDF removes all the metadata.

Au contraire, mon frere! (deja vu George Carlin? :) )

I certainly am not a PDF expert, but there are a few ways I know that metadata can get into PDFs.

1) If you attach a Word, PPT, Excel, or other document into a PDF file in its native format the metadata will follow. Yes, you can attach files to a PDF documents through Acrobat.

2) If you have the tracked changes visible when you convert a file to PDF. Yes, you would be able to see the changes clearly within the PDF, but I know *MANY* people who convert files to PDFs and never look over the resulting PDF document to make sure everything looks okay; they cheerfully send it on to others or post it online without realizing the PII is in associated metadata.

3) If your print configuration in Word, or other trackable applications, is set to print 'tracked changes' along with the document, then the resulting PDF will include the tracked changes.

4) If you imported a file, or data items, such as items from your email address book, into your native file and then deleted the info you did not want within the viewable file, the deleted viewable portions may still remain in the metadata and become part of resulting PDF.

And I know there are likely numerous other ways that metadata hides within PDFs.

I re-emphasize, I'm far from being a PDF guru, and in fact know comparatively little beyond what I need to know to do my work, but I know enough to know that metadata can easily creep into your PDF documents unbeknownst to the folks doing the PDF conversions.

Those of you out there who ARE PDF gurus...please enlighten me and share the other ways in which metadata can sneak into PDF files! I love learning something new every day, and this would be a great day to learn more about PDF security. :)

Does your company post PDFs on your Internet sites? Do you know if they include any PII or other sensitive data? It would be a good exercise to go look at some of them.

A site that seems to have pretty good information about PDFs and may be useful for you is Planet PDF.

TrackBack

TrackBack URL for this entry:
http://www.realtime-itcompliance.com/type/mt-tb.cgi/448

Post a comment

(All comments are approved by site leader before appearing here. Thanks for commenting!)

line

Rebecca Herold's Bio:

Rebecca Herold, CISSP, CIPP, CISM, CISA, FLMI, has been providing information security, privacy and regulatory assistance and services to organizations from a wide range of industries for the past two decades. Rebecca was instrumental in building the information security and privacy program while at Principal Financial Group, which was awarded the CSI Information Security Program of the Year Award in 1998. IT Security ranked Rebecca as one of the top 59 IT security influencers, and Computerworld put Rebecca their list of the world's best privacy experts and on their list of the best privacy consulting firms in both 2007 and 2008. Rebecca has been CPO for two consulting organizations, and has had her own information privacy, security and compliance business since 2004. Rebecca has written chapters for several books, dozens of articles, and has been writing a monthly privacy column for the CSI Alert newsletter since the beginning of 2001, and is working on her 13th book. Some of her other books include The Privacy Papers, Managing an Information Security and Privacy Awareness and Training Program, The Definitive Guide to Security Inside the Perimeter (Realtime Publishers), The Shortcut Guide to Improving IT Service Support through ITIL (Realtime Publishers), and The Practical Guide to HIPAA Privacy and Security Compliance. In addition, Rebecca is the leader of The Realtime IT Compliance Community where she posts to her IT Compliance weblog. You can contact Rebecca at: rebecca_herold@realtimepublishers.net.