Allow PDF Content to be Copyable

Have you ever downloaded a pdf and then tried highlighting a paragraph of its content to copy but then found out it wouldn’t let you copy? That probably got you thinking.

What to do?

Well, you have options. But what are they?

  1. Type out the whole paragraph by hand (time intensive, wrist intensive)
  2. Upload the pdf to some shady website and let it remove copy protection for you (do you trust the website?, internet may be unreliable, website may not be around tomorrow)
  3. Use OCR (Optical Character Recognition) on the pdf (way too complicated)

There’s actually a 4th option, which I’ll tell you about right now.

Ghostscript

Ooh! Spooky. What’s Ghostscript? It’s a suite of software to deal with postscript and pdf files. I assume you care about using it. Here’s an example of how to do that:

Let’s say you have a pdf called copy_protected.pdf which has copy protection, and your mission is to strip it of copy protection, so that you can finally copy its content. The resulting pdf file with content that can be copied will be titled stripped.pdf. Now you need to have Ghostscript installed and run a terminal command.

gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=stripped.pdf copy_protected.pdf

You should now see two files:

  • copy_protected.pdf – original file with copy protection
  • stripped.pdf – duplicate of original file, except with content that is copyable

Where to learn more?

If you’d like to discover more about what Ghostscript can do for you in regards to manipulation pdfs, visit https://www.ghostscript.com.

To get an example of a pdf file with copy protection, I’d encourage you to head over to http://www.prograbooks.com/2017/06/download-beginning-ios-10-programming.html and download Beginning iOS 10 Programming with Swift, and then give my example a try.

 

Leave a Reply

Your email address will not be published. Required fields are marked *