Working with PDFs in Python: Reading and Splitting Pages

Today, the Portable Document Format (PDF) belongs to the most commonly used data formats. In 1990, the structure of a PDF document was defined by Adobe. The idea behind the PDF format is that transmitted data/documents look exactly the same for both parties that are involved in the communication process – the creator, author or sender, and the receiver.

Today, the Portable Document Format (PDF) belongs to the most commonly used data formats. In 1990, the structure of a PDF document was defined by Adobe. The idea behind the PDF format is that transmitted data/documents look exactly the same for both parties that are involved in the communication process – the creator, author or sender, and the receiver.

This article is the beginning of a little series, and will cover these helpful Python libraries. In Part One we will focus on the manipulation of existing PDFs. You will learn how to read and extract the content (both text and images), rotate single pages, and split documents into its individual pages. Part Two will cover adding a watermark based on overlays. Part Three will exclusively focus on writing/creating PDFs, and will also include both deleting and re-combining single pages into a new document.

Link: Working with PDFs in Python: Reading and Splitting Pages