Home How can i extract text from a PDF with python?

Questions

How can i extract text from a PDF with python?

January 13, 2022

I’m looking to extract some text from a PDF. I’m using this code:

import PyPDF2
Doc = open('document.pdf','rb') 
pdfreader = PyPDF2.PdfFileReader(Doc)
pageObj = pdfreader.getPage(0)
pageObj.extractText()

Using this code the result from pageObj.extractText() is ''. I don’t know why this happen because there are text in the pdf that is open. This document just have 1 page.

Someone know what happen? or if there is another way to get information from a PDF?

>Solution :

You can try with PDF Plumber.

Instead of printing you can write it in a text file.

import pdfplumber
with pdfplumber.open(r'D:\document.pdf') as pdf:
    first_page = pdf.pages[0]
    print(first_page.extract_text())

byMR

Published January 13, 2022

Add a comment

Match two datasets on two columns in python (date being one of these values)

byMR

January 13, 2022

Questions

Formula for price breaks in Excel

byMR

January 13, 2022

Questions

repeating row until specific value is seen or reached

byMR

January 13, 2022

Questions

Cannot retrieve all entries from MongoDB collection

byMR

January 13, 2022

Questions

TypeScript Create a countdown iterator that counts from a to b

byMR

January 13, 2022

Questions

What is the best way to wrap this node.js snippet in a function?

byMR

January 13, 2022

How can i extract text from a PDF with python?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

You can try with PDF Plumber.

Like this:

Leave a ReplyCancel reply

Read more

Match two datasets on two columns in python (date being one of these values)

Formula for price breaks in Excel

repeating row until specific value is seen or reached

Cannot retrieve all entries from MongoDB collection

TypeScript Create a countdown iterator that counts from a to b

What is the best way to wrap this node.js snippet in a function?

Keep Up to Date with the Most Important News

How can i extract text from a PDF with python?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

You can try with PDF Plumber.

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Match two datasets on two columns in python (date being one of these values)

Formula for price breaks in Excel

repeating row until specific value is seen or reached

Cannot retrieve all entries from MongoDB collection

TypeScript Create a countdown iterator that counts from a to b

What is the best way to wrap this node.js snippet in a function?

Discover more from Dev solutions