So I’m currently working on a project where I use pyautogui and pytesseract to take a screenshot of the time in a video game emulator I’m using, and then to try and read the image and determine what time I got. Here’s what the image looks like when I use pyautogui to get the screenshot of the region I want:
Simply just using pytesseract.image_to_string()
worked with images of text when I tested it out to make sure it was installed properly, but when I use the in game timer picture it doesn’t output anything. Does this have to do with the quality of the image or some imitation with pytesseract or what?
>Solution :
You need to preprocess the image before performing OCR with Pytesseract. Here’s a simple approach using OpenCV and Pytesseract OCR. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale, apply a slight Gaussian blur, then Otsu’s threshold to obtain a binary image. We perform text extraction using the --psm 6
configuration option to assume a single uniform block of text. Take a look here for more options.
Input image
Otsu’s threshold to get a binary image
Result from Pytesseract OCR
0’ 12”92
Code
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Perform text extraction
data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
print(data)
cv2.imshow('thresh', thresh)
cv2.waitKey()