Friday, 26 October 2018

Bad tesseract screenshots recognition results

I am experimenting with PyTesseract recognition of text captured from other programs. Results are surprisingly bad. I thought this is simple problem given that scanned documents recognition was pretty good more than 20 years ago.

For example for enter image description here I am getting

"win32¢gui.EnumWindows(enum_cb, toplist)

winInfos = [(hwnd, title) for hwnd, title in winlist if
print("™sd process(es) found" % Len(winInfos))

wininfo = winInfos[@]

hwnd = wininfo[@]

# w2 = win32gui.Findwindow(None, “"Chrome")

for i in range(10):


eel eee"

Even worser without zooming and different background and text colors.

I don't need perfect solution (this is rather experimental project), but need at least something adequate. I am not limited much with ways how to implement/fix this. Only limited with Windows and Python is also very desirable. I know Python more or less and experienced in programming in whole, but newbie in text recognition.

Tesseract was the first library I tried. I read it is one of the best. I already know it likes big fonts (although for screenshots where all identical symbols always looks identically I thought 8 pixels height is pretty enough). And see zooming and making all backgrounds and text colors equal help, but not enough. I am going to recognize contents of tables with different text color and maybe backgrounds so it would be desirable not to stumble over such things.

The ways of solving I see:

  • to increase fonts in the source program to get "true" higher resolution (not sure this will help enough),
  • to train Tesseract on my fonts (quick search revealed this is very boring - instructions with 20 steps or with Python scripts I don't have),
  • to try other libraries.

What could you recommend?

Thanks



from Bad tesseract screenshots recognition results

No comments:

Post a Comment