I am experimenting with PyTesseract recognition of text captured from other programs. Results are surprisingly bad. I thought this is simple problem given that scanned documents recognition was pretty good more than 20 years ago.
"win32¢gui.EnumWindows(enum_cb, toplist)
winInfos = [(hwnd, title) for hwnd, title in winlist if
print("™sd process(es) found" % Len(winInfos))
wininfo = winInfos[@]
hwnd = wininfo[@]
# w2 = win32gui.Findwindow(None, “"Chrome")
for i in range(10):
eel eee"
Even worser without zooming and different background and text colors.
I don't need perfect solution (this is rather experimental project), but need at least something adequate. I am not limited much with ways how to implement/fix this. Only limited with Windows and Python is also very desirable. I know Python more or less and experienced in programming in whole, but newbie in text recognition.
Tesseract was the first library I tried. I read it is one of the best. I already know it likes big fonts (although for screenshots where all identical symbols always looks identically I thought 8 pixels height is pretty enough). And see zooming and making all backgrounds and text colors equal help, but not enough. I am going to recognize contents of tables with different text color and maybe backgrounds so it would be desirable not to stumble over such things.
The ways of solving I see:
- to increase fonts in the source program to get "true" higher resolution (not sure this will help enough),
- to train Tesseract on my fonts (quick search revealed this is very boring - instructions with 20 steps or with Python scripts I don't have),
- to try other libraries.
What could you recommend?
Thanks
from Bad tesseract screenshots recognition results

No comments:
Post a Comment