Tuesday, 7 March 2023

Google Cloud Vision OCR - Language hints seem to be ignored

I am using the Python framework of Google Cloud Vision to OCR passports. I want to support Ukrainian now, but the system has trouble recognizing handwritten parts.

I tried to improve the results by giving language hints to the system, but it seems like those are entirely ignored.

No matter what I set there, the recognized text and the language the system detects do not change at all. So to me, it looks like it is just ignoring the hint I give.

E.g. numbers are recognized as language "en", even if I change the language hint to "de" they are still returned as language "en". Even giving random strings as a language hint isn't producing any error.

# [...]
client = vision.ImageAnnotatorClient(client_options={"api_endpoint": endpoint})
response = client.document_text_detection(
    image=vision_img, image_context={"language_hints": ["uk"]}
)
# [...]

Are those hints just not supported using Python?



from Google Cloud Vision OCR - Language hints seem to be ignored

No comments:

Post a Comment