Hemant Vishwakarma: Keras Captcha OCR - How to pass single jpeg image to loaded (trained) model and receive prediction in string?

Wednesday, 25 August 2021

Keras Captcha OCR - How to pass single jpeg image to loaded (trained) model and receive prediction in string?

for the past several hours I was looking all over the internet for an answer to how I can pass a single jpeg image into my pre-trained model (saved and loaded) and receive prediction in string format.

I am using Captcha OCR from this source - https://keras.io/examples/vision/captcha_ocr/

Those two approaches below got me the farthest (I think) but they are still not working:

APPROACH 1:

model = load_model('trained_models/my_trained_model.h5', custom_objects={'CTCLayer': CTCLayer})

img_path = '/test/my_image.jpeg'

img = image.load_img(img_path, target_size=(200, 50))

img_array = image.img_to_array(img)
img_batch = np.expand_dims(img_array, axis=0)

img_preprocessed = preprocess_input(img_batch)

prediction = model.predict(img_preprocessed)

With this approach I didn't convert image to grey scale but before it could make any troubles I receive this error:

ValueError: Layer ocr_model_v1 expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, 200, 50, 3) dtype=float32>]

APPROACH 2: This approach is pretty much copied from data preprocessing from OCR model:

img = tf.io.read_file(img_path)
img = tf.io.decode_jpeg(img, channels=1)
img = tf.image.convert_image_dtype(img, tf.float32)
img = tf.image.resize(img, [200, 50])
img_preprocessed = tf.transpose(img, perm=[1, 0, 2])

prediction = model.predict(img_preprocessed)

And it gives me pretty much they same error:

ValueError: Layer ocr_model_v1 expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, 200, 1) dtype=float32>]

But this time it looks like image is additionally malformed.

I think this error is caused by this line in OCR:

# Define the model
model = keras.models.Model(
    inputs=[input_img, labels], outputs=output, name="ocr_model_v1"
)

Since the model is expecting two values (while training we were passing dict with image and image name (answer to captcha)). But now, I would like this model to actually predict the image so I am not able to pass answer/label.

After several hours, I was able to push this up to this moment but right now I ran out of ideas.

Could someone please point me in the right direction?

from Keras Captcha OCR - How to pass single jpeg image to loaded (trained) model and receive prediction in string?

Hemant Vishwakarma

Wednesday, 25 August 2021

Keras Captcha OCR - How to pass single jpeg image to loaded (trained) model and receive prediction in string?

No comments:

Post a Comment