I am working on Image Duplication Library which uses ML to predict image similarity. In this process Root Mean Square is used to calculate the similarity between two images ( I'm not going into how). The function which does it looks like this.
# Function that calulates the mean squared error (mse) between two image matrices
def _mse(imageA, imageB):
err = np.sum((imageA.astype("float") - imageB.astype("float")) ** 2)
err /= float(imageA.shape[0] * imageA.shape[1])
return err
My model worked fine when I tested it on folders containing 5K images but it took way too much time. So I decided to refactor my code and store all tensors in a database. Why?
If I store the tensors of all images in a database than query upcoming image tensor with it I will get result quickly. Looping over all images again and again + matching one image RMS with others will result into many combinations which will take time.
Solution
If I store all tensors which are list or Array and store them in database like Postgres than I can easily query them with RMS w.r.t. to getting all images at once than looping over them and finding out duplicity.
I need your help to figure out if there is any way to query Postgres for getting images with closest RMS
Something like this:
SELECT ID_PARTNER, ID_ACCOUNT
, SQRT(Avg( POWER(Act_F_1 - Pred_F_1 , 2) ) ) as feature_1_rmse
FROM ...
GROUP BY ID_PARTNER, ID_ACCOUNT
Similar Question: Get RMSE score while fetching data from the Table directly.Write a query for that
from Root Mean Square in Postgresql
No comments:
Post a Comment