Sunday, 6 March 2022

Root Mean Square in Postgresql

I am working on Image Duplication Library which uses ML to predict image similarity. In this process Root Mean Square is used to calculate the similarity between two images ( I'm not going into how). The function which does it looks like this.

# Function that calulates the mean squared error (mse) between two image matrices
def _mse(imageA, imageB):
    err = np.sum((imageA.astype("float") - imageB.astype("float")) ** 2)
    err /= float(imageA.shape[0] * imageA.shape[1])
    return err

My model worked fine when I tested it on folders containing 5K images but it took way too much time. So I decided to refactor my code and store all tensors in a database. Why?

If I store the tensors of all images in a database than query upcoming image tensor with it I will get result quickly. Looping over all images again and again + matching one image RMS with others will result into many combinations which will take time.

Solution

If I store all tensors which are list or Array and store them in database like Postgres than I can easily query them with RMS w.r.t. to getting all images at once than looping over them and finding out duplicity.

I need your help to figure out if there is any way to query Postgres for getting images with closest RMS

Something like this:

SELECT ID_PARTNER, ID_ACCOUNT
  , SQRT(Avg( POWER(Act_F_1 - Pred_F_1 , 2) ) ) as feature_1_rmse
FROM ...
GROUP BY ID_PARTNER, ID_ACCOUNT

Similar Question: Get RMSE score while fetching data from the Table directly.Write a query for that



from Root Mean Square in Postgresql

No comments:

Post a Comment