Thursday, 28 March 2019

How to compare each row from one dataframe against all the rows from other dataframe and calculate distance measure?

I have two different customer dataframes and I would like to match them based on Jaccard distance matrix or any other method.

df1

 Name     country            cost
    0    raj  Kazakhstan     23
    1    sam      Russia     243
    2  kanan     Belarus     2
    3    Nan         Nan     0

df2

   Name     country   DOB
0   rak  Kazakhstan   12-12-1903
1   sim      russia   03-04-1994
2   raj     Belarus   21-09-2003
3  kane     Belarus   23-12-1999

Output:

if the string comparison value is greater than >0.6, I would like to combine both the rows in the new dataframe.

Df3

    Name     country   Name  country     cost   DOB
0    raj  Kazakhstan   rak   Kazakhstan  23     12-12-1903
1    sam      Russia   sim   russia      243    03-04-1994
2  kanan     Belarus   Kane  Belarus     2      23-12-1999

I had tried doing calculating each row against each row but don't how to compare each rows against entire rows from one to other dataframe?



from How to compare each row from one dataframe against all the rows from other dataframe and calculate distance measure?

No comments:

Post a Comment