I have two different customer dataframes and I would like to match them based on Jaccard distance matrix or any other method.
df1
Name country cost
0 raj Kazakhstan 23
1 sam Russia 243
2 kanan Belarus 2
3 Nan Nan 0
df2
Name country DOB
0 rak Kazakhstan 12-12-1903
1 sim russia 03-04-1994
2 raj Belarus 21-09-2003
3 kane Belarus 23-12-1999
Output:
if the string comparison value is greater than >0.6, I would like to combine both the rows in the new dataframe.
Df3
Name country Name country cost DOB
0 raj Kazakhstan rak Kazakhstan 23 12-12-1903
1 sam Russia sim russia 243 03-04-1994
2 kanan Belarus Kane Belarus 2 23-12-1999
I had tried doing calculating each row against each row but don't how to compare each rows against entire rows from one to other dataframe?
from How to compare each row from one dataframe against all the rows from other dataframe and calculate distance measure?
No comments:
Post a Comment