Sunday, 12 September 2021

Merge two pandas DataFrame based on partial match

Two DataFrames have city names that are not formatted the same way. I'd like to do a Left-outer join and pull geo field for all partial string matches between the field City in both DataFrames.

import pandas as pd

df1 = pd.DataFrame({
                    'City': ['San Francisco, CA','Oakland, CA'], 
                    'Val': [1,2]
                  })

df2 = pd.DataFrame({
                    'City': ['San Francisco-Oakland, CA','Salinas, CA'], 
                    'Geo': ['geo1','geo2']
                  })

Expected DataFrame upon join:

 City                   Val   Geo

 San Francisco, CA      1     geo1
 Oakland, CA            2     geo1


from Merge two pandas DataFrame based on partial match

No comments:

Post a Comment