I used merge to create a dataframe between 2 of my query's. Now in the dataframe I want to put a filter on a column but I cannot get it to work. I am trying to add filter component date is null or blank.
##reading data from sql
package = pd.read_sql(sql,con)
component = pd.read_sql(sqla,con)
##doing the left join
test2 = pd.merge(package,component, on = ['identifier','date'], how='left')
##shrinking the dataframe
test3 = test2[['identifier_x']].copy()
i've tried doing the below but cannot get it to work. Getting date_y not defined. I also tried with just date, as when I merge my dataframes the like columns get labelled x and y.
test2 = pd.merge(package,component, on = ['identifier','date'], how='left'), component.query(date_y == '')
Now trying:
test2 = pd.merge(package,component, on = ['identifier','date'], how='left')
test2.query('date_y == \'\'')
And
test2 = pd.merge(package,component, on = ['identifier','date'], how='left')
test2[test2.date_y == '']
Also tried:
test2 = pd.merge(package,component, on = ['identifier','date'], how='left')
test2 = test2.date_y == ''
Sample data:
+------------+------------+------------+------------+
| date_x | identifier | date_y | identifier |
+------------+------------+------------+------------+
| 13/03/2019 | 3582191409 | 13/03/2019 | 3582191410 |
| 13/03/2019 | 3582191289 | 13/03/2019 | 3582191290 |
| 13/03/2019 | 3582190137 | 13/03/2019 | 3582190138 |
| 13/03/2019 | 3582185931 | 13/03/2019 | 3582185930 |
| 13/03/2019 | 3582184503 | | 3582184502 |
| 13/03/2019 | 3582195631 | | 3582195632 |
| 13/03/2019 | 3582191374 | | 3582191373 |
| 13/03/2019 | 3582185917 | | 3582185916 |
| 13/03/2019 | 3582185863 | | 3582185862 |
+------------+------------+------------+------------+
Im trying to filter date y = blanks
from Pandas Merge and filter
No comments:
Post a Comment