None of these questions adress the issue: Question 1 and Question 2 nor could I find the answer in pandas documentation.
Hello, I am trying to find the underlying cause for this error:
ValueError: You are trying to merge on object and int64 columns.
I know I can work around this problem using pandas concat
or merge
function, but I am trying to understand the cause for the error. The question is: Why do I get this ValueError
?
Here's the output of the head(5)
and info()
on both dataframes that are used.
print(the_big_df.head(5))
Output:
account apt apt_p balance date day flag month reps reqid year
0 AA0420 0 0.0 -578.30 2019-03-01 1 1 3 10 82f2d761 2019
1 AA0420 0 0.1 -578.30 2019-03-02 2 1 3 10 82f2d761 2019
2 AA0420 0 0.1 -578.30 2019-03-03 3 1 3 10 82f2d761 2019
3 AA0421 0 0.1 -607.30 2019-03-04 4 1 3 10 82f2d761 2019
4 AA0421 0 0.1 -610.21 2019-03-05 5 1 3 10 82f2d761 2019
print(the_big_df.info())
Output:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 36054 entries, 0 to 36053
Data columns (total 11 columns):
account 36054 non-null object
apt 36054 non-null int64
apt_p 36054 non-null float64
balance 36054 non-null float64
date 36054 non-null datetime64[ns]
day 36054 non-null int64
flag 36054 non-null int64
month 36054 non-null int64
reps 36054 non-null int32
reqid 36054 non-null object
year 36054 non-null int64
dtypes: datetime64[ns](1), float64(2), int32(1), int64(5), object(2)
memory usage: 3.2+ MB
Here's the dataframe I'm passing to the join()
; print(df_to_join.head(5))
:
reqid id
0 54580f39 13301
1 3ba905c0 77114
2 5f2d80da 13302
3 a1478e98 77115
4 9b09854b 78598
print(df_to_join.info())
Output:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14332 entries, 0 to 14331
Data columns (total 2 columns):
reqid 14332 non-null object
dni 14332 non-null object
The exact next line after the 4 prints stated above is:
the_max_df = the_big_df.join(df_to_join,on='reqid')
And the output is, as stated above:
ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat
Why does this happen, when before it is clearly stated that column reqid
is an object in both dataframes? Thanks.
from Trouble with df.join(): ValueError: You are trying to merge on object and int64 columns
No comments:
Post a Comment