Tuesday, 5 November 2019

Prevent coercion of pandas data frames while indexing and inserting rows

I'm working with individual rows of pandas data frames, but I'm stumbling over coercion issues while indexing and inserting rows. Pandas seems to always want to coerce from a mixed int/float to all-float types, and I can't see any obvious controls on this behaviour.

For example, here is a simple data frame with a as int and b as float:

import pandas as pd
pd.__version__  # '0.25.2'

df = pd.DataFrame({'a': [1], 'b': [2.2]})
print(df)
#    a    b
# 0  1  2.2
print(df.dtypes)
# a      int64
# b    float64
# dtype: object

Here is a coercion issue while indexing one row:

print(df.loc[0])
# a    1.0
# b    2.2
# Name: 0, dtype: float64
print(dict(df.loc[0]))
# {'a': 1.0, 'b': 2.2}

And here is a coercion issue while inserting one row:

df.loc[1] = {'a': 5, 'b': 4.4}
print(df)
#      a    b
# 0  1.0  2.2
# 1  5.0  4.4
print(df.dtypes)
# a    float64
# b    float64
# dtype: object

In both instances, I want the a column to remain as an integer type, rather than being coerced to a float type.



from Prevent coercion of pandas data frames while indexing and inserting rows

No comments:

Post a Comment