Thursday, 12 August 2021

Pandas read_excel function ignoring dtype

I'm trying to read an excel file with pd.read_excel(). The excel file has 2 columns Date and Time and I want to read both columns as str not the excel dtype.

Example of the excel file

Example of the excel file

I've tried to specify the dtype or the converters arguments to no avail.

df = pd.read_excel('xls_test.xlsx',
                   dtype={'Date':str,'Time':str})
df.dtypes
Date    object
Time    object
dtype: object
df.head()
Date    Time
0   2020-03-08 00:00:00 10:00:00
1   2020-03-09 00:00:00 11:00:00
2   2020-03-10 00:00:00 12:00:00
3   2020-03-11 00:00:00 13:00:00
4   2020-03-12 00:00:00 14:00:00

As you can see the Date column is not treated as str...

Same thing when using converters

df = pd.read_excel('xls_test.xlsx',
                   converters={'Date':str,'Time':str})
df.dtypes
Date    object
Time    object
dtype: object
df.head()
Date    Time
0   2020-03-08 00:00:00 10:00:00
1   2020-03-09 00:00:00 11:00:00
2   2020-03-10 00:00:00 12:00:00
3   2020-03-11 00:00:00 13:00:00
4   2020-03-12 00:00:00 14:00:00

I have also tried to use other engine but the result is always the same.

The dtype argument seems to work as expected when reading a csv though

What am I doing wrong here ??

Edit: I forgot to mention, I'm using the last version of pandas 1.2.2 but had the same problem before updating from 1.1.2.



from Pandas read_excel function ignoring dtype

No comments:

Post a Comment