Wednesday, 8 September 2021

How to get an exact representation of floats during `DataFrame.to_json`?

I observed the following behavior with DataFrame.to_json:

>>> df = pd.DataFrame([[eval(f'1.12345e-{i}') for i in range(8, 20)]])
>>> df
             0             1             2             3             4             5             6             7             8             9             10            11
0  1.123450e-08  1.123450e-09  1.123450e-10  1.123450e-11  1.123450e-12  1.123450e-13  1.123450e-14  1.123450e-15  1.123450e-16  1.123450e-17  1.123450e-18  1.123450e-19
>>> print(df.to_json(indent=2, orient='index'))
{
  "0":{
    "0":0.0000000112,
    "1":0.0000000011,
    "2":0.0000000001,
    "3":0.0,
    "4":0.0,
    "5":0.0,
    "6":0.0,
    "7":0.0,
    "8":1.12345e-16,
    "9":1.12345e-17,
    "10":1.12345e-18,
    "11":1.12345e-19
  }
}

So all numbers down to 1e-16 seem to be rounded to 10 decimal places (in agreement with the default value for double_precision) but all smaller values are represented exactly. Why is this the case and how can I turn off decimal rounding for the larger values too (i.e. using scientific notation instead)?


>>> pd.__version__
'1.3.1'

For reference, the standard library's json module doesn't do this:

>>> import json
>>> print(json.dumps([eval(f'1.12345e-{i}') for i in range(8, 20)], indent=2))
[
  1.12345e-08,
  1.12345e-09,
  1.12345e-10,
  1.12345e-11,
  1.12345e-12,
  1.12345e-13,
  1.12345e-14,
  1.12345e-15,
  1.12345e-16,
  1.12345e-17,
  1.12345e-18,
  1.12345e-19
]


from How to get an exact representation of floats during `DataFrame.to_json`?

No comments:

Post a Comment