Sunday 25 July 2021

How to append successive calls to a single numpy file without losing precision?

After some applying some procedure I am getting millions of numpy arrays (in the below case procedure converts e to a numpy array):

for e in l:
    procedure(e)

How can I save correctly each iteration into a single numpy file for later read it and load it?

So far I tried two options, with np.savez:

for i, e in enumerate(l):
    np.savez(f'/Users/user/array.npz',i=e)

And with pandas:

(1) For saving into a single file:

for e in l:
   arr = pd.DataFrame(procedure(i)).T 
   arr.to_csv('/Users/user/Downloads/arr.csv', mode='a', index=False, header=False)

(2) For reading:

arr = np.genfromtxt("/Users/user/Downloads/arr.csv", delimiter=',', dtype='float32', float_format='%.16f')

So far the solution that works is with pandas. However, I guess I am losing presicion in the numpy matrices. Because instead of having values like this (with the e):

-6.82821393e-01 -2.65419781e-01

I am getting values like this:

-0.6828214 , -0.26541978

However, the numpy matrices are not been saved correctly.

What is the most efficient and correct way to dump into a single file each numpy matrix after the for loop iteration?



from How to append successive calls to a single numpy file without losing precision?

No comments:

Post a Comment