Wednesday, 26 January 2022

Conversion to FEATHER file creates huge file

I am trying to turn an .rds file into a .feather file for reading with Pandas in Python.

library(feather)

# Set working directory
data = readRDS("file.rds")
data_year = data[["1986"]]

# Try 1
write_feather(
  data_year,
  "data_year.feather"
  )

# Try 2
write_feather(
  as.data.frame(as.matrix(data_year)),
  "data_year.feather"
)

Try 1 returns Error: 'x' must be a data frame and Try 2 actually writes a *.feather file but the file has a size of 4.5GB for a single year whereas the original *.rds file has a size of 0.055GB for several years.

How can I turn the file into separate or non-separate *.feather files for each year whilst maintaining an adequate file size?

enter image description here

data looks like this:

enter image description here

data_year looks like this:

enter image description here

*Update

I am open to any suggestions for making the data available for use in NumPy/Pandas whilst maintaining a modest file size!



from Conversion to FEATHER file creates huge file

No comments:

Post a Comment