Friday, 13 May 2022

MemoryError: Unable to allocate 1.88 GiB for an array with shape (2549150, 99) and data type object

I have a problem. I want to normalize with pd.json_normalize(...) a list with inside dict but unfortunately I got a MemoryError. Is there an option to work around this error? Well it worked with pd.json_normalize(my_data[:2000000], sep="_") but not with the complete data (2549150)

I looked at MemoryError: Unable to allocate MiB for an array with shape and data type, when using anymodel.fit() in sklearn , and Unable to allocate array with shape and data type

my_data = [
{'_id': 'orders/213123',
 'contactEditor': {'name': 'Max Power',
  'phone': '1234567',
  'email': 'max@power.com'},
 'contactSoldToParty': {'name': 'Max Not',
  'phone': '123456789',
  'email': 'maxnot@power.com'},
 'isCompleteDelivery': False,
 'metaData': {'dataOriginSystem': 'Goods',
  'dataOriginWasCreatedTime': '10:12:12',},
 'orderDate': '2021-02-22',
 'orderDateBuyer': '2021-02-22',
},
{'_id': 'orders/12323',
 'contactEditor': {'name': 'Max Power2',
  'phone': '1234567',
  'email': 'max@power.com'},
 'contactSoldToParty': {'name': 'Max Not',
  'phone': '123456789',
  'email': 'maxnot@power.com'},
 'isCompleteDelivery': False,
 'metaData': {'dataOriginSystem': 'Goods',
  'dataOriginWasCreatedTime': '10:12:12',},
 'orderDate': '2021-02-22',
 'orderDateBuyer': '2021-02-22',
 },
]

df = pd.json_normalize(my_data, sep="_")
[OUT]
---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_11136/3519902863.py in <module>
----> 1 df= pd.json_normalize(my_data, sep='_')
MemoryError: Unable to allocate 1.88 GiB for an array with shape (2549150, 99) and data type object

What I want

id             contactEditor_name contactEditor_phone contactEditor_email ...
orders/213123  Max Power          ...                 ...                 ...
orders/12323   Max Power2         ...                 ...                 ...

Length of len(my_data) is 2549150`



from MemoryError: Unable to allocate 1.88 GiB for an array with shape (2549150, 99) and data type object

No comments:

Post a Comment