Sunday, 3 June 2018

Numpy array error setting an array element with a sequence

i tryed to use multilist to hold scraped data from html

but after 50.000 list append i got memory error

So i decided to change lists to numpy array

SapList= []
ListAll  =  np.array([])

def eachshop(): #filling each list for each shop data
    global ListAll
    SapList.append(RowNum)
    SapList.extend([sap]) # here can be from one to 10 values in one list["sap1","sap2","sap3",...,"sap10"]
    SapList.extend([[strLink,ProdName],ProdCode,ProdH,NewPrice, OldPrice,[FileName+'#Komp!A1',KompPrice],[FileName+'#Sav!A1','Sav']])
    SapList.extend([ss]) # here can be from null to 80 sublist with 3 values [["id1", "link", "address"],["id80", "link", "address"]]


    ListAll = np.append(np.array(SapList))

So then i do print(ListAll) i got exception C:\Python36\scrap.py, LINE 307 "ListAll = np.append(np.array(SapList))"): setting an array element with a sequence

now for speed up i using pool.map

def makePool(cP, func, iters):
    try:

        pool = ThreadPool(cP)
        #perebiraem Url
        pool.map_async(func,enumerate(iters, start=2)).get(99999)
        pool.close()
        pool.join()
    except:
        print('Pool Error')
        raise
    finally:
        pool.terminate()

So how to reduce memory usage and speedup I\O operation using Numpy?



from Numpy array error setting an array element with a sequence

No comments:

Post a Comment