Wednesday, 27 January 2021

Memory leakage issue in python list

The identities list contains an big array of approximately 57000 images. Now, I am creating a negative list with the help of itertools.product(). This store the whole list in memory which is very costly and my system hanged after 4 minutes.

How can i optimize the below code and avoid saving in memory?`

for i in range(0, len(idendities) - 1):
    for j in range(i + 1, len(idendities)):
        cross_product = itertools.product(samples_list[i], samples_list[j])
        cross_product = list(cross_product)

        for cross_sample in cross_product:
            negative = []
            negative.append(cross_sample[0])
            negative.append(cross_sample[1])
            negatives.append(negative)
            print(len(negatives))

negatives = pd.DataFrame(negatives, columns=["file_x", "file_y"])
negatives["decision"] = "No"

negatives = negatives.sample(positives.shape[0])

The memory 9.30 is going to be higher and higher and on one point the system has been completely hanged.

I also implemented the below answer and modified code according to his answer.

for i in range(0, len(idendities) - 1):
    for j in range(i + 1, len(idendities)):
        for cross_sample in itertools.product(samples_list[i], samples_list[j]):
            negative = [cross_sample[0], cross_sample[1]]
            negatives.append(negative)
            print(len(negatives))

negatives = pd.DataFrame(negatives, columns=["file_x", "file_y"])
negatives["decision"] = "No"

Third version of code

This CSV file is too big even if you open a file then it gives an alert that your program can not load all files. Regarding the process, it ten minutes, and then again system going to be hanged completely.

for i in range(0, len(idendities) - 1):
    for j in range(i + 1, len(idendities)):
        for cross_sample in itertools.product(samples_list[i], samples_list[j]):
            with open('/home/khawar/deepface/tests/results.csv', 'a+') as csvfile:
                writer = csv.writer(csvfile)
                writer.writerow([cross_sample[0], cross_sample[1]])
            negative = [cross_sample[0], cross_sample[1]]
            negatives.append(negative)

negatives = pd.DataFrame(negatives, columns=["file_x", "file_y"])
negatives["decision"] = "No"

negatives = negatives.sample(positives.shape[0])

Memory screenshot.

enter image description here



from Memory leakage issue in python list

No comments:

Post a Comment