Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Split list by sum of sublist items

I have a list of sublists with file names and sizes. I need to split that list into sublists based on the criteria that each splitted sublist must have a total file size less than 500 000 000 bytes. I have tried multiple solutions but I could not find a way to make it work.
My last attempt is this:

import functools
import operator

data = [["c:\example_path", 480000],["c:\example_path2", 500000], ...]

list_final = []

sum = 0
list_items_subset = []

for index, item in enumerate(data):

   sum += item[1]

   if sum < 500000000:

      list_items_subset.append(item[0])

   else:
      list_final.append(list_items_subset)

      sum = 0
      
      list_items_subset = []
      list_items_subset.append(item[0])
      sum += item[1]

print("len data init: ", len(data))
print("len items final: ", len(functools.reduce(operator.iconcat, list_final, [])))

The list_final should store all the sublists of files which have a cumulative sum less than
500 000 000 bytes. In the code above, while sublists are created and inserted, I am left with items which are not included anywhere.

Thanks for any suggestions!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Is this what you want to get?

import functools
import operator

data = [[r"c:\example_path", 480000], [r"c:\example_path2", 500000]] * 10000

list_final = []

total_size = 0
list_items_subset = []

for name, size in data:
    total_size += size
    if total_size < 500000000:
        list_items_subset.append(name)

    else:
        list_final.append(list_items_subset)
        total_size = 0
        list_items_subset = [name]
        total_size += size

list_final.append(list_items_subset)
print("len data init: ", len(data))
print(len(list_final))
print("len items final: ", len(functools.reduce(operator.iconcat, list_final, [])))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading