Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to complete missing rows based on numeric columns?

My input is a dataframe :

import pandas as pd

df = pd.DataFrame({'ID': ['ID001', 'ID001', 'ID001', 'ID002'],
 'N1': [1, 1, 2, 1],
 'N2': [1, 2, 1, 3],
 'SUB_GROUP': [2, 2, 2, 1],
 'CHUNK': [2, 2, 2, 4]})

I’m trying to add some rows (blue ones in the image below) based on the values of the columns SUB_GROUP and CHUNK. For example for ID002, the dataframe has only one row N1=1 and N2=3 and since this id must have one subgroup with 4 units each, then we add 1/1, 1/2 and 1/4.

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I was able to generate almost the new rows to add but can’t find a way to merge them in my df :

for id in df['ID'].unique():
    new_rows = []
    df1 = df.loc[df['ID'] == id]
    len_sub_group = df1['SUB_GROUP'].iat[0]
    len_chunk = df1['CHUNK'].iat[0]
    for list1 in [list(n) for n in df1[['N1', 'N2']].values]:
        list1 = list([tuple(list1)])
        new_rows = [(len_sub_group, i)
            for i in range(1, len_chunk + 1)
            if (len_sub_group, i) not in list1]

Do you guys have an idea to do that ?

>Solution :

You could use a custom groupby.apply with itertools.product and merge:

from itertools import product

cols = ['ID', 'SUB_GROUP', 'CHUNK']

def add_missing(g):
    p =product(range(1, g['SUB_GROUP'].iat[0]+1),
               range(1, g['CHUNK'].iat[0]+1),
              )
    return g.merge(
        pd.DataFrame(p,
                     columns=['N1', 'N2']),
        how='right'
    )

out = (df.groupby(cols)[list(df)]
         .apply(add_missing).drop(columns=cols)
         .reset_index(cols)[list(df)]
      )

Output:

      ID  N1  N2  SUB_GROUP  CHUNK
0  ID001   1   1          2      2
1  ID001   1   2          2      2
2  ID001   2   1          2      2
3  ID001   2   2          2      2
0  ID002   1   1          1      4
1  ID002   1   2          1      4
2  ID002   1   3          1      4
3  ID002   1   4          1      4
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading