Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Collapsing rows based on condition in pandas

I have a dataframe like so:

Input:

Restaurant     IP_Id  IP_Time                 S_Code
McDonalds      NaN    2021-03-28 03:01:05     4
McDonalds      NaN    2021-03-28 03:01:05     5
McDonalds      NaN    2021-03-28 03:01:05     6
McDonalds      101    2021-03-28 03:01:05     4
McDonalds      101    2021-03-28 03:01:05     5
Wendys         101    2021-03-28 03:01:05     4
Wendys         101    2021-03-28 03:01:05     4

I want to transform the dataframe to the following. For every instance where the IP_Id is the same for a specific restaurant at a specific time, I want to collapse the row and include a comma separated value for the S_Code.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Output:

Restaurant     IP_Id  IP_Time                 S_Code
McDonalds      NaN    2021-03-28 03:01:05     4
McDonalds      NaN    2021-03-28 03:01:05     5
McDonalds      NaN    2021-03-28 03:01:05     6
McDonalds      101    2021-03-28 03:01:05     4, 5
Wendys         101    2021-03-28 03:01:05     4

I’ve tried grouping by Restaurant, IP_Id, and IP_Time, but it moves the records with Nan as the IP_Id

>Solution :

It’s actually quite simple. You’ll need a special condition to separate the NaNs into their own groups, which you can generate with a combination of shift and cumsum, and then call agg('unique'):

g = df.groupby([df['IP_Id'].ne(df['IP_Id'].shift(1)).cumsum(), df['Restaurant'], df['IP_Time']])
df = g['S_Code'].agg('unique').str.join(', ').reset_index().assign(IP_Id=g['IP_Id'].first().reset_index(drop=True))

Output:

>>> df
   IP_Id Restaurant             IP_Time S_Code
0    NaN  McDonalds 2021-03-28 03:01:05      4
1    NaN  McDonalds 2021-03-28 03:01:05      5
2    NaN  McDonalds 2021-03-28 03:01:05      6
3  101.0  McDonalds 2021-03-28 03:01:05   4, 5
4  101.0     Wendys 2021-03-28 03:01:05      4
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading