Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Ungroup pandas dataframe after bfill

I’m trying to write a function that will backfill columns in a dataframe adhearing to a condition. The upfill should only be done within groups. I am however having a hard time getting the group object to ungroup. I have tried reset_index as in the example bellow but that gets an AttributeError.

Accessing the original df through result.obj doesn’t lead to the updated value because there is no inplace for the groupby bfill.

def upfill(df:DataFrameGroupBy)->DataFrameGroupBy:
    for column in df.obj.columns:
        if column.startswith("x"):
            df[column].bfill(axis="rows", inplace=True)
    return df 

Assigning the dataframe column in the function doesn’t work because groupbyobject doesn’t support item assingment.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

def upfill(df:DataFrameGroupBy)->DataFrameGroupBy:
    for column in df.obj.columns:
        if column.startswith("x"):
            df[column] = df[column].bfill()
    return df 

The test I’m trying to get to pass:


def test_upfill():
    df = DataFrame({
        "id":[1,2,3,4,5],
        "group":[1,2,2,3,3],
        "x_value": [4,4,None,None,5],
    })
    grouped_df = df.groupby("group")
    result = upfill(grouped_df)
    result.reset_index()
    assert result["x_value"].equals(Series([4,4,None,5,5]))


>Solution :

You should use ‘transform’ method on the grouped DataFrame, like this:

import pandas as pd

def test_upfill():
    df = pd.DataFrame({
        "id":[1,2,3,4,5],
        "group":[1,2,2,3,3],
        "x_value": [4,4,None,None,5],
    })
    result = df.groupby("group").transform(lambda x: x.bfill())
    assert result["x_value"].equals(pd.Series([4,4,None,5,5]))

test_upfill()

Here you can find find more information about the transform method on Groupby objects

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading