Imagine I have the following data (slice as example):
ID Salary Component 1 Value1 Salary Component 2 Value2
10000 Basic Salary 22000
10000 Housing Allowance 13200
How can I combine rows per ID in a way that I have only one row per ID and filling blank information of columns using other rows column values when filled? It would result in this data:
ID Salary Component 1 Value1 Salary Component 2 Value2
10000 Basic Salary 22000 Housing Allowance 13200
Thank you for the help!
>Solution :
If need first non missing value per groups is possible use:
df = df.replace('', np.nan).groupby('ID', as_index=False).first()
If possible multiple values and need aggreagte e.g. by types:
f = lambda x: x.mean() if np.issubdtype(x.dtype, np.number) else ','.join(x.unique())
df = df.replace('', np.nan).groupby('ID', as_index=False).agg(f)