I have data like the following. Each row is for a specific colour, associated with different numbers:
| Color | num1 | num2 |
|---|---|---|
| red | 1 | 2 |
| red | 1 | na |
| blue | 2 | na |
| blue | 2 | 3 |
| yellow | 1 | 4 |
| yellow | 1 | na |
I want to use forward fill on the num2 column, but only forward fill within the same colors.
For example, only fill the num2 of the first blue row with the previous row’s num2, if that previous row was also blue.
Expected result:
| Color | num1 | num2 |
|---|---|---|
| red | 1 | 2 |
| red | 1 | 2 |
| blue | 2 | na |
| blue | 2 | 3 |
| yellow | 1 | 4 |
| yellow | 1 | 4 |
I have tried the following code:
for color in df['color'].unique():
df[df['color'] == color]['num2']=df[df['color'] == color]['num2'].fillna(method='ffill')
I have also tried with inplace=True and it does not work.
>Solution :
df = df.replace('na', np.nan)
df['num2'] = df.groupby('Color')['num2'].ffill()
Output:
>>> df
Color num1 num2
0 red 1 2
1 red 1 2
2 blue 2 NaN
3 blue 2 3
4 yellow 1 4
5 yellow 1 4