Assuming I have a dataframe looking like below:
import pandas as pd
import numpy as np
d = {'Column 1': [10, 12,13,43,np.nan],
'Column2':[np.nan,7,np.nan,49,8]}
df = pd.DataFrame(d)
I would like to create a third column with a condition to take values from Column 2 unless they are NaNs. So looking like below:
I have found multiple topics/solutions where the condition was dependent on values in one column but could not find one where it had to provide data from more than one column.
>Solution :
You could use mask:
df['Column3'] = df['Column2'].mask(df['Column2'].isna(), df['Column 1'])
A more generic version (uses any number of columns) would be to take the last valid value per row:
df['Column3'] = df.ffill(1).iloc[:,-1]
output:
Column 1 Column2 Column3
0 10.0 NaN 10.0
1 12.0 7.0 7.0
2 13.0 NaN 13.0
3 43.0 49.0 49.0
4 NaN 8.0 8.0

