I have the following dataframe
import pandas as pd
df = pd.DataFrame({'Original': [92,93,94,95,100,101,102],
'Sub_90': [99,98,99,100,102,101,np.nan],
'Sub_80': [99,98,99,100,102,np.nan,np.nan],
'Gen_90': [99,98,99,100,102,101,101],
'Gen_80': [99,98,99,100,102,101,100]})
I would like to create the following dictionary
{
'Gen_90': 'Original',
'Sub_90': 'Gen_90',
'Gen_80': 'Original',
'Sub_80': 'Gen_80',
}
using regex (because at my original data I also have Gen_70, Gen_60, ... , Gen_10 and Sub_70, Sub_60, ... , Sub_10)
So I would like to create pairs of Sub and Gen for the same _number and also pairs or the Original with the Gens
How could I do that ?
>Solution :
You can do:
gen_cols = df.filter(like='Gen_').columns
sub_cols = df.filter(like='Sub_').columns
d = dict(zip(sorted(sub_cols), sorted(gen_cols)))
d.update({g : 'Original' for g in gen_cols})
print(d)
{'Sub_80': 'Gen_80',
'Sub_90': 'Gen_90',
'Gen_90': 'Original',
'Gen_80': 'Original'}