lets say i have this table
df = pd.DataFrame({
'name':["a","b","c"],
'from':[1,1,2],
'to':[3,4,3]
})
what i am trying is to have this
df_result = pd.DataFrame({
'name':["a","a","a","b","b","b","b","c","c"],
'key':[1,2,3,1,2,3,4,2,3]
})
any ideas? i could simply use the melt function but then how do i fill the missing values?
>Solution :
You can create helper column key by range and then DataFrame.explode:
df = (df.assign(key = df.apply(lambda x: range(x['from'], x['to'] + 1), 1))
.explode('key')[['name','key']]
.reset_index(drop=True))
print (df)
name key
0 a 1
1 a 2
2 a 3
3 b 1
4 b 2
5 b 3
6 b 4
7 c 2
8 c 3
Or if need better performance use Index.repeat with DataFrame.loc and then use GroupBy.cumcount with add column from for column key:
df = df.assign(key = df['to'].add(1).sub(df['from']))
df = df.loc[df.index.repeat(df['key'])]
df['key'] = df.groupby(level=0).cumcount().add(df['from'])
df = df[['name','key']].reset_index(drop=True)
print (df)
name key
0 a 1
1 a 2
2 a 3
3 b 1
4 b 2
5 b 3
6 b 4
7 c 2
8 c 3