I am trying to convert the following column into new rows:
| Id | Prices |
|---|---|
| 001 | ["March:59", "April:64", "May:62"] |
| 002 | ["Jan:55", ETC] |
to
| id | date | price |
|---|---|---|
| 001 | March | 59 |
| 001 | April | 64 |
| 001 | May | 62 |
| 002 | Jan | 55 |
The date:price pairs aren’t stored in a traditional dictionary format like the following solution:
Convert dictionary keys to rows and show all the values in single column using Pandas
I managed to get the key:value pairs into individual rows like:
| Id | Prices |
|---|---|
| 001 | March:59 |
| 001 | April:64 |
And could split these into two columns using string manipulation but this feels inefficient instead of actually using the key:value pairs. Can anyone help please?
>Solution :
If you have valid lists, explode and split:
df = pd.DataFrame({'Id': ['001', '002'],
'Prices': [["March:59", "April:64", "May:62"], ["Jan:55"]]})
out = df.explode('Prices')
out[['date', 'price']] = out.pop('Prices').str.split(':', expand=True)
If you have strings, str.extractall with a regex and join:
df = pd.DataFrame({'Id': ['001', '002'],
'Prices': ['["March:59", "April:64", "May:62"]', '["Jan:55"]']})
out = (df.drop(columns='Prices')
.join(df['Prices'].str.extractall(r'(?P<date>[^":]+):(?P<price>[^":]+)')
.droplevel('match'))
)
Output:
Id date price
0 001 March 59
0 001 April 64
0 001 May 62
1 002 Jan 55
regex demo for the second approach.