Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python: How to explode two columns and set prefix

I have a DataFrame as follows:

time         asks                                                                    bids
2022-01-01   [{'price':'0.99', 'size':'32213'}, {'price':'0.98', 'size':'23483'}]    [{'price':'1.0', 'size':'23142'}, {'price':'0.99', 'size':'6436'}]
2022-01-02   [{'price':'0.99', 'size':'33434'}, {'price':'0.98', 'size':'33333'}]    [{'price':'1.0', 'size':'343'}, {'price':'0.99', 'size':'2342'}]

...
2022-01-21   [{'price':'0.98', 'size':'32333'}, {'price':'0.98', 'size':'23663'}]    [{'price':'1.0', 'size':'23412'}]

I want to explode the asks and bids columns and set prefixes to get the following:

time        asks_price  asks_size   bids_price   bids_size
2022-01-01        0.99      32213          1.0       23142
2022-01-01        0.98      23483         0.99        6436
2022-01-02        0.99      33434          1.0         343
2022-01-02        0.98      33333         0.99        2342
...
2022-01-21        0.98      32333          1.0       23412
2022-01-21        0.98      23663          NaN         NaN

Notice how the last row has NaN values under the bids_price and bids_size columns since there are no corresponding values.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

How can this be achieved using pandas?

EDIT: Here is a snippet of the data:

{'time': {0: '2022-05-07T00:00:00.000000000Z',
  1: '2022-05-07T01:00:00.000000000Z',
  2: '2022-05-07T02:00:00.000000000Z',
  3: '2022-05-07T03:00:00.000000000Z',
  4: '2022-05-07T04:00:00.000000000Z'},
 'asks': {0: [{'price': '0.9999', 'size': '4220492'},
   {'price': '1', 'size': '2556759'},
   {'price': '1.0001', 'size': '941039'},
   {'price': '1.0002', 'size': '458602'},
   {'price': '1.0003', 'size': '257955'},

>Solution :

Try to use .explode():

df = df.explode(['asks','bids'])

df['asks_price'] = [dict(i)['price'] for i in df['asks']]
df['asks_bids'] = [dict(i)['size'] for i in df['asks']]
df['bids_price'] = [dict(i)['price'] for i in df['bids']]
df['bids_size'] = [dict(i)['size'] for i in df['bids']]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading