Manipulate pandas stock level 2 dataframe

September 16, 2022

Manipulate this DataFrame to become single row on multiples columns based on increasing price, any help to do that with pandas or numpy or list dict whatever and needs to be fast its real time snapshot i was trying to do with function and df.apply(lambad

Follow the full dataframe

pd.DataFrame({'price': {0: 1.0012, 1: 1.00115, 2: 1.0011, 3: 1.00105, 4: 1.001, 5: 1.00095, 6: 1.0009, 7: 1.00085, 8: 1.0008, 9: 1.00075, 10: 1.0007, 11: 1.00065, 12: 1.0006, 13: 1.00055, 14: 1.0005, 15: 1.00045, 16: 1.0004, 17: 1.00035, 18: 1.0003, 19: 1.00025},'volume': {0: 45.0, 1: 38.0, 2: 50.0, 3: 42.0, 4: 51.0, 5: 55.0, 6: 28.0, 7: 28.0, 8: 16.0, 9: 2.0, 10: 13.0, 11: 24.0, 12: 41.0, 13: 18.0, 14: 32.0, 15: 38.0, 16: 36.0, 17: 29.0, 18: 78.0, 19: 29.0}, 'type': {0: 'bid', 1: 'bid', 2: 'bid', 3: 'bid', 4: 'bid', 5: 'bid', 6: 'bid', 7: 'bid', 8: 'bid', 9: 'bid', 10: 'ask', 11: 'ask', 12: 'ask', 13: 'ask', 14: 'ask', 15: 'ask', 16: 'ask', 17: 'ask', 18: 'ask', 19: 'ask'}})

for example, the beginning is on index 9 and 10 I need to make single row on multi columns

get the index based on increasing price, 9 columns will become price_ask_1 and index 10 price_bid_1 and volume_ask_1 and volume_bid_1 so on

['price_bid_1', 'price_ask_1', 'volume_bid_1', 'volume_ask_1','price_bid_2', 'price_ask_2', 'volume_bid_2', 'volume_ask_2']

until the last row in this case is 20

>Solution :

For one row DataFrame with flatten MultiIndex use GroupBy.cumcount with DataFrame.stack – ouput is Series, then Series.to_frame with transpose, change ordering of columns and last flatten in map:

df = df.iloc[9:]
print (df)
      price  volume type
9   1.00075     2.0  bid
10  1.00070    13.0  ask
11  1.00065    24.0  ask
12  1.00060    41.0  ask
13  1.00055    18.0  ask
14  1.00050    32.0  ask
15  1.00045    38.0  ask
16  1.00040    36.0  ask
17  1.00035    29.0  ask
18  1.00030    78.0  ask
19  1.00025    29.0  ask

df1 = (df.set_index([df.groupby('type').cumcount().add(1), 'type'])
         .stack()
         .to_frame()
         .T
         .sort_index(level=[0, 2, 1], ascending=[True, True, False],axis=1))

df1.columns = df1.columns.map(lambda x: f'{x[2]}_{x[1]}_{x[0]}')
print(df1)
   price_bid_1  price_ask_1  volume_bid_1  volume_ask_1  price_ask_2  \
0      1.00075       1.0007           2.0          13.0      1.00065   

   volume_ask_2  price_ask_3  volume_ask_3  price_ask_4  volume_ask_4  ...  \
0          24.0       1.0006          41.0      1.00055          18.0  ...   

   price_ask_6  volume_ask_6  price_ask_7  volume_ask_7  price_ask_8  \
0      1.00045          38.0       1.0004          36.0      1.00035   

   volume_ask_8  price_ask_9  volume_ask_9  price_ask_10  volume_ask_10  
0          29.0       1.0003          78.0       1.00025           29.0  

[1 rows x 22 columns]