Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Decoding a column and create new dataframe with the new data

I have a dataframe with the following structure:

Timestamp         Payload                                                                                                     
2022-03-16 18:00  3c313034393536353534303b31332e32333534313b303b302e303637313b302e303138313b332e33353b36322e31353338323e
2022-03-16 18:10  3c313034393536363134303b31332e32313134383b303b302e303637323b302e3031383b342e3632353b37332e34353133313e
2022-03-16 18:20  3c313034393536363734303b31332e31393630313b303b302e303637323b302e3031383b342e303132353b36322e34343238353e     
2022-03-16 18:30  3c313034393537353734303b31332e31303137343b303b302e3036373b302e303137393b342e383132353b37352e343931393e     
2022-03-16 18:40  3c313034393538323934303b31332e30353934353b303b302e303636373b302e303137393b332e313337353b37352e37313830393e  

What I would like is to create a new dataframe where those two fields appear, and also the decoding of the payload column. The difficulty I have is that the decoding of that set of characters generates a set of data. For example if we decode the following line,

3c313034393536353534303b31332e32333534313b303b302e303637313b302e303138313b332e33353b36322e31353338323e

we obtain,

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

<1049565540;13.23541;0;0.0671;0.0181;3.35;62.15382>

Where the characters <, > and ; are separators. So my goal would be to obtain something similar to the following dataframe,

2022-03-16 18:00  1049565540  13.23541  0  0.0671  0.0181  3.3500  62.15382  3c313034393536353534303b31332e32333534313b303b302e303637313b302e303138313b332e33353b36322e31353338323e
2022-03-16 18:10  1049566140  13.21148  0  0.0672  0.0180  4.6250  73.45131  3c313034393536363134303b31332e32313134383b303b302e303637323b302e3031383b342e3632353b37332e34353133313e
2022-03-16 18:20  1049566740  13.19601  0  0.0672  0.0180  4.0125  62.44285  3c313034393536363734303b31332e31393630313b303b302e303637323b302e3031383b342e303132353b36322e34343238353e     
2022-03-16 18:30  1049575740  13.10174  0  0.0670  0.0179  4.8125  75.49190  3c313034393537353734303b31332e31303137343b303b302e3036373b302e303137393b342e383132353b37352e343931393e     
2022-03-16 18:40  1049582940  13.05945  0  0.0667  0.0179  3.1375  75.71809  3c313034393538323934303b31332e30353934353b303b302e303636373b302e303137393b332e313337353b37352e37313830393e  

Actually I’m doing the following code, it’s ok but I’m sure I’m not taking advantage of the potential of the Pandas library.

def myFunc(pay):
    decode = bytearray.fromhex(pay[2:-2]).decode()
    return decode.split(";")
  
output2['datos'] = output2['payload'].apply(myFunc)
payloads = output2['datos'].values
df = pd.DataFrame(payloads.tolist(), columns=None)

Can it be done in a more efficient way?

Thank you very much!

>Solution :

If want working by special functions is possible use .apply like your solution or list comprehension:

L = [bytearray.fromhex(pay[2:-2]).decode().split(";") for pay in output2.pop('Payload')]
df = pd.DataFrame(L, index=output2.index)

Last use join:

df = output2.join(df)
print (df)
          Timestamp           0         1  2       3       4       5         6
0  2022-03-16 18:10  1049566140  13.21148  0  0.0672   0.018   4.625  73.45131
1  2022-03-16 18:20  1049566740  13.19601  0  0.0672   0.018  4.0125  62.44285
2  2022-03-16 18:30  1049575740  13.10174  0   0.067  0.0179  4.8125   75.4919
3  2022-03-16 18:40  1049582940  13.05945  0  0.0667  0.0179  3.1375  75.71809

If need also original column:

L = [bytearray.fromhex(pay[2:-2]).decode().split(";") for pay in output2['Payload']]
df = pd.DataFrame(L, index=output2.index)
df = output2.join(df)
print (df)
          Timestamp                                            Payload  \
0  2022-03-16 18:10  3c313034393536363134303b31332e32313134383b303b...   
1  2022-03-16 18:20  3c313034393536363734303b31332e31393630313b303b...   
2  2022-03-16 18:30  3c313034393537353734303b31332e31303137343b303b...   
3  2022-03-16 18:40  3c313034393538323934303b31332e30353934353b303b...   

            0         1  2       3       4       5         6  
0  1049566140  13.21148  0  0.0672   0.018   4.625  73.45131  
1  1049566740  13.19601  0  0.0672   0.018  4.0125  62.44285  
2  1049575740  13.10174  0   0.067  0.0179  4.8125   75.4919  
3  1049582940  13.05945  0  0.0667  0.0179  3.1375  75.71809  
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading