Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Applying conditional statements on lists stored in Dataframe cell

I would like to create a column that is the result of boolean logic of list stored in other column.

import pandas as pd
import numpy as np
d = {'202201': [7180516.0, 4868058.0], '202202': [433433740.0, 452632806.0], '202203': [5444119.0, 10000000.0]}
df = pd.DataFrame(data=d)

#Storing Values in List
df['seq'] = df.agg(list, axis=1)
#Or
#df['seq'] = df.agg(np.array, axis=1)
df

Desired output I want is a new col (df[‘seqToFs’]) that is a True or False list
For values in df[‘seq’]list > 8000000.

import numpy as np
d = {'202201': [7180516.0, 4868058.0], '202202': [433433740.0, 452632806.0], '202203': [5444119.0, 10000000.0], 
     'seq':[[7180516.0,433433740.0,5444119.0],[4868058.0,452632806.0,10000000.0]], 'seqToFs':[[False,True,False],[False,True,True]]}
df = pd.DataFrame(data=d)
df

Is it better to make df[‘seq’] a list or np.array for performance?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

My end goals is to analyze sequential orders of values meeting conditions. Is there a better way to perform such analysis than making lists in dataframe?

Example frame work of what I was trying to apply to each row. (Not my code)

original_prices = [1.25, -9.45, 10.22, 3.78, -5.92, 1.16]
prices = [True if i > 0else False for i in original_prices]
prices

Where original_prices list is replaced with row list, df[‘seq’] and prices is new col df[‘seqToFs]. Getting errors because of list format.

Help would be much appreciated.

>Solution :

You can use the normal > operator and then use agg or apply to get the desired output:

(df > 8000000).apply(list, axis=1)

0    [False, True, False]
1     [False, True, True]

example:

df = pd.DataFrame({'202201': [7180516.0, 4868058.0], '202202': [433433740.0, 452632806.0], '202203': [5444119.0, 10000000.0]})
df['seqToFs'] = (df > 8000000).apply(list, axis=1)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading