Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to insert a list in a dataframe pandas

I have the following dataframe

id  rule1 rule2 rule3
1   True  True  False
2   True  True  True
3   False False False
4   False True  False
5   True  False True
..

and a dictionary:

{'rule1': 'Rule one', 'rule2': 'Rule two', 'rule3': 'Rule three'}

And I want to get an additional column list_of_rules, which is a list of rules from the dictionary that are True in the dataframe above.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

id  rule1 rule2 rule3  list_of_rules
1   True  True  False  ['Rule one', 'Rule two']
2   True  True  True   ['Rule one', 'Rule two', 'Rule three']
3   False False False  ['']
4   False True  False  ['Rule two']
5   True  False True   ['Rule one', 'Rule three']
..

So far, I have the following solution:

df.loc[df['rule1'] == True, 'rule1'] = 'Rule one'
df.loc[df['rule2'] == True, 'rule2'] = 'Rule two'
df.loc[df['rule3'] == True, 'rule3'] = 'Rule three'

df.loc[df['rule1'] == False, 'rule1'] = ''
df.loc[df['rule2'] == False, 'rule2'] = ''
df.loc[df['rule3'] == False, 'rule3'] = ''

df['list_of_rules'] = df[['rule1', 'rule2', 'rule3']].apply("-".join, axis=1).str.strip('-').str.split('-')

df

which gives the following output:

id  rule1 rule2 rule3  list_of_rules
1   True  True  False  ['Rule one', 'Rule two']
2   True  True  True   ['Rule one', 'Rule two', 'Rule three']
3   False False False  ['']
4   False True  False  ['Rule two']
5   True  False True   ['Rule one', , 'Rule three']
..

Is there a way to fix fifth line, so there would be no double commas? Also, I would like to use the dictionary that I have above directly.

Thank you in advance

>Solution :

Try this using a little trick with pandas.Dataframe.dot:

import pandas as pd

data_dict = {'id': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5},
             'rule1': {0: True, 1: True, 2: False, 3: False, 4: True},
             'rule2': {0: True, 1: True, 2: False, 3: True, 4: False},
             'rule3': {0: False, 1: True, 2: False, 3: False, 4: True}}
df = pd.DataFrame(data_dict)
df = df.set_index('id')
d = {'rule1': 'Rule one', 'rule2': 'Rule two', 'rule3': 'Rule three'}
dfr = df.rename(columns=d)
df['list_of_rules'] = dfr.dot(dfr.columns+'-').str.strip('-').str.split('-')
df.reset_index()

Output:

   id  rule1  rule2  rule3                     list_of_rules
0   1   True   True  False              [Rule one, Rule two]
1   2   True   True   True  [Rule one, Rule two, Rule three]
2   3  False  False  False                                []
3   4  False   True  False                        [Rule two]
4   5   True  False   True            [Rule one, Rule three]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading