Home Check columns for groups of strings, replace with 1 if they exist 0 if they do not – python, pandas, logical operators

Questions

Check columns for groups of strings, replace with 1 if they exist 0 if they do not – python, pandas, logical operators

byMR

January 31, 2022

I’m trying to search for a set of strings in a column in a pandas dataframe and replace with 1 if the strings exist and 0 if they do not.

Per the example below, this works fine on the first pass:

df = pd.DataFrame({'ID':[1,2,3,4], 'Event':['1 Day', '2 Days','3 Days','4 Days']})
df['Event'] = np.where(df['Event'].str.contains('3 Days|4 Days'),1,df['Event'])

df

but when I try and apply the opposite logic and replace the instances where the strings do not exist:

df = pd.DataFrame({'ID':[1,2,3,4], 'Event':['1 Day', '2 Days','3 Days','4 Days']})
df['Event'] = np.where(df['Event'].str.contains('3 Days|4 Days'),1,df['Event'])
df['Event'] = np.where(~df['Event'].str.contains('3 Days|4 Days'),0,df['Event'])  

df

I get this error – TypeError: bad operand type for unary ~: 'float'

I tried using logical operators so the actions would occur simultaneously:

df = pd.DataFrame({'ID':[1,2,3,4], 'Event':['1 Day', '2 Days','3 Days','4 Days']})
df['Event'] = np.where(df['Event'].str.contains('3 Days|4 Days'),1,df['Event']) & np.where(~df['Event'].str.contains('3 Days|4 Days'),0,df['Event'])  

df

but received this error… TypeError: unsupported operand type(s) for &: 'str' and 'int'

What I’m ultimately trying to achieve is a df that replaces all the cells where the strings exist with 1 and the instances where those strings do not exist with 0s so I can analyze. Like so:

>Solution :

After this line:

df['Event'] = np.where(df['Event'].str.contains('3 Days|4 Days'),1, df['Event'])

df['Event'] contains 1 which is not a string, so the second time you check (inside np.where):

df['Event'].str.contains('3 Days|4 Days')

it returns:

0    False
1    False
2      NaN
3      NaN
Name: Event, dtype: object

Since NaN doesn’t evaluate ~NaN, it returns an error.

To get the desired outcome, simply use np.where once where you select 1 if True, 0 otherwise:

df['Event'] = np.where(df['Event'].str.contains('3 Days|4 Days'), 1, 0)

Output:

   ID  Event
0   1      0
1   2      0
2   3      1
3   4      1

logical-operators

byMR

Published January 31, 2022

Add a comment

Regex for matching only capitalized words stuck together (i.e. not separated by whitespace)

byMR

February 1, 2022

Questions

Print json block from a json array on new line joined by '#'

byMR

February 1, 2022

Questions

How to display only the overwritten files while unzipping a .zip file?

byMR

February 1, 2022

Questions

Regex for KeyValue pattern

byMR

February 1, 2022

Questions

Making a clock autorefresh using setInterval

byMR

February 1, 2022

Questions

Why looping through array gives non existent value?

byMR

February 1, 2022

Check columns for groups of strings, replace with 1 if they exist 0 if they do not – python, pandas, logical operators

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Regex for matching only capitalized words stuck together (i.e. not separated by whitespace)

Print json block from a json array on new line joined by '#'

How to display only the overwritten files while unzipping a .zip file?

Regex for KeyValue pattern

Making a clock autorefresh using setInterval

Why looping through array gives non existent value?

Keep Up to Date with the Most Important News

Check columns for groups of strings, replace with 1 if they exist 0 if they do not – python, pandas, logical operators

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Regex for matching only capitalized words stuck together (i.e. not separated by whitespace)

Print json block from a json array on new line joined by '#'

How to display only the overwritten files while unzipping a .zip file?

Regex for KeyValue pattern

Making a clock autorefresh using setInterval

Why looping through array gives non existent value?

Discover more from Dev solutions