Home List the differences between two dataframe columns ignoring case

Questions

List the differences between two dataframe columns ignoring case

October 20, 2023

I am trying to compare columns from 2 dataframes and return the difference, ignoring case. Here is what I have so far:

import pandas as pd

if __name__ == "__main__":
    data1={'Name':['Karan','Rohit','Sahil','Aryan']}
    data2={'Name':['karan','Rohit','Sahil']}

    df1=pd.DataFrame(data1)
    df2=pd.DataFrame(data2)

    print(list(set(df1['Name']).difference(df2['Name'])))

This code prints ['Karan', 'Aryan']. How do I modify this to ignore case so that karan and Karan are recognized as a match and only Aryan is returned?

I don’t want to use the following because it returns aryan and I want to maintain the capitalization of the row. In my real case, they are not first names so it isn’t as easy as making the first letter capitalized again after taking the difference.

    print(list(set(df1['Name'].str.lower()).difference(df2['Name'].str.lower())))

>Solution :

To perform a case insensitive comparison, use str.casefold:

print(list(set(df1['Name'].str.casefold()).difference(df2['Name'].str.casefold())))

If you want to keep the original case use boolean indexing with isin:

df1.loc[~df1['Name'].str.casefold().isin(df2['Name'].str.casefold()), 'Name'].unique()

Output:

array(['Aryan'], dtype=object)

pandas

byMR

Published October 20, 2023

Add a comment

Duplicates /Multiple values over data range Oracle

byMR

October 20, 2023

Questions

Why is this build failing when NODE_ENV is set to production?

byMR

October 20, 2023

Questions

Reactjs stack a schedule in day groups

byMR

October 20, 2023

Questions

Why does using conditional operator ? : causes returning local temporary object?

byMR

October 20, 2023

Questions

Is viewing an integer as an array of smaller integers UB?

byMR

October 20, 2023

Questions

How can i wait until the response is returned before returning in http request c#?

byMR

October 20, 2023

List the differences between two dataframe columns ignoring case

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Duplicates /Multiple values over data range Oracle

Why is this build failing when NODE_ENV is set to production?

Reactjs stack a schedule in day groups

Why does using conditional operator ? : causes returning local temporary object?

Is viewing an integer as an array of smaller integers UB?

How can i wait until the response is returned before returning in http request c#?

Keep Up to Date with the Most Important News

List the differences between two dataframe columns ignoring case

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Duplicates /Multiple values over data range Oracle

Why is this build failing when NODE_ENV is set to production?

Reactjs stack a schedule in day groups

Why does using conditional operator ? : causes returning local temporary object?

Is viewing an integer as an array of smaller integers UB?

How can i wait until the response is returned before returning in http request c#?

Discover more from Dev solutions