Home Pandas dataframe check if left part of a string matches another entry in a column

Questions

Pandas dataframe check if left part of a string matches another entry in a column

January 5, 2023

I have the following csv file:

Column1;Column2;Column3;Column4
A;B;6;6200
B;D;5;5000
E;F;6;5100
F;F;6;6200

Now I want to check, if column4 always begins with a ‘6‘, if column3 has an entry 6. In case, where this does not match I want to print a message. Same holds for the case vice versa: In case column4 does not begin with a ‘6‘, but column3 has an entry 6. Both columns are string.

I tried:

if ((df[df["Column3"] == "6"] and df['Column4'].str[0:1] <> "6") or (df[df["Column3"] <> "6"] and df['Column4'].str[0:1] == "6")):
    print("Error")

But doesn’t work. What did I miss?

>Solution :

You can use two boolean masks and check their identity:

out = df[df['Column3'].eq('6') == df['Column4'].str.startswith('6')]

NB. assuming you have strings. If not, add `.astype(str) right after slicing.

out = df[df['Column3'].astype(str).eq('6') == df['Column4'].astype(str).str.startswith('6')]

Output:

0       A       B        6     6200
1       B       D        5     5000
3       F       F        6     6200

printing a message:

m = df['Column3'].eq('6') != df['Column4'].str.startswith('6')
if m.any():
    print(f'Error: row(s) {",".join(m[m].index.astype(str))} are invalid')

Output:

Error: row(s) 2 are invalid

dataframe

byMR

Published January 05, 2023

Add a comment

How to limit by duplicates in a column in mysql?

byMR

January 5, 2023

Questions

Boxplot showes incorrect picture

byMR

January 5, 2023

Questions

Python – Matplotlib table doesn't show of the rows that are expected

byMR

January 5, 2023

Questions

where does the t0000z come from?

byMR

January 5, 2023

Questions

Why are the margins not working properly?

byMR

January 5, 2023

Questions

flatten dataframe columns to multi-indexed

byMR

January 5, 2023

Pandas dataframe check if left part of a string matches another entry in a column

MEDevel.com: Open-source for Healthcare and Education

>Solution :

printing a message:

Like this:

Leave a ReplyCancel reply

Read more

How to limit by duplicates in a column in mysql?

Boxplot showes incorrect picture

Python – Matplotlib table doesn't show of the rows that are expected

where does the t0000z come from?

Why are the margins not working properly?

flatten dataframe columns to multi-indexed

Keep Up to Date with the Most Important News

Pandas dataframe check if left part of a string matches another entry in a column

MEDevel.com: Open-source for Healthcare and Education

>Solution :

printing a message:

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

How to limit by duplicates in a column in mysql?

Boxplot showes incorrect picture

Python – Matplotlib table doesn't show of the rows that are expected

where does the t0000z come from?

Why are the margins not working properly?

flatten dataframe columns to multi-indexed

Discover more from Dev solutions