Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Comparing columns and list elements for using pd.DataFrame

I have a pd.DataFrame which may look like something like this

data = {"col_x": ["1234", "5678", "9876", "1111"],
        "col_y": ["1234", "2222", "3333", "1111"],
        "col_grp": [pd.NA, ["5678", "9999"], ["9876", "5555", "1222"], pd.NA]}

df = pd.DataFrame(data)

I want to make another column valid to check if col_x equals col_y or col_x is in col_grp.

I tried with

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

def check_validity(row):
    if row["col_x"] == row["col_y"]:
        return True
    if pd.notnull(row["col_grp"]):
        if isinstance(row["col_grp"], list):
            return row["col_x"] in row["col_grp"]
        else:
            return row["col_x"] == row["col_grp"]
    return False

df["valid"] = df.apply(lambda row: check_validity(row), axis=1)

But I get

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I know that list should probably not be in a pd.DataFrame like this, so I apologize in advance.

Can anybody help me?

>Solution :

Don’t use apply but a list comprehension that will be more efficient:

df['valid'] = [x==y or isinstance(g, list ) and x in g for (x, y, g)
               in zip(df['col_x'], df['col_y'], df['col_grp'])]

If you must use apply:

def check_validity(row):
    x, y, g = row[['col_x', 'col_y', 'col_grp']]
    return x==y or isinstance(g, list ) and x in g

df['valid'] = df.apply(lambda row: check_validity(row), axis=1)

Output (with some extra rows):

  col_x col_y             col_grp  valid
0  1234  1234                <NA>   True
1  5678  2222        [5678, 9999]   True
2  9876  3333  [9876, 5555, 1222]   True
3  1111  1111                <NA>   True
4  1234  2222                <NA>  False
5  1234  2222              [2222]  False
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading