Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to refrence a custom class in pandas query

I want to (as part of my query) instantiate a custom class then use that in the query. In the bellow example i will use a trivial identity class that does nothing.

import pandas as pd

class foo:
    def __init__(self, var):
        self.v = var

# this is meant to turn every row in col1 into a foo object then read the v attribute then compare it to 1. I know this is silly in this case, but it is a minimal working example.
q = '@foo(col1).v > 1'

df = pd.DataFrame({'col1':[1,2]})
df.query(q)

When I run this I get an error that the resolver could not find ‘foo’. Specifically:

KeyError                                  Traceback (most recent call last)
File ~/kits/miniconda3/envs/dev/lib/python3.10/site-packages/pandas/core/computation/scope.py:198, in Scope.resolve(self, key, is_local)
    197 if self.has_resolvers:
--> 198     return self.resolvers[key]
    200 # if we're here that means that we have no locals and we also have
    201 # no resolvers

File ~/kits/miniconda3/envs/dev/lib/python3.10/collections/__init__.py:982, in ChainMap.__getitem__(self, key)
    981         pass
--> 982 return self.__missing__(key)

File ~/kits/miniconda3/envs/dev/lib/python3.10/collections/__init__.py:974, in ChainMap.__missing__(self, key)
    973 def __missing__(self, key):
--> 974     raise KeyError(key)

KeyError: 'foo'

I also tried passing this to the query function using the local or global dict arguments, but both gave me the same answer.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I expected the query to properly instantiate inline and then evaluate the boolean.

>Solution :

Other similar option, create function get_foo(var) and call it:

import pandas as pd


class foo:
    def __init__(self, var):
        self.v = var


def get_foo(var):
    return foo(var)

q = "@get_foo(col1).v > 1"
df = pd.DataFrame({"col1": [1, 2]})

x = df.query(q)
print(x)

Prints:

   col1
1     2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading