I want to (as part of my query) instantiate a custom class then use that in the query. In the bellow example i will use a trivial identity class that does nothing.
import pandas as pd
class foo:
def __init__(self, var):
self.v = var
# this is meant to turn every row in col1 into a foo object then read the v attribute then compare it to 1. I know this is silly in this case, but it is a minimal working example.
q = '@foo(col1).v > 1'
df = pd.DataFrame({'col1':[1,2]})
df.query(q)
When I run this I get an error that the resolver could not find ‘foo’. Specifically:
KeyError Traceback (most recent call last)
File ~/kits/miniconda3/envs/dev/lib/python3.10/site-packages/pandas/core/computation/scope.py:198, in Scope.resolve(self, key, is_local)
197 if self.has_resolvers:
--> 198 return self.resolvers[key]
200 # if we're here that means that we have no locals and we also have
201 # no resolvers
File ~/kits/miniconda3/envs/dev/lib/python3.10/collections/__init__.py:982, in ChainMap.__getitem__(self, key)
981 pass
--> 982 return self.__missing__(key)
File ~/kits/miniconda3/envs/dev/lib/python3.10/collections/__init__.py:974, in ChainMap.__missing__(self, key)
973 def __missing__(self, key):
--> 974 raise KeyError(key)
KeyError: 'foo'
I also tried passing this to the query function using the local or global dict arguments, but both gave me the same answer.
I expected the query to properly instantiate inline and then evaluate the boolean.
>Solution :
Other similar option, create function get_foo(var)
and call it:
import pandas as pd
class foo:
def __init__(self, var):
self.v = var
def get_foo(var):
return foo(var)
q = "@get_foo(col1).v > 1"
df = pd.DataFrame({"col1": [1, 2]})
x = df.query(q)
print(x)
Prints:
col1
1 2