Home Even defining object before calling query for a DataFrame, returns the error name is not defined

Questions

Even defining object before calling query for a DataFrame, returns the error name is not defined

January 24, 2023

def fl_base(df,file_name):
    columns = ['historic_odds_1','historic_odds_2','historic_odds_3','historic_odds_4','odds']
    mean_cols = df[columns].mean(axis=1)
    df = df.query(
        f"\
            (mean_cols > 0) and \
                ((@df['minute_traded']/mean_cols)*100 >= 1000) and \
                    (@df['minute_traded'] >= 1000)\
                        "
    ).reset_index(drop=True)
    return df

name 'mean_cols' is not defined

If mean_cols is being created before calling df.query, why is it saying that it has not defined?

>Solution :

The query documentation states:

You can refer to variables in the environment by prefixing them with an ‘@’ character like @a + b.

Here, mean_cols is your variable and not a column name in df.

df = df.query(
        f"\
            (@mean_cols > 0) and \
                ((minute_traded/@mean_cols)*100 >= 1000) and \
                    (minute_traded >= 1000)\
                        "
    ).reset_index(drop=True)

I believe minute_traded is the column of df and mean_cols as stated earlier is the variable.