Passing cached pandas dataframe in python to another cached function give "unhashable type: dataFrame" error

Advertisements

I have three functions, for example:

from cachetools import cached, TTLCache
import pandas as pd


cache=TTLCache(10,1000)
@cached(cache)
def function1():
    df=pd.DataFrame({'one':range(5),'two':range(5,10)})  #just a little data, doesn't matter what
    return df

@cached(cache)
def function2(df):
    var1=df['one']
    var2=df['two']
    return var1, var2

def function3():
    df=function1() 
    var1,var2=function2(df)    #pass df to function 2 for some work
    
    print('this is var1[0]: '+str(var1[0]))
    print('this is var2[0]: '+str(var2[0]))
    
function3()

I want there to be a cached version of df, var1, and var2. Basically, I want to reassign df inside of function3 only if it is not cached, then do the following for var1 and var2, which depend on df. Is there a way to do this? When I remove @cached(cache) from function2 then the code works.

This is the error I get
TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed

>Solution :

Try to use cacheout lib, it worked for me

import pandas as pd
from cacheout import Cache
cache = Cache()


@cache.memoize()
def function1():
    df = pd.DataFrame({'one': range(5), 'two': range(5, 10)})
    return df


@cache.memoize()
def function2(df):
    var1 = df['one']
    var2 = df['two']
    return var1, var2


def function3():
    df = function1()
    var1, var2 = function2(df)

    print('this is var1[0]: ' + str(var1[0]))
    print('this is var2[0]: ' + str(var2[0]))


function3()

Output:

this is var1[0]: 0
this is var2[0]: 5

Leave a Reply Cancel reply