How to wrap a number of pandas operations into a function?

I have a tedious task and I want to automate a little bit of it since many steps are always the same. I do some cleaning and rearranging fo data and the final output is a excel file.

I have put into a function put I get the error "No objects to concatenate". Any ideas why this is the case? When I try the lines one by one outside of the function everthing works fine. I just want to enter the city name like London into the function clean_data_output_excel(cityname) and it should do everything.

def clean_data_output_excel(cityname):
    path = r'C:\Users\testuser\Downloads\Cities\cityname'
    all_files = glob.glob(os.path.join(path, "*.xlsx"))
    cityname = pd.concat((pd.read_excel(f) for f in all_files))
    cityname = cityname.drop(['UpTo'], axis=1)
    cityname['Time'] = pd.to_datetime(cityname['Time'], dayfirst=True)
    cityname = cityname.rename(columns = {'Time': 'Date'})
    cityname = cityname.set_index('Date')
    cityname = cityname.sort_values(by='Date')
    cityname.to_excel('Output/cityname_dailydata.xlsx')

>Solution :

This error is raised by pd.concat because all_files is empty:

>>> pd.concat([])
...
ValueError: No objects to concatenate

Try to use f-strings to evaluate cityname in your path:

def clean_data_output_excel(cityname):
    # Use f-strings here
    path = fr'C:\Users\testuser\Downloads\Cities\{cityname}'
    all_files = glob.glob(os.path.join(path, "*.xlsx"))
    # Don't use cityname as your dataframe variable
    df = pd.concat((pd.read_excel(f) for f in all_files))
    df = df.drop(['UpTo'], axis=1)
    df['Time'] = pd.to_datetime(df['Time'], dayfirst=True)
    df = df.rename(columns = {'Time': 'Date'})
    df = df.set_index('Date')
    df = df.sort_values(by='Date')
    # Use f-strings here too
    df.to_excel(f'Output/{cityname}_dailydata.xlsx')

Leave a Reply