Advertisements
With the following code, I can see the following table output
query = f''' SELECT timestamp, base FROM pricing WHERE {timestamp_range} LIMIT {max_records}'''
df: DataFrame = read_sql_query(text(query), db)
print(df.head())
print(df.columns)
df.set_index('timestamp', inplace=True) # Error here
# Output
# timestamp base
# 0 2023-02-17 10:25:54.099542 21
# 1 2023-02-17 10:27:54.060627 21
# 2 2023-02-17 10:29:53.581384 22
# 3 2023-02-17 10:31:54.110646 20
# 4 2023-02-17 10:33:53.827830 20
# Index(['timestamp', 'base'], dtype='object')
So it looks like I do have the timestamp
column, but when using set_index()
I get a KeyError: 'timestamp'
. Why is this? Using df.columns[0]
didn’t help either.
For reference, full stack trace log
[2023-02-17 13:50:32,388] ERROR in app: Exception on /api/data/query [GET]
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3802, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 165, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 5745, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 5753, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'timestamp'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.10/site-packages/flask/app.py", line 2525, in wsgi_app
response = self.full_dispatch_request()
File "/home/ubuntu/.local/lib/python3.10/site-packages/flask/app.py", line 1822, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/ubuntu/.local/lib/python3.10/site-packages/flask/app.py", line 1820, in full_dispatch_request
rv = self.dispatch_request()
File "/home/ubuntu/.local/lib/python3.10/site-packages/flask/app.py", line 1796, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "/home/ubuntu/Code/grozd/main.py", line 18, in query
result = getQuery(request.args)
File "/home/ubuntu/Code/grozd/data.py", line 27, in getQuery
df['timestamp'] = df['timestamp'].astype(int).floordiv(1000000).astype(int)
File "/home/ubuntu/.local/lib/python3.10/site-packages/pandas/core/frame.py", line 3807, in __getitem__
indexer = self.columns.get_loc(key)
File "/home/ubuntu/.local/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3804, in get_loc
raise KeyError(key) from err
KeyError: 'timestamp'
>Solution :
From the traceback it looks like the error does not actually occur in the line you highlighted, but later on when you do this:
df['timestamp'] = df['timestamp'].astype(int).floordiv(1000000).astype(int)
The reason this fails is that once you’ve assigned the timestamp
column to the index, it is no longer found among the columns. To fix this you can either perform your transformation first and then set the index, or transform the index directly:
df.index = df.index.astype(int).floordiv(1000000).astype(int)