Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Grouping OHLC data by day using groupby

I’m reading minute data from a sqlite db into a dataframe where the index is a datetime object:

                             open    high     low   close  volume  trade_count        vwap ticker
index                                                                                            
2022-09-13 04:26:00+00:00  163.50  163.50  163.50  163.50   298.0         12.0  163.503255   AAPL
2022-09-13 04:45:00+00:00  163.50  163.50  163.50  163.50   727.0          1.0  163.500000   AAPL
2022-09-13 05:16:00+00:00  163.43  163.43  163.43  163.43   202.0          4.0  163.430000   AAPL
2022-09-13 05:44:00+00:00  163.50  163.50  163.50  163.50   121.0          2.0  163.499587   AAPL
2022-09-13 05:45:00+00:00  163.46  163.46  163.46  163.46   200.0          2.0  163.460000   AAPL
...                           ...     ...     ...     ...     ...          ...         ...    ...
2022-09-14 19:57:00+00:00   99.73   99.73   99.69   99.69  1273.0         18.0   99.693425   ZROZ
2022-09-14 19:58:00+00:00   99.69   99.69   99.66   99.69  1114.0         11.0   99.686965   ZROZ
2022-09-14 19:59:00+00:00   99.69   99.82   99.69   99.76  9764.0         76.0   99.736332   ZROZ
2022-09-14 20:00:00+00:00   99.76   99.76   99.76   99.76  2168.0          1.0   99.760000   ZROZ
2022-09-14 20:33:00+00:00   99.96   99.96   99.96   99.96   150.0          4.0   99.968667   ZROZ

[317028 rows x 8 columns] df

I want to split this massive dataframe into bits, grouped by the ticker, and the day. When I try the following method:

table = df.groupby(pd.Grouper(key='index', freq='1D'))

I get the error:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

    raise KeyError(f"The grouper name {key} is not found")
KeyError: 'The grouper name index is not found'

When I change the key to:

table = df.groupby(pd.Grouper(key=df.index, freq='1D'))

I get the error:

    if getattr(self._gpr_index, "name", None) == key and isinstance(
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

How can I group by ticker and by day?

>Solution :

Because key parameter is for column name you can omit it:

table = df.groupby(pd.Grouper(freq='1D'))

Or use level parameter:

table = df.groupby(pd.Grouper(level='index', freq='1D'))

Or convert index to column (in my opinion overcomplicated):

table = df.reset_index().groupby(pd.Grouper(key='index', freq='1D'))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading