Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do I convert this pandas dataframe into records?

I am trying to use the yfinance package and I am not good at pandas. I’m just trying to store the data in a DB (without using to_sql, which fails).

The colums would be ticker | date | open | high | low | close | volume

But what pandas provides is not simple to navigate or flatten.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Does anyone know how I could make this such that I can iterate over a simple array of dict, or equally simple?

import yfinance as yf
prices = yf.download("AAPL MSFT", start="2022-04-25", end="2022-05-07")
set_trace()

(Pdb) prices
             Adj Close                   Close                    High                     Low                    Open                 Volume
                  AAPL        MSFT        AAPL        MSFT        AAPL        MSFT        AAPL        MSFT        AAPL        MSFT       AAPL      MSFT
Date
2022-04-25  162.641037  280.720001  162.880005  280.720001  163.169998  281.109985  158.460007  270.769989  161.119995  273.290009   96046400  35678900
2022-04-26  156.569962  270.220001  156.800003  270.220001  162.339996  278.359985  156.720001  270.000000  162.250000  277.500000   95623200  46518400
2022-04-27  156.340302  283.220001  156.570007  283.220001  159.789993  290.970001  155.380005  279.160004  155.910004  282.100006   88063200  63477700
2022-04-28  163.399918  289.630005  163.639999  289.630005  164.520004  290.980011  158.929993  281.459991  159.250000  285.190002  130216800  33646600
2022-04-29  157.418701  277.519989  157.649994  277.519989  166.199997  289.880005  157.250000  276.500000  161.839996  288.609985  131587100  37025000
2022-05-02  157.728256  284.470001  157.960007  284.470001  158.229996  284.940002  153.270004  276.220001  156.710007  277.709991  123055300  35151100
2022-05-03  159.246017  281.779999  159.479996  281.779999  160.710007  284.130005  156.320007  280.149994  158.149994  283.959991   88966500  25978600
2022-05-04  165.776428  289.980011  166.020004  289.980011  166.479996  290.880005  159.259995  276.730011  159.669998  282.589996  108256500  33599300
2022-05-05  156.540009  277.350006  156.770004  277.350006  164.080002  286.350006  154.949997  274.339996  163.850006  285.540009  130525300  43260400
2022-05-06  157.279999  274.730011  157.279999  274.730011  159.440002  279.250000  154.179993  271.269989  156.009995  274.809998  116055700  37748300
(Pdb) prices.index
DatetimeIndex(['2022-04-25', '2022-04-26', '2022-04-27', '2022-04-28',
               '2022-04-29', '2022-05-02', '2022-05-03', '2022-05-04',
               '2022-05-05', '2022-05-06'],
              dtype='datetime64[ns]', name='Date', freq=None)
(Pdb) prices.keys()
MultiIndex([('Adj Close', 'AAPL'),
            ('Adj Close', 'MSFT'),
            (    'Close', 'AAPL'),
            (    'Close', 'MSFT'),
            (     'High', 'AAPL'),
            (     'High', 'MSFT'),
            (      'Low', 'AAPL'),
            (      'Low', 'MSFT'),
            (     'Open', 'AAPL'),
            (     'Open', 'MSFT'),
            (   'Volume', 'AAPL'),
            (   'Volume', 'MSFT')],
           )
(Pdb)

>Solution :

You can stack prices to get the output in the desired shape and select the relevant columns::

out = prices.stack().rename_axis(['Date','Ticker']).reset_index()[['Ticker','Date','Open','High','Low','Close','Volume']]

Output:

   Ticker       Date        Open        High         Low       Close     Volume
0    AAPL 2022-04-25  161.119995  163.169998  158.460007  162.880005   96046400
1    MSFT 2022-04-25  273.290009  281.109985  270.769989  280.720001   35678900
2    AAPL 2022-04-26  162.250000  162.339996  156.720001  156.800003   95623200
3    MSFT 2022-04-26  277.500000  278.359985  270.000000  270.220001   46518400
4    AAPL 2022-04-27  155.910004  159.789993  155.380005  156.570007   88063200
5    MSFT 2022-04-27  282.100006  290.970001  279.160004  283.220001   63477700
6    AAPL 2022-04-28  159.250000  164.520004  158.929993  163.639999  130216800
7    MSFT 2022-04-28  285.190002  290.980011  281.459991  289.630005   33646600
8    AAPL 2022-04-29  161.839996  166.199997  157.250000  157.649994  131587100
9    MSFT 2022-04-29  288.609985  289.880005  276.500000  277.519989   37025000
10   AAPL 2022-05-02  156.710007  158.229996  153.270004  157.960007  123055300
11   MSFT 2022-05-02  277.709991  284.940002  276.220001  284.470001   35151100
12   AAPL 2022-05-03  158.149994  160.710007  156.320007  159.479996   88966500
13   MSFT 2022-05-03  283.959991  284.130005  280.149994  281.779999   25978600
14   AAPL 2022-05-04  159.669998  166.479996  159.259995  166.020004  108256500
15   MSFT 2022-05-04  282.589996  290.880005  276.730011  289.980011   33599300
16   AAPL 2022-05-05  163.850006  164.080002  154.949997  156.770004  130525300
17   MSFT 2022-05-05  285.540009  286.350006  274.339996  277.350006   43260400
18   AAPL 2022-05-06  156.009995  159.440002  154.179993  157.279999  116055700
19   MSFT 2022-05-06  274.809998  279.250000  271.269989  274.730011   37748300

You can also call to_records on the stacked DataFrame for a numpy rec array:

out = prices.stack().rename_axis(['Date','Ticker']).reset_index()[['Ticker','Date','Open','High','Low','Close','Volume']].to_records(index=False).tolist()

[('AAPL',
  1650844800000000000,
  161.1199951171875,
  163.1699981689453,
  158.4600067138672,
  162.8800048828125,
  96046400),
 ('MSFT',
  1650844800000000000,
  273.2900085449219,
  281.1099853515625,
  270.7699890136719,
  280.7200012207031,
  35678900),
...
 ('MSFT',
  1651795200000000000,
  274.80999755859375,
  279.25,
  271.2699890136719,
  274.7300109863281,
  37748300)]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading