Fast conversion of Pandas DataFrame to key->row presentation

I need a key, row index for my Pandas DataFrame where key is the id column of Pandas DataFrame and data is the row data.

The data is sparse – I only need to access data for a few keys, but I do not know ahead of time which keys I need to access.

I am currently doing this using iterrows as:

pair_map = {}
for pair_id, data in df.iterrows():
     pair_map[pair_id] = data

However, for a very large number of rows (~100k-1M), this becomes slow. Would there be any faster ways to create sparse key-row indexes for Pandas, so that access to any row arbitrarily would be fast? Even better if the index is sparse and the data pulled out from Pandas on-demand (though I do not think this is possible).

>Solution :

try this:


I don’t know if you can transpose a df with 1M columns and if you re looking for a dict with values with type pd.Series it is not a the solution

Leave a Reply