Fastest operation to combine 2 dataframes with 1D array in each cell to 1 dataframe with 2D array in each cell

I need to create a dataframe with in each cell a 2D array. This 2D array is a combination of one array from dataframe A and one array from dataframe B.

I tried applying a function on dataframe A in which I tried to combine those array combination with numpy. However I’m not sure that this is the fastest/most elegant way. Since I need to combine 2 dataframes of about 200GB this way, I really need the fastest/most elegant solution.

Any suggestions of how to speed this up or to make this a more elegant solution?

Dataframe A:

           col1              col2
0  [1, 2, 3, 4]   [9, 10, 11, 12]
1  [5, 6, 7, 8]  [13, 14, 15, 16]

Dataframe B:

               col1              col2
0  [17, 18, 19, 20]  [25, 26, 27, 28]
1  [21, 22, 23, 24]  [29, 30, 31, 32]

Desired result:

                               col1                                  col2
0  [[1, 2, 3, 4], [17, 18, 19, 20]]   [[9, 10, 11, 12], [25, 26, 27, 28]]
1  [[5, 6, 7, 8], [21, 22, 23, 24]]  [[13, 14, 15, 16], [29, 30, 31, 32]]

Code I used:

import numpy as np
import pandas as pd

def function(x):
  a[x.name] = pd.Series(np.stack([x, b[x.name]], axis=1).tolist(), name=x.name)

data_a = {'col1': [[1, 2, 3, 4], [5, 6, 7, 8]], 'col2': [[9, 10, 11, 12], [13, 14, 15, 16]]}
a = pd.DataFrame(data=data_a)

data_b = {'col1': [[17, 18, 19, 20], [21, 22, 23, 24]], 'col2': [[25, 26, 27, 28], [29, 30, 31, 32]]}
b = pd.DataFrame(data=data_b)

print(a)
print()
print(b)

a.apply(function)

print(a)

>Solution :

Honestly, using arrays/nested lists in a DataFrame is really not why pandas has been designed. Use numpy arrays.

Nevertheless, assuming that the DataFrames are already aligned, you can try:

out = pd.DataFrame(np.dstack([a, b]).tolist(), columns=a.columns)

Output:

                               col1                                  col2
0  [[1, 2, 3, 4], [17, 18, 19, 20]]   [[9, 10, 11, 12], [25, 26, 27, 28]]
1  [[5, 6, 7, 8], [21, 22, 23, 24]]  [[13, 14, 15, 16], [29, 30, 31, 32]]

Leave a Reply