When setting a cell with a numpy 1-dim array, the value that is actually set in the dataframe is a 0-dimensional array. This doesn’t happen when setting multi-element 1-dim arrays:
df = pd.DataFrame({"col1": [np.array([1,2,3]), np.array([4,5,6]), np.array([45])]})
print(df)
for i, arr in df.col1.items():
new_arr = arr + 1
print(f"{new_arr.ndim=}")
df.at[i, 'col1'] = new_arr
print(f"{df.at[i, 'col1'].ndim=}")
Is this a pandas bug?
I’m using pandas 2.2.2 with python Python 3.10.14
pd: I found this because I’m doing something with numpy 1-d arrays with 1 or more strings, and using explode produced some unexpected results.
>Solution :
This is a limitation of pandas, it is not designed to work with nested structures.
What about reassigning to the original array?
for i, arr in df.col1.items():
new_arr = arr + 1
df.at[i, 'col1'][:] = new_arr
print(df)
Output:
col1
0 [2, 3, 4]
1 [5, 6, 7]
2 [46]