I want to change .append to .concat.
I got the following error ValueError: Must pass 2-d input. shape=(1, 2, 2).
I looked already at Error "'DataFrame' object has no attribute 'append'" I tried but unfortunately I got the error above. How do I fix this issue?
I want to add an empty row between every new Market.
Dataframe
Market Values
0 A 1
1 B 2
2 A 3
3 C 4
4 B 5
import pandas as pd
data = {
'Market': ['A', 'B', 'A', 'C', 'B'],
'Values': [1, 2, 3, 4, 5]
}
df_sorted = pd.DataFrame(data)
print(df_sorted)
markets = ['A', 'B', 'C']
appended_df = pd.DataFrame()
# Loop through markets and append rows
for market in markets:
market_rows = df_sorted[df_sorted['Market'] == market]
#appended_df = appended_df.append(market_rows, ignore_index=True) # <--- old
df_sorted = pd.concat([df_sorted, pd.DataFrame([market_rows])], ignore_index=True)
#appended_df = appended_df.append(pd.Series(), ignore_index=True) # <--- old
df_sorted = pd.concat([df_sorted, pd.DataFrame([pd.Series()])], ignore_index=True)
print(appended_df)
[OUT] ValueError: Must pass 2-d input. shape=(1, 2, 2)
What I want
Market Values
0 A 1.0
1 A 3.0
2 NaN NaN
3 B 2.0
4 B 5.0
5 NaN NaN
6 C 4.0
>Solution :
Don’t concat in a loop, use groupby to efficiently split the groups (rather than repeated slicing), collect in a list and concat once after the loop. Use a nested loop to add the intermediate (and discard the last one before concat):
df_sorted = pd.DataFrame(data)
out = pd.concat([x for k, g in df_sorted.groupby('Market', sort=False)
for x in [g, pd.DataFrame(index=[0])]][:-1],
ignore_index=True
)
Output:
Market Values
0 A 1.0
1 A 3.0
2 NaN NaN
3 B 2.0
4 B 5.0
5 NaN NaN
6 C 4.0