I have a pandas dataframe as follows
fruit_name color
apple red
banana yellow
orange orange
Using this dataframe, I want to create a new dataframe as follows
name fruit
Ben ['apple', 'banana', 'orange']
Alex ['apple', 'banana', 'orange']
Jude ['apple', 'banana', 'orange']
May ['apple', 'banana', 'orange']
...
For this, I tried the following
new_df = pd.DataFrame(columns=['name', 'fruit']
new_df['name'] = students['name']
new_df['fruit'] = fruits['fruits_name'].to_list()
I get the following error message
ValueError: Length of values (3) does not match length of index (10)
I want my final result as follows
name fruit
Ben apple
Ben banana
Ben orange
Alex apple
Alex banana
Alex orange
...
I thought I would use the explode function once I have a column of lists. But I am stuck getting the dataframe with the column of lists.
>Solution :
What you are looking for is actually called a cross merge, available from Pandas 1.2.0+:
new_df = students[['name']].merge(fruits[['fruit_name']], how='cross')
Output:
name fruit_name
0 Ben apple
1 Ben banana
2 Ben orange
3 Alex apple
4 Alex banana
5 Alex orange
6 Jude apple
7 Jude banana
8 Jude orange
9 May apple
10 May banana
11 May orange
Note if you’re stuck with earlier version, use the same key to merge:
new_df = (students[['name']].assign(key=1)
.merge(fruits[['fruit_name']].assign(key=1), on='key')
.drop('key', axis=1)
)