Here’s the pandas dataframe that I’m using to learn how to do this:
import pandas as pd
test_list = pd.DataFrame()
test_list["Item"] = ["A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K"]
test_list["Number"] = ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11"]
test_list["Combined Numbers"]= ""
Based on that dataframe above, I intend to combine up to 3 numbers, separated by commas.
Following that, I intend to repeat this combined value I now have, for each of the test_list["Item"] and test_list["Number"] involved.
I’ve been scratching my head figuring it out so far. So far I’ve seen examples of groupby() function for situations like combining information based on a given criteria, like a duplicate value from a column. I’m learning to explore if I don’t have anything to refer to, how can I work this out instead?
Here’s my intended goal:
| Item | Number | Combined Numbers |
|---|---|---|
| A | 1 | 1, 2, 3 |
| B | 2 | 1, 2, 3 |
| C | 3 | 1, 2, 3 |
| D | 4 | 4, 5, 6 |
| E | 5 | 4, 5, 6 |
| F | 6 | 4, 5, 6 |
| G | 7 | 7, 8, 9 |
| H | 8 | 7, 8, 9 |
| I | 9 | 7, 8, 9 |
| J | 10 | 10, 11 |
| K | 11 | 10, 11 |
Thank you
>Solution :
In you case as you don’t have this column with the groups, you can generate a range of the length of your dataframe (with np.arange) and do the floor division by 3 (//3). Use groupby.transform to keep the original shape of your data and do the join operation on the column.
test_list["Combined Numbers"] = (
test_list.groupby(np.arange(len(test_list))//3)
['Number'].transform(', '.join)
)
print(test_list)
# Item Number Combined Numbers
# 0 A 1 1, 2, 3
# 1 B 2 1, 2, 3
# 2 C 3 1, 2, 3
# 3 D 4 4, 5, 6
# 4 E 5 4, 5, 6
# 5 F 6 4, 5, 6
# 6 G 7 7, 8, 9
# 7 H 8 7, 8, 9
# 8 I 9 7, 8, 9
# 9 J 10 10, 11
# 10 K 11 10, 11