Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Using pandas or numpy – within a group, how can I add data from every row to every row in the group?

I have a dict like below, it represents one horse race. There are many races in the dataset grouped by raceId:

data_orig = {
    'meetingId': [178515] * 6,
    'raceId': [879507] * 6,
    'horseId': [90001, 90002, 90003, 90004, 90005, 90006],
    'position': [1, 2, 3, 4, 5, 6],
    'weight': [51, 52, 53, 54, 55, 56],
}

I want to add every row’s horse specific data to every row. The result should look like this:

data_new = {
    'meetingId': [178515] * 6,
    'raceId': [879507] * 6,
    'horseId_a':[90001, 90002, 90003, 90004, 90005, 90006],
    'position_a':[1, 2, 3, 4, 5, 6],
    'weight_a':[51, 52, 53, 54, 55, 56],
    'horseId_b':[90002, 90003, 90004, 90005, 90006, 90001],
    'position_b':[2, 3, 4, 5, 6, 1],
    'weight_b':[52, 53, 54, 55, 56, 51],
    'horseId_c':[90003, 90004, 90005, 90006, 90001, 90002],
    'position_c':[3, 4, 5, 6, 1, 2],
    'weight_c':[53, 54, 55, 56, 51, 52],
    'horseId_d':[90004, 90005, 90006, 90001, 90002, 90003],
    'position_d':[4, 5, 6, 1, 2, 3],
    'weight_d':[54, 55, 56, 51, 52, 53],
    'horseId_e':[90005, 90006, 90001, 90002, 90003, 90004],
    'position_e':[5, 6, 1, 2, 3, 4],
    'weight_e':[55, 56, 51, 52, 53, 54,],
    'horseId_f':[90006, 90001, 90002, 90003, 90004, 90005],
    'position_f':[6, 1, 2, 3, 4, 5],
    'weight_f':[56, 51, 52, 53, 54, 55],
}

I have tried below, which transposes the dataframe.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

data_orig_df = pd.DataFrame(data_orig)
new_df = pd.DataFrame()
for index, row_i in data_orig_df.iterrows():
    horseId = row_i['horseId']
    row_new = row_i.copy()
    for index, row_j in race_df.iterrows():
        if row_j['horseId']:
            continue
        row_new = pd.merge(row_new, row_j[getHorseSpecificCols()], suffixes=('', row_j['position']))
    new_df = pd.concat([new_df, row_new], axis=1)

Thanks for your help.

>Solution :

You can use to easily roll/index the values:

def roll(g):
    a = g.to_numpy()
    x = np.arange(len(a))
    return pd.DataFrame(a[((x[:,None] + x)%len(a)).ravel()].reshape(len(a), -1),
                        index=g.index,
                        columns=[f'{c}_{i+1}' for i in x for c in g.columns])
    
cols = ['meetingId', 'raceId']

out = (data_orig_df.groupby(cols)
       .apply(lambda g: roll(g.drop(columns=cols)))
       .reset_index(cols)
       )

Output:

   meetingId  raceId  horseId_1  position_1  weight_1  horseId_2  position_2  weight_2  horseId_3  position_3  weight_3  horseId_4  position_4  weight_4  horseId_5  position_5  weight_5  horseId_6  position_6  weight_6
0     178515  879507      90001           1        51      90002           2        52      90003           3        53      90004           4        54      90005           5        55      90006           6        56
1     178515  879507      90002           2        52      90003           3        53      90004           4        54      90005           5        55      90006           6        56      90001           1        51
2     178515  879507      90003           3        53      90004           4        54      90005           5        55      90006           6        56      90001           1        51      90002           2        52
3     178515  879507      90004           4        54      90005           5        55      90006           6        56      90001           1        51      90002           2        52      90003           3        53
4     178515  879507      90005           5        55      90006           6        56      90001           1        51      90002           2        52      90003           3        53      90004           4        54
5     178515  879507      90006           6        56      90001           1        51      90002           2        52      90003           3        53      90004           4        54      90005           5        55
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading