I was trying to work with dataframe that looks like
| home | away | home_score | away_score |
|---|---|---|---|
| Tampa Bay | Colorado | 3 | 1 |
| San Jose | Colombus | 1 | 3 |
| New England | San Jose | 1 | 5 |
| Colorado | Tampa Bay | 2 | 0 |
| New England | KC Wizards | 2 | 1 |
My goal is to compare ‘home_score’ with ‘away_score’ and choose the string from ‘home’ or ‘away’ to store that value in to separate column based on which score was lower.
For example, for the first row, as away_score is 1 I should be able to add "Colorado" to a separate column.
Desired outcome:
| home | away | home_score | away_score | lost_team |
|---|---|---|---|---|
| Tampa Bay | Colorado | 3 | 1 | Colorado |
I tried to search for the task but I was not successful in finding methods. I would really appreciate the help!
>Solution :
You can use np.where
df['lost_team'] = np.where(df['home_score'] < df['away_score'], df['home'], df['away'])
print(df)
# Output
home away home_score away_score lost_team
0 Tampa Bay Colorado 3 1 Colorado
1 San Jose Colombus 1 3 San Jose
2 New England San Jose 1 5 New England
3 Colorado Tampa Bay 2 0 Tampa Bay
4 New England KC Wizards 2 1 KC Wizards
If a draw is possible, use np.select:
conds = [df['home_score'] < df['away_score'],
df['home_score'] > df['away_score']]
choices = [df['home'], df['away']]
draw = df[['home', 'away']].agg(list, axis=1)
df['lost_team'] = np.select(condlist=conds, choicelist=choices, default=draw).explode()
df = df.explode('lost_team')
print(df)
# Output
home away home_score away_score lost_team
0 Tampa Bay Colorado 3 1 Colorado
1 San Jose Colombus 1 3 San Jose
2 New England San Jose 1 5 New England
3 Colorado Tampa Bay 2 0 Tampa Bay
4 New England KC Wizards 2 1 KC Wizards
5 Team A Team B 0 0 Team A # Row 1
5 Team A Team B 0 0 Team B # Row 2