Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to swap multiple values within a row based on a conditional in pandas

Description of my problem

I pulled some data from an api that is unfortunately formatted. In particular, there are four columns I need to utilize in some way to solve my issue. They are label_1, odds_1, label_2, and line_2. The odds_ columns can consist of either Over or Under. Ideally, I’d want one of the odds_ columns to consist of exclusively Over while the other consists of exclusively Under. But alas, that is not how the data was formatted, so I am tasked with doing this myself. Ideally, label_ would be all Over while label_2 would be all under.

Here is a visual of my problem:

My data

odds_1  label_1 line_1  odds_2  label_2 line_2                      
-165    Under   3.5      130     Over    3.5
-137    Under   2.5      108     Over    2.5
-104    Over    10.5    -122    Under   10.5
-117    Over    26.5    -109    Under   26.5
 100    Over    2.5     -125    Under    2.5
-117    Over    14.5    -109    Under   14.5

My desired output

odds_1  label_1 line_1  odds_2  label_2 line_2                      
 130    Over    3.5     -165    Under    3.5
 108    Over    2.5     -137    Under    2.5
-104    Over    10.5    -122    Under   10.5
-117    Over    26.5    -109    Under   26.5
 100    Over    2.5     -125    Under    2.5
-117    Over    14.5    -109    Under   14.5

What I have tried

  • I tried using pandas.DataFrame.at, but that poses multiple problems. 1. I don’t believe there is a way to vectorize it, and 2. It doesn’t address the necessary swap of the odds_ values.

    MEDevel.com: Open-source for Healthcare and Education

    Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

    Visit Medevel

  • I have also tried numpy.where. This fixes the vectorization issue, but does not address the swap of the odds_ columns (at least the way I’m currently doing it). I haven’t discovered a way to nest np.where to make this work, but maybe I’m missing something

df = np.where(df.label_1 == 'Under', 'Over', df.label_1)

odds_1  label_1 line_1  odds_2  label_2 line_2                      
-165    Over    3.5      130    Under    3.5
-137    Over    2.5      108    Under    2.5
-104    Over    10.5    -122    Under   10.5
-117    Over    26.5    -109    Under   26.5
 100    Over    2.5     -125    Under    2.5
-117    Over    14.5    -109    Under   14.5
  • I have also tried pandas.DataFrame.iterrows but can’t quite get that to work either. Still not sure how to address the swap of the odds_ columns here.
for idx, row in df.iterrows():
  row.label_1 = 'Over' where some conditional?

Hope I gave enough info. Any help would be greatly appreciated. Thanks!

>Solution :

You can swap values like a, b = b, a using boolean mask to select right rows (the 2 first):

m = df['label_1'] == 'Under'
cols1 = ['odds_1', 'label_1', 'line_1']
cols2 = ['odds_2', 'label_2', 'line_2']
df.loc[m, cols1], df.loc[m, cols2] = df.loc[m, cols2].values, df.loc[m, cols1].values

Output:

>>> df
   odds_1 label_1  line_1  odds_2 label_2  line_2
0     130    Over     3.5    -165   Under     3.5
1     108    Over     2.5    -137   Under     2.5
2    -104    Over    10.5    -122   Under    10.5
3    -117    Over    26.5    -109   Under    26.5
4     100    Over     2.5    -125   Under     2.5
5    -117    Over    14.5    -109   Under    14.5
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading