Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python How to locate a value in a different column based off two other columns

I have a data set that has 4 columns in it. One with random numbers, one with 1s and 0s, another that breaks it into segments and one that sees if the value is different than the one before it.

Ex Data Set:

Column1   Column2  Column3  Column4
  10         1       1        1
  10         1       1        0
  14         1       1        1
  14         1       1        0
  16         0       0        1
  17         1       2        1
  17         1       2        0
  19         1       2        1
  20         0       0        1
  24         0       0        1
  25         1       3        1
  25         1       3        0
  25         1       3        0
  29         1       3        1
  30         1       3        1
  30         1       3        0

I am trying to get the first and last value of column1 inside of each segment based off of the criteria of column 4. So inside of the segment, the first-time column 4 has a 1, I want that value returned and the last time column 4 has a 1, I want that value returned as well.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

What I want returned:

Column3   Column45(first)  Column6 (last)
  1          10              14
  2          17              19
  3          25              30

What I have tried (does not work):

firstlast= (df.assign(fl=df['Column1'].abs()).groupby([['Column3' , 'Column4']]).size()['Column1'].agg([('min' , 'min'), ('max', 'max')]).add_prefix('Column1')) 

print (firstlast)

Still trying to get use to Python so I am open to new ways of trying to achieve this!

>Solution :

Try this:

df_filtered = df[df['Column4'] == 1]
firstlast = df_filtered.groupby('Column3')['Column1'].agg([('Column5', 'first'), ('Column6', 'last')])
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading