Home Python Pandas – How to Keep All Observations By Selector Variable When Data are in Long Form

Questions

Python Pandas – How to Keep All Observations By Selector Variable When Data are in Long Form

August 8, 2023

I am working in Python 3.11.1. I have data like these stored in a Pandas Dataframe:

ID  Position    Select
1      A          0
2      B          1
2      C          0
3      B          0
3      C          0
4      A          1
5      A          0

Where some ID’s are recorded in multiple rows, but others only appear in a single row. I need to subset this dataset by keeping all single ID rows coded 1 for Select AND keeping ALL multiple ID rows if ANY one of those multiple rows is coded 1 for Select for the same ID. The resulting dataset should look like:

ID  Position    Select
2      B          1
2      C          0
4      A          1

What is the best way to do this?

Ultimately, I then need to covert from long to wide form. Therefore, the final result should be:

ID  Position1   Position2  Select
2      B            C        1
4      A                     1

Thanks in advance.

>Solution :

Try:

u = df.groupby("ID").filter(lambda x: (x.Select == 1).any())  # or just `lambda x: x.Select.any()` if there are only 0/1 values
print(u)

Prints:

   ID Position  Select
1   2        B       1
2   2        C       0
5   4        A       1

To long form:

u["col"] = u.groupby("ID").cumcount() + 1
s = u.groupby("ID")["Select"].any().astype(int)

u = u[["ID", "col", "Position"]].pivot(index="ID", columns="col")
u.columns = [f"{c[0]}{c[1]}" for c in u.columns]
print(pd.concat([u, s], axis=1).fillna("").reset_index())

Prints:

   ID Position1 Position2  Select
0   2         B         C       1
1   4         A                 1

pandas

byMR

Published August 08, 2023

Add a comment

How to access input reference from another file using Unform?

byMR

August 8, 2023

Questions

Trying to understand –x vs x– in C++

byMR

August 8, 2023

Questions

Terraform not recognizing import block when creating SQL Server database

byMR

August 8, 2023

Questions

Strange Mockito verify fail

byMR

August 8, 2023

Questions

Python ruamel.yaml library adds new lines where not expected

byMR

August 8, 2023

Questions

How I find a row inside pandas DataFrame with row data?

byMR

August 8, 2023

Python Pandas – How to Keep All Observations By Selector Variable When Data are in Long Form

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

How to access input reference from another file using Unform?

Trying to understand –x vs x– in C++

Terraform not recognizing import block when creating SQL Server database

Strange Mockito verify fail

Python ruamel.yaml library adds new lines where not expected

How I find a row inside pandas DataFrame with row data?

Keep Up to Date with the Most Important News

Python Pandas – How to Keep All Observations By Selector Variable When Data are in Long Form

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

How to access input reference from another file using Unform?

Trying to understand –x vs x– in C++

Terraform not recognizing import block when creating SQL Server database

Strange Mockito verify fail

Python ruamel.yaml library adds new lines where not expected

How I find a row inside pandas DataFrame with row data?

Discover more from Dev solutions