Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Is there a Pandas function to compare and group a range of rows that satisfy a value

I have a dataframe dfA like this

chromosome  basepair            
chrA        500      
chrA        1000      
chrA        7000      
chrA        20000      
chrA        23000     
chrA        24000    
chrA        35000         
chrB        13000      
chrB        14000     
chrB        14500 

For each chromosome A position in dfA I would like to scan the basepair column of adjacent chromosome A rows to identify groups with a sequence separation of 5000 basepairs (i.e. 1-5000). Then repeat for chromosome B and write a new dataframe dfB with the list of all groups identified.

The output for dfB should be

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

chromosome  basepair    Group ID            
chrA        500         1     
chrA        1000        1      
chrA        20000       2      
chrA        23000       2     
chrA        24000       2
chrA        23000       3     
chrA        24000       3      
chrB        13000       4
chrB        14000       4 
chrB        14500       4

>Solution :

Assuming you want to change group whenever the value is > 5000, or when it goes backwards:

df['Group ID'] = (~df.groupby('chromosome')['basepair']
                     .diff().between(0, 5000)
                  ).cumsum()

Output:

  chromosome  basepair  Group ID
0       chrA       500         1
1       chrA      1000         1
2       chrA     20000         2
3       chrA     23000         2
4       chrA     24000         2
5       chrA     23000         3
6       chrA     24000         3
7       chrB     13000         4
8       chrB     14000         4
9       chrB     14500         4
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading