Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to compare two consecutive rows grouped by another column?

Using pandas, I have created a dataframe that contains the info about weed severity by field. Every field can contain various types of weed. Records are sorted by [field, weed, date] in order to observe the evolution of weed. Ideally, the severity of every weed should be supressed as the time goes by. In order to detect critical items, I would like to compare two consecutive weed severities, grouped by field and weed type and annotate the supression status as the difference between the two observed severities.

Below is the example data

Field   Date    Weed    Severity
1   2021-06-08  Cirsium arvense 3
1   2021-07-06  Cirsium arvense 3
1   2021-07-27  Cirsium arvense 2
1   2021-08-19  Cirsium arvense 1
1   2021-07-26  Lactuca Serriola    3
1   2021-06-08  Myagrum perfoliatum 2
1   2021-07-06  Sonchus oleraceus   3
1   2021-08-19  Sorghum halepense   2
1   2021-07-26  Xanthium strumarium 2
1   2021-07-27  Xanthium strumarium 2
1   2021-08-19  Xanthium strumarium 1
1   2020-07-08  Euphorbia cyparissias   1
1   2020-07-08  Matricaria chamomilla   1
1   2020-07-08  Polygonum Lapathifolium 1
1   2020-07-08  Sorghum halepense   5
1   2020-07-08  Xanthium strumarium 1
2   2021-09-14  Ambrosia artemisifolia  2
2   2021-05-29  Chenopodium album   1
2   2021-05-29  Cirsium arvense 1
2   2021-09-14  Echinochola crus gali   4
2   2021-05-29  Phragmites Australis    1
2   2021-05-29  Polygonum spp   1
2   2021-05-29  Setaria viridis 2
2   2021-05-29  Sinapis arvensis    1
2   2021-07-06  Sorghum halepense   4
2   2021-09-14  Sorghum halepense   3
2   2021-05-29  Xanthium strumarium 1
2   2021-09-14  Xanthium strumarium 3

On field 1, Cirsium arvense is successfully supressed as the severity decreases, thus the success rate should be (for rows 1 -4) [0, 0, 1, 1].

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

On the other hand, for example, Xanthium strumarium on field 2 (last two rows) is not successfully supressed and thus the values should be [0, -2].

In case of unconditionally comparing consecutive rows, I usually shift the column values and subtract the shifted column from the original. In case of observing the column, depending on other values, I wonder, what is a good way to observe the differences. Say rows 4 and 5 can’t be simply compared as the severity values of different weeds shouldn’t be compared.

>Solution :

Subtract from right side by shifted values by Series.rsub with DataFrameGroupBy.shift:

df['rate'] = (df['Severity'].rsub(df.groupby(['Field','Weed'])['Severity'].shift())
                            .fillna(0, downcast='infer'))

print (df)
    Field       Date                     Weed  Severity  rate
0       1 2021-06-08          Cirsium arvense         3     0
1       1 2021-07-06          Cirsium arvense         3     0
2       1 2021-07-27          Cirsium arvense         2     1
3       1 2021-08-19          Cirsium arvense         1     1
4       1 2021-07-26         Lactuca Serriola         3     0
5       1 2021-06-08      Myagrum perfoliatum         2     0
6       1 2021-07-06        Sonchus oleraceus         3     0
7       1 2021-08-19        Sorghum halepense         2     0
8       1 2021-07-26      Xanthium strumarium         2     0
9       1 2021-07-27      Xanthium strumarium         2     0
10      1 2021-08-19      Xanthium strumarium         1     1
11      1 2020-07-08    Euphorbia cyparissias         1     0
12      1 2020-07-08    Matricaria chamomilla         1     0
13      1 2020-07-08  Polygonum Lapathifolium         1     0
14      1 2020-07-08        Sorghum halepense         5     0
15      1 2020-07-08      Xanthium strumarium         1     0
16      2 2021-09-14   Ambrosia artemisifolia         2     0
17      2 2021-05-29        Chenopodium album         1     0
18      2 2021-05-29          Cirsium arvense         1     0
19      2 2021-09-14    Echinochola crus gali         4     0
20      2 2021-05-29     Phragmites Australis         1     0
21      2 2021-05-29            Polygonum spp         1     0
22      2 2021-05-29          Setaria viridis         2     0
23      2 2021-05-29         Sinapis arvensis         1     0
24      2 2021-07-06        Sorghum halepense         4     0
25      2 2021-09-14        Sorghum halepense         3     1
26      2 2021-05-29      Xanthium strumarium         1     0
27      2 2021-09-14      Xanthium strumarium         3    -2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading