Home Validate consecutive numbers in column, based on category in another column

Questions

Validate consecutive numbers in column, based on category in another column

October 9, 2023

I am working with a DataFrame with the following structure:

data = {'alpha_3_id': ['LIA', 'LIA', 'LIA', 'LIA', 'MIL', 'MIL', 'DEA', 'DEA', 'DEA', 'DEA'],
        'id': [1, 2, 3, 4, 1, 2, 1, 2, 3, 4]
       }
df = pd.DataFrame(data)

  alpha_3_id  id
0        LIA   1
1        LIA   2
2        LIA   3
3        LIA   4
4        MIL   1
5        MIL   2
6        DEA   1
7        DEA   2
8        DEA   3
9        DEA   4

I need to validate that for each "alpha_3_id", numbers in column "id" are listed in a consecutive order.

I tried to do this using code below but it only works if all values in column are consecutive but I need to test it for each category (LIA, MIL DEA, etc).

all(j == i + 1 for i, j in zip(list_of_values, list_of_values[1:]))

>Solution :

For test consecutive values use custom lambda function with Series.diff with omit first value and test if all values are 1 by Series.all:

out = df.groupby('alpha_3_id')['id'].apply(lambda x: x.diff().iloc[1:].eq(1).all())
print (out)
alpha_3_id
DEA    True
LIA    True
MIL    True
Name: id, dtype: bool

out = df.groupby('alpha_3_id')['id'].apply(lambda x: x.diff().iloc[1:].eq(1).all()).all()
print (out)
True

Another idea is use DataFrameGroupBy.diff, remove first values per groups by Series.duplicated, test by 1 and last use Series.all:

out = df.groupby('alpha_3_id')['id'].diff()[df['alpha_3_id'].duplicated()].eq(1).all()
print (out)
True

pandas

byMR

Published October 09, 2023

Add a comment

C++ pass iterator vs. pointer to function

byMR

October 9, 2023

Questions

Can't exceed 8175 chars on commands strings in exec, system and shell_exec PHP functions

byMR

October 9, 2023

Questions

Why am I getting cd: no such file or directory error?

byMR

October 9, 2023

Questions

Hourly recurring Ballerina task executes immediately

byMR

October 9, 2023

Questions

Azure B2C: How can I add a button for AAD login in existing policy w/ self asserted template?

byMR

October 9, 2023

Questions

How can I efficiently get multiple slices out of a large dataset?

byMR

October 9, 2023

Validate consecutive numbers in column, based on category in another column

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

C++ pass iterator vs. pointer to function

Can't exceed 8175 chars on commands strings in exec, system and shell_exec PHP functions

Why am I getting cd: no such file or directory error?

Hourly recurring Ballerina task executes immediately

Azure B2C: How can I add a button for AAD login in existing policy w/ self asserted template?

How can I efficiently get multiple slices out of a large dataset?

Keep Up to Date with the Most Important News

Validate consecutive numbers in column, based on category in another column

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

C++ pass iterator vs. pointer to function

Can't exceed 8175 chars on commands strings in exec, system and shell_exec PHP functions

Why am I getting cd: no such file or directory error?

Hourly recurring Ballerina task executes immediately

Azure B2C: How can I add a button for AAD login in existing policy w/ self asserted template?

How can I efficiently get multiple slices out of a large dataset?

Discover more from Dev solutions