Home Extract all entries from a pandas df where the values are the same across all years

Questions

Extract all entries from a pandas df where the values are the same across all years

February 23, 2023

I have a dataframe that looks like this (with many more other countries, this is a sample):

df_dict = {'country': ['Japan','Japan','Japan','Japan','Japan','Japan','Japan', 'Greece','Greece','Greece','Greece','Greece','Greece','Greece'],
           'year': [2016, 2017,2018,2019,2020,2021,2022,2016, 2017,2018,2019,2020,2021,2022],
           'value': [320, 416, 172, 652, 390, 570, 803, 100, 100, 100, 100, 100, 100,100]}

df = pd.DataFrame(df_dict)

I want to extract all the entries where the value is the same across all years. Sometimes it could be 100, sometimes it could be another value, but the example here is with 100.

I’m not really sure how to go about this

The output should look like this.

df_dict2 = {'country': ['Greece','Greece','Greece','Greece','Greece','Greece','Greece'],
           'year': [2016, 2017,2018,2019,2020,2021,2022],
           'value': [100, 100, 100, 100, 100, 100,100]}

df2 = pd.DataFrame(df_dict2)

>Solution :

If you want to know the countries that have the same value across all years use groupby.nunique:

s = df.groupby('country')['value'].nunique()

out = list(s[s.eq(1)].index)

Output: ['Greece']

If you also want the value, go for a groupby.agg with boolean indexing through loc:

(df.groupby('country')['value'].agg(['nunique', 'first'])
   .loc[lambda d: d.pop('nunique').eq(1), 'first']
)

Output:

country
Greece    100
Name: first, dtype: int64

edit: filtering the original DataFrame:

s = df.groupby('country')['value'].nunique()
df[df['country'].isin(s[s.eq(1)].index)]

Or directly:

df[df.groupby('country')['value'].transform('nunique').eq(1)]

Output:

   country  year  value
7   Greece  2016    100
8   Greece  2017    100
9   Greece  2018    100
10  Greece  2019    100
11  Greece  2020    100
12  Greece  2021    100
13  Greece  2022    100