Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to create new column in pandas based on result of groupby without needing to use join

In R, I can use a group_by and later create a column based on the result of that group_by without needing to join with another dataframe because the "group_by" function can work with not only summarise, but also mutate.
Example in creating the max_date by each name:

library(tidyverse)
library(lubridate)

df = tibble(
  date = seq(ymd('2020-01-01'), ymd('2020-03-01'), '1 month') %>% rep(3),
  name = c(rep('Romulo', 3), rep('Daniel', 3), rep('Fernando', 3))
)

df %>% 
  group_by(name) %>% 
  mutate(max_date = max(date))

enter image description here

In pandas, as I know, we would have to use "agg" for date, create a new dataframe and than join with the "df" referred before:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import pandas as pd

df = pd.DataFrame({
   'date': ['2020-01-01', '2020-02-01', '2020-03-01'] * 3,
   'name': [
       'Romulo', 'Romulo', 'Romulo',
       'Daniel', 'Daniel', 'Daniel',
       'Fernando', 'Fernando', 'Fernando'
       ]
})

max_date_df = df.groupby(
    'name', as_index=False
).agg({'date': 'max'}).rename({'date': 'max_date'}, axis=1)

df.merge(max_date_df, on='name', how='left')

enter image description here

Is there an easier way to do that in pandas (without the need to join dataframes)?

>Solution :

You can use transform():

df["max_date"] = df.groupby("name")['date'].transform('max')

Output:

         date      name    max_date
0  2020-01-01    Romulo  2020-03-01
1  2020-02-01    Romulo  2020-03-01
2  2020-03-01    Romulo  2020-03-01
3  2020-01-01    Daniel  2020-03-01
4  2020-02-01    Daniel  2020-03-01
5  2020-03-01    Daniel  2020-03-01
6  2020-01-01  Fernando  2020-03-01
7  2020-02-01  Fernando  2020-03-01
8  2020-03-01  Fernando  2020-03-01
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading