Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Fill date into columns

I have table:

user_id event_type date
001 install 01-01-2021
001 login 02-01-2021
001 login 04-01-2021

I need to create a new column "install", which will be filled with the date from the "date" column for all rows grouped by user_id, while the date must be taken from the event event_type=’install’.

The resulting table should look like this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

user_id event_type date install
001 install 01-01-2021 01-01-2021
001 login 02-01-2021 01-01-2021
001 login 04-01-2021 01-01-2021

>Solution :

You can use groupby.first to get the first install per group, then map to map it per user_id:

# get first install value (if you have duplicates you would need to get the min)
d = df[df['event_type'].eq('install')].groupby(df['user_id'])['date'].first()

# map the values per user_id
df['install'] = df['user_id'].map(d)

output:

   user_id event_type        date     install
0        1    install  01-01-2021  01-01-2021
1        1      login  02-01-2021  01-01-2021
2        1      login  04-01-2021  01-01-2021

As a one liner:

df['install'] = df['user_id'].map(df[df['event_type'].eq('install')]
                                    .groupby(df['user_id'])['date'].first())
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading