Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to Divide Dataset based on repeated Column Values in Pandas

I have a dataset like this:

df = pd.DataFrame({'customer_key': [13453, 16345, 14643, 15346, 13453],
                   'data_purchased': ['08-07-2021','06-07-2021','05-09-2020','02-09-2021','01-04-2020'],
                   'price_value': [56, 45, 29, 22, 43]})

Glimpse of the dataset:

customer_key  data_purchased   price_value

13453          08-07-2021         56
16345          06-07-2021         45
14643          05-09-2020         29
15346          02-09-2021         22
13453          01-04-2020         43

I want to divide this dataset into 2 different datasets. First dataset will contain only those rows where customer_keys values are repeated (ex- customer_key = 13453 in above dataset). and the second dataset contains only those rows where there is no repeated values of customer_key. How to do that in pandas ??

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Here’s one approach:

repeat_customer_mask = df.groupby("customer_key")["customer_key"].transform("size") > 1

Then you can split your DataFrame like so:

In [6]: repeat_customers = df[repeat_customer_mask]

In [7]: repeat_customers
Out[7]:
   customer_key date_purchased  price_value
0         13453     08-07-2021           56
4         13453     01-04-2020           43

In [8]: first_time_customers = df[~repeat_customer_mask]

In [9]: first_time_customers
Out[9]:
   customer_key date_purchased  price_value
1         16345     06-07-2021           45
2         14643     05-09-2020           29
3         15346     02-09-2021           22

From there, you’d probably want to .reset_index().

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading