Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Check if a value in a df is in a row after

First of all, my English is not my native language, sorry for my mistakes.

I am looking to automate some tasks done via Excel with Python.
I have a dataframe ordered by date/time, and I want to check if a customer has contacted me again after already having a response.

So I have a dataframe like this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel


|     Date   |     Tel      |
| ---------- | ------------ |
| 01-01-2023 | +33000000001 |
| 01-01-2023 | +33000000002 |
| 01-01-2023 | +33000000003 |
| 02-01-2023 | +33000000002 |
| 02-01-2023 | +33000000004 |

I’d like to add a column TRUE/FALSE if my client has contacted me later :

|     Date     |     Tel      |  Re-contact  | 
| ------------ | ------------ | ------------ |
| 01-01-2023   | +33000000001 |     FALSE    |
| 01-01-2023   | +33000000002 |     TRUE     |
| 01-01-2023   | +33000000003 |     FALSE    |
| 02-01-2023   | +33000000002 |     FALSE    |
| 02-01-2023   | +33000000004 |     FALSE    |

In Excel, I do this action as follows:

COUNTIFS(A2:A$5;A1)>0

And I would get my TRUE/FALSE if the phone number exists further in my list.

I looked at the documentation to see if a value existed in a list, but I couldn’t find a way to see if it existed further down. Also, I’m looking for a quick way to calculate it, as I have 100,000 rows in my dataframe.

# I've tried this so far: 

length = len(df.index) - 1
i = 1

for i in range(i, length):
    print(i)
    for x in df['number']:
    if x in df['number'][[i+1, length]]:
        df['Re-contact'] = 'TRUE'
    else:
        df['Re-contact'] = 'FALSE'
    i += 1

It feels very wrong to me, and my code takes too much time. I’m looking for a more efficient way to perform what I’m trying to do.

>Solution :

Use pandas.DataFrame.duplicated over Tel column to find repeated calls:

df['Re-contact'] = df.Tel.duplicated(keep='last')

         Date          Tel  Re-contact
0  01-01-2023  33000000001       False
1  01-01-2023  33000000002        True
2  01-01-2023  33000000003       False
3  02-01-2023  33000000002       False
4  02-01-2023  33000000004       False
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading