Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Email Validation using Regular Expressions Pandas Dataframe

I would like to do a simple email validation for list import of email addresses into a database. I just want to make sure that there is content before the @ sign, an @ sign, content after the @ sign, and 2+ characters after the ‘.’ . Here is a sample df:

import pandas as pd
import re

errors= {}

data= {'First Name': ['Sally', 'Bob', 'Sue', 'Tom', 'Will'],
     'Last Name': ['William', '', 'Wright', 'Smith','Thomas'],
     'Email Address': ['sally@gmail.co.uk','bobby123@gmail.com','suewright_123@yahoo.gov','tom.smith23@students.wacs.fl.us','']}
df=pd.DataFrame(data)

This is the expression I was using to check for valid emails:

regex = re.compile(r'([A-Za-z0-9]+[.-_])*[A-Za-z0-9]+@[A-Za-z0-9-]+(\.[A-Z|a-z]{2,})+')
def isValid(email):
    if re.fullmatch(regex, email):
      pass
    else:
      return("Invalid email")

This regex is working fine but I am not sure how to easily loop through my entire df email address column. I have tried:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

for col in df['Email Address'].columns:
   for i in df['Email Address'].index:
      if df.loc[i,col] = 'Invalid email'
           errors={'row':i, 'column':col, 'message': 'this is not a valid email address'

I am wanting to write the invalid email to a dictionary titled errors. with the above code I get an invalid error.

>Solution :

According to your description, I’d probably do

df["Email Address"].str.match(r"^.+@.+\..{2,}$")

str.match returns True if the regex matches the string.

The regex is

  • the start of the string ^
  • content before the @ sign .+
  • an @ sign @
  • content after the @ sign .+
  • a dot \.
  • and 2+ characters after the ‘.’ .{2,}
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading