Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Extract text with special characters using regex python

I have a secuence of emails of the form firstname.lastname@gmail.com.

I would like to get firstname, lastname and domain using regex.

I could manage to get the domain, like this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

domain = re.search('@.+', email).group()

but I’m getting problems with firstname and lastname.

Kindly, can you please explain me how to do it.

>Solution :

You need to use parentheses in regular expressions, in order to access the matched substrings. Notice that there are three parentheses in the regular expression below, for matching the first name, last name and domain, respectively.

m = re.match(r'(.*)\.(.*)@(.*)', email)
assert m is not None
firstname = m.group(1)
lastname = m.group(2)
domain = m.group(3)

Two more notes:

  1. You need to escape the dot that separates the first name and the last name, by using a backslash.
  2. It is convenient to use the prefix r to the regular expression string, to avoid duplicating the backslash character.
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading