Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How will you split a full name when his/her name consisting of 2 or more words from his/her last name in "PANDAS? Is there an easy way to do it?

enter image description here

I tried to concatenate the two columns firstname and lastname into one columns. Now, how about to split it into two columns when his/her firstname consisting of 2, 3 and more words from his/her lastname? Is there any easy way to do it? I tried using str.split() methods. But it says "columns should be the same length."

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

We can use str.extract here:

df[["firstname", "lastname"]] = df["fullname"].str.extract(r'^(\w+(?: \w+)*) (\w+)$')

The regex pattern used above assigns as many name words from the start as possible, leaving only the final component for the last name. Here is a formal explanation of the regex:

  • ^ from the start of the name
    • ( open first capture group \1
      • \w+ match the first word
      • (?: \w+)* then match space and another word, together zero or more times
    • ) close first capture group
    • match a single space
    • (\w+) match and capture the last word as the last name in \2
  • $ end of the name
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading