Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Iterating over 2 columns and comparing similarities in Python

I have a DF that looks like this:

Row      Account_Name_HGI           company_name_Ignite
1        00150042 plc               WAGON PLC
2        01 telecom, ltd.           01 TELECOM LTD
3        0404 investments limited   0404 Investments Ltd

what I am trying to do is to iterate through the Account_Name_HGI and the company_name_Ignite columns and compare the 2 strings in row 1 and provide me with a similarity score. I have got the code that provides the score:

from difflib import SequenceMatcher

def similar(a, b):
     return SequenceMatcher(None, a, b).ratio()

And that brings the similarity score that I want but I am having an issue with the logic on how to create a for loop that will iterate over the 2 columns and return the similarity score. Any help will be appreciated.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Use list comprehension with zipping both columns:

from difflib import SequenceMatcher

df['ratio'] = [SequenceMatcher(None, a, b).ratio()
               for a, b 
               in zip(df['Account_Name_HGI'], df['company_name_Ignite'])]

print (df)
   Row          Account_Name_HGI   company_name_Ignite     ratio
0    1              00150042 plc             WAGON PLC  0.095238
1    2          01 telecom, ltd.        01 TELECOM LTD  0.266667
2    3  0404 investments limited  0404 Investments Ltd  0.818182
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading