Home How to count letter based similarity on pandas dataframe

Questions

How to count letter based similarity on pandas dataframe

April 8, 2022

Here’s my first dataframe df1

Id   Text
1    dFn
2    fiqe
3    raUw

Here’s my second dataframe df2

Id   Text
1    yuw
2    dnag

Similarity Matrix, columns is Id from df1, rows is Id from df2

       1      2      3
1      0      0   0.66  
2    0.5      0   0.25

Note:

0 value in (1,1), (2,1) and (3,2) because no letter similar

0.25 value in (3,1) is because of only 1 letter from raUw avaliable in 4 letter `dnag’ (1/4 equals 0.25)

0.5 is counted because of 2 of 4 letter similar

0.66 is counted because of 2 of 3 words similar

>Solution :

IIUC, one option is to use set.intersection in a nested list comprehension:

out = pd.DataFrame([[len(set(x.lower()) & set(y.lower())) / len(x) for y in df1['Text'].tolist()] for x in df2['Text'].tolist()])

Output:

     0    1         2
0  0.0  0.0  0.666667
1  0.5  0.0  0.250000

similarity

byMR

Published April 08, 2022

Add a comment

Program is skipping inner loop

byMR

April 8, 2022

Questions

Assigning Unique Colors To Multiple Lines on a Graph

byMR

April 8, 2022

Questions

Java inheritance questions

byMR

April 8, 2022

Questions

Reading html content with without escaping?

byMR

April 8, 2022

Questions

How do you set an attribute value that is a JSON literal within a DOM

byMR

April 8, 2022

Questions

Deserialize JSON Date to C# Object with DateTime Returning 01/01/0001

byMR

April 8, 2022

How to count letter based similarity on pandas dataframe

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Program is skipping inner loop

Assigning Unique Colors To Multiple Lines on a Graph

Java inheritance questions

Reading html content with without escaping?

How do you set an attribute value that is a JSON literal within a DOM

Deserialize JSON Date to C# Object with DateTime Returning 01/01/0001

Keep Up to Date with the Most Important News

How to count letter based similarity on pandas dataframe

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Program is skipping inner loop

Assigning Unique Colors To Multiple Lines on a Graph

Java inheritance questions

Reading html content with without escaping?

How do you set an attribute value that is a JSON literal within a DOM

Deserialize JSON Date to C# Object with DateTime Returning 01/01/0001

Discover more from Dev solutions