Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Counting how many times one value is in the same series of another df

I’ve been having some difficulty seeing how many time one value exists in the same column but of another df. Here’s what I’m working with:

data1 = {
    "Col 1": ['a','b','c'],
    "Col 2": [1,2,3]
}

data2 = {
    "Col 1": ['f','a','b'],
    "Col 2": [4,5,6]
}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)

df1
df2

Now I have this block of code to see how many times a value in col 1 in df1 also appears in col 1 of df2. (I only have a col 2 because I want to show how I want to iterate without using iterrows):

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

count = 0 #Setting the variable count
for row in df1['Col 1']: #Iterating through each row.
    if row in df2['Col 1']:
        count += 1 #Increasing count by 1 every time there's a repeated value.
print("Count:", count) 

When I run this, my count returns as 0, when it should be 2, since both df1[‘Col 1’] and df2[‘Col 2’] both share ‘a’, and ‘b’.

I’m sure this is a minor error, but I’d appreciate a nudge in the right direction. Thanks!

>Solution :

When trying to check if a column contains a certain value in pandas, you need to add .values. Without it, you are creating a splice of the dataframe with the column and the indices, like so:

print(df2['Col 1'])

Print of spliced dataframe

Changing this to print(df2['Col 1'].values) yields a list of the contents of the column: ['f' 'a' 'b'], allowing your if statement to find your strings in that list.
Therefore, updating your code to:

count = 0 #Setting the variable count
for row in df1['Col 1']: #Iterating through each row.
    if row in df2['Col 1'].values:
        count += 1 #Increasing count by 1 every time there's a repeated value.
print("Count:", count) 

Prints out: Count:2, your expected answer. More info on finding values in columns can be found here.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading