Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python: Adding two pandas dataframes together – Specific Columns

My question is an expansion of the one answered here:

Adding two pandas dataframes

Assuming the same dataframes but with a new column containing strings.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import pandas as pd

df1 = pd.DataFrame([('Dog',1,2),('Cat',3,4),('Rabbit',5,6)], columns=['Animal','a','b'])

df2 = pd.DataFrame([('Dog',100,200),('Cat',300,400),('Rabbit',500,600)], columns=['Animal','a','b']

Using the solution would create this:

df_add = df1.add(df2, fill_value=0)

Out: 

       Animal        a    b
    0  DogDog       101  202
    1  CatCat       303  404
    2  RabbitRabbit 505  606

A potential solution could be just to index the Animal column and then run the .add function and then unindex the animal column again. But is there a more simple way that just adjusts this formula df_add = df1.add(df2, fill_value=0)so that the following solution is given:

Out: 

       Animal     a    b
    0  Dog       101  202
    1  Cat       303  404
    2  Rabbit    505  606

I tried df_add.iloc[:,1:] = df1.iloc[:,1:].add(df2.iloc[:,1:], fill_value=0) and it did not work.

>Solution :

Your question is not fully clear. Pandas performs addition operations after aligning the indexes (index + columns), thus if you want to ensure Dog is added to Dog irrespective of its position, setting the index is the way to go:

key = ['Animal']

out = df1.set_index(key).add(df2.set_index(key), fill_value=0).reset_index()

Output:

   Animal    a    b
0     Dog  101  202
1     Cat  303  404
2  Rabbit  505  606

This will ensure that there is no mismatch. For example:

df1 = pd.DataFrame([('Cat',3,4),('Dog',1,2),('Parrot',8,9)], columns=['Animal','a','b'])
df2 = pd.DataFrame([('Dog',100,200),('Cat',300,400),('Rabbit',800,00)], columns=['Animal','a','b'])
                   
key = ['Animal']
df_add = df1.set_index(key).add(df2.set_index(key), fill_value=0).reset_index()

Output:

   Animal      a      b
0     Cat  303.0  404.0
1     Dog  101.0  202.0
2  Parrot    8.0    9.0
3  Rabbit  500.0  600.0

Now, if your DataFrames are already aligned. I.e. the Animals are in the same order and the DataFrame indices are identical. You could use quick trick to ignore the Animal column: set that of df2 as empty string:

df_add = df1.add(df2.assign(Animal=''))

Output:

   Animal    a    b
0     Dog  101  202
1     Cat  303  404
2  Rabbit  505  606

This is however risky if you’re not fully sure if the animals and DataFrame indices are identical.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading