Home how to add a dataframe with another dataframe and add common columns values based on a column?

Questions

how to add a dataframe with another dataframe and add common columns values based on a column?

May 27, 2022

my first data frame-

    import pandas as pd
    
    df1 = pd.DataFrame({'CONTRACT':['Tom', 'nick', 'krish', 'jack'],
            'buy_qty':[20, 40, 50, 60],
            'buy_val':[100,120,140,160],
            'factor':[15,20,25,30],
            })
    
        CONTRACT    buy_qty buy_val factor
    0   Tom          20      100    96
    1   nick         40      110    93
    2   krish        50      140    91
    3   jack         60      160    83
      

second data frame- 

  

df2 = pd.DataFrame({'CONTRACT': ['Tom', 'nick', 'amit', 'joy'],
            'buy_qty': [5, 6, 15, 30],
            'buy_val': [15, 20, 35, 16],
            'factor': [67, 89, 76, 69]}

df2:
        CONTRACT    buy_qty buy_val factor
    0   Tom           5     15       67
    1   nick          6     20       89
    2   amit          15    35       76
    3   joy           30    16       69

I want dataframe Like this (all CONTRACT values of df2 and uncommon CONTRACT values of df1, by adding common CONTRACTS’s buy_qty and buy_val and factor will be from df2(latest data frame) …)-

        CONTRACT    buy_qty  buy_val  factor
    0   Tom         25*        115*      67
    1   nick        46*        130*      89
    2   krish       50         140       91
    4   jack        60         160       83
    2   amit        15         35        76
    3   joy         30         16        69
* denotes added values

I tried like this-

final = pd.concat([df2, df1]).add(df1['buy_qty','buy_val'],df2['buy_qty','buy_val']).drop_duplicates(subset=["CONTRACT"]).reset_index(drop=True)

but it’s not working fine…….

Can anyone please suggest a better way….
Thanks..

>Solution :

Unlike your previous question, here you have to groupby because you want to aggregate some values on your columns:

buy_qty: sum
buy_val: sum
factor: first*

* first because you want to keep the values from df2 and this is the first dataframe on pd.concat.

out = (pd.concat([df2, df1]).groupby('CONTRACT', as_index=False)
         .agg({'buy_qty': 'sum', 'buy_val': 'sum', 'factor': 'first'})
print(out)

# Output
  CONTRACT  buy_qty  buy_val  factor
0      Tom       25      115      67
1     amit       15       35      76
2     jack       60      160      83
3      joy       30       16      69
4    krish       50      140      91
5     nick       46      130      89