Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

how to add a dataframe with another dataframe and add common columns values based on a column?

my first data frame-

    import pandas as pd
    
    df1 = pd.DataFrame({'CONTRACT':['Tom', 'nick', 'krish', 'jack'],
            'buy_qty':[20, 40, 50, 60],
            'buy_val':[100,120,140,160],
            'factor':[15,20,25,30],
            })
    
        CONTRACT    buy_qty buy_val factor
    0   Tom          20      100    96
    1   nick         40      110    93
    2   krish        50      140    91
    3   jack         60      160    83
      

second data frame- 

  

df2 = pd.DataFrame({'CONTRACT': ['Tom', 'nick', 'amit', 'joy'],
            'buy_qty': [5, 6, 15, 30],
            'buy_val': [15, 20, 35, 16],
            'factor': [67, 89, 76, 69]}

df2:
        CONTRACT    buy_qty buy_val factor
    0   Tom           5     15       67
    1   nick          6     20       89
    2   amit          15    35       76
    3   joy           30    16       69

I want dataframe Like this (all CONTRACT values of df2 and uncommon CONTRACT values of df1, by adding common CONTRACTS’s buy_qty and buy_val and factor will be from df2(latest data frame) …)-

        CONTRACT    buy_qty  buy_val  factor
    0   Tom         25*        115*      67
    1   nick        46*        130*      89
    2   krish       50         140       91
    4   jack        60         160       83
    2   amit        15         35        76
    3   joy         30         16        69
* denotes added values

I tried like this-

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

final = pd.concat([df2, df1]).add(df1['buy_qty','buy_val'],df2['buy_qty','buy_val']).drop_duplicates(subset=["CONTRACT"]).reset_index(drop=True)

but it’s not working fine…….

Can anyone please suggest a better way….
Thanks..

>Solution :

Unlike your previous question, here you have to groupby because you want to aggregate some values on your columns:

  • buy_qty: sum
  • buy_val: sum
  • factor: first*

* first because you want to keep the values from df2 and this is the first dataframe on pd.concat.

out = (pd.concat([df2, df1]).groupby('CONTRACT', as_index=False)
         .agg({'buy_qty': 'sum', 'buy_val': 'sum', 'factor': 'first'})
print(out)

# Output
  CONTRACT  buy_qty  buy_val  factor
0      Tom       25      115      67
1     amit       15       35      76
2     jack       60      160      83
3      joy       30       16      69
4    krish       50      140      91
5     nick       46      130      89
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading