I have two dataframe such as :
Tab1
COL1 COL2 COL3
Sp1 False A
Sp2 False B
Sp3 True X
Sp4 True X
Sp5 False A
Sp6 False C
Tab2
COL1 COL2_bis COL3_bis
Sp1 True X
Sp4 True F
Sp6 True X
And I would like to correct the Tab1 COL2 and COL3 columns compared to Tab2 COL2_bis and COL3_bis columns for corresponding COL1 values between the two tabs.
I should then get:
Tab1
COL1 COL2 COL3
Sp1 True X
Sp2 False B
Sp3 True X
Sp4 True F
Sp5 False A
Sp6 True X
Does somene have an idea please ?
Here are the table in dict format if it cna helps:
tab1:
{'COL1': {0: 'Sp1', 1: 'Sp2', 2: 'Sp3', 3: 'Sp4', 4: 'Sp5', 5: 'Sp6'}, 'COL2': {0: False, 1: False, 2: True, 3: True, 4: False, 5: False}, 'COL3 ': {0: 'A', 1: 'B ', 2: 'X ', 3: 'X ', 4: 'A ', 5: 'C '}}
tab2:
{'COL1': {0: 'Sp1', 1: 'Sp4', 2: 'Sp6'}, 'COL2_bis': {0: True, 1: True, 2: True}, 'COL3_bis ': {0: 'X', 1: 'F', 2: 'X'}}
>Solution :
Use DataFrame.update by COL1 with set to indices and remove _bis substrings from columns names:
df1 = df1.set_index('COL1')
df2 = df2.set_index('COL1').rename(columns=lambda x: x.replace('_bis', ''))
df1.update(df2)
print (df1)
COL2 COL3
COL1
Sp1 True X
Sp2 False B
Sp3 True X
Sp4 True F
Sp5 False A
Sp6 True X
df = df1.reset_index()