Dictionary update method for DataFrames

Dictionaries have an update method which can be used to add new items based on those of another dictionary. Here it is an example:

d1 = {"asd": 0, "lol": 1}
d2 = {"lol": 2, "foo": 3}
d1.update(d2)
assert d1 == {'asd': 0, 'lol': 2, 'foo': 3}

I am trying to obtain a similar result with DataFrames, this is what I have.

>>> df1 = pd.DataFrame(0, index=range(3), columns=["a"])
>>> df1
    a
0   0
1   0
2   0
>>> df2 = pd.DataFrame(1, index=range(2, 4), columns=["a"])
>>> df2
    a
2   1
3   1

I have tried with update, which gives the following result:

>>> df1.update(df2)
>>> df1
    a
0   0.0
1   0.0
2   1.0

The conversion from int to float is mildly infuriating, but the main problem is that the rows which are present in df2 only are not added. The expected result is:

    a
0   0
1   0
2   1
3   1

>Solution :

Reason is in DataFrame.update now is implemented only left join:

join{‘left’}, default ‘left’

Only left join is implemented, keeping the index and columns of the original object.

If check DataFrame.combine_first:

Update null elements with value in the same location in other.

Combine two DataFrame objects by filling null values in one DataFrame with non-null values from other DataFrame. The row and column indexes of the resulting DataFrame will be the union of the two. The resulting dataframe contains the ‘first’ dataframe values and overrides the second one values where both first.loc[index, col] and second.loc[index, col] are not missing values, upon calling first.combine_first(second).

df2.combine_first(df1)

Leave a Reply