Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Merging two dataframes and updating count and date

I have this dataframe:

         number  total_complaint_count first_complaint_on last_complaint_on  
0    0000000000                     77         2021-10-29        2021-12-05   
1   00000000000                      1         2021-11-12        2021-11-12   
2  000000000000                      1         2021-11-07        2021-11-07   
3   00020056234                      1         2021-11-23        2021-11-23   
4    0002266648                      1         2021-11-02        2021-11-02   
5    0012684202                      1         2021-11-15        2021-11-15   
6    0033774983                      1         2021-11-12        2021-11-12   
7   00420056234                      1         2021-11-23        2021-11-23   
8   00620056234                      1         2021-11-23        2021-11-23   
9    0101276541                      1         2021-11-16        2021-11-16   

And this dataframe:

         number  total_complaint_count first_complaint_on last_complaint_on  
0    0000000000                      3         2021-10-29        2021-15-05   
1   00000000000                      1         2021-11-12        2021-16-12   
2  000000000000                      1         2021-11-07        2021-16-07   

How can I merge them by the number so that the total_complaint_count getts added to the one in the first dataframe and last_complaint_on date is updated with the later date?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Like this:

         number  total_complaint_count first_complaint_on last_complaint_on  
0    0000000000                     80         2021-10-29        2021-15-05   
1   00000000000                      2         2021-11-12        2021-16-12   
2  000000000000                      2         2021-11-07        2021-16-07   
3   00020056234                      1         2021-11-23        2021-11-23   
4    0002266648                      1         2021-11-02        2021-11-02   
5    0012684202                      1         2021-11-15        2021-11-15   
6    0033774983                      1         2021-11-12        2021-11-12   
7   00420056234                      1         2021-11-23        2021-11-23   
8   00620056234                      1         2021-11-23        2021-11-23   
9    0101276541                      1         2021-11-16        2021-11-16   

I tried this approach:

    if df2["number"].isin(df1, "number"):
        df.loc(df2["number"].isin(df1, "number"))

But don’t know how to proceed further.

If the number is not present in the first dataframe, it’s simply added to the rows.

Dict of the first table for reproduction:

{'number': {0: '0000000000', 1: '00000000000', 2: '000000000000', 3: '00020056234', 4: '0002266648', 5: '0012684202', 6: '0033774983', 7: '00420056234', 8: '00620056234', 9: '0101276541'}, 'total_complaint_count': {0: 77, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1}, 'first_complaint_on': {0: '2021-10-29', 1: '2021-11-12', 2: '2021-11-07', 3: '2021-11-23', 4: '2021-11-02', 5: '2021-11-15', 6: '2021-11-12', 7: '2021-11-23', 8: '2021-11-23', 9: '2021-11-16'}, 'last_complaint_on': {0: '2021-12-05', 1: '2021-11-12', 2: '2021-11-07', 3: '2021-11-23', 4: '2021-11-02', 5: '2021-11-15', 6: '2021-11-12', 7: '2021-11-23', 8: '2021-11-23', 9: '2021-11-16'}, 'subject_id': {0: [UUID('5321d243-d4a2-49e0-aa56-d269e613122e'), UUID('f4a91830-d1cd-423d-a1f6-175153a7b040'), UUID('4e9732b2-9b9a-4eaa-9bd6-5e72209b3274'), UUID('cc7027e7-ceb6-42aa-a588-c6b0dfa5bd5a'), UUID('8e340596-f1cf-4251-a61e-7b349b9979db'), UUID('6fb85e65-aa40-42fb-8f0a-2341ada3cc46'), UUID('3c858da6-5360-4fdd-8b5b-d246b1273b6e'), UUID('6618fa6a-0b82-4083-816a-e2e517a7af86'), UUID('e447569a-e070-4977-a6fd-c8c40046b1d6'), UUID('a70c2f71-94b3-4fd8-80b2-36b6e6229582'), UUID('d7a0214e-79ef-4997-8001-1b46f02af85f'), UUID('16687c59-c828-49a1-8a09-37466eab63aa'), UUID('87813a24-bdf3-4283-bb0b-236f152cd990'), UUID('ba6ca856-66b8-4d33-920d-a9e62c934866'), UUID('6eedd713-738b-4dce-8fa6-27667858a5c6'), UUID('92aae302-91b1-46c6-876c-5e580ffe249c'), UUID('c1949065-7f14-4b71-b2da-8c8e8328152f'), UUID('0c7d6553-0e19-41a5-b8c2-a1f69d27ee60'), UUID('f6f822a7-7e8c-423b-b7df-ffb3b73d6384'), UUID('ad058b09-c90c-4914-9fbd-f03f14135b20'), UUID('90a9d8df-fa66-4ea1-bc5b-e74f883e8474'), UUID('7b1d34aa-92b4-4bde-8939-213ab071910c'), UUID('51f313b0-05ee-4eb9-8e75-f9b7388a3f67'), UUID('d5a33682-4e34-40ee-8b57-2269a5e443f2'), UUID('f9fa8a64-c6e2-4aab-94c2-6558543d13f9'), UUID('9812c9cf-5c8d-4898-9253-00c9bb6cf1f4'), UUID('a252c20b-87db-44b7-994f-7dbea59f5ed5'), UUID('2d6972a9-dbc7-41b3-9063-0beecaf3fca4'), UUID('04e95d9d-8fba-4493-a503-921cf4709be0'), UUID('7f180dff-614c-422f-beb9-8c32148dceea'), UUID('61601ddf-7c74-4b58-8595-1aecdf897798'), UUID('eb3a1766-0f07-48ae-8857-a70988ee33cc'), UUID('e71d0b17-4783-4368-901c-f775dfb21959'), UUID('12b10576-253a-459a-a6a4-c9dfd07a76f9'), UUID('55343f62-f89e-4a5b-b7a4-ee3366c94ea0'), UUID('c28f9866-c5e5-4034-8237-abdd7cde2cda'), UUID('071314e7-e63b-4d0c-9c92-20bec9d5d1b5'), UUID('70932cb5-81e2-4ea2-ad2e-1b6a4f54312d'), UUID('c97fec02-5a1d-43e8-b715-6242eea64f5e'), UUID('9d1d89ba-741e-4797-b7c4-01aeb05a9eb0'), UUID('e32346bc-92a4-485b-8979-1400a68310c0'), UUID('2df2a49e-8a2c-4261-b7ea-624e8da43788'), UUID('b73d54dd-0ee5-4c9b-bf24-fcf3da0617f3'), UUID('25f57a95-b36d-49ab-b8b9-42a2ef915a80'), UUID('6f81c96f-8009-4192-8f00-d89c93848f50'), UUID('5b70e8ff-b80c-4ecc-ac63-d3d8a31c6244'), UUID('d1a4d512-ba1b-4361-9aa5-722a67a51f4d'), UUID('cf316e9c-1a7a-4337-b11e-4ba09cdde9a4'), UUID('7d1fb284-284d-4f8c-9e32-6bc5514b5bf3'), UUID('a041ba69-9571-480b-a68c-8c90d56d7666'), UUID('b1209765-f3ee-4823-a1a8-87999a8bd671'), UUID('21ac988d-0d84-49eb-81cf-db88277bd1a7'), UUID('dde67191-3e87-4d1b-a275-00e646ed9b13'), UUID('825f987d-e064-4343-a82d-b94b3e10cf58'), UUID('c076f72f-9835-4b3a-8e70-2659b9e07bc5'), UUID('f79b42aa-7818-4298-b9b4-8a120cdc11e3'), UUID('c430b06b-ddcd-4a6d-b691-1c6c8db272a6'), UUID('567e3013-a81c-4777-b101-280d55d8e54d'), UUID('93c0643b-777f-4a15-94a2-45cc8d9ec62f'), UUID('a807435f-96de-49d4-b1d8-bcac7a174d80'), UUID('565ea841-4f7b-43a4-a11d-863f260b081c'), UUID('b0f079a9-bfae-489a-a789-27216bafcda1'), UUID('afcbcda9-0439-482f-bac2-a61681998227'), UUID('ac62c8a3-5dfe-4fa6-a727-1ef05d227a91'), UUID('65a59510-37b5-4b80-b3fb-f487e925ebff'), UUID('7b4c7b6d-c7ef-492a-89c4-d9a8a2868428'), UUID('b9d27f11-bfc3-468a-86ba-7fb1bbf9d338'), UUID('53355721-a518-45dc-8717-e97cee6b6f6c'), UUID('d28c672a-8a6d-42b5-9849-19b45752ab39'), UUID('8056cd85-1044-4ec5-a8bf-02a88772b404'), UUID('63c511d0-1edb-4f84-bf53-eb75e5ceb9b8'), UUID('f48d2ac3-ff7a-4635-8e3c-c65623322b54'), UUID('b4e5213e-7dcb-4cff-8e30-34a84652c537'), UUID('3335d582-a068-421e-9085-a6cfbbaa7e5c'), UUID('5f440360-07dc-4ef1-ab63-bc452b07e9a8'), UUID('b0e4999c-f7a1-4293-93a9-c5323703f0a4'), UUID('9be20a6b-b185-412a-adad-387bc3f5ab48')], 1: [UUID('fc6421cf-b014-429c-a50e-1fcc25138844')], 2: [UUID('e97eb783-0821-4c97-a3a1-412d9976ace4')], 3: [UUID('220fdb90-61b3-479c-ba3a-3bdb74d3e71a')], 4: [UUID('92d0a2b4-669b-42dc-9d87-06bfb53e6ca6')], 5: [UUID('0c1b8c0c-8eaf-4918-ad87-78f2fa1bbc46')], 6: [UUID('fd3b92b2-80da-4fdc-acd5-9e171358673b')], 7: [UUID('dd0d1b79-6d21-40b3-bea4-77a162687978')], 8: [UUID('46b1ce56-f42a-4105-a1ac-ec50fc867be9')], 9: [UUID('10ebe98c-d0f6-45ca-aa6a-6a9a3bb260b6')]}}

Second table:

{'number': {787: '0000000000', 4391: '0000000000', 694: '0000000000', 1106: '0000000000', 4682: '0000000000', 11223: '0000000000', 773: '0000000000', 1555: '0000000000', 1377: '0000000000', 287: '0000000000'}, 'subject': {787: 'Other', 4391: 'Calls pretending to be government, businesses, or family and friends', 694: 'Warranties  & protection plans', 1106: 'Other', 4682: 'Dropped call or no message', 11223: 'No Subject Provided', 773: 'Warranties  & protection plans', 1555: 'Other', 1377: 'Medical  & prescriptions', 287: 'Other'}, 'subject_id': {787: UUID('5321d243-d4a2-49e0-aa56-d269e613122e'), 4391: UUID('f4a91830-d1cd-423d-a1f6-175153a7b040'), 694: UUID('4e9732b2-9b9a-4eaa-9bd6-5e72209b3274'), 1106: UUID('cc7027e7-ceb6-42aa-a588-c6b0dfa5bd5a'), 4682: UUID('8e340596-f1cf-4251-a61e-7b349b9979db'), 11223: UUID('6fb85e65-aa40-42fb-8f0a-2341ada3cc46'), 773: UUID('3c858da6-5360-4fdd-8b5b-d246b1273b6e'), 1555: UUID('6618fa6a-0b82-4083-816a-e2e517a7af86'), 1377: UUID('e447569a-e070-4977-a6fd-c8c40046b1d6'), 287: UUID('a70c2f71-94b3-4fd8-80b2-36b6e6229582')}}

>Solution :

Use concat with aggregate sum and GroupBy.last with GroupBy.agg:

df = (pd.concat([df1, df2])
        .groupby('number', sort=False, as_index=False)
        .agg({'total_complaint_count':'sum',
              'first_complaint_on':'last',
              'last_complaint_on':'last'}) )
print (df)
         number  total_complaint_count first_complaint_on last_complaint_on
0    0000000000                     80         2021-10-29        2021-15-05
1   00000000000                      2         2021-11-12        2021-16-12
2  000000000000                      2         2021-11-07        2021-16-07
3   00020056234                      1         2021-11-23        2021-11-23
4    0002266648                      1         2021-11-02        2021-11-02
5    0012684202                      1         2021-11-15        2021-11-15
6    0033774983                      1         2021-11-12        2021-11-12
7   00420056234                      1         2021-11-23        2021-11-23
8   00620056234                      1         2021-11-23        2021-11-23
9    0101276541                      1         2021-11-16        2021-11-16
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading