Join two panda dataframe with duplicate value

I have two tables

masterblok:

BLOCKID PLANTINGDATE  PLANTED_HA
   A001    01-JAN-08       13.86
   A002    01-JAN-08       13.24
   A002    31-MAR-18        1.99
   A003    01-JAN-08       14.76
   A003    31-MAR-18        2.48

pest_perperiod: (note that there is FIELDCODE other than A002)

FIELDCODE      PERIOD
     A002  2019-01-01
     A002  2019-02-01
     A002  2019-03-01
     A002  2019-04-01
     A002  2019-05-01

I want to join two dataframe so that for each data in pest_perperiod will have one or many corresponding PLANTINGDATE (like cross join in SQL) so I can calculate retention rate since active month for each BLOCKID and PLANTINGDATE

i tried using: (vice versa)

pest_perperiod.join(masterblok.set_index('BLOCKID'), on='FIELDCODE')

returned error because duplicated value still exist, how to do this?

>Solution :

I think you just want merge

pest_perperiod.merge(masterblock, left_on='BLOCKID', right_on='FIELDCODE')

output:

  BLOCKID PLANTINGDATE  PLANTED_HA FIELDCODE      PERIOD
0    A002    01-JAN-08       13.24      A002  2019-01-01
1    A002    01-JAN-08       13.24      A002  2019-02-01
2    A002    01-JAN-08       13.24      A002  2019-03-01
3    A002    01-JAN-08       13.24      A002  2019-04-01
4    A002    01-JAN-08       13.24      A002  2019-05-01
5    A002    31-MAR-18        1.99      A002  2019-01-01
6    A002    31-MAR-18        1.99      A002  2019-02-01
7    A002    31-MAR-18        1.99      A002  2019-03-01
8    A002    31-MAR-18        1.99      A002  2019-04-01
9    A002    31-MAR-18        1.99      A002  2019-05-01

Leave a Reply