I have the following two dataframes:
prod_id land_ids
0 1 [1,2]
1 2 [1]
2 3 [2,3,4]
3 4 [1,3]
4 5 [3,4]
land_id land_desc
0 1 germany
1 2 austria
2 3 switzerland
3 4 italy
Bascially, I want all numbers in column land_ids to individually join the other df.
The result should look something like this:
prod_id land_ids list_land
0 1 [1,2] germany austria
1 2 [1] germany
2 3 [2,3,4] austria switzerland italy
3 4 [1,3] germany switzerland
4 5 [3,4] switzerland italy
Preferrably, the column list_land is one string where the lands are concatenated. But I would also be fine with getting a list as a result.
Any idea on how to do this?
Here is my code for creating the df:
data_prod = {'prod_id': [1,2,3,4,5], 'land_ids': [[1,2],[1],[2,3,4],[1,3],[3,4]]}
prod_df = pd.DataFrame(data_prod)
data_land = {'land_id': [1,2,3,4], 'land_desc': ['germany', 'austria', 'switzerland', 'italy']}
land_df = pd.DataFrame(data_land)
>Solution :
you can use the apply
method:
prod_df['list_land'] = prod_df['land_ids'].apply(lambda x: [land_df.loc[land_df['land_id'] == y]['land_ids'].values[0] for y in x])
In this case, the list_land
column is a list. You can use the following code if you want it to be a string.
prod_df['list_land'] = prod_df['land_ids'].apply(lambda x: ' '.joind([land_df.loc[land_df['land_id'] == y]['land_ids'].values[0] for y in x]))