Home Update pandas cell in one dataframe from looked up value in second dataframe

Questions

Update pandas cell in one dataframe from looked up value in second dataframe

October 24, 2022

Update pandas cell in one Dataframe from looked up value in second Dataframe

I have a case where I need to update a cell in one Dataframe, ‘Stock‘, which holds records of stock on-hand, looking up its value in a second Dataframe, ‘Items‘, which is the table of all items. This is a simplified example of the Dataframes with relevant fields.

Stock

    item_no qty
0   9H.111  101
1   9H.222  230
2   MODEL_B 136
3   9H.444  344
4   MODEL_E 505
5   9H.666  332

Items

    item_no model_no
0   9H.111  MODEL_A
1   9H.222  MODEL_B
2   9H.333  MODEL_B
3   9H.444  MODEL_C
4   9H.555  MODEL_D
5   9H.666  MODEL_E
6   9H.777  MODEL_D
7   9H.888  MODEL_F

The challenge

I have previously done this in PostgreSQL but would like to see if I can do all the processing in Pandas (I plan to link to the PostgreSQL table of items). If we look at the Stock table the item_no column should only have item numbers, see Items Dataframe (table), but sometimes the users put in the model_no instead of the item number. So in the Stock dataframe, row 2 incorrectly has the value MODEL_B.

What’s needed

What is needed to be done is to:

get the value MODEL_B from the item_no column in the Stock dataframe
find that in the model_no column of the Items dataframe
then get the value from the item_no field of the Items dataframe
use that value to replace the (incorrect) model number value in the item_no column of the Stock dataframe

It gets a little more challenging… a model may have more than one part number:

1   9H.222  MODEL_B
2   9H.333  MODEL_B

In this case the ‘highest’ part number, in this case 9H.333, needs to be used. In SQL I use the MAX() operator.

I would like to perform this using ‘set’ operations in pandas (not looping), similar to running a query in SQL. So this would mean (?) joining the two dataframes on the fields stock.item_no <-> items.model_no (?) – I’m not sure how to go about it hence the question marks.

Generate Dataframes

This code will generate the dataframes discussed above.

stock = pd.DataFrame({
    'item_no': ['9H.111', '9H.222', 'MODEL_B', '9H.444', 'MODEL_E', '9H.666'],
    'qty': [101, 230, 136, 344, 505, 332],
})
items = pd.DataFrame({
    'item_no': ['9H.111', '9H.222', '9H.333', '9H.444', '9H.555', '9H.666', '9H.777', '9H.888'],
    'model_no': ['MODEL_A', 'MODEL_B', 'MODEL_B', 'MODEL_C', 'MODEL_D', 'MODEL_E', 'MODEL_D', 'MODEL_F']
})

display(stock)
display(items)

>Solution :

You can use:

# keep max item_no per model_no
# convert to mapping Series
mapper = (items.sort_values(by='item_no')
               .drop_duplicates(subset='model_no', keep='last')
               .set_index('model_no')['item_no']
         )

# identify rows with model_no
m = stock['item_no'].isin(mapper.index)

# replace the values in place
stock.loc[m, 'item_no'] = stock.loc[m, 'item_no'].map(mapper)

updated stock:

  item_no  qty
0  9H.111  101
1  9H.222  230
2  9H.333  136
3  9H.444  344
4  9H.666  505
5  9H.666  332

dataframe

byMR

Published October 24, 2022

Add a comment

std::accumulate won't' peform sum of elements with std::plus functor

byMR

October 24, 2022

Questions

Getting OutOfMemoryError from Nifi encryptconfig command

byMR

October 24, 2022

Questions

overload the overloadedFunc() function

byMR

October 24, 2022

Questions

OpenTelemetry & correlation support for Azure Functions

byMR

October 24, 2022

Questions

C: realloc(): invalid next size for an int array

byMR

October 24, 2022

Questions

Is there an alternative to foreach in Js?

byMR

October 24, 2022

Update pandas cell in one dataframe from looked up value in second dataframe

Update pandas cell in one Dataframe from looked up value in second Dataframe

Stock

Items

The challenge

What’s needed

MEDevel.com: Open-source for Healthcare and Education

Generate Dataframes

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

std::accumulate won't' peform sum of elements with std::plus functor

Getting OutOfMemoryError from Nifi encryptconfig command

overload the overloadedFunc() function

OpenTelemetry & correlation support for Azure Functions

C: realloc(): invalid next size for an int array

Is there an alternative to foreach in Js?

Keep Up to Date with the Most Important News

Update pandas cell in one dataframe from looked up value in second dataframe

Update pandas cell in one Dataframe from looked up value in second Dataframe

Stock

Items

The challenge

What’s needed

MEDevel.com: Open-source for Healthcare and Education

Generate Dataframes

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

std::accumulate won't' peform sum of elements with std::plus functor

Getting OutOfMemoryError from Nifi encryptconfig command

overload the overloadedFunc() function

OpenTelemetry & correlation support for Azure Functions

C: realloc(): invalid next size for an int array

Is there an alternative to foreach in Js?

Discover more from Dev solutions