Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Check if two columns are having matching values, but values are not in the same index places(Python, Pandas)

So, I have this data frame about Super Store Sales. I have 2 sheets:

  • First is named "Orders"
  • Second one is named "Returns"
    In both sheets we have a matching column called "Order ID", but in the Return sheet we have less rows in "Order ID" of returned purchases and what I basically want to do is make a new column and check if Order ids are matching In Order sheet and in Return sheet and if they are matching I want a value "Returned" to be written and if values are not matching "Not returned".
    This is df_order data frame
    orders
    This is df_return
    returns
    This is how i thought it should be checked but it is definitely not correct cause everywhere says "not returned", but I’ve checked manually and seen that some orders are matching. Please, help me out.
excel_path = r'C:\Users\Korisnik\Desktop\PythonFiles\Omega\SuperStoreUS.xlsx'
df = pd.read_excel(excel_path, sheet_name=None)

# 1.
df_order = df.get('Orders')
df_returns = df.get('Returns')
df_users = df.get('Users')

df_n.reset_index(drop=True)
df_returns.reset_index(drop=True)
df_n['Status'] = np.where( df_n['Order ID'].equals(df_returns['Order ID'])  and df_returns["Status"] == "Returned", "Returned", "Not returned")
df_order= {'City':['Prior Lake','Chicago','NY','Prior Lake', 'Round Rock'],
           'Order ID':[86838 ,90154,15000,10000, 12447]}
df_return= {'Order ID':[90154, 86838 ],
           'Returned':['Returned', 'Returned']}

# Create DataFrame from dict
df_orders = pd.DataFrame.from_dict(df_order)
df_returns = pd.DataFrame.from_dict(df_return)

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

You can use pandas.DataFrame.merge with pandas.Series.fillna :

df_order = pd.read_excel("SuperStoreUS.xlsx", sheet_name="Orders")
df_return = pd.read_excel("SuperStoreUS.xlsx", sheet_name="Returns")

Use either :

# --- To create a new dataframe
out = df_order.merge(df_return, on="Order ID", how="left")
out["Status"] = out["Status"].fillna("Not Returned")

Or:

# --- To update df_order
df_order = df_order.merge(df_return, on="Order ID", how="left")
df_order["Status"] = df_order["Status"].fillna("Not Returned")
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading