Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Find columns negative value in another column – dataframe

I have this code:

test = {"number": ['1555','1666','1777', '1888', '1999'],
        "order_amount": ['100.00','200.00','-200.00', '300.00', '-150.00'],
        "number_of_refund": ['','','1666', '', '1888']
    }

df = pd.DataFrame(test)

Which returns the following dataframe:

  number order_amount number_of_refund
0   1555       100.00                 
1   1666       200.00                 
2   1777      -200.00             1666
3   1888       300.00                 
4   1999      -150.00             1888    

I want to remove order and order refund entries if:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  • "number_of_refund" matches a number from "number" column (there might not be a number of order in the dataframe if order was made last month and refund during the current month)
  • amount of "number_of_refund" (which was matched to "number") has a negative amount of "number" amount (in this case number 1666 has 200, and refund of 1666 has -200 so both rows should be removed)

So the result in this case should be:

number order_amount number_of_refund
0   1555       100.00                 
3   1888       300.00                 
4   1999      -150.00           1888                            

How do I check if amount of one column’s value is in another column but with opposite amount (negative)?

>Solution :

IIUC, you can use a boolean indexing approach:

# ensure numeric values
df['order_amount'] = pd.to_numeric(df['order_amount'], errors='coerce')

# is the row a refund?
m1 = df['number_of_refund'].ne('')
# get mapping of refunds
s = df[m1].set_index('number_of_refund')['order_amount']

# get reimbursements and find which ones will equal the original value
reimb = df['number'].map(s)
m2 = reimb.eq(-df['order_amount'])
m3 = df['number_of_refund'].isin(df.loc[m2, 'number'])

# keep rows that do not match any m2 or m3 mask
df = df[~(m2|m3)]

output:

  number  order_amount number_of_refund
0   1555         100.0                 
3   1888         300.0                 
4   1999        -150.0             1888
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading