Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python script to highlight data mismatch from predefined list

I have a predefined list of Service and CI in Excel:

Service CI
Non-Financial Risk Management South Africa Aravo
Non-Financial Risk Management South Africa Business Resilience
Non-Financial Risk Management South Africa Change Risk Management
First Line Control Attestation South Africa Control First
Group Audit Assurance South Africa DigiAud
Group Governance Advisory and Support South Africa Diligent Boardbooks

I get an extract of data from our call logging system which also has the Service and CI columns.
I need to highlight if the Service and CI in the call extract does not match the predefined list.

My code so far works for one Service in the predefined list, I need to figure out how to add the rest of the listed Services in my predefined list. IF I run it as is it works for the Service named Non-Financial Risk Management South Africa but highlights all the other Services in RED.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

from pathlib import Path
import pandas as pd
import numpy as np
import os

extract = Path.cwd() / "/extract.xlsx"
df_extract = pd.read_excel(extract)


m = (df_extract['Service'] == 'Non-Financial Risk Management South Africa') & (df_extract['CI'].isin(['Aravo', 'Business Resilience', 'Change Risk Management']))


(df_extract.style.apply(lambda x: np.where(m, '', 'background-color: red'))
   .to_excel('/output.xlsx', index=False))

I tried adding a 2nd boolean mask but I cant figure out how to integrate with the np.where:

n = (df_extract['Service'] == 'First Line Control Attestation South Africa') & (df_extract['CI'].isin(['Control First']))

>Solution :

I think you need chain both masks by | for bitwise OR:

(df_extract.style.apply(lambda x: np.where(m | n, '', 'background-color: red'))
           .to_excel('/output.xlsx', index=False))

If there is more masks is possible create dictionary and pass to mask for more readable code:

#add more key:values if necessary
d = {'Non-Financial Risk Management South Africa':['Aravo', 'Business Resilience',
                                                   'Change Risk Management'],
     'First Line Control Attestation South Africa':['Control First']}

mask = np.logical_or.reduce([df_extract['Service'].eq(k) &  df_extract['CI'].isin(v)  
                             for k, v in d.items()])

(df_extract.style.apply(lambda x: np.where(mask, '', 'background-color: red'))
           .to_excel('/output.xlsx', index=False))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading