Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Unit Test: Testing if a dataframe contains specific columns

I’m creating a unit test for some functions, in one of the tests, I would like test whether the new columns were created or not. Therefore I would like to test that the names of certain columns are in the output dataframe df_output from one of the functions. I have a list containing the names of the expected newly created columns List_Match. How can I do that as a unit test ?

A simplified example of my data: 
d = {'ID_EMPLOYEE': [12, 35, 56, 46], 'Number':[0,1,2,30], 'Location_EMPLOYEE':["US","US","Austria","France"], 'Salary':[100,200,100,160]}
df_output=pd.DataFrame(d)

List_Match=["Location_EMPLOYEE","ID_EMPLOYEE"]

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Try this,

assert all([col in df_output.columns for col in List_Match])

Alternative Solution without loop:

assert len(set(List_Match)&set(df_output.columns))==len(set(List_Match))

Explanation:

  1. Check each expected column in output
  2. Perform all to verify everything is present
  3. use assert to test your code
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading