Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Two Dataframes have same values and dtypes but are still not equal under df1.equals(df2)

I am testing new functionality for a project im on and I can’t get the test to run with the new functionality. The code is functioning as intended and the outputs have identical values and dtypes as far as I can tell. What is going on here?

Sorry for a lot of code here, but this is as minimal as I think I can make it. It should run just fine in a jupyter notebook cell or .py file.

import pandas as pd
import numpy as np
from datetime import date as dtm_date

def days_since_dec_30_1899(input_date:pd._libs.tslibs.timestamps.Timestamp) -> str:
    date_list = [int(i) for i in str(input_date).split(' ')[0].split('-')]
    return str((dtm_date(date_list[0], date_list[1], date_list[2]) - 
                dtm_date(1899, 12, 30)).days)

def create_combo_npi_and_date_col(data:pd.core.frame.DataFrame,
                                  date_name:str = 'Date'
) -> pd.core.frame.DataFrame:
    temp = data.copy()
    days_since_1899 = temp[date_name].apply(lambda date: str(days_since_dec_30_1899(date)))
    
    # deals with float->str issue of having .0 at the end
    if str(temp['NPI'].dtype) == 'float64':
        temp['NPI'] = temp['NPI'].astype('Int64')
    
    temp['Combo (NPI & Date)'] = temp['NPI'].astype('str') + days_since_1899
    temp['Combo (NPI & Date)'] = temp['Combo (NPI & Date)']\
                                 .apply(lambda x : x if '<NA>' not in x else np.nan)
    temp['NPI'] = temp['NPI'].astype('str')
    
    return temp


import unittest
from datetime import date

class Test_Methods(unittest.TestCase):
    '''A test class for the methods in Main.py'''
    def test_create_combo_npi_and_date_col(self):
        '''Tests the create_combo_npi_and_date_col method.'''
        # Test input.
        input = pd.DataFrame([[pd._libs.tslibs.timestamps.Timestamp('2022-08-11 00:00:00'), '1234567890'],
                                   [pd._libs.tslibs.timestamps.Timestamp('2022-07-14 00:00:00'), '0987654321']],
                                  columns=['Date', 'NPI'])

        # Test ground truth output.
        output = \
        pd.DataFrame([[pd._libs.tslibs.timestamps.Timestamp('2022-08-11 00:00:00'), '1234567890', '123456789044784'],
                      [pd._libs.tslibs.timestamps.Timestamp('2022-07-14 00:00:00'), '0987654321', '098765432144756']],
                      columns=['Date', 'NPI', 'Combo (NPI & Date)'])
        
        print(type(str(output.NPI.dtype)))
        print()
        print(output)
        print()
        print(create_combo_npi_and_date_col(input))
        print()
        print(output.compare(create_combo_npi_and_date_col(input)))
        print()
        print(output.dtypes)
        print(create_combo_npi_and_date_col(input).dtypes)
        self.assertTrue(output.equals(create_combo_npi_and_date_col(input)), 'test 1 has failed') # test 1

        # tests if floats act properly

        input['NPI'] = input['NPI'].astype('float64')

        self.assertTrue(output.equals(create_combo_npi_and_date_col(input)), 'test 2 has failed') # test 2


unittest.main(argv=[''], verbosity=2, exit=False)

Some of the docstring stuff is private, so I am sorry there are no docs. Can someone tell my why my tests fail?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Edit: import statements.

>Solution :

use the assert_frame_equal function from pandas.testing

assert_frame_equal will give you feedback on what exactly is different. Sometimes it has to do with attributes that aren’t obvious.

from pandas.testing import assert_frame_equal

# ... all of your other code before the assert ... #

    assert_frame_equal(output,create_combo_npi_and_date_col(input))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading