I have a xlsx file and same in csv format, i read them using pandas.read_excel and pandas.read_csv respectively and then store data after reading those files in a dataframe. If i print the stored values, i see no difference, but then when i sort and compare, i am seeing difference between the two values, can someone help?
df = fields_df.sort_values(['register',1], ascending=[False, False])
df1 = fields_df1.sort_values(['register',1], ascending=[False, False])
fields_df and fields_df1 are the two dataframes after reading from xlsx and csv resp.
Before sorting fields_df and fields_df1 when compared, are exactly same. But after sorting i see difference in the output
>Solution :
You can investigate difference of dataframes by:
out = fields_df.compare(fields_df1)
print (out)
Then test rows in out DataFrame, e.g.if different types, or some trailing spaces for strings columns or maybe some datetime columns only in excel DataFrame.
print (fields_df.dtypes)
print (fields_df1.dtypes)