Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

check if my whole year dataset contains all months and days

I have got the 1 year data and I would like to check if it contains observations from every day of every month. Basically to validate if all has been collected.
The dataset contains day, month and year column. My idea was to plot this and see if all days of months are there. I have tried the following

fig, ax = plt.subplots()
ax.plot(earth2019['month'], earth2019['day'])

plt.show()

but the chart doesn’t really confirms what I wanted to know,

My question is how to validate that my data contains all the observations? it should have some observations for each day of each month, I basically want to know if all data has been collected in that dataset.
Is there some way to check this using Python code?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Without a sample, it’s difficult but you can try:

ref19 = pd.date_range('2019', '2020', closed='left', freq='D')
dti19 = pd.to_datetime(earth2019.assign(year=2019)[['year', 'month', 'day']])

out = ref19.difference(dti19)  # missing dates here

Sample output:

>>> out
DatetimeIndex(['2019-02-20', '2019-04-02', '2019-04-13', '2019-04-26',
               '2019-05-08', '2019-07-19', '2019-09-21', '2019-10-09',
               '2019-10-11', '2019-12-22'],
              dtype='datetime64[ns]', freq=None)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading