Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Remove part of a string from pd.to_datetime() unconverted values

I tried to convert a column of dates to datetime using pd.to_datetime(df, format='%Y-%m-%d_%H-%M-%S') but I received the error ValueError: unconverted data remains: .1

I ran:

data.loc[pd.to_datetime(data.date, format='%Y-%m-%d_%H-%M-%S', errors='coerce').isnull(), 'date']

to identify the problem. 119/1037808 dates in the date column have an extra ".1" at the end of them. Other than the ".1", the dates are fine. How can I remove the ".1" from the end of those dates only and then convert the column values to datetime?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Here is an example dataframe that recreates the issue:

import pandas as pd

data = pd.DataFrame({"date" : ["2022-01-15_08-11-00.1","2022-01-15_08-11-30","2022-01-15_08-12-00.1", "2022-01-15_08-12-30"], 
                     "value" : [1,2,3,4]})

I have tried:

data.date = data.date.replace(".1", "")

and

data = data.replace(".1", "")

but these did not remove the ".1". The final result should look like this:

data = pd.DataFrame({"date" : ["2022-01-15_08-11-00","2022-01-15_08-11-30","2022-01-15_08-12-00", "2022-01-15_08-12-30"], 
                         "value" : [1,2,3,4]})

>Solution :

You can use pandas.Series.replace to get rid of the extra dot/number :

data["date"]= pd.to_datetime(data["date"].replace(r"\.\d+", "",
                                                  regex=True),
                             format="%Y-%m-%d_%H-%M-%S")

# Output :

print(data)
print(data.dtypes)

                 date  value
0 2022-01-15 08:11:00      1
1 2022-01-15 08:11:30      2
2 2022-01-15 08:12:00      3
3 2022-01-15 08:12:30      4
date     datetime64[ns]
value             int64
dtype: object

If you don’t want a datetime format, use just data["date"].replace(r"\.\d+", "", regex=True)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading