Advertisements
I’m sorry if this is a simple question but I have a csv file with time formats as follows: hh:mm:ss
An extract of the file looks like this:
column_name
00:00:00
01:00:00
02:00:00
03:00:00
...
23:00:00
00:00:00
I have the following regex expression to match all those times
[0-9]{2}[:][0-9]{2}[:][0-9]{2}
My question is how do I get rid of the colon and the seconds (:ss
) essentially changing the format from
hh:mm:ss
to hh:mm
in a python script?
I managed to change all the -
to /
by using this line of code:
df['column_name'] = df['column_name'].str.replace('-', '/')
I tried using this line:
df['column_name'] = [re.sub(r'[0-9]{2}[:][0-9]{2}[:][0-9]{2}', r'[0-9]{2}[:][0-9]{2}', str(x)) for x in df['column_name']]
But this changed all the times to this [0-9]{2}[:][0-9]{2}
I also tried just using slicing such as [:-3]
but I could not get it to work:
df['column_name'] = [re.sub(r'[0-9]{2}[:][0-9]{2}[:][0-9]{2}', [:-3], str(x)) for x in df['column_name']]
Any help would be much appreciated, Thank you
>Solution :
You can slice the string with str:
df['column_name'] = df['column_name'].str[:-3]
Or:
df['column_name'] = df['column_name'].str.rsplit(':', 1).str[0]