Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Converting a column of strings with different formats that contains year from 2 digits (YY) to 4 digits (YYYY) in python pandas

I have a dataframe with the following column. Each row contains different format strings.

col                  |
----------------------
GRA/B
TPP
BBMY
...
SOCBBA 0 MAX
CMBD 0 MAX
EPR 5.75 MAX
...
PMUST 5.57643 02/15/34
LEO 0 12/30/2099
RGB 3.125 09/15/14
RGB 3.375 04/15/20

I want to convert all the dates to a format that shows the full year.

Is there a way to regex this so that it looks like this.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

col                  |
----------------------
GRA/B
TPP
BBMY
...
SOCBBA 0 MAX
CMBD 0 MAX
EPR 5.75 MAX
...
PMUST 5.57643 02/15/2034
LEO 0 12/30/2099
RGB 3.125 09/15/2014
RGB 3.375 04/15/2020

Right now the only thing I can think of doing is doing,

df['col'] = df['col'].str.replace('/14', '/2014')

for each year, but theres many years, also it will replace the days and months as well.

How can I achieve this properly, should I be using regex?

>Solution :

what about replacing when it "ends with a slash followed by 2 digits"?

In [9]: df["col"] = df["col"].str.replace(r"/(\d{2})$", r"/20\1", regex=True)

In [10]: df
Out[10]:
                         col
0                      GRA/B
1                        TPP
2                       BBMY
3                        ...
4               SOCBBA 0 MAX
5                 CMBD 0 MAX
6               EPR 5.75 MAX
7                        ...
8   PMUST 5.57643 02/15/2034
9           LEO 0 12/30/2099
10      RGB 3.125 09/15/2014
11      RGB 3.375 04/15/2020

regex:

  • /: a literal forward slash
  • (\d{2}): capture 2 digits
  • $: end of string

substituter:

  • /20: literally forward slash and 20
  • \1: first capturing group in the regex, i.e., the last 2 digits there
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading