I have this pandas data frame which has 65 columns Full month:
ID Name Date 2022-12-1-IN 2022-12-1-OUT 2022-12-2-IN 2022-12-2-OUT ... 2022-12-31-IN 2022-12-31-OUT
n_cols = df.shape[1]
# Create a list of the new column names
new_col_names = []
for i in range(3, n_cols):
if (i - 3) % 2 == 0:
new_col_names.append('2022-12-0' + str(((i - 4) // 2) + 1) + '-IN')
else:
new_col_names.append('2022-12-0' + str(((i - 4) // 2) + 1) + '-OUT')
df.columns = new_col_names
My expected results:
ID Name Date 2022-12-01-IN 2022-12-01-OUT 2022-12-02-IN 2022-12-02-OUT ... 2022-12-31-IN 2022-12-31-OUT
lif new_len != old_len:
---> 70 raise ValueError(
71 f"Length mismatch: Expected axis has {old_len} elements, new "
72 f"values have {new_len} elements"
73 )
ValueError: Length mismatch: Expected axis has 65 elements, new values have 62 elements
What I’m doing wrong?
Or any better solution to replace all my dates with this format yyyy-mm-dd-IN and yyyy-mm-dd-OUT.
Ex: 2022-12-01-IN, 2022-12-01-OUT,… 2022-12-02-IN …
I have a range of dates from 2022-1-1 to 2022-12-1
>Solution :
you can use regex to pad single digits found between two - with a zero:
import re
df.columns = [re.sub(r'(?<=-)(\d)(?=-)', r'0\1', x) for x in df.columns]
or directly:
df.columns = df.columns.str.replace(r'(?<=-)(\d)(?=-)', r'0\1', regex=True)