Advertisements
I tried to split the date column into 3 separate columns
df[['date1', 'date2', 'date3']] = df['Date'].str.split('/')
Here’s the error I’m getting
ValueError: Columns must be same length as key
It works fine when I just type in df[‘Date’].str.split(‘/’) and I was able to confirm that each list has only 3 elements.
I think it might have to do with some of the dates not having the full year, so the length of the column varies but I’m not sure why this would matter here.
df['Date'].value_counts()
22/05/2022 10
26/12/05 10
26/12/08 10
11/05/14 10
12/05/2019 10
..
28/02/05 1
14/09/2015 1
27/09/2015 1
28/09/2015 1
17/08/2015 1
df['Date'].str.len().value_counts()
8.0 4850
10.0 2280
Name: Date, dtype: int64
>Solution :
By default Series.str.split
will return a Series (i.e. a single "column") with lists that contain the split elements).
df['Date'].str.split('/')
0 [22, 05, 2022]
1 [26, 12, 05]
2 [26, 12, 08]
3 [11, 5, 2014]
4 [12, 5, 2019]
5 [28, 02, 05]
6 [14, 09, 2015]
7 [27, 09, 2015]
8 [28, 09, 2015]
9 [17, 08, 2015]
Name: Date, dtype: object
So, the error you are getting is the result of trying to assign this single "column" to three new columns. To fix this, you need to set the expand
parameter to True
. This will expand the result into three different columns, which we can then assign as intended:
df[['date1', 'date2', 'date3']] = df['Date'].str.split('/', expand=True)
df
Date Count date1 date2 date3
0 22/05/2022 10 22 05 2022
1 26/12/05 10 26 12 05
2 26/12/08 10 26 12 08
3 11/5/2014 10 11 5 2014
4 12/5/2019 10 12 5 2019
5 28/02/05 1 28 02 05
6 14/09/2015 1 14 09 2015
7 27/09/2015 1 27 09 2015
8 28/09/2015 1 28 09 2015
9 17/08/2015 1 17 08 2015
Data used
import pandas as pd
data = {'Date': {0: '22/05/2022', 1: '26/12/05', 2: '26/12/08', 3: '11/5/2014',
4: '12/5/2019', 5: '28/02/05', 6: '14/09/2015', 7: '27/09/2015',
8: '28/09/2015', 9: '17/08/2015'},
'Count': {0: 10, 1: 10, 2: 10, 3: 10, 4: 10, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1}}
df = pd.DataFrame(data)