Get some words from a string until there is a vast pattern in pandas

March 6, 2022

I have a pandas dataframe and one of the columns is a string. I only want the first words from that column that are in front of a date (also in string form).
The problem is that I don’t know how much words there are in front of the date.

The string rows of the column looks like the following:

word1 word2 word3 02/08/2022 XXX XXX XXX
word1 04/09/2019 XXX XXX XXX
word1 word2 word3 word4 10/12/2021 XXX XXX XXX
word1 word2 30/11/2022 XXX XXX XXX

So I want only:

word1 word2 word3
word1
word1 word2 word3 word4
word1 word2

The ‘XXX’ stands for words of which I do not know in advance how many there are.

Can someone help me with this problem?

>Solution :

We can use Series.str.split with a regex pattern

s = pd.Series(["word1 word2 word3 02/08/2022 XXX XXX XXX", "word1 04/09/2019 XXX XXX XXX"])

s.str.split("\d{2}/\d{2}/\d{4}").str[0]

0    word1 word2 word3 
1                word1 
dtype: object