I have a large Pandas DataFrame with >100 columns and I would like to select all columns where the substring einkst_l appears in the column name.
In addition, I want to select the two columns name and year.
So far, I could only create two new data frames:
e = 'einkst_l'
df_1 = df.filter(like = e, axis=1).reset_index(drop=True)
df_2 = df.filter(items = ['name', 'year'], axis=1).reset_index(drop=True)
I would like to select all the columns in one shot, but unfortunately ‘like’ and ‘items’ cannot be combined in one statement.
How can I select name + year + all columns containing the specified substring all at once?
>Solution :
This is more fuzzy but you could just use regex match like.
df[df.columns[df.columns.str.contains('einkst_l|name|year')]]
Also, could use ^ or $ to make match exactly for name and year.