how can I select all columns of a dataframe, which partially match strings in a list?

Advertisements

Select all columns in df whose name partially match any of the strings in mylist. MRE:

import pandas as pd

# sample dataframe
df = pd.DataFrame({'foo': [1, 2, 3], 'bar': [4, 5, 6], 'ber': [7, 8, 9]})

# sample list of strings
mylist = ['oo', 'ba']

# desired output
df_out = {'foo': [1, 2, 3], 'bar': [4, 5, 6]}

>Solution :

You can use df.filter with regex to do that.

import pandas as pd

# sample dataframe
df = pd.DataFrame({'foo': [1, 2, 3], 'bar': [4, 5, 6], 'ber': [7, 8, 9]})

# sample list of strings
mylist = ['oo', 'ba']

# join the list to a single string
matches = '|'.join(mylist)

# use regex to filter the columns based on the string
df_out = df.filter(regex=matches)

Leave a ReplyCancel reply