Home How to select column range based on partial column names in Pandas Python?

Questions

How to select column range based on partial column names in Pandas Python?

November 6, 2024

I have pandas dataframe and I am trying to select multiple columns (column range starting from Test to Bio Ref). Selection has to start from column Name Test to any column name whose Name starts with Bio. Below is the sample dataframe.

In reality it can contain:

any number of columns before Test column,
any number of columns between Test & Bio Ref like 2,3,4,5 etc.
any number of columns after Bio Ref.
Bio Ref column can contain suffix in it but Bio Ref will be there as start of column name always.

df_chunk = pd.DataFrame({'Waste':[None,None],
              'Test':['something', 'something'],
              '2':[None,None],
              '3':[None,None],
              'Bio Ref':['2-50','15-100'],
              'None':[None,None]})

df_chunk

    Waste   Test           2      3  Bio Ref    None
0   None    something   None    None    2-50    None
1   None    something   None    None    15-100  None

I have tried below codes that work:

df_chunk.columns.str.startswith('Bio')

df_chunk[df_chunk.columns[pd.Series(df_chunk.columns).str.startswith('Bio')==1]]

Issue: But when I try to use them for multiple column Selection then it doesn’t work:

df_chunk.loc[:,'Test':df_chunk.columns.str.startswith('Bio')]

>Solution :

You can creates masks for boolean indexing:

m1 = np.maximum.accumulate(df_chunk.columns=='Test')
# array([False,  True,  True,  True,  True,  True])
m2 = np.maximum.accumulate(df_chunk.columns.str.startswith('Bio')[::-1])[::-1]
# array([ True,  True,  True,  True,  True, False])

# m1 & m2
# array([False,  True,  True,  True,  True, False])

out = df_chunk.loc[:, (m1&m2)]

Or identify the correct names to build a slice:

start = 'Test'
end = next(iter(df_chunk.columns[df_chunk.columns.str.startswith('Bio')]), None)

out = df_chunk.loc[:, slice(start, end)]

Output:

        Test     2     3 Bio Ref
0  something  None  None    2-50
1  something  None  None  15-100

dataframe

byMR

Published November 06, 2024

Add a comment

algorithm for undoing Bresenham lines

byMR

November 6, 2024

Questions

Convert strange Datetime format

byMR

November 6, 2024

Questions

Why is my Angular signal not populating with data from localStorage?

byMR

November 6, 2024

Questions

Is it possible to generate a summary table of another table?

byMR

November 7, 2024

Questions

How to make a ruler to measure "cm"?

byMR

November 7, 2024

Questions

Why is this concept true?

byMR

November 7, 2024

How to select column range based on partial column names in Pandas Python?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

algorithm for undoing Bresenham lines

Convert strange Datetime format

Why is my Angular signal not populating with data from localStorage?

Is it possible to generate a summary table of another table?

How to make a ruler to measure "cm"?

Why is this concept true?

Keep Up to Date with the Most Important News

How to select column range based on partial column names in Pandas Python?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

algorithm for undoing Bresenham lines

Convert strange Datetime format

Why is my Angular signal not populating with data from localStorage?

Is it possible to generate a summary table of another table?

How to make a ruler to measure "cm"?

Why is this concept true?

Discover more from Dev solutions