Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Columns selection on specific text

I want to extract specific columns that contain specific names. Below you can see my data

import numpy as np
import pandas as pd


data = {
         'Names': ['Store (007) Total amount of Sales ',
                   'Store perc (65) Total amount of sales ',
                   'Mall store, aid (005) Total amount of sales',
                   'Increase in the value of sales / Additional seling (22) Total amount of sales',
                   'Dividends (0233) Amount of income tax',
                   'Other income (098) Total amount of Sales',
                   'Other income (0245) Amount of Income Tax',
                    ],
         'Sales':[10,10,9,7,5,5,5],
         
         
        }

df = pd.DataFrame(data, columns = ['Names',
                                   'Sales',
                             
                                   ])
df

This data have some specific columns that I need to be selected in the separate data frame. Keywords for this selection are words Total amount of Sales or Total amount of sales . These words are placed after the second brackets ). Also please take into account that text is no trimmed so empty spaces are possible.

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

So can anybody help me how to solve this ?

>Solution :

Use Series.str.contains without test cases with case=False in boolean indexing:

df1 = df[df['Names'].str.contains('Total amount of Sales', case=False)]
print (df1)
                                               Names  Sales
0                 Store (007) Total amount of Sales      10
1             Store perc (65) Total amount of sales      10
2        Mall store, aid (005) Total amount of sales      9
3  Increase in the value of sales / Additional se...      7
5           Other income (098) Total amount of Sales      5

Or if need test sales or Sales use:

df2 = df[df['Names'].str.contains('Total amount of [Ss]ales')]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading