Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python Pandas Dataframe By Column and Row Value

I have a pandas dataframe with multiple column values that are the same. I’d like to select the series in the column where I pass it the column name, and then they have a unique identifier in one of the index/rows. My data frame looks like this:

                       TestPart     TestPart    OtherPart
attribute                                                                                  
Location                 Plant       Factory      Plant
Lead Time                25            56          30
Value1                   5             10          15
Value2                   A             B           C

There are many more parts, but these 2 serve as a good example. I’d like to select all the values in the column TestPart and Plant.

I’m able to isolate these to columns specifically with df.loc[:, "TestPart"], but I’d like to further isolate down by somehow selecting the column based on the Location value.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Is this possible? I’ve read through a lot of post trying iloc and other tricks, but can’t seem to come up with the right recipe for this one yet.

EDIT:

Trying to add some more info for clarity. I have the dataframe shown above.

I would like to select the column with the Column = "TesPart" and the Location = "Plant".

The output I need is the entire series in that column like this:

attribute     
Location                         Plant
Lead Time                          25
Value1                             5
Value2                             A

>Solution :

Here need compare row Location selected by DataFrame.loc and columns names, for filter columns use again DataFrame.loc with : for all rows and columns by condition:

df = df.loc[:, df.loc['Location'].eq('Plant') & (df.columns == 'TestPart')]
print (df)
          TestPart
Location     Plant
Lead Time       25
Value1           5
Value2           A

If create MultiIndex by original columns and first row of data solution is simplier – select by tuple:

df.columns = [df.columns, df.iloc[0]]
df1 = df.iloc[1:].reset_index(drop=True)
print (df1)
         TestPart         OtherPart
Location    Plant Factory     Plant
0              25      56        30
1               5      10        15
2               A       B         C

s = df1[('TestPart','Plant')]
print (s)
0    25
1     5
2     A
Name: (TestPart, Plant), dtype: object
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading