I have a very large pandas.Dataframe and want to create a new Dataframe by selecting all columns where one row has a specific value.
A B C D E
Region Nord Süd West Nord Nord
value 2.3 1.2 4.2 0.5 1.3
value2 20 400 30 123 200
Now i want to create a new DataFrame with all columns where the row "Region" has the value "Nord".
How can it be done? The result should look like this:
A D E
Region Nord Nord Nord
value 2.3 0.5 1.3
value2 20 123 200
Thanks in advance
>Solution :
Use first DataFrame.loc for select all rows (:) by mask compred selected row Region by another loc:
df = df.loc[:, df.loc['Region'] == 'Nord']
print (df)
A D E
Region Nord Nord Nord
value 2.3 0.5 1.3
value2 20 123 200
Better is crated MultiIndex by first row with original columns, then is possible select by DataFrame.xs:
df.columns = [df.columns, df.iloc[0]]
df = df.iloc[1:].rename_axis((None, None), axis=1)
print (df)
A B C D E
Nord Süd West Nord Nord
value 2.3 1.2 4.2 0.5 1.3
value2 20 400 30 123 200
print (df.xs('Nord', axis=1, level=1))
A D E
value 2.3 0.5 1.3
value2 20 123 200
print (df.xs('Nord', axis=1, level=1, drop_level=False))
A D E
Nord Nord Nord
value 2.3 0.5 1.3
value2 20 123 200