Pythonic way to locate index and column of polars dataframe based on absolute sorted values

Advertisements

I have a dataframe containing multiple columns out of which 1st column is regarded as index. The remaining columns containing values that I want to sort by absolute key and thereafter create new dataframe indicating original dataframe location in terms of index and column of the sorted values. Highly appreciate if somebody can shade some light on how to do it in more pythonic way.

import polars as pl
import numpy as np

df = pl.DataFrame({
    "name": ["a", "b", "c", "d", "e", "f"],
    "val1": [1.2, -2.3, 3, -3.3, 2.2, -1.3],
    "val2": [5, 2, 2, -4, -3, -6]})

vals = df[df.columns[1:]].to_numpy()
sorted_vals = sorted(tuple(vals.reshape(-1,)), key=abs)[::-1]

data = []
for sv in sorted_vals:
    i, c = int(np.where(vals==sv)[0][0]), int(np.where(vals==sv)[1][0])
    data.append([sv, df[i,'name'], df.columns[1+c]])

new_df = pl.DataFrame(data=data, orient='row', schema=['val', 'name', 'col'])

print(new_df)

# shape: (12, 3)
# ┌──────┬──────┬──────┐
# │ val  ┆ name ┆ col  │
# │ ---  ┆ ---  ┆ ---  │
# │ f64  ┆ str  ┆ str  │
# ╞══════╪══════╪══════╡
# │ -6.0 ┆ f    ┆ val2 │
# │ 5.0  ┆ a    ┆ val2 │
# │ -4.0 ┆ d    ┆ val2 │
# │ -3.3 ┆ d    ┆ val1 │
# │ …    ┆ …    ┆ …    │
# │ 2.0  ┆ b    ┆ val2 │
# │ 2.0  ┆ b    ┆ val2 │
# │ -1.3 ┆ f    ┆ val1 │
# │ 1.2  ┆ a    ┆ val1 │
# └──────┴──────┴──────┘

>Solution :

You should melt and sort (with an abs expression for the absolute value):

out = (df.melt('name', variable_name='col', value_name='val')
         .sort(pl.col('val').abs(), descending=True)
      )

Output:

shape: (12, 3)
┌──────┬──────┬──────┐
│ name ┆ col  ┆ val  │
│ ---  ┆ ---  ┆ ---  │
│ str  ┆ str  ┆ f64  │
╞══════╪══════╪══════╡
│ f    ┆ val2 ┆ -6.0 │
│ a    ┆ val2 ┆ 5.0  │
│ d    ┆ val2 ┆ -4.0 │
│ d    ┆ val1 ┆ -3.3 │
│ …    ┆ …    ┆ …    │
│ b    ┆ val2 ┆ 2.0  │
│ c    ┆ val2 ┆ 2.0  │
│ f    ┆ val1 ┆ -1.3 │
│ a    ┆ val1 ┆ 1.2  │
└──────┴──────┴──────┘

Leave a ReplyCancel reply