My solution to find unique mode within group in up and running . But as you can see I am joining dataframe to achieve desired result . Is there way to implement same solution without using join.
import polars as pl
# Sample data (replace with your actual data)
data = pl.DataFrame({
"group": ["A", "A", "B", "B", "C", "C", "C","C"],
"value": [1, 1, 2, 2, 3, 3, 4,4]
})
# Group by the "group" column
#############working soltuiom##################
grouped_data = data.group_by("group")
# Find the mode (most frequent value) for each group
mode_values = grouped_data.agg(pl.when(pl.col("value").mode().len()==1)
.then(pl.col("value")\
.mode().first())\
.otherwise(-999)
)
print(mode_values)
#############working soltuiom##################
data=data.join(mode_values,"group","left")
print(data)
C:\Dev\Python3.11\python.exe
C:\Python_Projects\Python_extra_code\code_for_stackoverflow.py
shape: (3, 2)
┌───────┬───────┐
│ group ┆ value │
│ --- ┆ --- │
│ str ┆ i64 │
╞═══════╪═══════╡
│ C ┆ -999 │
│ A ┆ 1 │
│ B ┆ 2 │
└───────┴───────┘
shape: (8, 3)
┌───────┬───────┬─────────────┐
│ group ┆ value ┆ value_right │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ i64 │
╞═══════╪═══════╪═════════════╡
│ A ┆ 1 ┆ 1 │
│ A ┆ 1 ┆ 1 │
│ B ┆ 2 ┆ 2 │
│ B ┆ 2 ┆ 2 │
│ C ┆ 3 ┆ -999 │
│ C ┆ 3 ┆ -999 │
│ C ┆ 4 ┆ -999 │
│ C ┆ 4 ┆ -999 │
└───────┴───────┴─────────────┘
Process finished with exit code 0
THis question in continuation of find mode in multiple group and return null if no unique mode
Let me know if same can be answered in old solution . I will delete this one .
>Solution :
You could use over:
data.with_columns(pl.when(pl.col('value').mode().len()==1)
.then(pl.col('value').mode().first())
.otherwise(-999)
.over('group')
.alias('value_right')
)
Output:
shape: (8, 3)
┌───────┬───────┬─────────────┐
│ group ┆ value ┆ value_right │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ i64 │
╞═══════╪═══════╪═════════════╡
│ A ┆ 1 ┆ 1 │
│ A ┆ 1 ┆ 1 │
│ B ┆ 2 ┆ 2 │
│ B ┆ 2 ┆ 2 │
│ C ┆ 3 ┆ -999 │
│ C ┆ 3 ┆ -999 │
│ C ┆ 4 ┆ -999 │
│ C ┆ 4 ┆ -999 │
└───────┴───────┴─────────────┘