Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Find mode using polars within group

My solution to find unique mode within group in up and running . But as you can see I am joining dataframe to achieve desired result . Is there way to implement same solution without using join.

import polars as pl

# Sample data (replace with your actual data)
data = pl.DataFrame({
  "group": ["A", "A", "B", "B", "C", "C", "C","C"],
  "value": [1, 1, 2, 2, 3, 3, 4,4]
})

# Group by the "group" column
#############working soltuiom##################
grouped_data = data.group_by("group")


# Find the mode (most frequent value) for each group
mode_values = grouped_data.agg(pl.when(pl.col("value").mode().len()==1)
                                .then(pl.col("value")\
                                .mode().first())\
                               .otherwise(-999)
                               )
print(mode_values)
#############working soltuiom##################


data=data.join(mode_values,"group","left")
print(data)

C:\Dev\Python3.11\python.exe

C:\Python_Projects\Python_extra_code\code_for_stackoverflow.py 
shape: (3, 2)
┌───────┬───────┐
│ group ┆ value │
│ ---   ┆ ---   │
│ str   ┆ i64   │
╞═══════╪═══════╡
│ C     ┆ -999  │
│ A     ┆ 1     │
│ B     ┆ 2     │
└───────┴───────┘
shape: (8, 3)
┌───────┬───────┬─────────────┐
│ group ┆ value ┆ value_right │
│ ---   ┆ ---   ┆ ---         │
│ str   ┆ i64   ┆ i64         │
╞═══════╪═══════╪═════════════╡
│ A     ┆ 1     ┆ 1           │
│ A     ┆ 1     ┆ 1           │
│ B     ┆ 2     ┆ 2           │
│ B     ┆ 2     ┆ 2           │
│ C     ┆ 3     ┆ -999        │
│ C     ┆ 3     ┆ -999        │
│ C     ┆ 4     ┆ -999        │
│ C     ┆ 4     ┆ -999        │
└───────┴───────┴─────────────┘

Process finished with exit code 0

THis question in continuation of find mode in multiple group and return null if no unique mode
Let me know if same can be answered in old solution . I will delete this one .

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You could use over:

data.with_columns(pl.when(pl.col('value').mode().len()==1)
                    .then(pl.col('value').mode().first())
                    .otherwise(-999)
                    .over('group')
                    .alias('value_right')
                  )

Output:

shape: (8, 3)
┌───────┬───────┬─────────────┐
│ group ┆ value ┆ value_right │
│ ---   ┆ ---   ┆ ---         │
│ str   ┆ i64   ┆ i64         │
╞═══════╪═══════╪═════════════╡
│ A     ┆ 1     ┆ 1           │
│ A     ┆ 1     ┆ 1           │
│ B     ┆ 2     ┆ 2           │
│ B     ┆ 2     ┆ 2           │
│ C     ┆ 3     ┆ -999        │
│ C     ┆ 3     ┆ -999        │
│ C     ┆ 4     ┆ -999        │
│ C     ┆ 4     ┆ -999        │
└───────┴───────┴─────────────┘
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading