Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to use the Polars library in Python to find consecutive 1's?

Here is a piece of code for Polars library along with some test data:

import polars as pl

data = {'test': [0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1]}
df = pl.DataFrame(data)

I want to achieve the following result:

[0, 1, 2, 0, 1, 2, 3, 0, 0, 1, 0, 1, 2, 3, 4, 0, 1, 0, 0, 1, 2, 0, 0, 0, 1, 2, 3, 4, 0, 1, 0, 0, 1]

The desired result is to keep the original 0 values unchanged, start accumulating the consecutive 1’s, and reset the count to the initial value when encountering a 0 value.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

The end result is still data of type pl.DataFrame.

The amount of data is so large that a syntax similar to a for loop cannot be used.

What should I do if I am required to use only polars functions and not numpy or other libraries?

>Solution :

One way to break up the output data is that it is a cumulative count by group, with a new group starting every time a 0 appears in the input data. In that way you can build the following expression:

df.with_columns(
    pl.col("test")
    .cumcount()
    .over(pl.when(pl.col("test") == 0).then(1).cumsum().forward_fill())
)

The cumsum in the over expression on a flat 1 literal column, along with filling the nulls appropriately, creates the groups we need.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading