Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Create duplicates of row based column values

I’m trying to build a histogram of some data in polars. As part of my histogram code, I need to duplicate some rows. I’ve got a column of values, where each row also has a weight that says how many times the row should be added to the histogram.

How can I duplicate my value rows according to the weight column?

Here is some example data, with a target series:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import polars as pl

df = pl.DataFrame({"value":[1,2,3], "weight":[2, 2, 1]})

print(df)
# shape: (3, 2)
# ┌───────┬────────┐
# │ value ┆ weight │
# │ ---   ┆ ---    │
# │ i64   ┆ i64    │
# ╞═══════╪════════╡
# │ 1     ┆ 2      │
# │ 2     ┆ 2      │
# │ 3     ┆ 1      │
# └───────┴────────┘

s_target = pl.Series(name="value", values=[1,1,2,2,3])
print(s_target)
# shape: (5,)
# Series: 'value' [i64]
# [
#   1
#   1
#   2
#   2
#   3
# ]

>Solution :

How about

(
    df.with_columns(
        pl.col("value").repeat_by(pl.col("weight"))
    )
    .select(pl.col("value").arr.explode())
)
In [11]: df.with_columns(pl.col('value').repeat_by(pl.col('weight'))).select(pl.col('value').arr.explode())
Out[11]:
shape: (5, 1)
┌───────┐
│ value │
│ ---   │
│ i64   │
╞═══════╡
│ 1     │
│ 1     │
│ 2     │
│ 2     │
│ 3     │
└───────┘

I didn’t know you could do this so easily, I only learned about it while writing the answer. Polars is so nice 🙂

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading