I’m trying to compute the entropy of a list, but I need to do a conversion first :
import polars as pl
df = pl.DataFrame({"Result": "1, 2, 3"})
df.select(pl.col("Result").str.split(",").cast(pl.List(pl.Float64)).entropy()).collect()
but this gives :
ComputeError: cannot cast List type (inner: 'Float64', to: 'Float64')
What’s wrong here?
>Solution :
For this problem you’ll need to do a couple of things:
- properly parse the numbers (including the space after the comma)
- use
.list.eval(…entropy())to calculate the entropy per list - the result returns a list of length 1, so we grab the calculated entropy
import polars as pl
print(pl.__version__) # 0.20.2
df = pl.DataFrame({"Result": ["1, 2, 3", "4, 5, 6"]})
print(
df.select(
pl.col("Result").str.split(", ") # ①
.cast(pl.List(pl.Float64))
.list.eval(pl.element().entropy()) # ②
.list.get(0) # ③
)
# shape: (2, 1)
# ┌──────────┐
# │ Result │
# │ --- │
# │ f64 │
# ╞══════════╡
# │ 1.011404 │
# │ 1.085189 │
# └──────────┘
)