python polars: new column based on condition/comparison of two existing columns

I am trying to create a new column in Polars data frame based on comparison of two existing columns:

import polars as pl
data = {"a": [2, 30], "b": [20, 3]}
df = pl.DataFrame(data)

df
Out[4]:
shape: (2, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 2   ┆ 20  │
│ 30  ┆ 3   │
└─────┴─────┘

When I do:

df.with_columns(pl.map(["a", "b"], lambda s: "+" if s[0] < s[1] else "-").alias("strand"))

I am getting an error:

thread '<unnamed>' panicked at 'python apply failed: The truth value of a Series is ambiguous. Hint: use '&' or '|' to chain Series boolean results together, not and/or; to check if a Series contains any values, use 'is_empty()'', src/lazy/apply.rs:185:19

I am able to create a boolean column:

df.with_columns(pl.map(["a", "b"], lambda s: s[0] < s[1] ).alias("strand"))

so with extra steps I should get the column with the desired "+" and "-", but is there some simpler way?

Thank you for your help

DK

>Solution :

You can use polars expressions e.g. when/then/otherwise

df.with_columns(
   pl.when(pl.col("a") < pl.col("b")).then("+").otherwise("-")
     .alias("strand")
)
shape: (2, 3)
┌─────┬─────┬────────┐
│ a   | b   | strand │
│ --- | --- | ---    │
│ i64 | i64 | str    │
╞═════╪═════╪════════╡
│ 2   | 20  | +      │
│ 30  | 3   | -      │
└─────┴─────┴────────┘

or .map_dict

df.with_columns(
   (pl.col("a") < pl.col("b"))
   .map_dict({True: "+", False: "-"})
   .alias("strand")
)
shape: (2, 3)
┌─────┬─────┬────────┐
│ a   | b   | strand │
│ --- | --- | ---    │
│ i64 | i64 | str    │
╞═════╪═════╪════════╡
│ 2   | 20  | +      │
│ 30  | 3   | -      │
└─────┴─────┴────────┘

Leave a Reply