I am trying to count the number of letters in a string in Polars.
I could probably just use an apply method and get the len(Name).
However, I was wondering if there is a polars specific method?
import polars as pl
mydf = pl.DataFrame(
{"start_date": ["2020-01-02", "2020-01-03", "2020-01-04"],
"Name": ["John", "Joe", "James"]})
print(mydf)
│start_date ┆ Name │
│ --- ┆ --- │
│ str ┆ str │
╞════════════╪═══════╡
│ 2020-01-02 ┆ John │
│ 2020-01-03 ┆ Joe │
│ 2020-01-04 ┆ James │
In the end John would have 5, Joe would be 3 and James would be 5
I thought something like below might work based on the Pandas equivalent
# Assume that its a Pandas Dataframe
mydf['count'] = mydf ['Name'].str.len()
# Polars equivalent - ERRORs
mydf = mydf.with_columns(
pl.col('Name').str.len().alias('count')
)
>Solution :
You can use
.str.lengths()that counts number of bytes in the UTF8 string (doc) – faster.str.n_chars()that counts number of characters (doc)
mydf.with_columns([
pl.col("Name").str.lengths().alias("len")
])
┌────────────┬───────┬─────┐
│ start_date ┆ Name ┆ len │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ u32 │
╞════════════╪═══════╪═════╡
│ 2020-01-02 ┆ John ┆ 4 │
│ 2020-01-03 ┆ Joe ┆ 3 │
│ 2020-01-04 ┆ James ┆ 5 │
└────────────┴───────┴─────┘