I have just started learning ‘Polars’ and trying to convert my pandas code to polars.
Here is my code:
import polars as pl
import pandas as pd
pdf = pl.DataFrame({
"Country": ["US", "UK", "US"],
"Date": pd.to_datetime(["2023-01-01", "2023-02-01", "2023-03-01"])
})
df = pdf.to_pandas()
# Using Pandas
out = df.groupby('Country').apply(lambda row: row['Date'].dt.strftime('%Y-%m').value_counts()).unstack(0).sort_index()
print(out)
# How to do that in polars? (my attempt so far)
pdf = pdf.with_columns(
Date_ym = pdf['Date'].dt.strftime("%Y-%m")
)
out = (pdf.groupby("Country")
.agg(pl.col('Date_ym')
.value_counts())
.explode('Date_ym')
)
print(out)
Required output
Country UK US
2023-01 NaN 1.0
2023-02 1.0 NaN
2023-03 NaN 1.0
>Solution :
It looks like a .pivot()
(df.pivot('Date', 'Date', 'Country', aggregate_function='count')
.with_columns(
pl.col('Date').dt.to_string('%Y-%m')
)
)
shape: (3, 3)
┌─────────┬──────┬──────┐
│ Date ┆ US ┆ UK │
│ --- ┆ --- ┆ --- │
│ str ┆ u32 ┆ u32 │
╞═════════╪══════╪══════╡
│ 2023-01 ┆ 1 ┆ null │
│ 2023-02 ┆ null ┆ 1 │
│ 2023-03 ┆ 1 ┆ null │
└─────────┴──────┴──────┘