Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pure polars version of safe ast literal eval

I have data like this,

df = pl.DataFrame({'a': ["['b', 'c', 'd']"]})

I want to convert the string to a list
I use,

df = df.with_columns(a=pl.col('a').str.json_decode())

it gives me,

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

ComputeError: error inferring JSON: InternalError(TapeError) at character 1 (''')

then I use this function,

import ast
def safe_literal_eval(val):
    try:
        return ast.literal_eval(val)
    except (ValueError, SyntaxError):
        return val
df = df.with_columns(a=pl.col('a').map_elements(safe_literal_eval, return_dtype=pl.List(pl.String)))

and get the expected output, but is there a pure polars way to achieve the same?

>Solution :

A general ast eval is not yet available. The problem with json_decode is that the list representation uses single quotes (instead of double quotes as used in JSON).

In your example, this issue can be circumvented by replacing the single quotes using pl.Expr.str.replace_all as follows.

df.with_columns(
    pl.col("a").str.replace_all("'", '"').str.json_decode()
)
shape: (1, 1)
┌─────────────────┐
│ a               │
│ ---             │
│ list[str]       │
╞═════════════════╡
│ ["b", "c", "d"] │
└─────────────────┘
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading