Home How to remove last few characters from each row/column from pandas (python) dataframe?

Questions

How to remove last few characters from each row/column from pandas (python) dataframe?

November 22, 2021

I have a dataset with lots of variation in format like this.

    -0.002672945<120>
    -0.077635566{600}
    5.88365537e-005{500}
   -0.116441565{1}
   -4.549649974<29.448>

There are all kinds of variety in the end of the values, I need to remove all those weird brackets, problem is sometimes they are 3 characters, some times 6, etc. I also cannot just take first few characters as there are scientific notation numbers such as 8.645637e-007 like this.

Is there a smart way to clear this kind of mess from data?

>Solution :

>>> df = pd.DataFrame({"x": [
... "-0.002672945<120>",
... "-0.077635566{600}",
... "5.88365537e-005{500}",
... "-0.116441565{1}",
... "-4.549649974<29.448>",
... ]})
>>> df["x"].replace(r"[<{].+$", "", regex=True)
0       -0.002672945
1       -0.077635566
2    5.88365537e-005
3       -0.116441565
4       -4.549649974
Name: x, dtype: object
>>>

You can assign that result back into the df then.