My data is formatted like this:
| product_name |
|---|
| HP Ryzen 5 Hexa Core 5500U – (16 GB/512 GB SSD/Windows 11 Home) 15s- eq2182AU Thin and Light Laptop |
| DELL Inspiron Athlon Dual Core 3050U – (8 GB/256 GB SSD/Windows 11 Home) Inspiron 3525 Notebook |
These names are too long, and I would like to shorten them. A common theme with all rows of my data is that all the text before the first occurrence of - ( is what I want to preserve for the product name.
How do I remove all the text that comes after - (, including - ( itself?
>Solution :
pandas’s applymap should do it:
import pandas as pd
def shorten(s):
return s.split(' - (')[0]
df = pd.DataFrame(['abc - (123)', 'def - (456)'])
print(df)
df = df.applymap(shorten)
print(df)
Output:
0
0 abc - (123)
1 def - (456)
0
0 abc
1 def
If you want to only modify a specific column, e.g. "product_name", use apply instead:
import pandas as pd
def shorten(s):
return s.split(' - (')[0]
df = pd.DataFrame([['abc - (123)'], ['def - (456)']], columns = ['product_name'])
print(df)
df['product_name'] = df['product_name'].apply(shorten)
print(df)