I have a dataframe like
df = pd.DataFrame({'team': ['Red', 'Blue', 'Yellow', 'Green'],
'points': [11, 7, 8, 10],
'other': ["kk;aaa;;;", ";;bb;;", ";", ";;;;"]})
I would like to replace all cells that contain only ";" with "" (empty cells). There may be only 1 or many ";" in a row. If there is anything in the cells other than ";", I would like to leave them as is.
In my dataframe the other column would become:
other
kk;aaa;;;
;;bb;;
#(nothing)
#(nothing)
>Solution :
You can use Series.replace with a regex pattern:
df['other'] = df.other.replace(r'^;{1,}$','', regex=True)
print(df)
team points other
0 Red 11 kk;aaa;;;
1 Blue 7 ;;bb;;
2 Yellow 8
3 Green 10
Explanation pattern ^;{1,}$:
^asserts position at start of the string;matches the character;{1,}matches the previous token (i.e.;) between one and unlimited times$asserts position at the end of the string
You can also use this on multiple columns, with df.replace. E.g.:
df = pd.DataFrame({'team': ['Red', 'Blue', 'Yellow', 'Green'],
'points': [11, 7, 8, 10],
'other': ["kk;aaa;;;", ";;bb;;", ";", ";;;;"],
'other2': ["kk;aaa;;;", ";;bb;;", ";", ";;;;"]})
cols = ['other','other2']
df[cols] = df[cols].replace(r'^;{1,}$','', regex=True)
print(df)
team points other other2
0 Red 11 kk;aaa;;; kk;aaa;;;
1 Blue 7 ;;bb;; ;;bb;;
2 Yellow 8
3 Green 10