I have a data frame and I want to delete rows that in the column "Phrase", pattern "___" exists.
| Index | PHRASE | Label |
|---|---|---|
| 0 | proposed by the president of the | 1 |
| 1 | Living ___ | 1 |
| 2 | "Murder, ___ Wrote" | 0 |
But Imagin that the data fram has 2,000,000 enteries
import re
df_clean = pd.DataFrame()
z = 0
y = 0
for i in df_original["PHRASE"]:
x = re.search("___", i)
if x:
y = y + 1
else:
df_clean.append([i])
z = z + 1
this is what I came up with so far, I know it's not right, Does anyone know the answer? (by the way append takes a lot of time)
>Solution :
df[~df['phrase'].str.contains('___')]
Where the ~ symbol negates the operation.