I am doing data cleaning for my dataset.
How to remove the link "pic.twitter.com…." in Python for colab?
here’s the picture of the link I want to remove
Any suggestions are much appreciated. Thanks.
#remove other links
def removelinks(text):
** links = re.sub(r'????')**
return links.sub(r'',text)
train_df['clean tweet']= train_df['clean tweet'].apply(lambda x: removelinks(x))
train_df.head()
>Solution :
You can try (regex101):
df['clean tweet'] = df['clean tweet'].str.replace(r'pic\.twitter\.com\S*\s*', '', regex=True)
print(df)
Prints:
clean tweet
0 some tweet1
1 some tweet2
2 some tweet3
Initial df:
clean tweet
0 some tweet1
1 pic.twitter.com/some_link some tweet2
2 some tweet3