I have a data frame with two columns. cnn_handle contains Twitter handles and tweet contains tweets where the Twitter handle in the corresponding row is mentioned. However, most tweets mention at least one other user/handle indicated by @. I want to remove all rows where a tweet contains more than one @.
df
cnn_handle tweet
1 @DanaBashCNN @JohnKingCNN @DanaBashCNN @kaitlancollins @eliehonig @thelauracoates @KristenhCNN CNN you are still FAKE NEWS !!!
2 @DanaBashCNN @DanaBashCNN He could have made the same calls here, from SC.
3 @DanaBashCNN @DanaBashCNN GRAMMER ALERT: THAT'S FORMER PRESIDENT TRUMP Please don't forget this important point. Also please refrain from showing a pic of him till you have one in his casket. thank you
4 @brianstelter @eliehonig @brianstelter My apologies to you sir. Just seems like that story disappeared. Imo the nursing home scandal is just as bad.
5 @brianstelter @DrAndrewBaer1 @JGreenblattADL @brianstelter @CNN @TuckerCarlson @FoxNews Anti-Semite are you, Herr Doktor? How very Mengele of you.
6 @brianstelter @ma_makosh @Shortguy1 @brianstelter @ChrisCuomo Liberals, their feelings before facts and their crucifixion of people before due process. Never a presumption of innocence when it concerns the rival party. So un-American.
7 @andersoncooper @BrendonLeslie And Biden was a staunch opponent of “forced busingâ€. He also said that integrating schools will cause a “racial jungleâ€. But u won’t hear this on @ChrisCuomo @jaketapper @Acosta @andersoncooper bc they continue to cover up the truth about Biden & his family.
8 @andersoncooper Anderson Cooper revealed that he "wanted a change" when reflecting on his break from news as #TheMole arrives on Netflix.
9 @andersoncooper @johnnydollar01 @newsbusters @drsanjaygupta @andersoncooper He was terrible as a host
I suspect some type of regular expression is needed. However, I am not sure how to combine it with a greater-than sign.
The desired result i.e. tweets only mentioning the corresponding cnn_handle
cnn_handle tweet
2 @DanaBashCNN @DanaBashCNN He could have made the same calls here, from SC.
3 @DanaBashCNN @DanaBashCNN GRAMMER ALERT: THAT'S FORMER PRESIDENT TRUMP Please don't forget this important point. Also please refrain from showing a pic of him till you have one in his casket. thank you
8 @andersoncooper Anderson Cooper revealed that he "wanted a change" when reflecting on his break from news as #TheMole arrives on Netflix.
>Solution :
A straighforward solution using str_count from stringrwhich presupposes that @ occur only in Twitter handles:
base R:
library(stringr)
df[str_count(df$tweet, "@") > 1,]
dplyr:
library(dplyr)
library(stringr)
df %>%
filter(!str_count(tweet, "@") > 1)