I have a csv file in which I have tweets with the following column names: File,User,Date 1,month,day,Tweet,Permalink,Retweet count,Likes count,Tweet value,Language,Location.
I would like to create a new dataframe with tweets that only come from certain cities. I can do it but only for the last city in my list (Girona). So it doesn’t actually add all the rows. Here is my code:
import pandas as pd
import os
path_to_file = "populismo_merge.csv"
df = pd.read_csv(path_to_file, encoding='utf-8', sep=',')
values = df[df['Location'].str.contains("A Coruña",na=False)]
values = df[df['Location'].str.contains("Alava",na=False)]
values = df[df['Location'].str.contains("Albacete",na=False)]
values = df[df['Location'].str.contains("Alicante",na=False)]
values = df[df['Location'].str.contains("Almería",na=False)]
values = df[df['Location'].str.contains("Asturias",na=False)]
values = df[df['Location'].str.contains("Avila",na=False)]
values = df[df['Location'].str.contains("Badajoz",na=False)]
values = df[df['Location'].str.contains("Barcelona",na=False)]
values = df[df['Location'].str.contains("Burgos",na=False)]
values = df[df['Location'].str.contains("Cáceres",na=False)]
values = df[df['Location'].str.contains("Cádiz",na=False)]
values = df[df['Location'].str.contains("Cantabria",na=False)]
values = df[df['Location'].str.contains("Castellón",na=False)]
values = df[df['Location'].str.contains("Ceuta",na=False)]
values = df[df['Location'].str.contains("Ciudad Real",na=False)]
values = df[df['Location'].str.contains("Córdoba",na=False)]
values = df[df['Location'].str.contains("Cuenca",na=False)]
values = df[df['Location'].str.contains("Formentera",na=False)]
values = df[df['Location'].str.contains("Girona",na=False)]
values.to_csv(r'populismo_ciudad.csv', index = False)
Many thanks!!!
>Solution :
You are overwriting the values variable each time. A more concise answer would be along the lines of.
values= df[df['LocationName'].isin(["A Coruña", "Alava", ......)]