Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Merge columns into one single column using pandas

I’m trying to concatenate the string values of 3 different columns into one single column with the name "Address".

This is my data: Dataframe

I’ve tried this code:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

def get_data(input_file, location_file, output_file):

  df1 = pd.read_excel(input_file, sheet_name="Destino")

  df2 = pd.read_excel(location_file)
  df2_filtered = df2[['TWLAD LGORT Depósito', 'ADRC POST_CODE1 Código postal da cidade', 'ADRC CITY1 Local', 'ADRC STREET Rua']]
  
  df = df1.merge(df2_filtered, left_on='PlanoDestino', right_on='TWLAD LGORT Depósito')
  df = df.drop(columns='TWLAD LGORT Depósito')

  df['Address'] = df['ADRC STREET Rua'] + ' ' + df['ADRC CITY1 Local'] + ' ' + df['ADRC POST_CODE1 Código postal da cidade']

But it’s returning a not concatenated column: Dataframe with Address column

What do I need to do?

I need that the new column "Address" be like this example: "Rua das Pereiras Aveiro 3810-310"

>Solution :

It might be a problem that there are leading or trailing whitespaces in one or both of your data frames meaning they can’t be matched up correctly.

To solve this, try removing whitespaces using the strip() function in your data frames like below before trying to join them together:

df['TWLAD LGORT Depósito'] = df['TWLAD LGORT Depósito'].str.strip()
df['ADRC POST_CODE1 Código postal da cidade'] = df['ADRC POST_CODE1 Código postal da cidade'].str.strip()
df['ADRC CITY1 Local'] = df['ADRC CITY1 Local'].str.strip()
df['ADRC STREET Rua'] = df['ADRC STREET Rua'].str.strip()

Then the line:

df['Address'] = df['ADRC STREET Rua'] + ' ' + df['ADRC CITY1 Local'] + ' ' + df['ADRC POST_CODE1 Código postal da cidade']

should give you the output you expect.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading