Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

changing row values in a dataframe by looking into another dataframe

I have a look up table as a dataframe (1000 rows) consisting of codes and labels. I have another dataframe (2,00,000 rows) consisting of codes and geometries.

I need to get label names for each corresponding code by looking in the look up dataframe.

Output should be dataframe.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I tried it as follows.

df = pd.read_csv(filepath)

codes = df['codes'].values
labels = df['labels'].values 


df2 = pd.read_csv(filepath)
print (df2.shape)    

for ix in df2.index:
    code = df2.loc[ix, 'code']
    df2.loc[ix, 'label'] = labels[codes==code][0]   

   
print (df2)

Result is correct, but it’s very slow… for looping is very slow

Can you help me?

>Solution :

You should use the merge method of DataFrames (https://pandas.pydata.org/docs/reference/api/pandas.merge.html). It allows to join two dataframes based on a common column. Your code should look like this:

df2 = df2.merge(df, left_on="code", right_on="codes", how="left")
# Check labels using df2["labels"]

The common column name is specified in the parameters left_on and right_on. The parameter how='left' indicates that all the rows from df2 are preserved even if there is no code for a row.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading