Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do you merge a list to an existing dataframe?

I am trying to replace a value in a pandas dataframe based on another column value.

I made sample code below to replicate the issue, but essentially I want to add a column to an existing dataframe and then replace placeholder information based on another column’s value. The dataframe I am using (not in the example) is based on an excel document that will be used to webscrape information that I want returned in the new column based on the other column’s value. Just wanted to mention that before people asked why not just start the ex_list data in the dataframe. Additionally I only want it replace where the condition is met, not replacing the entire column with a set value.

Sample Code

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

## this would be the excel document df
sample_df = pd.DataFrame({"a":[1,2,3,4,5]})
sample_df["b"] = ""
## this data would be webscrapped using information above
ex_list = [[1, "CHANGE"],[4, "CHANGE"]]

for sub in ex_list:
    location = sample_df.loc[sample_df['a']==sub[0], 'b'].iloc[0]
    sample_df.replace(location, sub[1])
sample.head()

I also tried this just as a quick playaround but it produced the same output

sample_df = pd.DataFrame({"a":[1,2,3,4,5]})
sample_df["b"] = ""
ex_list = [[1, "CHANGE"],[4, "CHANGE"]]

for sub in ex_list:
    sample_df[sample_df['a']==sub[0], 'b'].iloc[0] += sub[1]
sample_df.head()

Both outputs are the same and show no change:

    a   b
0   1   
1   2   
2   3   
3   4   
4   5   

The output I am hoping for is this

    a   b
0   1   CHANGE
1   2   
2   3   
3   4   CHANGE
4   5   

I would appreciate a second pair of eyes on this. Is my method of ‘locating the value’ logic off? I thought the .loc/.iloc would be best but perhaps another way of indexing is best? I would be open to any solutions!

>Solution :

IMHO, the title makes your question an XY. I think you want to simply merge both objects :

# sample_df["b"] = "" # no need for this line anymore

out = sample_df.merge(pd.DataFrame(ex_list, columns=["a", "b"]), how="left")

Output :

print(out)

   a       b
0  1  CHANGE
1  2     NaN
2  3     NaN
3  4  CHANGE
4  5     NaN
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading