Home Appending column values from one dataframe to another as a list

Questions

Appending column values from one dataframe to another as a list

March 4, 2022

I have dozens of very similar dataFrames. What I want is to combine all ‘VALUE’ column values from each into lists, and return a dataFrame where the ‘VALUE’ column is comprised of these lists. I only want to do this for rows where ‘PV’ contains a substring from a list of substrings.

I came up with one way I thought would work, but it’s real nasty and doesn’t work anyways (stopped it at 3m). There has to be a better way of doing this, does anyone here have any ideas? Thanks for any and all help.

import pandas as np
# Example dataFrames
df0 = pd.DataFrame(data={'PV': ['pv1', 'pv2', 'pv3', 'pv4'], 'VALUE': [1, 2, 3, 4]})
df1 = pd.DataFrame(data={'PV': ['pv1', 'pv2', 'pv3', 'pv4'], 'VALUE': [5, 6, 7, 8]})

DATAFRAMES

df0 dataFrame          df1 dataFrame

PV   VALUE             PV   VALUE
pv1    1               pv1    5
pv2    2               pv2    6
pv3    3               pv3    7
pv4    4               pv4    8

# Nasty code I thought might work
strings = ['v2', 'v4']
for i, row0 in df0.iterrows():
    for j, row1 in df1.iterrows():
        if (row0['PV']==row1['PV']) & any(substring in row0['PV'] for substring in strings):
            df0.at[i,'VALUE'] = [row0['VALUE'], row1['VALUE']]

Desired result:

PV   VALUE
pv1    1
pv2  [2,6]
pv3    3
pv4  [4,8]

>Solution :

You could filter df1 for rows that contain strings; concatenate it with df0; then groupby + agg(list) can aggregate "VALUE"s for each "PV".

Finally, you could use mask to take out the elements from the singleton lists.

out = (pd.concat([df0, df1[df1['PV'].str.contains('|'.join(strings))]])
       .groupby('PV', as_index=False)['VALUE'].agg(list))
out['VALUE'] = out['VALUE'].mask(out['VALUE'].str.len().eq(1), out['VALUE'].str[0])

Alternatively, we could make the values in the "VALUE" columns lists and merge + concatenate the lists:

df0['VALUE'] = df0['VALUE'].apply(lambda x: [x])
df1['VALUE'] = df1['VALUE'].where(df1['PV'].str.contains('|'.join(strings)), df1['VALUE'].max()+1).apply(lambda x: [x] if x <= df1['VALUE'].max() else [])
out = df0.merge(df1, on='PV').set_index('PV').sum(axis=1)
out = out.mask(out.str.len().eq(1), out.str[0]).reset_index(name='VALUE')

Output:

    PV   VALUE
0  pv1       1
1  pv2  [2, 6]
2  pv3       3
3  pv4  [4, 8]

dataframe

byMR

Published March 04, 2022

Add a comment

Passing varying length variables to a PySpark groupby().agg function

byMR

March 4, 2022

Questions

What is the distribution of math.random in Java?

byMR

March 4, 2022

Questions

Python prints ascii code + character instead of colored text

byMR

March 4, 2022

Questions

How to generate getters and setters in Javascript?

byMR

March 4, 2022

Questions

Unhandled Exception: type 'Future<dynamic>' is not a subtype of type 'int'

byMR

March 4, 2022

Questions

Don't Accept the number that start with zero using jquery validation

byMR

March 4, 2022

Appending column values from one dataframe to another as a list

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Passing varying length variables to a PySpark groupby().agg function

What is the distribution of math.random in Java?

Python prints ascii code + character instead of colored text

How to generate getters and setters in Javascript?

Unhandled Exception: type 'Future<dynamic>' is not a subtype of type 'int'

Don't Accept the number that start with zero using jquery validation

Keep Up to Date with the Most Important News

Appending column values from one dataframe to another as a list

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Passing varying length variables to a PySpark groupby().agg function

What is the distribution of math.random in Java?

Python prints ascii code + character instead of colored text

How to generate getters and setters in Javascript?

Unhandled Exception: type 'Future<dynamic>' is not a subtype of type 'int'

Don't Accept the number that start with zero using jquery validation

Discover more from Dev solutions