Home Join series with repeated index on dataframe where column values are equal to the index in the series

Questions

Join series with repeated index on dataframe where column values are equal to the index in the series

July 22, 2022

Say I have the series an dataframe like:

import pandas as pd
s = pd.Series([10,20,11,12,30,34],
    index=["red","red","blue","blue","green","green"])
s.index.name="numbers"

df = pd.DataFrame({
    "color":["red","green","blue","blue","red","green"],
    "id":[1,2,3,4,5,6]})

I want to add the values in s to the column in df in the same order as they appear where the index of s is equal to df["color"] i.e

pd.some_function(df,s,left_on="color",right_index=True)

color   id    numbers
red      1      10
green    2      30
blue     3      11
blue     4      12
red      5      20
green    6      34

I have tried pd.merge, pd.join etc. but I simply cannot make it work (without looping over df, filtered by color, add the data from s and then concat it at the end)

>Solution :

You can use groupby.cumcount to set up a unique key for the merge:

idx1 = s.groupby(level=0).cumcount()
# [0, 1, 0, 1, 0, 1]
idx2 = df.groupby('color').cumcount()
# [0, 0, 0, 1, 1, 1]

s.index.name="color"
out = (df
   .merge(s.reset_index(name='number'),
          left_on=['color', idx2], right_on=['color', idx1])
   .drop(columns='key_1')
)

variant:

s.index.name="color"
out = (df
   .assign(idx=df.groupby('color').cumcount())
   .merge(s.reset_index(name='number')
           .assign(idx=s.groupby(level=0).cumcount().values),
          left_on=['color', 'idx'], right_on=['color', 'idx'])
    .drop(columns='idx')
)

output:

   color  id  number
0    red   1      10
1  green   2      30
2   blue   3      11
3   blue   4      12
4    red   5      20
5  green   6      34

dataframe

byMR

Published July 22, 2022

Add a comment

Populate multiple cell entries of one sheet with data on another sheet cells

byMR

July 22, 2022

Questions

Pandas rename columns in function using chained operations

byMR

July 22, 2022

Questions

Can we initiate an array literal with variables in C?

byMR

July 22, 2022

Questions

RTK query function is undefined in react native

byMR

July 22, 2022

Questions

Render label names instead of integer fields in Django templates

byMR

July 22, 2022

Questions

How to remove some identical objects in array(js)?

byMR

July 22, 2022

Join series with repeated index on dataframe where column values are equal to the index in the series

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Populate multiple cell entries of one sheet with data on another sheet cells

Pandas rename columns in function using chained operations

Can we initiate an array literal with variables in C?

RTK query function is undefined in react native

Render label names instead of integer fields in Django templates

How to remove some identical objects in array(js)?

Keep Up to Date with the Most Important News

Join series with repeated index on dataframe where column values are equal to the index in the series

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Populate multiple cell entries of one sheet with data on another sheet cells

Pandas rename columns in function using chained operations

Can we initiate an array literal with variables in C?

RTK query function is undefined in react native

Render label names instead of integer fields in Django templates

How to remove some identical objects in array(js)?

Discover more from Dev solutions