Follow

Follow

Contact

Home Pandas: Difference between two series with different length (unaligned index)

Questions

Pandas: Difference between two series with different length (unaligned index)

byMR

November 17, 2021

Consider two following series:

sri = inp.groupby(inp.index.date)['value'].count()

2009-01-12    7
2009-01-14    3

and

sro = out.groupby(out.index.date)['value'].count()

2009-01-03      1
2009-01-09     14
2009-01-10     61
2009-01-11     93
2009-01-12    106
2009-01-13    123
2009-01-14    130

When we subtract one from another, sro-sri, we have:

2009-01-03      NaN
2009-01-09      NaN
2009-01-10      NaN
2009-01-11      NaN
2009-01-12     99.0
2009-01-13      NaN
2009-01-14    127.0

However my desired output is:

2009-01-03      1.0
2009-01-04      0.0
2009-01-05      0.0
2009-01-06      0.0
2009-01-07      0.0
2009-01-08      0.0
2009-01-09     14.0
2009-01-10     61.0
2009-01-11     93.0
2009-01-12     99.0
2009-01-13    123.0
2009-01-14    127.0

and we can use the following workaround to generate the same result:

start_date = '2009-01-03'
end_date = '2009-01-15'
df = pd.DataFrame(
    index=pd.date_range(pd.to_datetime(start_date), pd.to_datetime(end_date) - timedelta(days=1), freq='d').date)
df = df.merge(sro.to_frame(), how='outer', left_index=True, right_index=True) \
    .merge(sri.to_frame(), how='outer', left_index=True, right_index=True).fillna(0)
print(df['value_x'] - df['value_y'])

Is there a much more compact solution to generate the same output?

>Solution :

A simple approach for a simple subtraction is to use sub with fillna=0:

sro.sub(sri, fill_value=0).convert_dtypes()

output:

2009-01-03      1
2009-01-09     14
2009-01-10     61
2009-01-11     93
2009-01-12     99
2009-01-13    123
2009-01-14    127

To add the missing indexes:

idx = sro.index.union(sri.index)
(sro.sub(sri, fill_value=0)
    .reindex(pd.date_range(idx.min(), idx.max()).astype(str), fill_value=0)
    .convert_dtypes()
)

output:

2009-01-03      1
2009-01-04      0
2009-01-05      0
2009-01-06      0
2009-01-07      0
2009-01-08      0
2009-01-09     14
2009-01-10     61
2009-01-11     93
2009-01-12     99
2009-01-13    123
2009-01-14    127

used input:

sri = pd.Series({'2009-01-12': 7, '2009-01-14': 3})
sro = pd.Series({'2009-01-03': 1, '2009-01-09': 14, '2009-01-10': 61, '2009-01-11': 93, '2009-01-12': 106, '2009-01-13': 123, '2009-01-14': 130})

merge

byMR

Published November 17, 2021

Add a comment

Leave a ReplyCancel reply

Read more

Questions

vue3 update image src dynamically

byMR

November 17, 2021

Questions

Value imported can't be changed to a number

byMR

November 17, 2021

Questions

Working on a python project and there are syntax errors never faced before

byMR

November 17, 2021

Questions

If no matching row for a searched value found

byMR

November 17, 2021

Questions

how to extract specific range of characters from string with casting to array type– What it can't be done?

byMR

November 17, 2021

Questions

Which HDL (hardware description language) Tesla is using to program the chips for their cars?

byMR

November 17, 2021