Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas groupby raises ValueError: len(index) != len(labels) when trying to aggregate columns

I have some data whose columns are float numbers and I want to aggregate them on the integer number they are rounded to. In the MWE below, the expected output should be

   912
0  2.5
1  1.5

because all column elements are rounded to 912.

MWE:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import pandas as pd
temp = pd.DataFrame({911.7: {0: 0, 1: 1}, 911.9: {0: 2.0, 1: 0.0}, 912.0: {0: 0.5, 1: 0.5}})
round_to = 1
price_digits=1
rounded = [round(round(x / round_to) * round_to, price_digits) for x in temp.columns]
temp.groupby(by=rounded, axis=1).sum()

When actually run, the traceback will be:

Traceback (most recent call last):
  File "D:\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-17-983fbc3f7113>", line 1, in <module>
    temp.groupby(by=rounded, axis=1).sum()
  File "D:\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py", line 1378, in f
    return self._cython_agg_general(alias, alt=npfunc, **kwargs)
  File "D:\Anaconda3\lib\site-packages\pandas\core\groupby\generic.py", line 1004, in _cython_agg_general
    how, alt=alt, numeric_only=numeric_only, min_count=min_count
  File "D:\Anaconda3\lib\site-packages\pandas\core\groupby\generic.py", line 1033, in _cython_agg_blocks
    block.values, how, axis=1, min_count=min_count
  File "D:\Anaconda3\lib\site-packages\pandas\core\groupby\ops.py", line 587, in aggregate
    "aggregate", values, how, axis, min_count=min_count
  File "D:\Anaconda3\lib\site-packages\pandas\core\groupby\ops.py", line 530, in _cython_operation
    result, counts, values, codes, func, is_datetimelike, min_count
  File "D:\Anaconda3\lib\site-packages\pandas\core\groupby\ops.py", line 608, in _aggregate
    agg_func(result, counts, values, comp_ids, min_count)
  File "pandas\_libs\groupby.pyx", line 464, in pandas._libs.groupby._group_add
ValueError: len(index) != len(labels)

which is perplexing because len(rounded)==len(temp.columns)==3. There doesn’t seem to be a length mismatch.

What would be the appropriate way to achieve my purpose? Thanks in advance!

Pandas version: '1.0.1'. Python version: Python 3.7.6 (default, Jan 8 2020, 16:21:45) [MSC v.1916 32 bit (Intel)].


The MWE does work in most cases. For example when we change the third column element to 912.3 from 912.0:

import pandas as pd
round_to = 1
price_digits=1
temp = pd.DataFrame({911.7: {0: 0, 1: 1}, 911.9: {0: 2.0, 1: 0.0}, 912.3: {0: 0.5, 1: 0.5}})
rounded = [round(round(x / round_to) * round_to, price_digits) for x in temp.columns]
temp.groupby(by=rounded, axis=1).sum()

The output will be

Out[14]: 
   912
0  2.5
1  1.5

>Solution :

You can convert list to Index:

df = temp.groupby(pd.Index(rounded), axis=1).sum()
print (df)
   912
0  2.5
1  1.5

Or pass lambda function:

rounded = lambda x: round(round(x / round_to) * round_to, price_digits)
df = temp.groupby(rounded, axis=1).sum()
print (df)
   912
0  2.5
1  1.5
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading