I have some data whose columns are float numbers and I want to aggregate them on the integer number they are rounded to. In the MWE below, the expected output should be
912
0 2.5
1 1.5
because all column elements are rounded to 912.
MWE:
import pandas as pd
temp = pd.DataFrame({911.7: {0: 0, 1: 1}, 911.9: {0: 2.0, 1: 0.0}, 912.0: {0: 0.5, 1: 0.5}})
round_to = 1
price_digits=1
rounded = [round(round(x / round_to) * round_to, price_digits) for x in temp.columns]
temp.groupby(by=rounded, axis=1).sum()
When actually run, the traceback will be:
Traceback (most recent call last):
File "D:\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-17-983fbc3f7113>", line 1, in <module>
temp.groupby(by=rounded, axis=1).sum()
File "D:\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py", line 1378, in f
return self._cython_agg_general(alias, alt=npfunc, **kwargs)
File "D:\Anaconda3\lib\site-packages\pandas\core\groupby\generic.py", line 1004, in _cython_agg_general
how, alt=alt, numeric_only=numeric_only, min_count=min_count
File "D:\Anaconda3\lib\site-packages\pandas\core\groupby\generic.py", line 1033, in _cython_agg_blocks
block.values, how, axis=1, min_count=min_count
File "D:\Anaconda3\lib\site-packages\pandas\core\groupby\ops.py", line 587, in aggregate
"aggregate", values, how, axis, min_count=min_count
File "D:\Anaconda3\lib\site-packages\pandas\core\groupby\ops.py", line 530, in _cython_operation
result, counts, values, codes, func, is_datetimelike, min_count
File "D:\Anaconda3\lib\site-packages\pandas\core\groupby\ops.py", line 608, in _aggregate
agg_func(result, counts, values, comp_ids, min_count)
File "pandas\_libs\groupby.pyx", line 464, in pandas._libs.groupby._group_add
ValueError: len(index) != len(labels)
which is perplexing because len(rounded)==len(temp.columns)==3. There doesn’t seem to be a length mismatch.
What would be the appropriate way to achieve my purpose? Thanks in advance!
Pandas version: '1.0.1'. Python version: Python 3.7.6 (default, Jan 8 2020, 16:21:45) [MSC v.1916 32 bit (Intel)].
The MWE does work in most cases. For example when we change the third column element to 912.3 from 912.0:
import pandas as pd
round_to = 1
price_digits=1
temp = pd.DataFrame({911.7: {0: 0, 1: 1}, 911.9: {0: 2.0, 1: 0.0}, 912.3: {0: 0.5, 1: 0.5}})
rounded = [round(round(x / round_to) * round_to, price_digits) for x in temp.columns]
temp.groupby(by=rounded, axis=1).sum()
The output will be
Out[14]:
912
0 2.5
1 1.5
>Solution :
You can convert list to Index:
df = temp.groupby(pd.Index(rounded), axis=1).sum()
print (df)
912
0 2.5
1 1.5
Or pass lambda function:
rounded = lambda x: round(round(x / round_to) * round_to, price_digits)
df = temp.groupby(rounded, axis=1).sum()
print (df)
912
0 2.5
1 1.5