Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to define a correct index by constructing simple pandas Series?

I have the following python dictionary:

sdata ={'Ohio': 35000, 'Oregon': 16000, 'Texas': 71000, 'Utah': 5000}

Suppose I want to create pandas Series from this dictionary. For some reasons, I want to construct the Series with additional columns:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

states = ['California', 'Damascus','Ohio', 'Oregon', 'Texas','Regensburg', 'Munich']
obj4 = pd.Series(sdata, index=states)
obj4

And the output will be:

California        NaN
Damascus          NaN
Ohio          35000.0
Oregon        16000.0
Texas         71000.0
Regensburg        NaN
Munich            NaN
dtype: float64

In this case, 3 values found in sdata were placed in the appropriate locations, but since no value for California, Damascus, Regensburg, and Munich were found, they appears as NaN.
In other words, an index without corresponding value in sdata will appear as NaN.

However, it does not work when I am trying to create Series from a list:

labels = ['Covid', 'Delta', 'Omicron', 'Mu']
obj2 = pd.Series([1.5, -2.5, 0], index=labels) 
obj2

The error message:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-87-3f289c72627f> in <module>()
      1 # use the above created index object as an index in this Serie
----> 2 obj2 = pd.Series([1.5, -2.5, 0], index=labels)
      3 obj2

/usr/local/lib/python3.7/dist-packages/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    312                     if len(index) != len(data):
    313                         raise ValueError(
--> 314                             f"Length of passed values is {len(data)}, "
    315                             f"index implies {len(index)}."
    316                         )

ValueError: Length of passed values is 3, index implies 4.

I did not understand why I get this message error although it is allowed to create Series with NaN values as in the first case?

Thank you in advance!

>Solution :

Use pd.Series only with dictionary and then add Series.reindex:

obj4 = pd.Series(sdata).reindex(states)

If create by list is necessary same length of index like data list first, e.g. for length of 3 is filtered first 3 values of list labels:

labels = ['Covid', 'Delta', 'Omicron', 'Mu']
obj2 = pd.Series([1.5, -2.5, 0], index=labels[:3]).reindex(labels)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading