Home Getting an error that i should not be getting

Questions

Getting an error that i should not be getting

February 1, 2022

I am trying to get a percentage by dividing the numbers from one column with another column but i keep getting the same error.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-34-60166e8a919c> in <module>()
      6 dataLake = dataLake[['day','Agent','Resolved','Meta','Week','Year']]
      7 #Creating new data (atingimento)
----> 8 dataLake["atingimento"] = ((dataLake.Resolved.astype(int) / dataLake.Meta.astype(int)) * 100)
      9 dataLake['Resolved'] = dataLake.Resolved.astype(int)
     10 dataLake['Meta'] = dataLake.Meta.astype(str)

4 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/dtypes/cast.py in astype_nansafe(arr, dtype, copy, skipna)
    972         # work around NumPy brokenness, #1987
    973         if np.issubdtype(dtype.type, np.integer):
--> 974             return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
    975 
    976         # if we have a datetime/timedelta array of objects

pandas/_libs/lib.pyx in pandas._libs.lib.astype_intsafe()

ValueError: invalid literal for int() with base 10: ''

I tried converting both data sets to int using .astype(int) but it does not work as you can see from the data set below some how the google colab is reading the column ‘Meta’ as string even though its in the same format as the column Resolved.

           day  |             Agent | Resolved |   Meta |Week | Year
-------------------------------------------------------------------------
103 2021-01-26  |   Ana Carolina B. |     107  |2525252525    4  2021
104 2021-01-25  |       Bárbara D.  |   275    |3831252128    4  2021
105 2021-01-25  |          Danielly |   192    |3831252128    4  2021
106 2021-01-26  |   Felipe Pereira  | 102      |3125212822    4  2021
107 2021-01-26  |Fernanda Favalessa |207       |3125212822    4  2021
108 2021-01-25  |           Guto R. |215       |3831252114    4  2021
109 2021-01-25  |        Helaine S. |   253    |  3831252114    4  2021
110 2021-01-25  |           João M. |   145    |   38252128    4  2021
111 2021-01-25  |           João P. |    173   | 3535353535    4  2021
112 2021-01-26  |     Livia Azeredo |     89   |3125212822    4  2021
113 2021-01-26  |       Lucas Alves |     70   |1815101320    4  2021
114 2021-01-25            Paula P.  |    137   |3831252114    4  2021

>Solution :

You might want to use pandas.to_numeric that can convert the invalid data to NaN (and then fillna with a default value if needed):

in place of:

dataLake.Resolved.astype(int)

Use:

pd.to_numeric(dataLak['Resolved'], errors='coerce')
# or
pd.to_numeric(dataLak['Resolved'], errors='coerce').fillna(-1) # -1 if invalid

etc. for all other occurrences

Example:

pd.to_numeric(pd.Series(['1', '   12  ', '']), errors='coerce')

output:

0     1.0
1    12.0
2     NaN
dtype: float64

byMR

Published February 01, 2022

Add a comment

why do i get this error creating superuser django?

byMR

February 1, 2022

Questions

Python: Print only part of the string when condition is met

byMR

February 1, 2022

Questions

how to get sorted index of min values of 2d array

byMR

February 1, 2022

Questions

Returning second highest element of a list using just a for loop and Len() method

byMR

February 1, 2022

Questions

Databricks Pyspark – Group related rows

byMR

February 1, 2022

Questions

How to select a part of a range, which is output of a formula

byMR

February 1, 2022

Getting an error that i should not be getting

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

why do i get this error creating superuser django?

Python: Print only part of the string when condition is met

how to get sorted index of min values of 2d array

Returning second highest element of a list using just a for loop and Len() method

Databricks Pyspark – Group related rows

How to select a part of a range, which is output of a formula

Keep Up to Date with the Most Important News

Getting an error that i should not be getting

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

why do i get this error creating superuser django?

Python: Print only part of the string when condition is met

how to get sorted index of min values of 2d array

Returning second highest element of a list using just a for loop and Len() method

Databricks Pyspark – Group related rows

How to select a part of a range, which is output of a formula

Discover more from Dev solutions