Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Unable to parse string at a seemingly random position

I scraped a website in order to convert chart data to a dataframe format.

As such, all of the values of the graph are in the form of a String, which I am trying to convert to int form so it can be plotted.

The generated dataframe:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

enter image description here

Since the values under Prime and Subprime headers are strings, I wrote the following to first get rid of the percentage symbol so I could subsequently convert the string to an int:

masked = df
masked.Prime = masked.Prime.str[:-1]
masked.Subprime = masked.Subprime.str[:-1]

This actually worked for the first dataset. But for the second one, I got the following error:

ValueError: Unable to parse string "0.30%" at position 322

What’s wrong here? I converted the half-parsed dataframe to Excel and it was successful up until that random position.

I’ve looked across this site for possible solutions, but I couldn’t find anything that pertained to this issue.

>Solution :

I’d try:

for col in ('Prime', 'Subprime'):
    df[col] = pd.to_numeric(df[col].str.strip().str.strip('%'))

Strip whitespace, then strip % values.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading