Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python Currency string conversion to float

This is the list:

x = ["111,222","111.222","111,222.11","111.222,11","111111","111.22"]

I would like that it will convert correctly without using locale and will work with the set above as the original data set is a mess

I have tried

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  • regex

  • float("".join(["".join([i for i in list(s)[0:-3] if i not in [".",","]]),"".join(list(s)[-3:]).replace(",",".")]) if list(s)[-3] in [".",","] else "".join(list(s)))

  • r = re.sub('[^0-9]', '', s) export = float(r[0:-2]+'.'+r[-2:])

  •    comma = ","
       dot = "."
    
       last_comma_index = s.rfind(comma)
       last_dot_index = s.rfind(dot)
    
       if last_comma_index > last_dot_index:
           last_index = last_comma_index
       else:
           last_index = last_dot_index
    
       before_point = s[:last_index]
    
       no_commas = "".join(before_point.split(comma))
       no_dots = "".join(no_commas.split(dot))
    
       export  = no_dots + dot + s[last_index + 1:]
    
  •        s = "".join(c for c in s if c.isdigit() or c in [",", "."])
    
           if "," in s:
               decimal_sep = ","
               thousands_sep = "."
           else:
               decimal_sep = "."
               thousands_sep = ","
    
           s = s.replace(decimal_sep, ".")
           s = re.sub(f"\\{thousands_sep}(?=[0-9])", "", s)
           export = s
    
  •        def parseNumber(text):
    
           try:
               # First we return None if we don't have something in the text:
               if text is None:
                   return None
               if isinstance(text, int) or isinstance(text, float):
                   return text
               text = text.strip()
               if text == "":
                   return None
               # Next we get the first "[0-9,. ]+":
               n = re.search("-?[0-9]*([,. ]?[0-9]+)+", text).group(0)
               n = n.strip()
               if not re.match(".*[0-9]+.*", text):
                   return None
               # Then we cut to keep only 2 symbols:
               while " " in n and "," in n and "." in n:
                   index = max(n.rfind(','), n.rfind(' '), n.rfind('.'))
                   n = n[0:index]
               n = n.strip()
               # We count the number of symbols:
               symbolsCount = 0
               for current in [" ", ",", "."]:
                   if current in n:
                       symbolsCount += 1
               # If we don't have any symbol, we do nothing:
               if symbolsCount == 0:
                   pass
               # With one symbol:
               elif symbolsCount == 1:
                   # If this is a space, we just remove all:
                   if " " in n:
                       n = n.replace(" ", "")
                   # Else we set it as a "." if one occurence, or remove it:
                   else:
                       theSymbol = "," if "," in n else "."
                       if n.count(theSymbol) > 1:
                           n = n.replace(theSymbol, "")
                       else:
                           n = n.replace(theSymbol, ".")
               else:
                   # Now replace symbols so the right symbol is "." and all left are "":
                   rightSymbolIndex = max(n.rfind(','), n.rfind(' '), n.rfind('.'))
                   rightSymbol = n[rightSymbolIndex:rightSymbolIndex+1]
                   if rightSymbol == " ":
                       return parseNumber(n.replace(" ", "_"))
                   n = n.replace(rightSymbol, "R")
                   leftSymbolIndex = max(n.rfind(','), n.rfind(' '), n.rfind('.'))
                   leftSymbol = n[leftSymbolIndex:leftSymbolIndex+1]
                   n = n.replace(leftSymbol, "L")
                   n = n.replace("L", "")
                   n = n.replace("R", ".")
               # And we cast the text to float or int:
               n = float(n)
    
               if n > 5000000:
                   return 0
               elif n.is_integer():
                   return int(n)
               else:
                   return n
    
           except: pass
    
           return None
    
  • Decimal

  • locale

    • Most of the answers in StackOverflow in around locale but have to avoid it as data messed up…

The result should be like this:

x = [111222,111222,111222.11,111222.11,111111,111.22]

Looking forward for any suggestions.

>Solution :

Try replacing the commas with dots so that the separators are all the same, then split on the separator and check if the rightmost chunk is of length 3.

Since no currencies (that I know of) use three decimal places for their fractional amounts, if the right chunk is of length 3, it must be part of a whole number. Otherwise it must be part a float.

x = ["111,222","111.222","111,222.11","111.222,11","111111","111.22"]

def from_currency(x: str):
    x = x.replace(',', '.')
    if not x.count('.'):
        return int(x)
    *whole_parts, frac = x.split('.')
    if len(frac) == 3:
        return int(''.join([*whole_parts, frac]))
    else:
        whole = ''.join(whole_parts)
        return float(f'{whole}.{frac}')


[from_currency(c) for c in x]
# returns 
[111222,
 111222,
 111222.11,
 111222.11,
 111111,
 111.22]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading