Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas & excel: keeping trailing zero in number stored as string

I have a small Excel spreadsheet I’m reading in a Python script using the pandas package. There is a column with what can look like a number (such as "1.10") that’s stored as string and when I access the cell in the Python script, it’s extracted as a float and the trailing zero is not present:

 0 Section    float  1.1
 0 Text       str    One
 1 Section    float  1.2
 1 Text       str    Two
 2 Section    float  1.3
 2 Text       str    Three
 3 Section    float  1.4
 3 Text       str    Four
 4 Section    float  1.5
 4 Text       str    Five
 5 Section    float  1.6
 5 Text       str    Six
 6 Section    float  1.7
 6 Text       str    Seven
 7 Section    float  1.8
 7 Text       str    Eight
 8 Section    float  1.9
 8 Text       str    Nine
 9 Section    float  1.1
 9 Text       str    Ten
10 Section    float  1.11
10 Text       str    Eleven

I created a gist with a small Python script and Excel spreadsheet to illustrate the problem.

Does anyone know how to extract this type of cell as a string and retain the trailing blanks?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

you can modify your code by explicitly specifying the data types for the columns you want to read as strings. Here’s your code that i edited as the code that i wrote before when i had the same problem and hope it works :

import sys
import argparse
import pandas as pd

parser = argparse.ArgumentParser(description=sys.argv[0])
parser.add_argument('path', help='Path to excel file')
args = parser.parse_args()

spreadsheet = pd.read_excel(args.path, sheet_name=None, dtype={'Section': str})
sheet_names = list(spreadsheet.keys())
sheet_name = sheet_names[0]
sheet = spreadsheet[sheet_name]
column_names = list(sheet.columns)
columns = {column_name: list(sheet[column_name]) for column_name in column_names}

for row in range(len(columns[column_names[0]])):
    for (column_name, column) in columns.items():
        cell = str(column[row])  # Convert the cell to string
        print(f'{row:2} {column_name:10} {type(cell).__name__:6} {cell}')

In the above code, the dtype={'Section': str} parameter is passed to the read_excelfunction. This tells pandas to treat the "Section" column as a string when reading the Excel file. By explicitly converting the cell to a string using str(column[row]), the trailing zero will be retained when accessing the cell value.

Make sure to replace 'numbers-as-strings.xlsx' with the path to your actual Excel file.

Hope it leads u to the way .

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading