Let’s say I have CSV data as note below; let’s call this
name,value1,value2 firstname,34326408129478932874,553 secondname_a_very_long_one,65,123987 thirdname_medium,9686509933423,33
Basically, it’s either single word text (no space separation, so no need for quoting) or numbers (here integers, but could be floats with decimals or scientific 1e-5 notation) – and there is no expectation that a comma could appear somewhere (outside of being a delimiter), so no need for special handling of the quoting of comma either …
So, to ease the strain on my eyes when I view this .csv file in a text editor, I would like to format it with fixed width – space padded (left or right padding choice made per column, and separately for header row); note that the file is still comma delimited as a data format, the fixed width is just for viewing in editor – this is how I would like it to look like, let’s call this
name , value1 , value2 firstname , 34326408129478932874, 553 secondname_a_very_long_one, 65, 123987 thirdname_medium , 9686509933423, 33
So here, the heading row is all left-aligned (right-padded with spaces); columns
value2 are also left-aligned (right-padded with spaces); and column
value1 is right-aligned (right-padded with spaces). The columns are sized (in characters) according to the largest string length of the data in that column; and there is an extra space as visual delimiter after the commas.
Of course, if I want to use this data in Python properly, I’d have to "strip" it first – but I don’t mind, since as I mentioned, the data is such that I don’t have to worry about quoting issues; here is a Python example of how I could use
tmpfw.csv – let’s call it
import sys import csv import pprint with open('tmpfw.csv', newline='') as csvfile: my_csv = csv.reader(csvfile) my_csv_list = list(my_csv) my_csv_list_stripped = [list(map(str.strip, irow)) for irow in my_csv_list] print("\nmy_csv_list:\n") pprint.pprint( my_csv_list ) print("\nmy_csv_list_stripped:\n") pprint.pprint( my_csv_list_stripped ) #print("\nreprint stripped as csv:\n") #csvwriter = csv.writer(sys.stdout) # just print out to terminal #csvwriter.writerows(my_csv_list_stripped)
This is what I get printed:
$ python3 test.py my_csv_list: [['name ', ' value1 ', ' value2'], ['firstname ', ' 34326408129478932874', ' 553'], ['secondname_a_very_long_one', ' 65', ' 123987'], ['thirdname_medium ', ' 9686509933423', ' 33']] my_csv_list_stripped: [['name', 'value1', 'value2'], ['firstname', '34326408129478932874', '553'], ['secondname_a_very_long_one', '65', '123987'], ['thirdname_medium', '9686509933423', '33']]
I can use this as a base to convert the numbers to int later – so, I can use such a fixed-width csv fine, all is good …
So, my question is: let’s say I have the
original.csv – what would be the easiest way in Python to obtain a "fixed-width formatted"
pandas or other libraries have facilities for exporting a CSV format like this?
Sure – compute what the maximum length of each column is, then
.ljust() them accordingly when printing:
import csv import io # pretend reading csv from file csv_data = list(csv.reader(io.StringIO(""" name,value1,value2 firstname,34326408129478932874,553 secondname_a_very_long_one,65,123987 thirdname_medium,9686509933423,33 """.strip()))) n_cols = len(csv_data) col_widths = [max(len(row[i]) for row in csv_data) for i in range(n_cols)] for row in csv_data: print(', '.join(val.ljust(width) for val, width in zip(row, col_widths)))
This prints out
name , value1 , value2 firstname , 34326408129478932874, 553 secondname_a_very_long_one, 65 , 123987 thirdname_medium , 9686509933423 , 33
and naturally you could open a file and
print(..., file=...) instead.