How to sort the contents of a text file

August 23, 2023

I am trying to sort the content of a text file. Currently, I’m sorting it using excel. I wanted to use python script to automatically sort the contents.

The code below does not produce the needed output. Any idea will be very much appreciated.

  def sorting(filename):
    infile = open(filename)
    words = []
    for line in infile:
      temp = line.split()
      for i in temp:
        words.append(i)
    infile.close()
    words.sort()
    outfile = open("result.txt", "w")
    for i in words:
      outfile.writelines(i)
      outfile.writelines(" ")
    outfile.close()

  sorting("myfile.txt")

Current code output:

 -0.162 -0.638 -2.248 -2.348 -3.427 -4.622 0.00 0.162 0.176 -> more data...

myfile.txt # file needed to be sorted

  ABC1234     -2.385      05:58
  1234DCE     -3.430      05:58
  98ACD12     12.574      05:58
  18DDD12    11.564   05:58
  453FGH2A    0.351   05:58
  A2DA21     0.00     05:58
  8FF3EFA    -0.720   05:58
  ABC1234     -2.348      06:11
  1234DCE     -3.427      06:11
  98ACD12     11.883      06:11
  18DDD12     10.883      06:11
  453FGH2A    0.176   06:11
  A2DA21      0.162   06:11
  8FF3EFA     0.888   06:11
  ABC1234     -2.248      06:18
  1234DCE     -4.622      06:18
  98ACD12     13.356      06:18
  18DDD12    10.915   06:18
  453FGH2A    0.00    06:18
  A2DA21     -0.162   06:18
  8FF3EFA    -0.638   06:18

Intended Output: # to be sorted based on the latest time.

  Name       06:18      06:11    05:58      
  ABC1234    -2.248    -2.348    -2.385     
  1234DCE    -4.622    -3.427    -3.430      
  98ACD12    13.356    11.883    12.574     
  18DDD12    10.915    10.883    11.564     
  453FGH2A    0.00      0.176     0.351       
  A2DA21     -0.162     0.162     0.00       
  8FF3EFA    -0.638     0.888    -0.720

>Solution :

I don’t usually argue for pandas as the first solution, but in this case I think it is the right answer:

import pandas as pd

data = {}
for row in open('x.csv'):
    cols = row.rstrip().split()
    if cols[2] not in data:
        data[cols[2]] = {}
    data[cols[2]][cols[0]] = cols[1]

print(data)
df = pd.DataFrame(data)
print(df)

Output:

{'05:58': {'ABC1234': '-2.385', '1234DCE': '-3.430', '98ACD12': '12.574', '18DDD12': '11.564', '453FGH2A': '0.351', 'A2DA21': '0.00', '8FF3EFA': '-0.720'}, '06:11': {'ABC1234': '-2.348', '1234DCE': '-3.427', '98ACD12': '11.883', '18DDD12': '10.883', '453FGH2A': '0.176', 'A2DA21': '0.162', '8FF3EFA': '0.888'}, '06:18': {'ABC1234': '-2.248', '1234DCE': '-4.622', '98ACD12': '13.356', '18DDD12': '10.915', '453FGH2A': '0.00', 'A2DA21': '-0.162', '8FF3EFA': '-0.638'}}
           05:58   06:11   06:18
ABC1234   -2.385  -2.348  -2.248
1234DCE   -3.430  -3.427  -4.622
98ACD12   12.574  11.883  13.356
18DDD12   11.564  10.883  10.915
453FGH2A   0.351   0.176    0.00
A2DA21      0.00   0.162  -0.162
8FF3EFA   -0.720   0.888  -0.638

(Side note — you could eliminate two lines of code by using defaultdict for data. In this case, I don’t think it’s worth the trouble.)