I am trying to sort the content of a text file. Currently, I’m sorting it using excel. I wanted to use python script to automatically sort the contents.
The code below does not produce the needed output. Any idea will be very much appreciated.
def sorting(filename):
infile = open(filename)
words = []
for line in infile:
temp = line.split()
for i in temp:
words.append(i)
infile.close()
words.sort()
outfile = open("result.txt", "w")
for i in words:
outfile.writelines(i)
outfile.writelines(" ")
outfile.close()
sorting("myfile.txt")
Current code output:
-0.162 -0.638 -2.248 -2.348 -3.427 -4.622 0.00 0.162 0.176 -> more data...
myfile.txt # file needed to be sorted
ABC1234 -2.385 05:58
1234DCE -3.430 05:58
98ACD12 12.574 05:58
18DDD12 11.564 05:58
453FGH2A 0.351 05:58
A2DA21 0.00 05:58
8FF3EFA -0.720 05:58
ABC1234 -2.348 06:11
1234DCE -3.427 06:11
98ACD12 11.883 06:11
18DDD12 10.883 06:11
453FGH2A 0.176 06:11
A2DA21 0.162 06:11
8FF3EFA 0.888 06:11
ABC1234 -2.248 06:18
1234DCE -4.622 06:18
98ACD12 13.356 06:18
18DDD12 10.915 06:18
453FGH2A 0.00 06:18
A2DA21 -0.162 06:18
8FF3EFA -0.638 06:18
Intended Output: # to be sorted based on the latest time.
Name 06:18 06:11 05:58
ABC1234 -2.248 -2.348 -2.385
1234DCE -4.622 -3.427 -3.430
98ACD12 13.356 11.883 12.574
18DDD12 10.915 10.883 11.564
453FGH2A 0.00 0.176 0.351
A2DA21 -0.162 0.162 0.00
8FF3EFA -0.638 0.888 -0.720
>Solution :
I don’t usually argue for pandas as the first solution, but in this case I think it is the right answer:
import pandas as pd
data = {}
for row in open('x.csv'):
cols = row.rstrip().split()
if cols[2] not in data:
data[cols[2]] = {}
data[cols[2]][cols[0]] = cols[1]
print(data)
df = pd.DataFrame(data)
print(df)
Output:
{'05:58': {'ABC1234': '-2.385', '1234DCE': '-3.430', '98ACD12': '12.574', '18DDD12': '11.564', '453FGH2A': '0.351', 'A2DA21': '0.00', '8FF3EFA': '-0.720'}, '06:11': {'ABC1234': '-2.348', '1234DCE': '-3.427', '98ACD12': '11.883', '18DDD12': '10.883', '453FGH2A': '0.176', 'A2DA21': '0.162', '8FF3EFA': '0.888'}, '06:18': {'ABC1234': '-2.248', '1234DCE': '-4.622', '98ACD12': '13.356', '18DDD12': '10.915', '453FGH2A': '0.00', 'A2DA21': '-0.162', '8FF3EFA': '-0.638'}}
05:58 06:11 06:18
ABC1234 -2.385 -2.348 -2.248
1234DCE -3.430 -3.427 -4.622
98ACD12 12.574 11.883 13.356
18DDD12 11.564 10.883 10.915
453FGH2A 0.351 0.176 0.00
A2DA21 0.00 0.162 -0.162
8FF3EFA -0.720 0.888 -0.638
(Side note — you could eliminate two lines of code by using defaultdict for data. In this case, I don’t think it’s worth the trouble.)