I have one.txt file with data:
822.25 111.48 883.59 256.68
822.25 111.48 883.59 256.68
8.6 123.68 467.27 276.69
0.0 186.77 165.62 375.0
0.0 186.77 165.62 375.0
724.76 177.83 923.52 316.78
724.76 177.83 923.52 316.78
724.76 177.83 923.52 316.78
724.76 177.83 923.52 316.78
724.76 177.83 923.52 316.78
438.03 148.5 540.88 198.54
511.99 170.97 571.74 215.81
511.99 170.97 571.74 215.81
For lines that are repeated I want to write only one line for them. For instance:
724.76 177.83 923.52 316.78
is repeated 5 times, I want to write it only one time and do the same thing for other lines as well and write new data to a file.
My code:
with open('one.txt', 'r') as infile:
with open('output.txt', 'w') as outfile:
for line in infile:
#how to do this?
if line are repeated remove and replace them with only one line
outfile.write(line)
>Solution :
you probably want itertools.groupby, without a comparison function it just returns a ‘group’ per unique line so you can just skip the group entirely and just write one line from each grouping.
with open('one.txt', 'r') as infile:
with open('output.txt', 'w') as outfile:
for line, _ in itertools.groupby(infile):
outfile.write(line)
This would only replace groups that occur in the same area, if repeated lines may appear in multiple places in the file (e.g. a a b a would write a b a) then you can keep a set of lines you have seen already
seen_lines = set()
with open('one.txt', 'r') as infile:
with open('output.txt', 'w') as outfile:
for line in infile:
if line in seen_lines:
continue
outfile.write(line)
seen_lines.add(line)