I have a file that looks like this:
2000
2000
2001
2001
2001
2001
2002
2002
I need a script to show me this:
2000 - 2
2001 - 4
2002 - 2
I prefer using sed or awk
>Solution :
This is precisely what uniq -c does. From man uniq:
DESCRIPTION
Filter adjacent matching lines from INPUT (or standard input), writing to OUTPUT (or standard output).
[ . . . ]
-c, –count
prefix lines by the number of occurrences
So with your example, we get:
$ uniq -c file
2 2000
4 2001
2 2002
You can also write a little script if you prefer for some reason. For instance, with awk:
$ awk '{ count[$0]++ } END{ for(line in count){ print line,count[line] }}' file
2000 2
2001 4
2002 2