In the text file below "BEFORE FILE", how would I remove the duplicate numbers to make it look like the "AFTER FILE" below? The "_PRODxxxx," where the x’s are the numbers, will stay in that format.
BEFORE FILE
NET_SalesD_PROD1111,mexico
NET_Sales4_PROD22,newjersy
NET_SalesG_PROD333,bull
AFTER FILE
NET_SalesD_PROD1,mexico
NET_Sales4_PROD2,newjersy
NET_SalesG_PROD3,bull
I have tried using sed and a regex capture group like "PROD[1-9]{2,4}" but cannot get it to work.
>Solution :
Use a capture group to capture the first digit, and a back-reference to match repetitions of it. Then use the same back-reference in the replacement to produce just one of it.
sed -E 's/PROD([1-9])\1+,/PROD\1,/'