Compare two files and output difference from the first separator

I will ask my question with an example (fake passwords + fake names/domain). I have two files:

1.txt (containing new emails with passwords)

gwennette.prutzman93ent@stackprotect.com:fwgzvg
kimbler.ellizabeth@stackprotect.com:ft5tz45
cectvshowtape@stackprotect.com:rfh44f32q
standiford.gyneth5566@stackprotect.com:zh6535
lecroy.jeanlucas5329@stackprotect.com:frb46

2.txt (containing old emails from my database without passwords)

2de0aae2fdfd4025a0236869bb091488@stackprotect.com
standiford.gyneth5566@stackprotect.com
lecroy.jeanlucas5329@stackprotect.com
cectvshowtape@stackprotect.com
fiorillo.alianny@stackprotect.com
gwennette.prutzman93ent@stackprotect.com
kimbler.ellizabeth@stackprotect.com
vincente-gunnard@stackprotect.com
anjum.coetta0376@stackprotect.com
grandison-liboria9587@stackprotect.com

I’m expecting to get an output like this:

3.txt (lines from the 2.txt which are not duplicates from the file 1.txt from the first column/separator)

2de0aae2fdfd4025a0236869bb091488@stackprotect.com
fiorillo.alianny@stackprotect.com
vincente-gunnard@stackprotect.com
anjum.coetta0376@stackprotect.com
grandison-liboria9587@stackprotect.com

I’m trying to see which emails didn’t went through my 2.txt file (database) so that I can run them again.

I tried a several Regex solutions, but not one could help with my problem.

findstr /V /I /X /L /G:"2.txt" "1.txt" > "3.txt"

findstr /v /g:"2.txt" "1.txt" > 3.txt

>Solution :

$ awk -F':' 'NR==FNR{a[$1]; next} !($1 in a)' 1.txt 2.txt
2de0aae2fdfd4025a0236869bb091488@stackprotect.com
fiorillo.alianny@stackprotect.com
vincente-gunnard@stackprotect.com
anjum.coetta0376@stackprotect.com
grandison-liboria9587@stackprotect.com

That’s how you’d do it in Unix, you’ll have to figure out the Windows quoting yourself if you want to do it on Windows.

Leave a Reply