I have a VCF file with SNP IDs like this:
AX-14233402__rs35404821
AX-37499887__rs74704183
AX-36783275__rs11997571
I would like to change the SNP IDs to have only the IDs without the AX-... term:
rs35404821
rs74704183
rs74704183
Is there any solution for this? I tried with a gsub command, but nothing changed:
awk '{gsub(/AX*_rs/,"rs"); print}' datafile.vcf > datafile_ID.vcf
>Solution :
With your shown samples, please try following solution. What this code does is: Setting __ as a field separator for all the lines, then checking condition if line starts from AX- then printing 2nd field of that line.
awk -F'__' '/^AX-/{print $2}' Input_file
OR in case you want to simply print values after __ without checking if line starts from AX- or not then try following.
awk -F'__' '{print $2}' Input_file