Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Am I understanding this Regex Pattern correctly?

I am currently working on validating data and the following regex pattern is used for a attribute known as ID:

java.util.regex.Pattern.matches("^((?i)(?!.*unknown.*)(?!\\b(misc)\\b)(?!.*tbd.*))[A-Za-z0-9-\\s]{1,}$", input_row.ID_A)
&& java.util.regex.Pattern.matches("^[A-Za-z0-9-\\s]{1,}$", input_row.ID_A) 

I understand this as: if a ID attribute contains an unknown, misc, or tbd it will be discarded but if it contains a ID that has characters [A-Za-z0-9-\s] it will be kept?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

It will match a string containing letters, numbers, -, and whitspace, unless it begins with the word misc or contains either unknown or tbd anywhere.

(?!.*unknown.*) and (?!.*tbd.*) are negative lookaheads that match those strings anywhere because of the .* around them.

(?!\\b(misc)\\b) is a negative lookahead that matches the misc with word boundaries around it. Since there’s no .* at the beginning, it only applies at that position, which is after ^, which means the beginning of the string.

If any of the negative lookaheads are matched, the regexp match fails.

[A-Za-z0-9-\\s]{1,} matches one of more of the characters that are matched by that character class.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading