Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Recognize block of data as one block using regex vba

I am trying to create a pattern for the following text

not included

 468049876 
some text some text ffgg   
 30905103300638 
 1
other text other text

no included

Here’s my try

^\s*\d{6,10}(?:\n(?!\s*\d{1,}\n).*){5}

I will be using such a pattern in VBA
The expected output to be highlighted (in five lines)

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

468049876 
some text some text ffgg   
 30905103300638 
 1
other text other text

** I have updated the question as I face a problem
Suppose the text like that

not included

 468041476 
some text some text ffgg   
 30905103300638 
 1
other text other text
extra line
 416524332 
some text some text ffgg   
 30905103300638 
 1
other text other text
extra line
6354422
no included

Here I need the block to follow the sequence:
1- Numbers from 6 to 12 digits
2- Then some text in one line
3- Numbers equals to 14 digits
4- Numbers from 1 to 3 digits
5- Text (this is the problem as this text may be in two lines not one line) and I need to include that extra line as one line
so the output of the text example

 468049876 
some text some text ffgg   
 30905103300638 
 1
other text other text extra line

and

 416524332 
some text some text ffgg   
 30905103300638 
 1
other text other text extra line

I mean that text would include two blocks only (each of five lines)

>Solution :

It seems to me you should check for 6-10 digit number in the negative condition, and to match whitespace byt line breaks you can use [^\S\r\n]:

^[^\S\r\n]*\d{6,10}[^\S\r\n]*(?:(?:\r\n?|\n)(?![^\S\r\n]*\d{6,10}[^\S\r\n]*[\r\n]).+)*

If we assume line breaks are \n and whitespaces are just spaces you could write it as

^ *\d{6,10} *(?:\n(?! *\d{6,10} *\n).+)*

See the regex demo. Details:

  • ^ – start of a line (remember to use )
  • [^\S\r\n]* – zero or more horizontal whitespace
  • \d{6,10} – six to ten digits
  • [^\S\r\n]* – zero or more horizontal whitespace
  • (?: – start of a non-capturing group:
    • (?:\r\n?|\n) – a CRLF, LF or CR line ending
    • (?![^\S\r\n]*\d{6,10}[^\S\r\n]*[\r\n]) – not immediately followed with zero or more horizontal whitespace, six to ten digits, zero or more horizontal whitespace and end of a line
    • .+ – a non-empty line (one or more chars other than line break chars as many as possible
  • )* – end of the grouping, zero or more occurrences.
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading