Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can I get the date inside string that contains a text?

I have a feature to compare the inserted data with the original data inside an image and we use Google Vision OCR to extract the text.

the OCR will give you the result per-block

example per block

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

so it will give you an array result

const result = [
  {
    text: This is the first block
  },
  {
    text: This is the second block
  },
  {
    text: Created on 20 September 2021
  },
]

My question is, how can I get the date ( 20 September 2021 ) ?, so I can do compare for the data that has been inserted.

I did some logic with looping and regex, but I still can’t finish it because I still need to learn about regex, and tbh, I spent a day with this regex.

I just thought, the images that need to compare is not consistent for about the date.

maybe I will found the date with only per-block ( without any text, it’s only show the date ),
or the date is separated by space ( 20 September 2021 ),
or the date is separated by dash ( 20-September-2021 ),
or the date is separated by slash ( 20/September/2021 ),
or the date month using a number ( 20-09-2021 ).

The main thing is, the date structure is not always same.
in this case I compare the inserted data with a certificate image.

So, if I can get the date, I will make the date to be consistent date format using momentjs().format()

I think that’s all, thank you.

>Solution :

Based upon your expected inputs, here is some RegExp that will work:

  1. Find 1,2 digits (day)
  2. Find a space, -, or /
  3. Find either a string of 3 to 9 characters or 2 digits
  4. Find a space, -, or /
  5. Find 2 to 4 digits
let regex = /\d{1,2}(-| |\/)(\w{3,9}|\d{1,2})(-| |\/)\d{2,4}/;
const inputs = ['some random text: 20 September 2020', '20/September/2020', '20-September-20', '20/09/2020', '20 Sep 20', '20-09-2020'];

for(const input of inputs){
 console.log(input.match(regex)[0]);
}
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading