Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Regex working in php 7.4 but not with mariadb 10.3.38 (though both use pcre)

I’m currently migrating a very old typo3 cms to wordpress 6.1. In order to get links stored in the db working I have to transform them from a proprietary non-xml format to html using a regular expression:

<link fileadmin/user_upload/file47.pdf - download>Linktext</link>

has to transformed with this regex

<link\s(fileadmin\S*?)[>\s].+?>(.+?)<\/link>

to this link:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

<a href="fileadmin/user_upload/file47.pdf">Linktext</a>

I’ve tested the regular expression here https://regex101.com/r/cF5ZVR/1 – it works, and I’m also able to use it in php 7.4:

php > $pattern = '~<link\s(fileadmin\S*?)[>\s].+?>(.+?)<\/link>~';
php > $replacement = '<a href="$1">$2</a>';
php > $string = '<link fileadmin/user_upload/file47.pdf - download>Linktext</link>';
php > echo preg_replace($pattern, $replacement, $string);
<a href="fileadmin/user_upload/file47.pdf">Linktext</a>

But when I try to do it with a sql statement in MariaDB 10.3.38 I don’t get any result:

select regexp_replace('<link fileadmin/user_upload/file47.pdf - download>Linktext</link>', '<link\s(fileadmin\S*?)[>\s].+?>(.+?)<\/link>', '<a href="\\1">\\2</a>') as replaced;
+-------------------------------------------------------------------+
| replaced                                                          |
+-------------------------------------------------------------------+
| <link fileadmin/user_upload/file47.pdf - download>Linktext</link> |
+-------------------------------------------------------------------+
1 row in set (0,000 sec)

I’ve checked the regex grammar of MariaDB: https://mariadb.com/kb/en/pcre/ and everything seems to look fine.

Can anyone give me a hint what I’m missing here? Thanks!

>Solution :

You select should be:

select regexp_replace('<link fileadmin/user_upload/file47.pdf - download>Linktext</link>', '<link\\s(fileadmin\\S*?)[>\\s].+?>(.+?)<\\/link>', '<a href="\\1">\\2</a>') as replaced;

You have to escape \ symbols in you regexp. You have done it in replacement string, but missed in regex itself.

From MariaDB KB on regex:

Note: Because MariaDB uses the C escape syntax in strings (for example, \n to represent the newline character), you must double any \ that you use in your REGEXP strings.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading