Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to replace all match with another format using Regex

Here I have html contains hrefs as follows:

<a href="https://www.test.com/help">a</a><br/>
...
other html text
...
<a href="https://www.test.com/help2">b</a><br/>
...
other html text
...
<a href="https://www.test.com/help3">c</a>

How to use Regex to replace all . character to space character in href?

<a href="https://www test com/help">a</a><br/>
<a href="https://www test com/help2">b</a><br/>
<a href="https://www test com/help3">c</a>

I am trying the follwing, but not work:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

string textToHandle = "html text here";
string newUrl = string.Empty;
Regex rg = new Regex("(?<=(href=\"))[.\\s\\S]*?(?=(\"))", RegexOptions.IgnoreCase);

MatchCollection collection = rg.Matches(textToHandle);
foreach (Match match in collection)
{
    string u = match.Groups[0].Value;
    newUrl = rg.Replace(textToHandle, u.Replace(".", " "));
    Console.WriteLine(newUrl);
}

Console.WriteLine(newUrl);

>Solution :

In order to replace all . characters with spaces within the href attribute, you can modify your regex pattern to (?<=href="")[^""]*(?="") and use the Regex.Replace() method with a MatchEvaluator:

string textToHandle = "<a href=\"https://www.test.com/help3\">c</a>";
string newUrl = string.Empty;

// Updated regex pattern to match href values
string pattern = @"(?<=href="")[^""]*(?="")";
Regex rg = new Regex(pattern, RegexOptions.IgnoreCase);

// Match evaluator to replace '.' with ' '
MatchEvaluator evaluator = match => match.Value.Replace(".", " ");

// Replace all occurrences in the input text
newUrl = rg.Replace(textToHandle, evaluator);

Console.WriteLine(newUrl);
Console.ReadKey();

Regex Explaination

  • (?<=href="") Positive lookbehind to ensure the match is preceded by href=".
  • [^""]* Match any character except " (double quotes), zero or more times.
  • (?="") Positive lookahead to ensure the match is followed by a double quote ".
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading